The difference set in my preceding post, between the two mpeg+default cpuflags encodes (using opt or noopt on that bvop rd module) consists of a ~180 frame sequence of P and B frames, starting at a P frame. It ends at the next I frame. The rest of the 5000 frame clip is identical.
Very curious indeed - somehow opt/noopt of *this* module polluted / affected a P frame?
Ran another series of tests, expanding on the 'diff' axes
constant -> vhq-b=on, vhq=1, 4mv=off, mpeg
variable -> range of cpu flags
10 encodes
Code:
/diff\ /same\ /same\ /SAME\
full optimize mmx xmm sse 3dn 3de
| | | | |
same same same same DIFF
| | | | |
noopt bvop rd mmx xmm sse 3dn 3de
\diff/ \same/ \same/ \DIFF/
I really don't know what to make of this. Going from 3dn to 3de switches in a fairly sizable set of asm routines, including dequant_mpeg_inter_3dne. When estimation_rd_based_bvop is compiled with optimizer, I get same results as with xmm, sse, and 3dn. But when it is compiled noopt, this causes results to change. (Or perhaps result was *supposed* to change in the optimized build case, but didn't?)
Well, at least it further narrow things down...