View Single Post
Old 18th April 2007, 21:23   #3  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
SSE4 will be implemented as soon as I get access to a SSE4 cpu, or someone else with such decides to write it.

No need for such a complicated interface. I can't write SSE4 without a cpu to test it on, and if I can test it then there's very little chance that it would break on other cpus. No one has complained about SSSE3 crashes...

What's your source for "40%"? This one says "Motion estimation ... often accounts for about 40% of the total CPU cycles consumed by an encoder. ... This white paper will describe how video encoders can benefit from the Intel SSE4 instructions, achieving 1.6x to 3.8x performance speedups in integer motion vector search." Then they go on to describe ESA. And their results are probably correct for ESA. But the fast integer motion searches in x264 are more like 10-15% of the total cpu-time, plus SSE4 won't help them as much as it helps ESA. And even x264's successive-elimination ESA might be as fast as Intel's brute-force SSE4 ESA.

Last edited by akupenguin; 18th April 2007 at 22:05.
akupenguin is offline   Reply With Quote