View Full Version : X264 SSE3 support?
inurenegade
22nd March 2006, 02:49
Hello all
i have been wondering around the forums and reading up on some stuff and i was wondering if its possible for the latest x264 builds to support/use SSE3 instructions or whatever for pentium 4 processors. So far i have found this thread http://forum.doom9.org/showthread.php?t=94065&highlight=sse3+instructions
http://forum.doom9.org/showthread.php?t=94303&highlight=sse3+x264
but they appear outdated so i was wondering if anyone had any new information about this
thanks in advance ^_^
ChronoCross
22nd March 2006, 04:12
it currently does not support sse3. also do not ask if it will be implementsed.
inurenegade
22nd March 2006, 04:36
thank you
sysKin
22nd March 2006, 12:54
SSE3 has only one new integer instruction doesn't it? It loads unaligned stuff better. Can be useful for ME, on P4s only.
Sounds easy to add ~
Sirber
22nd March 2006, 13:15
IIRC, wasn't x264 MMX only?
Sharktooth
22nd March 2006, 13:29
uh? no.
el divx
22nd March 2006, 14:43
SSE3 has only one new integer instruction doesn't it? It loads unaligned stuff better. Can be useful for ME, on P4s only.
Sounds easy to add ~
Didn't AMD add SSE3 support the Athlon 64. IIRC, they only skipped 1 or 2 instructions that required hyperthreading.
sysKin
22nd March 2006, 14:45
Didn't AMD add SSE3 support the Athlon 64. IIRC, they only skipped 1 or 2 instructions that required hyperthreading.
Yes they did, my 4200+ supports sse3. But it's for compatibility only, unlike P4s, this instruction is not faster than its older (sse2) equivalent.
akupenguin
22nd March 2006, 20:51
For that matter, why is there a separate instruction lddqu? If they can make unaligned loads fast, why not do it in movdqu?
Sharktooth
22nd March 2006, 21:19
coz intel is weird... and they probably added SSE3 only to screw the competition.
oh... SSE4 are coming though...
Kostarum Rex Persia
22nd March 2006, 23:37
Yes, who knows what SSE4 will bring to us.
Romario
23rd March 2006, 03:16
As far I know, SSE3 has one specific instruction, lddq. If Intel don't lie us, this instruction has much impact on improving video compression(speed-up).
So, implementing SSE3 to x264 would be very nice.
soresu
23rd March 2006, 04:07
As far as I am aware, SSE3 is optimised towards a cpu with a longer pipeline (P4) - so is any speedup likely to change when Intel moves to their new Conroe/Core architecture?
squid_80
23rd March 2006, 04:08
As far I know, SSE3 has one specific instruction, lddq.
There's more to sse3 than a single instruction. There's also haddpd, haddps, hsubpd, hsubps, movddup, movshdup and movsldup to name a few that might come in useful (I haven't checked the timings, just the usage).
sysKin
23rd March 2006, 04:42
There's more to sse3 than a single instruction. There's also haddpd, haddps, hsubpd, hsubps, movddup, movshdup and movsldup to name a few that might come in useful (I haven't checked the timings, just the usage).
They cannot be useful because they're floating point.
squid_80
23rd March 2006, 05:06
They cannot be useful because they're floating point.
Just because they're intended for floats, doesn't mean they can't be used otherwise. AMD's own optimization guide recommends to use movlpd/movhpd for a 128-bit unaligned move instead of movdqu. There's a couple of movhlps instructions in xvid's source code too, and they're not moving floats ;)
Romario
23rd March 2006, 09:55
Ok, but, squid_80, can these instructions be implemented in future x264 builds, without any problems with other things?
Sharktooth
23rd March 2006, 10:21
the only useful instruction in sse3 is lddqu, the others have higher timings and it's not likely that lddqu will "change your life"... so why bother?
Romario
23rd March 2006, 10:55
Well, perhaps we can get 15 % speed-up, if we properly implement SSE3 instructions.
Sharktooth
23rd March 2006, 11:00
LOL... maybe in your dreams... :)
FFWD
23rd March 2006, 12:45
Yes, who knows what SSE4 will bring to us.http://de.wikipedia.org/wiki/SSE4
As far as I am aware, SSE3 is optimised towards a cpu with a longer pipeline (P4) - so is any speedup likely to change when Intel moves to their new Conroe/Core architecture?SSE, SSE2 & SSE3 instructions will be even faster executed thanks to a feature called Advanced Digital Media Boost. (http://www.tomshardware.com/2006/03/13/idf_spring_2006/page6.html)
Intel Core Microarchitecture demo :
http://www.intel.com/technology/architecture/coremicro/demo/demo.htm
Sirber
23rd March 2006, 13:16
looks like I'll have to move for Intel :S
Sirber
23rd March 2006, 13:24
Yes, who knows what SSE4 will bring to us.Pricy CPUs? ;)
bratao
23rd March 2006, 13:26
Sirber, Two weeks after, AMD will release an processator with SSE4+ and ULTRA Advanced Digital Media Boost. At Half Clock speed, with 3 Times the Intel Peformace
FFWD
23rd March 2006, 13:50
Pricy CPUs? ;)Merom (Mobile)
* T5600: 1.83 GHz, FSB667, 2 MB L2 cache, $241 at launch
* T7200: 2.00 GHz, FSB667, 4 MB L2 cache, $294 at launch
* T7400: 2.16 GHz, FSB667, 4 MB L2 cache, $423 at launch
* T7600: 2.33 GHz, FSB667, 4 MB L2 cache, $637 at launch
Conroe (Desktop)
* E6700: 2.66 GHz / FSB 1066/ 4 MB shared L2 cache $530 at launch
* E6600: 2.40 GHz / FSB 1066/ 4 MB shared L2 cache $316 at launch
* E6400: 2.13 GHz / FSB 1066/ 2 MB shared L2 cache $241 at launch
* E6300: 1.86 GHz / FSB 1066/ 2 MB shared L2 cache $209 at launch
* E4200: 1.60 GHz / FSB 800/ 2 MB shared L2 cache ???
* Conroe Extreme Edition (XE): Specifications unknown
Woodcrest (Server)
* Xeon DP 5110: 1.60 GHz, FSB1066, 4 MB L2 cache, $209 at launch
* Xeon DP 5120: 1.86 GHz, FSB1066, 4 MB L2 cache, $256 at launch
* Xeon DP 5130: 2.00 GHz, FSB1333, 4 MB L2 cache, $316 at launch
* Xeon DP 5140: 2.33 GHz, FSB1333, 4 MB L2 cache, $455 at launch
* Xeon DP 5150: 2.66 GHz, FSB1333, 4 MB L2 cache, $690 at launch
* Xeon DP 5160: 3.00 GHz, FSB1333, 4 MB L2 cache, $851 at launch
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.