PDA

View Full Version : amd opeteron for video encoding


Doom9
27th April 2003, 14:03
anandtech did a test trying to determine the desktop performance of an opteron. While the processor looks good in gaming, the settings aren't really real-life ones (pci gf4mx400 card) so the results definitely have to be taken with a grain of salt.

In video encoding the GFX card doesn't matter and a server can be used for video encoding without problems. After all.. servers are optimized for high I/O throughput and that's what video encoding needs.
Here are the results: http://www.anandtech.com/cpu/showdoc.html?i=1818&p=8 (divx5) and http://www.anandtech.com/cpu/showdoc.html?i=1818&p=9 (wmv9).
Now I know this is going to stir up another controversy, but let me point out a few things.
1) The video encoding scenario isn't very realistic. Xmpeg 4.5 is old for starters, and no respectable ripper would encode DVDs at full D1 resolution and waste bitrate on black bars. A good test would involve dvd2avi, avisynth 2.51 and virtualdub for encoding and the fps at the end of encoding may not be a good overall speed indicator. Everybody who has ever watched the stats during an encoding session knows that framerates can fluctuate a lot, going up nearly 20fps in no time in certain situations.
2) Opteron supports SSE2 extensions. How many applications today are able to detect and utilize those extensions? I recall that when the Athlon got SSE1 extensions programs at first didn't use them and especially windows media encoder required a special patch which helped to improve performance. I would assume that this time the situation is the same and as programs start to support SSE2 extensions the opteron can offer we'd eventually see another performance increase. And since both processor architectures now support SSE2 it is highly likely that we'll see more SSE2 optimized applications in the future which would lead to a general speed improvements on both platforms.
So what am I trying to say? Video testing methods used by most publications are not very realistic, and the use / not use of multimedia extensions has to be mentioned.

kenshin
4th May 2003, 02:13
would be nice if someone wrote a bundled video encoding benchmark suite just for dvd rippers and pc hardware junkies :D

gabest
4th May 2003, 23:09
Originally posted by Doom9
2) Opteron supports SSE2 extensions. How many applications today are able to detect and utilize those extensions? Assembly coders correct me if I'm wrong, but shouldn't this ability be marked with the same flag as for the p4 in the return value of the "cpuid" instruction? Then any application could detect and automatically select its sse2 optimized routines.

Doom9
5th May 2003, 13:47
gabest: I think that's the theory. In practice however, programs read out the CPU type and then use certain features. So if the CPU says it's AMD, SSE2 will not be used. At least that's the problem that existed in tests when the Athlons started supporting SSE and most programs weren't using it.

gabest
5th May 2003, 14:07
Yea, that's possible (but also lame coding to my opinion). I remember that when I upgraded from cel900 to xp1600+ in the past, I noticed that the very old mp3 encoder wingogo, which was made long before the athon xp cpus appeared, could automatically detect and use sse. The funny part was that it encoded slower than with the mmx/3dnow instructions.

int 21h
6th May 2003, 06:21
Originally posted by Doom9
gabest: I think that's the theory. In practice however, programs read out the CPU type and then use certain features. So if the CPU says it's AMD, SSE2 will not be used. At least that's the problem that existed in tests when the Athlons started supporting SSE and most programs weren't using it.

No. Anyone who is looking for optimizations should do a CPUID, then test EDX for the appropriate bits... similar to how DVD2AVI does it:

__asm
{
mov eax, 1
cpuid
test edx, 0x00800000 // STD MMX
jz TEST_SSE
mov [cpu.mmx], 1
TEST_SSE:
test edx, 0x02000000 // STD SSE
jz TEST_3DNOW
mov [cpu.ssemmx], 1
mov [cpu.ssefpu], 1
TEST_3DNOW:
mov eax, 0x80000001
cpuid
test edx, 0x80000000 // 3D NOW
jz TEST_SSEMMX
mov [cpu._3dnow], 1
TEST_SSEMMX:
test edx, 0x00400000 // SSE MMX
jz TEST_END
mov [cpu.ssemmx], 1
TEST_END:
}


More details here:

http://www.sandpile.org/ia32/cpuid.htm

However, if the CPU itself doesn't populate the register correctly (i.e. it supports SSE, but doesn't say it does) that's something entirely different. But anyone basing optimizations off of the VendorID is simply silly.

Originally posted by gabest
Yea, that's possible (but also lame coding to my opinion). I remember that when I upgraded from cel900 to xp1600+ in the past, I noticed that the very old mp3 encoder wingogo, which was made long before the athon xp cpus appeared, could automatically detect and use sse. The funny part was that it encoded slower than with the mmx/3dnow instructions.

It may have encoded slower, but it also encoded with a higher accuracy :) There is a similar situation between MMX iDCT and SSE2 iDCT, the speed is nearly identical, but the SSE2 iDCT is leagues more accurate. When comparing SSE or even MMX to SSE2, the thing to keep in mind is that you're reaching an area of diminishing returns. SSE2 won't necessarily preform faster than SSE, but it will preform higher accuracy calculations at a higher speed than doing the same accuracy level in SSE. Eventually you reach the point where assembly is assembly, and disregarding any large branch mispredicts, the speed will be nearly identical for either version of the instructions... and that is where things get interesting for the two behemoths (AMD and Intel) since they both are moving towards radically different CPU models... In theory, SSE2 instructions should execute more quickly on an Opteron than a P4 because of the shorter pipeline.....

symonjfox
10th May 2003, 13:08
Ok, but let's wait for Windows XP-64 for opteron/athlon 64 and let's wait that some programmer will optimize and recompile every software for 64 bit.

For example let's imagine Xvid64, CCE64, and all other kind of programs, compilated and optimizated for Opteron.

I think that in the next future we'll see new kind of performace, let's wait september, until Athlon 64 will be relased. :rolleyes:

int 21h
12th May 2003, 18:36
LOL, I'm sure all of those authors are looking forward to completely rewriting all of the assembly portions of their programs.

symonjfox
12th May 2003, 19:52
The nice thing is that Opteron and Athlon 64 are 32bit compatible and AMD says that in 32bit Opteron has about 30% more power over an Athlon XP. So also using non optimized software there will be an huge speed increment.

XiS
25th June 2003, 16:34
Originally posted by Doom9
1) The video encoding scenario isn't very realistic. Xmpeg 4.5 is old for starters, and no respectable ripper would encode DVDs at full D1 resolution and waste bitrate on black bars. A good test would involve dvd2avi, avisynth 2.51 and virtualdub for encoding and the fps at the end of encoding may not be a good overall speed indicator. Everybody who has ever watched the stats during an encoding session knows that framerates can fluctuate a lot, going up nearly 20fps in no time in certain situations.
2) Opteron supports SSE2 extensions. How many applications today are able to detect and utilize those extensions? I recall that when the Athlon got SSE1 extensions programs at first didn't use them and especially windows media encoder required a special patch which helped to improve performance. I would assume that this time the situation is the same and as programs start to support SSE2 extensions the opteron can offer we'd eventually see another performance increase. And since both processor architectures now support SSE2 it is highly likely that we'll see more SSE2 optimized applications in the future which would lead to a general speed improvements on both platforms.


Most of the video programs use now SSE2 instructions. Xmpeg 4.5 is old and out to date, of course. Have you tried the lastest XMPEG version @
www.xmpeg.net (http://www.xmpeg.net) ? :) :) :p