PDA

View Full Version : RealVideo encoding speed (AMD64 and P4)


karl_lillevold
24th August 2004, 17:59
For the first time I ran a quick benchmark on an AMD64. I used Producer Mercury Alpha1 32-bit (note that a 64-bit Producer will not be built anytime soon, since the 64-bit compilers do not yet support inline assembly, and the result would undoubtably run slower).

As you can see, the results are quite good. AMD64 includes MMX, SSE, and SSE2, which makes a huge difference for RealVideo encoding.

Source: 704x352 23.976 fps I420 1570 frames

Encoding time for 2 passes, EHQ 65/85

P4 3.60 GHz : 02:47.312
P4 3.60 GHz no HT: 03:21.078
AMD64 3400+ : 03:31.968

(HT = Hyper Threading)

CPU Information (from CPU-Z)


Name Intel Pentium 4 560
Code name Prescott
Specification Genuine Intel(R) CPU 3.60GHz
Family/Model/Stepping F34
Extended Family/Model 0/0
Package LGA775
Core Stepping D0
Technology 0.09µ
Instructions Sets MMX, SSE, SSE2, SSE3
Clock Speed 3599.6 MHz
Clock multiplier x18.0
Multiplier range 14 - 18
Front Side Bus Frequency 200.0 MHz
Bus Speed 799.9 MHz


AMD64 3400+
Name AMD Athlon 64 3400+
Code name ClawHammer
Specification AMD Athlon(tm) 64 Processor 3400+
Family/Model/Stepping F4A
Extended Family/Model F/4
Brand ID 1
Package Socket 754
Core Stepping SH7-CG
Technology 0.13µ
Instructions Sets MMX, Extended MMX, 3DNow!,
Extended 3DNow!, SSE, SSE2, x86-64
Clock Speed 2200.4 MHz
Clock multiplier x11.0
HTT Bus Frequency 200.0 MHz

iwod
24th August 2004, 19:27
@Karl - Do you have the sample clip avalible for us to test?

@All. Does any have an Pentium M Laptop. As i will like to see how well it does on Real Encoding. Do this will tell if RV Encoder is an Mhz based application or a IPC based.

Will Real support Intel EMT64? Since Intel has already released Pentium 4F in some market.

Sirber
24th August 2004, 19:40
Would bi-opetron be faster than amd64?

karl_lillevold
24th August 2004, 19:56
iwod: no, I am afraid the clip is too large.

Producer will probably be compiled for 64 bit (both AMD64 and EMT64) when the compilers we use support inline assembly. The currently available compilers do not, and the porting effort is simply too large. The same may be true for RealPlayer, when 64 bit processors and tools become mainstream, it will be ported.

Sirber: a true dual CPU is about twice as fast as a single CPU for RealVideo encoding, since the encoder threading is very effective.

Dark-Cracker
24th August 2004, 20:20
hum 2.2 ghz vs 3.6 ghz it's not really fair :) does some tests on a P4 2.2 ghz are possible ?

++

karl_lillevold
24th August 2004, 20:22
the comparison is relatively fair, since AMD64's are more effective per clock cycle. That's why they call it 3400+ !

Sirber
24th August 2004, 22:47
Yeah right. More effective :rolleyes:

Hiro2k
25th August 2004, 00:52
OMG Sirber don't make me slap you! ;)

If you put up the two processors at the same speed you will see that the AMD chip will win! That's why AMD changed to the model numbering system that wasn't based on CPU speed.


AMD is correct that clock speed isn't everything - average instructions executed per clock (IPC) multiplied by clock speed would give you the real instruction throughput. Unfortunately, coming up with a precise measurement of IPC is virtually impossible - it varies depending on the code executed. Still, clock-for-clock, Athlons are definitely faster than P4 chips, and the PR ratings were relatively accurate, at least in the beginning.



And frankly I think the test was fair because it tested the best processors from both companies and we were able to determine which one is best for REAL encoding.

Sirber
25th August 2004, 01:04
2 AMD vs 1 P4? LOL

The Edge
25th August 2004, 02:06
Originally posted by iwod
@Karl - Do you have the sample clip avalible for us to test?

@All. Does any have an Pentium M Laptop. As i will like to see how well it does on Real Encoding. Do this will tell if RV Encoder is an Mhz based application or a IPC based.

Will Real support Intel EMT64? Since Intel has already released Pentium 4F in some market.

I have a P4M laptop mate. 3.06Mhz

Hiro2k
25th August 2004, 02:30
Originally posted by Sirber
2 AMD vs 1 P4? LOL

No I meant if you compared the two procesors AMD and Intel at the same clock speed, the AMD would win.

Although if you put 2 AMD's versus 1 P4 you would be equal since the hyperthreading on the P4 counts as another processor :p hahaha

Sirber
25th August 2004, 03:04
HT add 15-20%, not more.

superdump
25th August 2004, 03:06
It's not quite fair because of the 3400+ versus 3.6GHz thing however, I find it disappointing that the RV codec seems to be so optimised for Intel systems and not for both AMD and Intel systems equally.

karl_lillevold
25th August 2004, 03:31
as long as AMD64 now supports SSE2, RealVideo is optimized about the same for both. Intel has Hyper-Threading, that's true, but that is taking advantage of code that's optimized for dual CPUs, be they AMD or Intel.

Hiro2k
25th August 2004, 04:48
Thank you Karl for stopping this FUD, Real is not optimized for Intel. It's as simple as more MHZ = faster encoding. And AMD is lacking in that department, it always has and it's not really a problem for most applications.

Sirber
25th August 2004, 04:51
AMD lol! ;)

just kidding.

So, after all that reading, I'll get a P4 for my next computer.

iwod
25th August 2004, 05:59
Originally posted by The Edge
I have a P4M laptop mate. 3.06Mhz

Pentium M - IS different to Pentium 4 M.

But then Xvid is faster on AMD64. The new prescott is just simple too "HOT" :D

Still waiting for 0.09 AMD64.

superdump
25th August 2004, 16:17
Originally posted by Hiro2k
Thank you Karl for stopping this FUD, Real is not optimized for Intel. It's as simple as more MHZ = faster encoding. And AMD is lacking in that department, it always has and it's not really a problem for most applications. But the less MHz processors are supposed to be doing more per clock cycle so what's going on? Why are the Intels so much faster?

Sirber
25th August 2004, 16:35
I don't know. Front Side Bus speed? Dual Channel memory? DDR2 DIMM? Who knows :rolleyes:

Hiro2k
25th August 2004, 18:13
FSB Speed on the Athlon4 and the P4 are the same, 200MHZ (for now). They can both use dual channel memory, I don't know if they were in this test. DDR2 could help, but then that would mean the Real Producer is more dependant on memory speed than CPU speed.

I think if you disabled Hyper-Threading you would see some differnt results ;)

iwod
25th August 2004, 19:39
Well i don't think DDR2 would help since the transfer rate are pretty much the same. ( Higher CL Rate in DDR2 and higher bandwidth = nearly the same as DDR )

But the less MHz processors are supposed to be doing more per clock cycle so what's going on? Why are the Intels so much faster?
Well first there is Intel Compiler which is generally faster on INtel CPU. ( But also faster on AMD64, just not as much ) Then there is Hyper Threading. Which Encoding will benifit. Hyper threading are more useful as Clock speed scale up. Therefore in this case while AMD can do more Job per thread, Intel Has more thread. Get the point?

Suddenly i thought of Celeron Prescott as well. Anyone has that to test?

karl_lillevold
25th August 2004, 21:07
Even though it has always helped the decoder, hyper threading did not help the encoder until the introduction of larger cache sizes with the P4EE and Prescott CPUs. This is due to the encoder needing fast access to much larger memory, and the two threads would cause what is called "cache contention" (competing for the fast cache).

I have previously measured the improvement from HT on the Prescott at around 20%. If you add 20% to 2:47, you get 3:20. So then I ran the exact same encoding with encoder threading turned off, and I got 00:03:21.078. This is getting close to AMD64. I have updated the first post with this result as well.

EDIT: The Prescott also includes SSE3, which the RealVideo encoder has been optimized for. This speeds it up about 5%. If you add 5% to 3:21, you get around 3:31, the exact same as AMD64. So with HT and SSE3 taken out of the picture, the AMD64 3400+ is around the same speed as the P4 3.6 GHz.

The Edge
25th August 2004, 22:09
Originally posted by iwod
Pentium M - IS different to Pentium 4 M.

But then Xvid is faster on AMD64. The new prescott is just simple too "HOT" :D

Still waiting for 0.09 AMD64.

I know. Just the way I wrote that last post was confusing...opps.

http://img.photobucket.com/albums/v223/Bren1/Post/laptop_cpu.gif

sh0dan
3rd September 2004, 09:55
Originally posted by karl_lillevold
Producer will probably be compiled for 64 bit (both AMD64 and EMT64) when the compilers we use support inline assembly. The currently available compilers do not, and the porting effort is simply too large. The same may be true for RealPlayer, when 64 bit processors and tools become mainstream, it will be ported.

If you are using MS Visual Studio, there doesn't seem to be much hope there. :(

karl_lillevold
4th September 2004, 20:27
Originally posted by sh0dan
If you are using MS Visual Studio, there doesn't seem to be much hope there. :(
This is unfortunate, but with the high level of MMX and SSE(1/2/3) optimizations that are already in the RealVideo decoder and encoder, the 32-bit performance when running on 64-bit is most likely as good as it can be, since these optimizations are all about processing data in 64 - 128 bit wide operations anyway.

CiNcH
6th September 2004, 00:24
Well, AMD64 Dual Core CPU's are on the horizon and they will of course benefit from multi-threaded software, much more than Intel's Hyper-Threading as this is only a form of Simultaneous Multi-Threading (2 logical CPU's) and not real Symmetric Multi-Processing (2 physical CPU's ... Dual Core is kind of a Chip-Level Symmetric Multi-Processing). AMD64 Dual Core CPU's will also support SSE3.

The Server parts will be available for Socket 940 and Desktop parts for Socket 939. So if you buy Athlon 64 for Socket 939 today you should be able to update to AMD64 Dual Core later (available by the end of 2005). Mainboards are already specified for a maximum Thermal Design Power of 105W.


This is unfortunate, but with the high level of MMX and SSE(1/2/3) optimizations that are already in the RealVideo decoder and encoder, the 32-bit performance when running on 64-bit is most likely as good as it can be, since these optimizations are all about processing data in 64 - 128 bit wide operations anyway.
AMD64 Technology is more than just extending GPR's to 64-bit. It also includes 8 additional 64-bit GPR's (16 total) and 8 additional 128-bit SSE registers (16 total). This especially reduces the register saving instructions. The code will be shorter, more efficient, and faster.

Hiro2k
6th September 2004, 15:14
Originally posted by CiNcH
The Server parts will be available for Socket 940 and Desktop parts for Socket 939. So if you buy Athlon 64 for Socket 939 today you should be able to update to AMD64 Dual Core later (available by the end of 2005). Mainboards are already specified for a maximum Thermal Design Power of 105W.

Well the sockets might be the same, but the current chipsets won't support the dual core functions. So buying one now seems like a waste, thats why I'm waiting for the next generation of chipsets to come out before I upgrade.

iwod
6th September 2004, 16:01
I am still hoping that Intel will get the act together and make a decent Dual CORE CPU ( Pentium M ) based. So that CPU is not acting as a heater.

Does HT works in conjunction with Dual Core? I mean can you get 4 threads with Intel Dual Core CPU with HT?

Should that traslate into another 50% performance increase? :cool:

CiNcH
6th September 2004, 16:07
Well the sockets might be the same, but the current chipsets won't support the dual core functions.

That's wrong. AMD CPU's based on the K8 architecture have certain Northbridge functions integrated. AMD has developed the K8 supporting "SMP on-a-chip" (or Chip Multi-Processing... CMP) from the beginning. The interface to the I/O Subsystems stays exactly the same, being a Host-Bridge HyperTransport Link to the Mainboard's Chipset.

AMD has already demonstrated a 4-way SMP System with 4 AMD64 Dual Core CPU's (codenamed 'Egypt' and scheduled for mid-2005) in a HP ProLiant DL585 Server. Only a BIOS update is required.

Hiro2k
6th September 2004, 19:29
Originally posted by CiNcH
Only a BIOS update is required.

So that means current chipsets don't have it, which is what I meant. I didn't know you could update it by just flashing your BIOS. But I'm still going to wait. I don't think flashing my BIOS will give me 2 PCI-e slots for my Nvidia SLI setup :devil:

CiNcH
6th September 2004, 19:57
So that means current chipsets don't have it, which is what I meant. I knew you could upgrade the bios for it to work, it seems I worded my last paragraph wrong.
I still don't see your point. BIOS' of mainboards with current generation chipsets do not support AMD Dual Core, neither will the BIOS' of mainboards with next generation chipsets. BIOS' will be available when the Dual Core CPU's are available. There won't come out socket 939 mainboards for AMD Dual Core CPU's exclusively, just updated BIOS'.

The only reason for waiting for the next chipset generation could be PCI-Express or High Definition Audio.


Ok, I see you edited your post, now I understand.

Wow, going from ATI Radeon 9600 SE to nVIDIA SLI configuration will most likely boost gaming experience! ;)