Log in

View Full Version : What Machine should I buy to perform RV9 - EHQ=80 realtime encoding ?


Pages : [1] 2

ookzDVD
30th September 2003, 03:30
As the title said.

I plan to buy new machine,
with plain encoding (no filter),
what processor is enough to do the realtime RV9 EHQ=80 encoding ?

PS. With my current machine (Athlon +1700) the encoding time is
about 3.5-4x the movie duration :(

So I need +1700 x 4 = Athlon +6800 ?
Hmm.... When that machine will be available ? :(

General Lee D. Mented
30th September 2003, 09:20
Dual Opteron 246s? =)

scharfis_brain
30th September 2003, 10:06
which resolution do you try to encode?

maybe lowering that would help.

Sirber
30th September 2003, 13:51
Just buying a P4 will boost your speed by 25%, and DualChannel DDR is a must. RV9 is dependant of SSE2, and in high memory bandwith also :) After buying that, you could encode at EHQ=90... :D

iwod
30th September 2003, 16:15
The best you could get now is a Pentium 4 Extreme Edition 3.2 Ghz.

And with 2x 256 DDR 400 SDram ( i.e Dual Channel ) you should be done.

However karl said eailer that he want to do a ultra super douper complex algorthium for EHQ 100 to increase the quality a litte bit more but it would demand even more CPU power compare to EHQ 80.

Therefore i suggest you to wait until next intel Pentium Processor the prescot.............. May be Helix will even optimse their encoder more with SSE3...........

Or you could get a Athlon 64 cpu where it support SSE2.........

Sirber
30th September 2003, 16:59
I'm bored of AMD being behind for Multimedia CPU extentions. I'll get a P4 next time.

karl_lillevold
30th September 2003, 17:03
an Athlon 1700 is really only something like a 1.4 GHz, so if you get a P4 2.8 GHz, you get 2X * 1.25 (25% extra for SSE2) = 2.5 speedup. If your encodes in the past were 4X realtime, they will be 4/2.5 = 1.6X with a P4 2.8 GHz. Not quite there, but almost. By the way, is 4X including 2 pass or just for 1 pass?

RV9 incl EHQ has already been optimized for SSE3, but not yet released. This speedup should be approx 5% in addition to SSE2, more measurements later.

EDIT: I don't know the effect of Dual DDR 800 MHz memory, but I am guessing it will help quite a bit. Video encoding is very memory access intensive.

Sharktooth
30th September 2003, 17:14
Dual Opteron 246. It supports SSE2 and its faster than dual P4 Xeon.

Sirber
30th September 2003, 17:18
Opteron is quite expensive...

Sharktooth
30th September 2003, 17:23
Then go for an Athlon 64 FX51 or FX53(when available).
64bit codecs will be available very soon...
Memory bandwidth shouldnt be a problem since the FX has a dual channel memory controller.

karl_lillevold
30th September 2003, 17:54
re 64bit: we currently have no plans to optimize either encoder or decoder specifically for 64 bit processors, simply because all the functions that would benefit from parallelizations have already been optimized for MMX/SSE/SSE2 which for SSE2 offers up to 128bit wide operations. It will be very interesting to see how RV9 encodes and decodes on Athlon 64 though, now that AMD has included SSE2. Maybe we have to make some changes in places where shift operations assume 32bit, but it should be fairly minimal.

deXtoRious
30th September 2003, 19:49
Sirber

Does that mean that your AMD XP+ 2000 (mine too) is a piece of crap?
You know, you got me worried ;)

Sirber
30th September 2003, 20:56
With a P4, same speed, I could encode at least 35% faster :(

slavickas
30th September 2003, 21:01
Originally posted by karl_lillevold
re 64bit: we currently have no plans to optimize either encoder or decoder specifically for 64 bit processors, simply because all the functions that would benefit from parallelizations have already been optimized for MMX/SSE/SSE2 which for SSE2 offers up to 128bit wide operations. It will be very interesting to see how RV9 encodes and decodes on Athlon 64 though, now that AMD has included SSE2. Maybe we have to make some changes in places where shift operations assume 32bit, but it should be fairly minimal.

but amd64 has twice more xmm registers

btw Karl i think setting via registry patternAdaptivity=3 probably won't work (have tested only on 1 klip), via job file seems to work

TheXung
30th September 2003, 22:46
Your best bet is the aforementioned P4 3.2 system. Build yourself a waterchilled cooling system and overclock it. It could get you into the 3.6-4.0 Ghz range. Something else that would make a difference is to use a capturing or encoding program that is multithreaded.

I have heard the SSE2 implementation on the Athlon FX to be described as "stoddy", so take that for what it is worth.

karl_lillevold
30th September 2003, 23:03
@slavickas: xmm regs; I did not know that about amd64. Also, I just checked patternAdaptivity via the registry. Works fine here. Make sure you spell it correctly:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\RealNetworks\RV9]
"patternAdaptivity"=dword:00000003


@TheXung: multi-threaded encoders; yep, RV9 encoder and decoder are both fully multi-threaded. A true dual system will encode (and decode) about twice as fast as a single CPU. There is currently not much advantage to hyper-threading for the encoder, due to cache contention, but it will be, when Intel increases the cache size. Maybe Prescott already has this, I am not sure.

ookzDVD
1st October 2003, 02:01
Originally posted by karl_lillevold
an Athlon 1700 is really only something like a 1.4 GHz, so if you get a P4 2.8 GHz, you get 2X * 1.25 (25% extra for SSE2) = 2.5 speedup. If your encodes in the past were 4X realtime, they will be 4/2.5 = 1.6X with a P4 2.8 GHz. Not quite there, but almost. By the way, is 4X including 2 pass or just for 1 pass?


Thank yuo Karl for your information, I'll buy P4 2.8Ghz then :)
4x realtime is just for 1 pass :(

Sirber
1st October 2003, 02:45
strange...

with my 2000+ I need 2 days for a 4h movie (Rock et belles oreilles), so 1 day per pass, 6h / 1 movie h, so 6x. :devil:

ookzDVD
1st October 2003, 02:48
Originally posted by Sirber
strange...

with my 2000+ I need 2 days for a 4h movie (Rock et belles oreilles), so 1 day per pass, 6h / 1 movie h, so 6x. :devil:

Maybe you use too much filter :)

ookzDVD
1st October 2003, 02:49
Originally posted by scharfis_brain
which resolution do you try to encode?

maybe lowering that would help.

I prefer to encode 640x... for 1CD-rip.

Sirber
1st October 2003, 11:54
anamorphic for 1 CD sometimes improve the quality...

slavickas
13th October 2003, 14:04
Originally posted by karl_lillevold
@slavickas: xmm regs; I did not know that about amd64. Also, I just checked patternAdaptivity via the registry. Works fine here. Make sure you spell it correctly:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\RealNetworks\RV9]
"patternAdaptivity"=dword:00000003




yup it worx, in http://forum.doom9.org/showthread.php?s=&threadid=60712 that thread your reg sample contained extra space in name.
any target date of new milestone or dll build with fixed scalingFactor ?

karl_lillevold
13th October 2003, 15:32
@slavickas: nice catch, i did not see that extra space at the end. I have corrected my post.

Re ScalingFactor fix: I don't know when the next Milestone will be ready. However, I added a slight improvement I would really like some feedback on before it becomes the new default, so I may post a DLL to test soon. This will also have the scalingFactor fix included.

Sirber
13th October 2003, 16:14
What's that fix about? 30% more quality? :D

karl_lillevold
13th October 2003, 22:46
To clarify, the scalingFactor fix is a minor fix only applicable when the scalingFactor param is used. I posted the details in this thread (http://forum.doom9.org/showthread.php?s=&postid=385462#post385462).

The improvement is something else, more details later. Nothing like 30%, probably not even visible most of the time, just another small step in the right direction.

ToiletDuck
23rd February 2004, 08:38
I'm on a dual opteron 240 system running at 1500mhz right now with EHQ=100 and I am encoding at around 40-45% actual movie speed. I don't really understand the whole SSE optimization thing. Wasn't there an article on how you could just remove some flags or something similar and get a lot more speed out of the AMD chip?

Hiro2k
24th February 2004, 00:39
Originally posted by ToiletDuck
I'm on a dual opteron 240 system running at 1500mhz right now with EHQ=100 and I am encoding at around 40-45% actual movie speed. I don't really understand the whole SSE optimization thing. Wasn't there an article on how you could just remove some flags or something similar and get a lot more speed out of the AMD chip?

Which AMD chip?? There are many different cores and it makes a difference. Also I'm glad to hear that RV9 well suited for dual processors, one more reason to upgrade to dual Opterons! :D

If only vdub could use both procesors. :devil:

ToiletDuck
24th February 2004, 01:39
when using AutoDub it uses both... and as far as which processor i didn't know that opteron processors had different cores.

McoreD
24th February 2004, 13:12
AMD processors pretty much suck when it gets to Video Encoding. I can backup my statement with practical benchmarking tests done at TomsHardware Labs.

Video Benchmarks for:
Mainconcept MPEG Encoder 1.4.1
Pinnacle Studio 9
XMPEG 5.0.3 / DivX 5.1.1 Pro
http://www.tomshardware.com/cpu/20040201/prescott-14.html
Windows Media Encoder 9
Windows Movie Maker 2.0
http://www.tomshardware.com/cpu/20040201/prescott-15.html

If you check those pages, you will see AMD processors get no where near 1st, 2nd or 3rd.

That's enough for me to go with Intel®. :)

nFury8
24th February 2004, 13:38
I can backup my statement with practical benchmarking tests done at TomsHardware Labs.
Its well known that AMD procs cant keep up with Intel's P4 when it comes to video encoding. But for pity's sake, there's a lot more reputable and credible sites to base one's decisions on, other than 'that' site. Of course these more respectable sites still conclude that P4 is faster in video encoding.

regards

Sirber
24th February 2004, 13:40
I agree that P4 is better for viseo encoding, but AMD is better in a student's budget :D

slavickas
24th February 2004, 14:27
Originally posted by Sirber
I agree that P4 is better for viseo encoding, but AMD is better in a student's budget :D

true, true :D nothing can't beat overclocked 1400 duron for price/performance :cool:

ToiletDuck
24th February 2004, 20:28
I don't know about that. with my dual opteron 240's running at 1500mhz I encoded xvid at 640xXXX with sharpening and deblocking options all turned on and I still get about 60-70fps. I think it comes down to the codecs are more optimized for intel chips. I can use adobe premier and edit video in realtime rendering with out matrox cards or anything. Should there be a 64bit optimized code for the AMD64 architecture I think the opteron's would clobber the intel hands down.

ToiletDuck
24th February 2004, 20:31
I don't know about that. with my dual opteron 240's running at 1500mhz I encoded xvid at 640xXXX with sharpening and deblocking options all turned on and I still get about 60-70fps. I think it comes down to the codecs are more optimized for intel chips. I can use adobe premier and edit video in realtime rendering with out matrox cards or anything. Should there be a 64bit optimized code for the AMD64 architecture I think the opteron's would clobber the intel hands down.

********Edit*******
http://www.extremetech.com/article2/0,3973,1402705,00.asp

The operon gives the p4 a good run for it's money here. Wins more than loses. And all this is 32bit.

In uniprocessor systems, Intel's Hyper-Threading technology often gives it an edge over an AMD64-based CPU in some benchmarks. However, once we toss in a second CPU, all bets are off. The dual Opteron system outperformed the Xeon 3.06GHz (and would likely outperform the 3.2GHz version) in benchmarks where Intel had won on single CPU performance.

Big_Berny
25th February 2004, 22:21
Well, I think it's a lot cheaper to get a DVD-burner, isn't it?
Why do you want a fast PC to encode realtime?

Big_Berny

Hiro2k
25th February 2004, 22:54
Because not all of us have the time to spend dedicating our computers to 1 single task. I know with windows 2000 you can set the priorities lower and do other things, but sometimes if your doing a test on a filter, then speed is an issue and you want to get through with things as soon as possible.

ToiletDuck
25th February 2004, 23:21
Well, I think it's a lot cheaper to get a DVD-burner, isn't it?

What are you doing here then lol.

Big_Berny
25th February 2004, 23:39
Originally posted by ToiletDuck
What are you doing here then lol.

At the moment I haven't even enough money to buy a dvd-burner! :D
I'll wait for the new doublelayer-generation... Until then I have to use one of these codecs.

Big_Berny

MfA
25th February 2004, 23:51
P4 would be a so much nicer processor if it didnt stall at the drop of a hat ... I would definetely get a Prescott if you want Intel, the SSE3 instruction which allows non aligned accesses, without the ridiculous stalls you get on a normal P4, will probably become quite important for video encoding apps (for the rest SSE3 is pretty much inconsequential).

Sirber
26th February 2004, 00:02
P4 is also generaly more quiet than an AMD machine :)

nFury8
26th February 2004, 01:55
P4 is also generaly more quiet than an AMD machine :)
You mean the non-Athlon64's? I believe A64's are much more quieter than P4's comparatively. But I like my procs a bit noisy cause that would mean it's doing a lot more work than a silent one. :D ;)

Maverick
26th February 2004, 01:55
[/QUOTE]
Originally posted by karl_lillevold an Athlon 1700 is really only something like a 1.4 GHz, so if you get a P4 2.8 GHz, you get 2X * 1.25 (25% extra for SSE2) = 2.5 speedup.
Wow. That's the most incredbile statement I've ever read. Now, disregarding the Athlon has a 12 stage pipeline and a Pentium 4 has a 20 stage pipeline, the cache mispredictions the Pentium 4 has so often, AND the fact it's tiny L1 instruction cache can barely cache a thing...
RealVideo's encoder must be doing some REALLY funky stuff. You can't say double the clock speed = double the MIPS/speed. The P4 3.2 Extreme does so well because of it's giant 512 L2 + 2048K L3 caches. Most video encoding I see has *Big* data sets to process, so cache helps enormously, to the standard un-extreme P4's detriment.


Originally posted by karl_lillevold
RV9 incl EHQ has already been optimized for SSE3, but not yet released. This speedup should be approx 5% in addition to SSE2, more measurements later.
If it's only 5%, then Prescott at equal clock speeds, unfortunately, it going to be slower than Northwood. Intel decided it would be 'REALLY COOL' if they made the Prescott 30 stages. Yay. Intel will be releasing a newly optimised C++ compiler soon, look out for it.

Originally posted by karl_lillevold EDIT: I don't know the effect of Dual DDR 800 MHz memory, but I am guessing it will help quite a bit. Video encoding is very memory access intensive.
Just curious, what hardware do you guys test your encoder on?

Sirber
26th February 2004, 02:07
always stuff that we don't have :(

ToiletDuck
4th March 2004, 07:01
However as cheap as DVD burners are getting it almost seems like in 2-3yrs that divx, xvid, and all other codecs will not be needed anymore. Maybe for live streaming media. Even then the new bandwiths will be raising so fast it is unreal. However I could be biting my toung in the future by not thinking of what else is coming out. The new "blue wave" or whatever technology will be coming out. Pretty soon we might have 40gb disk that we are wanting to put on a 4gb disk...... ehhhhh only time will tell i guess.

Sirber
4th March 2004, 13:09
Higher disk capacity mean more movies per disk :D Higher bandwith means more realtime streaming. XviD/DivX/VP6/RV10 won't die soon :)

iwod
4th March 2004, 21:16
It is quite funny looking back at my own reply from September. Where i still believe Intel is going to Make Prescott better than rumors.

However everything in the wild came true where it suck power like hell, i could even imagine buying so a expensive chip and to pay extra for its elecrtity Bills.

People tend to say Intel is better for encoding only because all comparsion has been done on application which are specificly optimize for intel CPU only.

And so far i haven't seen any article on encoding RV10 EHQ 100. ( Suddenly thought i could recommand this to Anandtech )

ToiletDuck
5th March 2004, 00:05
iwood i encode at 100hq and can do a 2hr movie in about 4hours total. That is ripping it, doing audio, and 2pass encode.

nFury8
5th March 2004, 02:00
Originally posted by ToiletDuck
iwood i encode at 100hq and can do a 2hr movie in about 4hours total. That is ripping it, doing audio, and 2pass encode.
Really? What's your exact rig setup? Encoding parameters? That rig screams, man. Awesome.

Maverick
5th March 2004, 02:50
Makes my 36hr WM9 encoding sessions (P4 2.6C) sound like marathons...

nFury8
5th March 2004, 05:34
I just searched a couple of posts back in this thread and realized this guy(ToiletDuck) is running a dual Opteron (240?). Jeez, where do these guys get all the money to buy life's expensive stuff? :D Duck, mind posting your whole setup like memory and mobo, as well as encoding parameters?