View Full Version : MeGUI CPU Time Test - Compare different CPUs encoding the same file
tsp
31st March 2007, 21:59
opteron 165 8x300 MHz =2.4 GHz 3 GB ram (but windows xp only detect 2.5 :( )
http://img379.imageshack.us/img379/6404/meguihl3.png
And yes the voltage is wrong. Should be 1.4 V
aicjofs: The MT numbers on vista could be correct as the multithreading support has been improved with vista
graysky
1st April 2007, 10:57
@ tsp: FSB to DRAM = CPU/12? Is it 3:2 ?
squid_80
2nd April 2007, 02:22
PS: I know that a 64-bit meGUI won't help unless everything that meGUI accesses is 64-bit too.
64-bit test results (XP64 with 64-bit MeGUI, plugins and x264.exe) for an E6600 (9x266, 5-6-6-18): fps1 = 53.33, fps2 = 17.34.
graysky
2nd April 2007, 20:09
@tsp: sorry it took me so long but I finally updated the chart w/ your result.
@ squid_80: interesting that your XP64 results are more or less identical to other 9x266 chips such as darkstar's result.
morph166955
2nd April 2007, 22:36
to make sure that it worked for when I want to run it on the octa-core later in the week I ran your test on my Lenovo X60 Laptop w/Core2 Duo T7200. Here are the results for ya so you can add it to your list. I tried to shut off as much as I could but I'm sure there was a few thing slowing this machine down...I'm going to format this laptop in the next month I'll rerun it once I get to that point.
EDIT: forgot to say machine is windows xp 32-bit in case that matters for you
Selmer79
3rd April 2007, 14:26
And yes the voltage is wrong. Should be 1.4V
Updating CPU-Z might help.. ;) Latest version is 1.39, you're running 1.33.1..
Get it here: http://www.cpuid.com/download/cpu-z-139.zip
HookedOnTV
3rd April 2007, 18:13
The memory timings listed next to my numbers aren't mine. Forgot to submit them.
1:1 @ 3-4-4-10
graysky
4th April 2007, 18:50
@morph166955: thanks for the first merom data point, added
@ HOT: fixed
morph166955
5th April 2007, 04:13
im loading xp on the 8-core tomorrow...i should have some results for you all then :-D. just for a hint...i loaded freebsd 64-bit on it just now...holy hell its fast (and seeing a 7 under the cpuid column when using top is kinda cool)
morph166955
6th April 2007, 00:22
ok so the test has officially been run...sadly the test didnt par up to what the computer could do. x264 only ran at 38% on the first pass...did 82.55fps. couldnt coax it to do more. it did manage to run at 95%+ on the second pass and got me 68.1fps. I'm going to try to tune this box a little to see what it can do though...i think 68.1 is kinda lacking...im shooting for 80 or so minimum.
morph166955
6th April 2007, 00:47
THANK GOD FOR MT.DLL!!!
First Pass: 142.59FPS!!!
Second Pass: 70.34FPS!!!
mt.dll pegged the cpu above 95% for both passes. I'm going to run it with the brand spanking new x264 to see what it does. (oh and the cpu has that speedstep technology so ignore that it says 1999MHz on the cpuinfo page...it stepped down from 8.0x to 6.0x when its idle)
http://www.benswebs.com/public/pictures/meguitest.JPG
hmm x264 and avisynth MT doesn't scale to well with so many cores... but still very very fast.
morph166955
6th April 2007, 03:04
doesnt scale well? the closest thing to that is an overclocked QX6700 which was set at 3.8GHz which was still 20 fps under that. an equivalent QX6700 at 2.66GHz hit almost exactly half of of what I hit. I'd say it scalled pretty damn well
Blue_MiSfit
6th April 2007, 06:43
insert insane, slack-jawed, drooling jealousy here.
You can actually encode HD H.264 in a reasonable amount of time now!
~MiSfit
doesnt scale well? the closest thing to that is an overclocked QX6700 which was set at 3.8GHz which was still 20 fps under that. an equivalent QX6700 at 2.66GHz hit almost exactly half of of what I hit. I'd say it scalled pretty damn well
well if we compare your result without avisynth mt with darkstar782 (core2duo 2.66 GHz) and drjay (qx6700 2.66 GHz) the framerate only increases with 66 % every time the number of cores is doubled shown in this graph:
http://img46.imageshack.us/img46/191/scalingfj5.jpg
You could try to disable some of the cores(/NUMPROC= switch in boot.ini (http://www.microsoft.com/technet/sysinternals/information/bootini.mspx)) to see how it scales with avisynth mt as there are not enough data in this thread to create a similar comparison(but going from 82.55 fps with 35 % cpu utilization to 142.59 fps with >95 % cpu utilization is not perfect scaling(should be more like 225 fps) but that must be the creator of mt.dll fault :) )
graysky
6th April 2007, 20:43
Is the inefficient scaling a function of an inefficient OS? I think morph166955 needs to rip a DVD and encode the mpg stream under BSD and then use the same commandline options for x264.exe under windows and compare the two results. I'll bet dollars to doughnuts the NIX will do a more efficient job... you up to it dude?
graysky
6th April 2007, 21:21
updated the table
Is the inefficient scaling a function of an inefficient OS? I think morph166955 needs to rip a DVD and encode the mpg stream under BSD and then use the same commandline options for x264.exe under windows and compare the two results. I'll bet dollars to doughnuts the NIX will do a more efficient job... you up to it dude?
well as avisynth 2.5 currently is not available for NIX it is difficult to compare this result. It would require a new test case without avisynth but I believe it is more x264 fault than windows as it is not easy to create an application that scales well when only working with 1 frame.
morph166955
7th April 2007, 02:30
awe ya only stuck my non mt.dll line in there...looks so weak in comparison to the others. I think that for the systems that run a significant amount of cores that mt.dll runs should be included (and labeled so people know) since its obvious that megui/avisynth is not run at anywhere near its potential and its choking x264.
morph166955
7th April 2007, 02:35
well as avisynth 2.5 currently is not available for NIX it is difficult to compare this result. It would require a new test case without avisynth but I believe it is more x264 fault than windows as it is not easy to create an application that scales well when only working with 1 frame.
well i suspose i could do a run using mencoder and have it set to do the resize/deint and all that the avisynth script does. i think that in comparison the actual x264 encode is doing a hell of a lot more usage wize then the resize/deint is. i'm down in florida right now but i'm terminaled into the server installing things so depending how fast they go (and let me tell you...they are going FAST...gcc 4.1.2 w/ c & c++ compiled in under 10 minutes!) i'll do my best to duplicate the tests. the other option i suspose is to resize/deint the image to raw YUV 4:2:0 file and feed that into x264 with the options used in the jobs. ill keep yall updated.
graysky
7th April 2007, 11:53
I don't think you need the resize in there at all since you just want to compare the two OS's ability to handle 8 cores. I think a straight encode would work just fine. I would suggest you just rip a VOB from any DVD and rename it to .mpg since it's an mpeg-2 stream which I think x264 can use as an input and simply encode it to whatever options you want under BSD and compare the encode time to the same thing under windows keeping avisynth out of the picture all together. I took this example from my logs as a suggested output:
Pass1: x264 --pass 1 --bitrate 1582 --stats "D:\work\pass1.stats" --bframes 3 --b-pyramid --direct auto --subme 1 --analyse none --vbv-maxrate 25000 --me dia --threads auto --thread-input --progress --no-psnr --no-ssim --output NUL "D:\work\input.mpg"
Pass2:x264 --pass 2 --bitrate 1582 --stats "D:\work\pass1.stats" --ref 5 --mixed-refs --no-fast-pskip --bframes 3 --b-pyramid --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse all --8x8dct --vbv-maxrate 25000 --me umh --threads auto --thread-input --sar 40:33 --progress --no-psnr --no-ssim --output "D:\work\2ndpass.mkv" "D:\work\input.mpg"
graysky
7th April 2007, 12:34
I added the two mt.dll results we have to the first post in their own table.
@morph: what line did you add to your avs when you did your mt.dll test? For example, aicjofs used this line to the avs file: SetMTmode(2,0) and I wanna at least track the results people are getting with their settings.
ditche
7th April 2007, 14:32
Updated.
http://users.skynet.be/bk314761/megui03.png
http://users.skynet.be/bk314761/megui04.png
aicjofs
7th April 2007, 20:34
I ran some more tests since Tsp felt we didn't have enough info to draw conclusions about MT scaling.
I set QX6700 to default 2.66 and ran a quadcore test, and then disabled multiplexing(quad runs as only a dual core and will give rough dual core result) with MT.dll runs. Only issue here is RAM is now 400 4-4-4, instead of 519 4-4-4.
2.66 quad MT 103.38 37.79
2.66 dual MT 59.17 19.12
It really appears to me the gains for MT.dll are minimal for dual core and start increasing for quad core and more. At least I didn't see the benefit during this test of MT for dual core.
but going from 82.55 fps with 35 % cpu utilization to 142.59 fps with >95 % cpu utilization is not perfect scaling(should be more like 225 fps)
for 1st pass I still worry about the accruacy of this since it completes before frame rates have stabilized and I'm not sure frame rates rise in a linear fashion from start to maximum. This can skew the results. i.e. perhaps his avg framerate for a longer test would be closer to 225. Who knows...
I think that for the systems that run a significant amount of cores that mt.dll runs should be included (and labeled so people know) since its obvious that megui/avisynth is not run at anywhere near its potential and its choking x264.
Agreed. That's why I mentioned it in this thread, and we have got some good data already that shows the benefits for numerous cores. Really helps put all of this great data submitted by people into perspective we all get something out of.
I also ran some more Vista testing MT, noMT, stock speed, quad core, dual core, etc. Here is a summary of everything ran thus far.
XP
3.8 quad MT 138.16 54.15 10sec 28sec
3.8 quad noMT 121.20 55.01 12sec 27sec
2.66 quad MT 103.38 37.79 15sec 39sec
2.66 quad noMT 86.74 37.91 17sec 39sec
2.66 dual MT 59.17 19.12 24sec 77sec
2.66 dual noMT 58.78 19.28 25sec 77sec
VISTA
3.8 quad MT 147.91 53.78 10sec 29sec
3.8 quad noMT 123.67 53.16 12sec 27sec
2.66 quad MT 105.67 38.28 14sec 39sec
2.66 quad noMT 86.26 38.54 17sec 39sec
2.66 dual MT 59.33 19.22 25sec 77sec
2.66 dual noMT 58.6 19.13 25sec 77sec
HookedOnTV
7th April 2007, 22:58
...It really appears to me the gains for MT.dll are minimal for dual core and start increasing for quad core and more. At least I didn't see the benefit during this test of MT for dual core.
I did a test run with MT on my dual core and the total time ended up exactly the same.
Blue_MiSfit
8th April 2007, 04:31
MT AviSynth is only really necessary when you have enough CPU power to make non MT AviSynth the bottleneck.
It makes a HUGE difference when you are doing less intensive encoding tasks, like CCE, which can easily run 120+fps on a quad core system with a basic AviSynth script. x264 is so CPU intensive that MT doesn't help that much with most pedestrian dual cores, but with the big boys it can make a world of difference, depending on your script complexity and x264 configuration.
Now, if only it worked with MVTools :D
~MiSfit
morph166955
8th April 2007, 04:39
I added the two mt.dll results we have to the first post in their own table.
@morph: what line did you add to your avs when you did your mt.dll test? For example, aicjofs used this line to the avs file: SetMTmode(2,0) and I wanna at least track the results people are getting with their settings.
same line as aicjofs. I wasnt very familiar with the plugin since i rarely use avisynth so I just went with the same thing. is there anything I should change to make it faster?
morph166955
8th April 2007, 04:45
for 1st pass I still worry about the accruacy of this since it completes before frame rates have stabilized and I'm not sure frame rates rise in a linear fashion from start to maximum. This can skew the results. i.e. perhaps his avg framerate for a longer test would be closer to 225. Who knows...
I'm going to run some dvd rips through this thing tomorrow once i'm home from florida to get a nice stable rate. I think that for us crazy high core/speed systems we need a much longer source to run off of (and possibly an HD source). I'll do up something in the next few days (moving back to school so i may be delayed a little) for the quad core people to test on (assuming im the only octacore setup out there at the moment thats running these tests.)
graysky
8th April 2007, 19:32
Cool let us know how it goes.
squid_80
9th April 2007, 09:34
I would suggest you just rip a VOB from any DVD and rename it to .mpg since it's an mpeg-2 stream which I think x264 can use as an input and simply encode it to whatever options you want under BSD and compare the encode time to the same thing under windows keeping avisynth out of the picture all together. No way does x264 accept mpg as input.
morph166955
9th April 2007, 14:47
im movin back up to school today and im having an issue getting mplayer to compile in 64 bit (and im having some issues using -m32 on gcc for it also...thats probably my fault though for not having the proper 32 bit libraries floating around). I'll be tinkering with it tonight and over the next few days though. If anyones got some suggestions on what to do please let me know...im getting assembly errors that are something like "xxx is not a valid 64 bit register" or something like that. im pretty sure I have to build binutils in 32 bit mode to get those libraries but im not 100% sure. as for source, ill either pipe my source through mencoder to do x264 or I'll create a YUV 4:2:0 raw source for it to read out of or something like that.
EDIT: just got to school, and i got mplayer to compile...had to update the decrepidly old version of binutils that bsd64 comes with. I'm installing samba now so that I can copy files to the box easier. I should have results shortly on manual x264 passes with the same source.
morph166955
10th April 2007, 01:20
using mencoder and the settings from job 2 I am now a firm believer that the video source were using is no where near whats needed to stabilize a framerate. according to i hit 179.37 fps on the first pass which took 10.413 seconds and the frame rate was still climbing when it ended. im going to try a pass on a VOB now...be back soon
graysky
10th April 2007, 01:26
Interesting. Using pretty fast dualcore chips, many users have been able to get highly reproducible result with it. Your machine with 8 cores on the other hand might be the exception :)
What happens when you run the same encode several times (179.37 fps or near that each time?)
morph166955
10th April 2007, 01:51
i get with in 1-2 fps of that each run. what i did notice is that i am once again limited by the encoder not running in multi-thread mode (i believe its mencoder this time). im only using the cpu's at 38% total on that first pass...and thats at 170+ FPS! im converting the video into a YUV 4:2:0 raw video to pipe directly to x264...god help us all
morph166955
10th April 2007, 02:23
ok so heres the results using freebsd, x264 r647 and the source video in raw YUV format thats resized thanks to mplayer set as close to the settings of the avs file as I could get. the times are generated using the simple gnu time command. processes were run using "nice -n -20 time x264...". Source was stored in a ramdisk for maximum speed. I realize this isnt 100% accurate to what the offical test is running but I believe its a good source of info to base theory off of. I'm trying to figure out why x264 isnt doing threaded input on this file, its kinda weird to me and I believe its because its not using avisynth as the input source. I'm looking into avisynth 3.0 since it is written to run in linux. that may help.
pass 1:
encoded 1486 frames, 120.76 fps, 995.46 kb/s
12.38 real 29.65 user 3.76 sys
pass 2:
encoded 1486 frames, 93.30 fps, 1009.21 kb/s
16.08 real 83.08 user 2.10 sys
morph166955
11th April 2007, 20:48
So I got some linux runs for you guys to contimplate over. The way these were generated was by using my xp laptop and mencoder to generate the sourcefile from the benchmarks avs file (converted it into a raw yuv file for x264 to use). once I got that file built I copied it onto my linux setup (im now running fedora core 6 because of smp/thread problems with bsd) and then used x264 with the exact options (i changed the stats file location but thats it) out of the job1-2 and job1-3 files. I had a script run the two x264 passes 10 times. below are the fps/times from all the first passes and then all the second passes. You must divide the CPU % by 8 (i believe its 8 not 10) to get the average CPU usage per core. As you can see my first pass is only getting ~45% avg cpu usage and my second pass is getting ~80% avg cpu usage. I'm trying to get this to go faster but I'm not sure what more I can do. Piping the video through mencoder using the comparable video filters gets me about the same cpu utilizations and frame rates.
PASS1:
encoded 1488 frames, 189.16 fps, 989.84 kb/s
27.75user 0.42system 0:07.87elapsed 357%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 182.12 fps, 989.84 kb/s
28.00user 0.41system 0:08.17elapsed 347%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 187.77 fps, 989.84 kb/s
27.84user 0.41system 0:07.93elapsed 356%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 188.89 fps, 989.84 kb/s
27.83user 0.46system 0:07.88elapsed 358%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 188.22 fps, 989.84 kb/s
27.87user 0.42system 0:07.91elapsed 357%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 187.57 fps, 989.84 kb/s
27.79user 0.40system 0:07.94elapsed 355%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 185.17 fps, 989.84 kb/s
27.85user 0.41system 0:08.04elapsed 351%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 187.65 fps, 989.84 kb/s
27.78user 0.46system 0:07.93elapsed 356%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 185.20 fps, 989.84 kb/s
27.72user 0.42system 0:08.04elapsed 350%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 188.29 fps, 989.84 kb/s
27.77user 0.44system 0:07.91elapsed 356%CPU (0avgtext+0avgdata 0maxresident)k
PASS2:
encoded 1488 frames, 116.38 fps, 1009.85 kb/s
84.39user 0.90system 0:12.87elapsed 662%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 116.82 fps, 1009.74 kb/s
84.26user 0.91system 0:12.82elapsed 664%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 116.19 fps, 1009.75 kb/s
84.25user 0.90system 0:12.88elapsed 660%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 116.74 fps, 1009.39 kb/s
84.41user 0.91system 0:12.83elapsed 665%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 116.34 fps, 1009.38 kb/s
84.42user 0.84system 0:12.87elapsed 662%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 116.08 fps, 1009.44 kb/s
84.27user 0.93system 0:12.90elapsed 660%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 116.15 fps, 1009.78 kb/s
84.28user 0.89system 0:12.89elapsed 660%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 115.93 fps, 1009.79 kb/s
84.22user 0.92system 0:12.91elapsed 659%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 115.99 fps, 1009.38 kb/s
84.24user 0.87system 0:12.91elapsed 659%CPU (0avgtext+0avgdata 0maxresident)k
encoded 1488 frames, 115.59 fps, 1009.48 kb/s
84.33user 0.87system 0:12.95elapsed 657%CPU (0avgtext+0avgdata 0maxresident)k
graysky
11th April 2007, 21:05
Pretty fast numbers, too bad you can max it out. To put things in to prospective for me anyway, I'm using my Athlon XP 3200+ to encode some movies so I can watch them on my laptop. I'm getting about 23 fps on the first pass and 5 fps on the second!
rakan
14th April 2007, 16:43
I get an error when I try to run this on my machine. It compresses the audio file fine but runs into an error on the first pass. Here is a printout of my log...
Looking for job processor for job...
Processor found!
Starting job job1-1 at 11:44:30 PM
Starting preprocessing of job...
Preprocessing finished!
encoder commandline:
successfully started encoding
Processing ended at 11:44:36 PM
----------------------------------------------------------------------------------------------------------
Log for job job1-1
Channels=2, BitsPerSample=16, SampleRate=48000Hz
C:\Program Files\megui\tools\lame\lame.exe --abr 128 -h --silent - "C:\work\test-new T01 DELAY 0ms.mp3"
----------------------------------------------------------------------------------------------------------
Starting postprocessing of job...
Job completed successfully and deletion of intermediate files is activated
Postprocessing finished!
Looking for job processor for job...
Processor found!
Starting job job1-2 at 11:44:36 PM
Starting preprocessing of job...
Job 'job1-2' requires bitrate calculation. Calculating now...
Found another video job: job1-3.
Found audio stream: job1-1.
The video job has a desired final output size of 16777216 bytes and video bitrate of 1000kbit/s
Examining audio jobs found...
Audio job 'job1-1':
This job completed successfully, taking size into account...The size is of this track is 906648 bytes, and the type is MP3. Taking this into account in the bitrate calculation.
Desired video size after substracting audio size is 15483KBs. Setting the desired bitrate of the subsequent video jobs to 2045 kbit/s.
Preprocessing finished!
Calling setup of processor failed with error The file C:\work\test-NEW.avs cannot be opened.
Error message for your reference: Script error: there is no function named "Spline36Resize"
(C:\work\test-NEW.avs, line 8)
It's how would I fix this? I put the appropriate DLL files in the avisynth plugins directory.
graysky
14th April 2007, 17:03
Interesting... it seems to be complaining about the resize step. I thought the resize functions were built into the app (i.e. didn't require a plugin). Are you using the latest version of Avisynth? If so, can you post the contents of your /avisynth/plugins dir here. Do it like this:
-open a console window (start>run type cmd and hit ok)
-cd to your avisynth plugins dir like cd "Program files\avisynth\plugins"
-type the following: dir > list.txt
then open the list.txt file and paste it in here. Here is mine for comparison:
Directory of C:\Program Files\AviSynth 2.5\plugins
12/24/2006 05:12 AM 24,576 DirectShowSource.dll
12/24/2006 05:12 AM 112,640 TCPDeliver.dll
07/05/2005 08:04 AM 7,129 colors_rgb.avsi
03/15/2007 03:00 PM 69,632 FluxSmooth.dll
03/15/2007 03:00 PM 2,541 Convolution3DYV12.txt
03/15/2007 03:00 PM 7,530 Convolution3d.txt
03/15/2007 03:00 PM 57,344 Convolution3DYV12.dll
03/15/2007 03:00 PM 53,248 UnDot.dll
03/15/2007 03:00 PM 2,640 Readme_UnDot.txt
03/15/2007 03:00 PM 6,456 readme.html
03/15/2007 03:00 PM 90,112 EEDI2.dll
03/15/2007 03:00 PM 5,301 README.txt
03/15/2007 03:00 PM 86,016 Decomb.dll
03/15/2007 03:00 PM 10,929 DecombFAQ.html
03/15/2007 03:00 PM 45,802 DecombReferenceManual.html
03/15/2007 03:00 PM 11,992 DecombTutorial.html
03/15/2007 03:00 PM 319,488 LeakKernelDeint.dll
03/15/2007 03:00 PM 7,670 LeakKernelDeintHelp.html
03/15/2007 03:00 PM 704,596 TomsMoComp.dll
03/15/2007 03:00 PM 188,416 TDeint.dll
03/15/2007 03:00 PM 36,189 tdeint.htm
03/15/2007 03:00 PM 48,285 TFM - READ ME.txt
03/15/2007 03:00 PM 1,883 TIVTC - READ ME.txt
03/15/2007 03:00 PM 444,416 TIVTC.dll
03/15/2007 03:00 PM 2,233 FieldDiff - READ ME.txt
03/15/2007 03:00 PM 5,391 FrameDiff - READ ME.txt
03/15/2007 03:00 PM 2,319 MergeHints - READ ME.txt
03/15/2007 03:00 PM 3,296 RequestLinear - READ ME.txt
03/15/2007 03:00 PM 7,396 ShowCombedTIVTC - READ ME.txt
03/15/2007 03:00 PM 44,117 TDecimate - READ ME.txt
03/15/2007 03:00 PM 27,827 DGDecodeManual.html
03/15/2007 03:00 PM 221,262 DGDecode.dll
03/15/2007 03:00 PM 7,712 Readme_SimpleResize.txt
03/15/2007 03:00 PM 102,400 ColorMatrix.dll
03/15/2007 03:00 PM 454,656 NicAudio.dll
03/15/2007 03:00 PM 1,343 NicAudio_Readme.txt
03/15/2007 03:00 PM 61,440 SimpleResize.dll
rakan
14th April 2007, 18:26
Thanks for your help graysky,
I uninstalled Avisynth, uinstalled MeGUI, and reinstalled both and updated the plugins as you instructions.txt said and everything works.:)
graysky
14th April 2007, 19:19
Cool, I'll add your results when you post them. Be sure you're using the older version of x264.exe as mentioned in the instructions.txt so your results match the others in the table.
graysky
28th April 2007, 03:35
Updated the table with my new quad core system (Q6600); it's literally over 5x faster than my aged Athlon XP. Thanks to all who participated in this little thread that grew into a monster here!
graysky
28th April 2007, 13:51
But you should remember their TJunction is higher than C2Ds: 100C vs. 85C
Jeffy: what is your source for that statement?
jeffy
29th April 2007, 07:02
Jeffy: what is your source for that statement?
However, there is a problem. That reading although available in °C is _relative_ to the Tjunction which is CPU specific. For some models it is 100°C for others it is 85°C.
http://softwarecommunity.intel.com/isn/Community/en-US/forums/thread/30222546.aspx
(I) Core Temp 0.9X Tjunction 85c or 100c is Intel’s Tj max spec, is not a temp, and doesn't change.
http://forumz.tomshardware.com/hardware/Core-Duo-Temperature-Guide-ftopict221745.html
But to be clear, here is:
Some steppings of the mobile Intel® Core™2 processor do indicate Tj to be approximately 85 or 100 via a single bit in the EXT_CONFIG register (msr 0EEh) but desktop, workstation and server processors do not. Nor is there a register implemented in those processors that software can read to get the Tj value for either the Pentium® 4 processor, Intel® Xeon® processors or Intel® Core™2 processors.
http://softwarecommunity.intel.com/isn/Community/en-US/forums/post/30230975.aspx
The above statement is (according to http://softwarecommunity.intel.com/isn/Community/en-US/forums/permalink/30222546/30231056/ShowThread.aspx#30231056): "Note that the information in the post immediately before this one came from Intel, specifically from one of our internal contacts for the processor documentation, and is what we have been working with our friends on the hardware design side of Intel to get for you."
In all implementations the IA21_THERM_STATUS[22:16] value is relative to PROCHOT assertion and not an absolute temperature.
No, quite higher than that. Tjunction is 100 C on quads.
http://www.xtremesystems.org/forums/showthread.php?t=139857&page=2
Regarding the Intel's note: there probably isn't much more available, AFAIK, for a general public/whoever who didn't sign the NDA as a company.
So until there is more information publicly available, I have to take the value shown in CoreTemp as the correct value, which is 100°C for Intel quad-cores and 85°C for most Intel dual-cores.
graysky
29th April 2007, 16:42
Wow, as usual thanks kindly for the very detailed reply (and with references to boot!) Too bad this small piece of info isn't officially published somewhere. I've been shooting for a TAT temp of 65C which limits my o/c to 9x333 but I'm sure it could do 9x400 or more (others are doing it). Oh well, 9x333 is over 5x faster than my old PC :p
Blue_MiSfit
2nd May 2007, 07:56
I got a new 3800+ X2 Toledo today :)
Here's some benchies:
Old CPU (3500+, @ standard 2.2GHz)
http://img479.imageshack.us/img479/4651/3500stockmq2.png (http://imageshack.us)
New CPU (3800+ Dual Core @ standard 2.0GHz)
http://img267.imageshack.us/img267/8302/3800x2stocksl8.png (http://imageshack.us)
New CPU (3800+ Dual Core @ 2.45 GHz :))
http://img267.imageshack.us/img267/1365/3800at245ghzri5.png (http://imageshack.us)
I'm also (as a test) re-encoding some 1080p footage (QuickTime trailer, H.264 @ 10mbps) into CRF22 x264, high profile with all the usual fancy settings. Averaging 2fps using DGAVCDecode to feed AviSynth. This will be better soon (I think).
~MiSfit
delacroixp
2nd May 2007, 14:37
A bit similar to x264 multi-core (4+) threading optimization (http://forum.doom9.org/showthread.php?t=124557) ...
I'll certainly recommend the same at AutoMKV ...
@graysky
Gr8 rig...
:):D:eek:
Pascal
graysky
3rd May 2007, 02:36
Updated
Is it possible to post here the avs code of the best quality you have in Multithread (for a 8 cores, Clovertown), for a SD (720x576) video with wav sound ?
Filter :
Desenterlace
No Crop
Thx
graysky
12th May 2007, 18:06
@ogg: that's going to vary depending on the source. The key to using multithreads is the in the commandline for x264.exe.
The --threads auto switch should take care of it. Have a look at the job1-3.xml file in the rar to see what I mean. Have you tried the test encode contained in the first post of this thread?
the test encode contained in the first post of this thread
but the avs fil isn't for multithread : I don't see MT(2,0) and I don't know where I have to write it :(
I encode not very quickly with my octocore @ 2,9 :(:(:(
So, what can I do to optimise ?
My source is a Quicktime DV PAL (720x576 lower field) with .mov extension
graysky
13th May 2007, 12:41
Well, the avs IS for multithreaded. If you edit the job1-x.xml in notepad, you'll see the --threads auto switch as well as the <NbThreads>0</NbThreads> tag.
MT.dll will allow some plugins to go faster, but the actual x264.exe won't benefit from them.
Please post your FPS1 and FPS2 using the unmodified jobs files. If you look in the table, morph has a 8-core system with some data in there.
FPS1 = 81,86 / (80,17 with <NbThreads>8</NbThreads>)
FPS2 = 68,35 / (70,27 with <NbThreads>8</NbThreads>)
2x Xeon Clovertown 5330ES @2,66
2x1 Go Ram Crucial PC667 FBDIMM
Maxtor 146 Go 15k
Vista Ultimate 64b
morph has : fps1 = 143 / fps2 = 70 with the mt.dll
I will like to have the same performances by using mt.dll .... but I don't understand how to do :(
aicjofs
14th May 2007, 07:45
Grab MT.dll and modded avisynth here. 6th down I think.
http://avisynth.org/tsp/
then add it first line to test-NEW.avs in the "work" folder from graysky's first post of the thread.
SetMTMode(2,0)
global MeGUI_darx = 4
global MeGUI_dary = 3
DGDecode_mpeg2source("C:\work\test-new.d2v")
AssumeTFF()
Telecide(guide=1,post=2,vthresh=35) # IVTC
Decimate(quality=3) # remove dup. frames
crop( 2, 0, -10, -4)
Spline36Resize(640,480) # Spline36 (Neutral)
That's the simple way for this test. Each filter can use a different mode of the MT.dll and for true optimization would be set accordingly throughout the code on a per filter basis, but for a 4 or 8 core setup SetMTMode(2,0) is the easiest/quickest way to get a good boost. For an 8 core system in still seems x264 is limiting factor with the current implementation. There is some good work going on in that department up in the AVC section of the forum.
I did what you said
but I have an error :
Log for job job1-2
avis [error]: unsupported input format (DIB )
x264 [error]: could not open input file 'C:\work\test-NEW.avs'
----------------------------------------------------------------------------------------------------------
The current job contains errors. Skipping chained jobs
... what is the problem ?
foxyshadis
14th May 2007, 14:45
Open it up in virtualdub and find out. (DIB ) format means an avisynth error.
I don't understand : what files do you want me to open in virtual dub ??
jeffy
15th May 2007, 04:12
could not open input file 'C:\work\test-NEW.avs'
You should try to open the file 'C:\work\test-NEW.avs'.
graysky
16th May 2007, 02:24
@Ogg - updated the table in the first thread w/ your results. Also, did you get the mt version of avisynth to work?
virtualdub bugs too :
http://4u9ur.free.fr/images/VBError.gif
the mt.dll is in the avisynth plugin folder ....
0gg: you need to copy the supplied avisynth.dll file in the mt package to the windows\system32 directory and overwrite the default avisynth.dll file. (as the setmtfmode function is inside avisynth.dll)
ok so, on vista 64bits, the dll have to be in Windows\SysWOW64
job 1-1 : ok
job 1-2 :
fps up to 143.13 and at the middle of process, fps get down and then, nothing ! (with default video profile)
Starting job job1-2 at 20:45:41
Starting preprocessing of job...
Preprocessing finished!
encoder commandline:
--pass 1 --bitrate 2040 --stats "C:\work\test-NEW.stats" --bframes 3 --b-pyramid --direct auto --subme 1 --analyse none --vbv-maxrate 25000 --me dia --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --output NUL "C:\work\test-NEW.avs"
successfully started encoding
graysky
17th May 2007, 01:38
No idea man... what if you use a DVD source and do nothing but x264 like this:
DGDecode_mpeg2source("D:\source.d2v")
Does it use all 8 cores then?
I use this .avs (yours + SetMTmode)
SetMTMode(2,0)
global MeGUI_darx = 4
global MeGUI_dary = 3
DGDecode_mpeg2source("C:\work\test-new.d2v")
AssumeTFF()
Telecide(guide=1,post=2,vthresh=35) # IVTC
Decimate(quality=3) # remove dup. frames
crop( 2, 0, -10, -4)
Spline36Resize(640,480) # Spline36 (Neutral)
But, I need to know a thing :
What "Video Profile" do I have to use ?
=> in Default : at the middle of process, fps get down and then, nothing
=> CQ Lossless : ok it is fast (about 135 first pass and 70 second pass)
=> HQ Insane : very slow
graysky
17th May 2007, 02:12
But, I need to know a thing :
What "Video Profile" do I have to use ?
=> in Default : at the middle of process, fps get down and then, nothing
=> CQ Lossless : ok it is fast (about 135 first pass and 70 second pass)
=> HQ Insane : very slow
If you're doing the test the way you're supposed to, you don't need to use a video profile at all - it's hard coded in the jobs files.
ogg make sure you're using the latest version of MT:
http://www.avisynth.org/tsp/MT_07.zip
So, now, with many soft open and only 2x1 Go Ram dual channel :
http://4u9ur.free.fr/images/MeGUIMT07.gif
with MT07
2x Xeon Clovertown 5330ES 2,13 @2,9 Ghz
I think it could be better if it doesn't stepped down from 8.0x to 6.0x but always in 8.0x
And now :
MeGUI 0.24.1041
Avisynth 2.5.7
MT.dll v0.7
x264r655
http://4u9ur.free.fr/images/mt07x264r655.gif
8 cores are used between 97% to 100% !!
greeeeeaaattt
thank you everybody ;):thanks:
With same configuration hard & soft, but :
MeGui set in task manager to priority > Real Time
x264 set in task manager to priority > Real Time (twice : one for the first pass, one for the second pass)
http://4u9ur.free.fr/images/mt07x264r655RT.jpg
Next week, I update to 4x1 Go Ram Quad Channel (FBDIMM Memory on i5000x chipset)
graysky
17th May 2007, 20:21
I'm in the process of making a new test that uses no filters except the DGDecode_mpeg2source and loop. My plan is to upload a 7 second 720x480 23 fps progressive clip and have the avs loop it out to like 5-10 min. My curiosity is how the octa core systems (Ogg this means you) will handle it with no filters. MT.dll shouldn't help with this since all the processing power comes from x264.exe. Lemme work it out and upload it here.
Ogg: are you game to try it?
yeahh, with pleasure !
is it possible to play with these files too :
http://4u9ur.free.fr/procotest/index.mht
??
Quicktime 1'20" => .mov 230 Mo
Source : HDCAM, downscaled in QuickTime DV Lower field 25 fps
SetMTMode(2,0)
DirectShowSource("C:\procotest\LAOS.mov",fps=25,audio=false)
ConvertToYV12()
edeintted = last.AssumeBFF().SeparateFields().SelectEven().EEDI2(field=-1)
TDeint(order=0,full=false,edeint=edeintted)
#denoise
Video profiles = HQ Insane
FPS1 = 15,65
FPS2 = 15,41
graysky
17th May 2007, 21:02
OK.. here's the link to the new work file (it's just like the original): click here to download (http://home.insightbb.com/~pixels/test_encode/work2.rar)
Here is my result (q6600 @ 9x333 = 3.0 GHz). I'll make a table if we get some other results:
http://img375.imageshack.us/img375/5269/9x333hj0.jpg
My CPU usage was >99 % on the 2nd pass. Was yours on your 8 core system?
mitsubishi
17th May 2007, 21:33
Hi, I see you are talking about making some changes, should I be OK to run these tests tomorrow?
Well I say tomorrow, I rang up who I ordered some RAM off today and they hadn't noticed I'd sent in my payment, so hopefully I'll get it tomorrow, or it'll be Monday.
My mobo supports DDR and DDR2 (only upto 667 though)
But right now I have 2x512mb CL2.5 DDR400 and the new RAM is OCZ platinum DDR2-800 2x1GB. So I should be able to get some results showing the effect of memory speed on the whole thing. CPU is E6300, nice and cool with the zalman CNPS9700 I have on it, but won't get much of an overclock with this board.
Any suggestions what I should run? Stock at DDR-266,333,400 and DDR2-533, 667 with best and LCD timings and the same at max OC?
http://4u9ur.free.fr/images/work2.gif
8 Cores usage was between 86 -> 96 % on the 2nd pass.
graysky
17th May 2007, 22:28
Any suggestions what I should run? Stock at DDR-266,333,400 and DDR2-533, 667 with best and LCD timings and the same at max OC?
I dunno man, experiment with it would be my suggestion. I find that DRAM:CPU should be 1:1 for the best results. YMMV.
graysky
17th May 2007, 22:30
@Ogg: ok... the 2nd pass is really showing the power of those extra 4 cores. What was your CPU usage on the original (rich with filter) test?
As I said : http://forum.doom9.org/showpost.php?p=1004344&postcount=318
8 cores are used between 97% to 100% !!
graysky
17th May 2007, 22:37
Missed that, cool
http://forum.doom9.org/showpost.php?p=1004480&postcount=322 kill the workstation
mitsubishi
17th May 2007, 23:51
I dunno man, experiment with it would be my suggestion. I find that DRAM:CPU should be 1:1 for the best results. YMMV.
Yeah the plan was to run at 533 with as much OC as I can get (600 hopefully) and hope to get 3-3-3-10 timings out of the RAM at that until I get a better mobo.
But It'll be interesting to see what is best for x264 as that is the thing that most needs optimizing.
http://4u9ur.free.fr/images/MeGUI4Go.gif
and with 4x 1 Go (quadchannel mode on i5000x), 2x Xeon 5330 @ 2,9 ^^
morph166955
22nd May 2007, 17:12
Hey all, not sure how but I some how got unsubscribed to this thread a while back so I haven't seen the updates for it in my box and kinda forgot. Graysky, I noticed that you were wanting some 8-core benches from Ogg, anything I can do to help out with that on my 8-core? I'm running linux but I could either run avisynth via wine and then just pipe that to a raw x264 cli. I realize the results wont be perfectly identical but since megui is just a frontend for the x264cli/avisynth i dont see any reason that it wont work good enough for our purposes. It could also nicely show the linux vs windows differences on the same source/settings. I'm also contimplating loading XP64 into a vmware console on the linux box although I dont know if that will let me pass more then 2 CPU's to the system (my windows box only shows options for 2 CPU's as of right now). Lemme know!
graysky
22nd May 2007, 19:21
I think if you run it through WINE you won't get a true read since WINE has overhead associated with it. Plus I dunno how I'd comment your data in the table so that people who don't know about WINE/LINUX could understand it.
If you're game though, give it a shot and post your results. I'd be interesting to see.
graysky
22nd May 2007, 19:22
http://4u9ur.free.fr/images/MeGUI4Go.gif
and with 4x 1 Go (quadchannel mode on i5000x), 2x Xeon 5330 @ 2,9 ^^
I'd like to update the table with this result, but I don't understand it. You o/c'ed to 2.9 GHz from 2.6 GHz and went from 82 fps on pass 1 to more than double that?
anahita
23rd May 2007, 11:11
Cool ;)
You o/c'ed to 2.9 GHz from 2.6 GHz
My Xeon are 5330 = 2,16 Ghz
With BSEL Modification = 2,66 Ghz => http://forums.2cpu.com/showthread.php?t=77937
+ with Systool = 2,93 Ghz => http://www.techpowerup.com/systool/
went from 82 fps on pass 1 to more than double that
Before I had 2x1 Go = Dual Channel
Now : 4x1 Go = Quad Channel (on chipset i5000x)
FPS1 = 81,86 => without mt.dll, without last x264 (r655), without 4x1 Go (only 2x1 Go), no set priority
FPS1 = 163,13 => with MeGUI 0.24.1041, Avisynth 2.5.7, MT.dll v0.7, x264r655 set in real time in the task manager
morph166955
23rd May 2007, 16:16
mt.dll is the key to the whole thing. its equally evident on my passes. w/o mt.dll my system ran almost identically to ogg's non mt.dll pass. with it i hit double what I did with out it.
also Ogg, while the ram is running mildly better in quad channel mode, i doubt it did a whole lot speed wize. Also FYI, running x264 in realtime mode is actually bad and can slow it down. if x264 is preceding avisynth process wize it has the potential to slow down the whole thing instead of speeding it up! if your going to attempt to adjust processor priority I would kick x264, avisynth and megui (prob not important but cant really hurt) up to AboveNormal...MAYBE High although I think that would cause some issues also. Setting things in realtime while it sounds like it should give it the most priority actually can cause issues. Its not the equivalent of setting somethings nice value to -20 in linux at all if thats how your thinking it works. I normally just leave my stuff alone to handle itself becaues as long as your not running anything else thats a cpu hog when your doing the encode then the processes will get all the cpu time they need anyway.
mitsubishi
23rd May 2007, 16:22
Had a knightmare trying to get it to work, kept getting DGdecode mismatches. Didn't work until I deleted all traces of DGdecode (WTF!) yet still works now I put the included dll back.
Got 42.78 / 13.82 with current x264, and 40.3 / 13.18 with correct version.
Will do it all properly now I got it set up correctly and report back later.
mitsubishi
23rd May 2007, 18:06
Finished the DDR1 tests, not doing any overclocking with DDR.
Motherboard: ASRock 775Dual-VSTA
CPU: Intel E6300 B2
Gfx: 6800GT AGP
Ram: 2x512
Tested at SPD and lowest common denominator for each speed.
http://img504.imageshack.us/img504/3572/ddrresultsxa7.png
Clearly a difference at the lower speeds of DDR, will swap out the ram for DDR2 later and test. The only odd one is the first pass on the last test...
graysky
23rd May 2007, 19:55
http://img504.imageshack.us/img504/3572/ddrresultsxa7.png
Clearly a difference at the lower speeds of DDR, will swap out the ram for DDR2 later and test. The only odd one is the first pass on the last test...
Do you really think there's a difference here? One thing I never did was to run it like 10 times and calculate a SD and range to give error bars. At a first glance, it seems the DDR-400 is faster than the DDR-333 but without the error calculated, it's a tough call.
mitsubishi
23rd May 2007, 20:47
There certainly seems to be trend in the right direction. I did consider running a few iterations with the test being so short, the first pass in particular is bit too quick and will be knocked off by anything in the background, perhaps I should have set priority to high to be on the safe side.
If you look at the last three results for 2nd pass,
266 > 1.5% > 333 > 1.8% > 400
It's not much, but I think enough to say it is not experimental error, no-one would expect it to be much on such a CPU intensive operation. Even a few percent is welcome on long jobs.
I'm not expecting there to be a noticeable difference between 533 and 667, but we'll see.
graysky
23rd May 2007, 20:53
Could be statistically significant. I dunno for sure since I'm no statistician. I just wanted to raise the question with nothing implied either way. I think the major speed differences come from RAM timings but I don't have any data to back that statement up, just a hunch.
graysky
26th May 2007, 21:54
So I ran the test 10 times and found when the hardware is unchanged, the results were all within 3% of each other for pass1 and within 1-1/2% of each other for pass2. If your machine is like mine with those errors, if the delta FPS is within 3% on pass1 and/or 1-1/2% on pass2, the results are identical.
It'll be interesting if you do that same n=10 experiment and calculate your errors as well.
vjHeaven
28th May 2007, 00:33
My first desktop PC
CPU Type: AMD Athlon XP, 1666 MHz (12.5 x 133) 2000+
System Memory: 896 MB (DDR SDRAM @ 333Mhz)
http://img91.imageshack.us/img91/5493/56098685if1.jpg
My second desktop PC :D
AMD Athlon Thunderbird @ 1.20GHz (12.0 x 100.2)
Memory: 384MBytes (SDRAM @ 100Mhz)
http://img145.imageshack.us/img145/3934/72619801ro4.jpg
My Notebook (Acer TravelMate 2492NWLMi)
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz
Memory: 512MBytes DDR2, PC2-5300 (333 MHz)
http://img143.imageshack.us/img143/6949/12783526sg5.jpg
graysky
28th May 2007, 00:46
Cool, thanks for participating. I'll update the table in tonight or tomorrow.
graysky
28th May 2007, 10:47
...updated
morph166955
28th May 2007, 17:55
@graysky
On the mt.dll image at the bottom of the first post it says "morph166955 used this avisynth line" but has nothing next to it. I used the same line aicjofs used to keep things uniform for the tests "SetMTmode(2,0)". Just figured I'd let ya know for the next time your updating the images if you wanted to include that.
graysky
28th May 2007, 23:28
I'll correct that, thanks for catching it.
simonhowson
5th June 2007, 06:33
I'm considering upgrading my computer so that it is faster at backing up DVDs to x264. I have over 1000 DVDs, so I want this process to be as fast as possible.
Specifically, I intend to use the MeGUI iPod video 5.5 profile, with the image 640 pixels wide, and an average bitrate of 1000 Kbps for the video, and ~100 Kbps AAC for the audio.
I'm wondering which of the following CPUs would be faster for encoding with these settings, either the Core 2 Quad 6600 (4 X 2.40 GHz) or a Core 2 Duo E6850 3 GHz (specifically the 9 x 333 MHz bus version, set for release on July 22nd).
Apparently on July 22nd both of these two CPUs will be the same price ~US$266. I'm just unsure if I should go for 2 cores @ 3 GHz on a 1333 MHz bus. Or 4 cores at 2.4 GHz, on a 1066 MHz bus. I don't intend to overclock, I just want the best performance at stock settings.
The closest comparison I can find shows the C2D E6700 at 9 x 333 has an encode ratio of 1.38:1. Whereas the C2Q 6600 at stock settings has an encode ratio of 1:1.06 (i.e., slightly faster than real time), which means the Core 2 Quad 6600 is about 31% faster.
Would encoding using the iPod 5.5 profile have similar results? Or does the benefit of Quad Core really depend on what encoding settings are in use at the time, thus making it hard to predict how much benefit the 2 extra cores are, and how this would compare to a faster dual core CPU?
ditche
6th June 2007, 11:42
C2Q. :p
http://img402.imageshack.us/img402/7949/couper6dr7.png
http://img525.imageshack.us/img525/8574/couper7ge5.png
graysky
7th June 2007, 01:31
I'm considering upgrading my computer so that it is faster at backing up DVDs to x264. I have over 1000 DVDs, so I want this process to be as fast as possible.
Specifically, I intend to use the MeGUI iPod video 5.5 profile, with the image 640 pixels wide, and an average bitrate of 1000 Kbps for the video, and ~100 Kbps AAC for the audio.
I'm wondering which of the following CPUs would be faster for encoding with these settings, either the Core 2 Quad 6600 (4 X 2.40 GHz) or a Core 2 Duo E6850 3 GHz (specifically the 9 x 333 MHz bus version, set for release on July 22nd).
Apparently on July 22nd both of these two CPUs will be the same price ~US$266. I'm just unsure if I should go for 2 cores @ 3 GHz on a 1333 MHz bus. Or 4 cores at 2.4 GHz, on a 1066 MHz bus. I don't intend to overclock, I just want the best performance at stock settings.
The closest comparison I can find shows the C2D E6700 at 9 x 333 has an encode ratio of 1.38:1. Whereas the C2Q 6600 at stock settings has an encode ratio of 1:1.06 (i.e., slightly faster than real time), which means the Core 2 Quad 6600 is about 31% faster.
Would encoding using the iPod 5.5 profile have similar results? Or does the benefit of Quad Core really depend on what encoding settings are in use at the time, thus making it hard to predict how much benefit the 2 extra cores are, and how this would compare to a faster dual core CPU?
Bus speed really doesn't matter; see the first post of the thread. If you do a lot of x264, and it sounds like you do, the quad is the tool you want and will out perform the dual every time since there are 4 cores vs. 2 cores, and since x264 does a great job using all 4 cores in parallel.
Also, why get a Q6600 and run it @ stock values of 9x266? I've had mine clocked @ 9x333 (1333 quad pumped) for months. A long as you have decent cooling and keep your vcore low it should be fine.
graysky
7th June 2007, 01:32
Thanks for the graphic, ditche!
dancho
7th June 2007, 15:14
http://img528.imageshack.us/img528/9884/cpubb3.jpg (http://imageshack.us)
http://img403.imageshack.us/img403/7609/memko1.jpg (http://imageshack.us)
http://img201.imageshack.us/img201/4456/resultfs7.jpg (http://imageshack.us)
memory is DDR2 2*1GB PC6400 (800MHz) OCZ platinum XTC
mbo is Gigabyte N650SLI-DS4
all default and slightly better results....:rolleyes:
chainring
7th June 2007, 17:00
Nothing special here:
Acer Travelmate 8204 (laptop)
Tests done with "work2".
ditche
7th June 2007, 17:41
Thanks for the graphic, ditche!
Edited. :)
legoman666
7th June 2007, 20:04
finally got around to running this. Results could probably be better, computer hasn't been rebooted in 22 days.
http://img76.imageshack.us/img76/6677/cpuspeedkl3.jpg
http://img233.imageshack.us/img233/9499/meguioz2.jpg
As a side note, the .d2v file you included in the test rar was made using an old version of dgindex. To make it work with the new version, just open up the .d2v file in notepad and change the first line from "DGIndexProjectFile13" to "DGIndexProjectFile16" Besides that first line, the files are idenical (the one made with the new version and the one made with the old version.)
graysky
9th June 2007, 12:14
@legoman: yeah, I think you're using the "dev" version of MeGUI. It downloads, and updates to version 0.2.4.1041 and if you notice when you run an update, all the plugins,apps, etc. are the dev (unstable) versions. Perhaps a bug?
I'll add the most recent results from you, chainring, and dancho in a few...
ditche
9th June 2007, 18:45
I want to restart the test... but which version of x264 file have I to use ?
And where can I found this file, lapse of memory... :p
deets
9th June 2007, 18:48
download the older version of x264.exe from the below url and then manually copy it into your MeGUI\tools\x264 directory
http://mirror01.x264.nl/x264/?dir=./revision620
from the instructions :P
ditche
9th June 2007, 19:22
Yep, thank you, but it seems I've an old version of the file instructions.txt (02/25/07), because I don't see what you say. :)
ditche
9th June 2007, 19:37
Graysky, I think you can erase my bench with my A64 X2 4200+ @ 2.53 GHz...
RAM and CPU weren't synchronised...
Now RAM & CPU are at the same FSB : 220 MHz (before A64 @ 230), and the results are slightly better :
http://img354.imageshack.us/img354/6889/couperyd7.png
http://img354.imageshack.us/img354/7863/couper2ms2.png http://img354.imageshack.us/img354/7433/couper3um9.png
:)
graysky
10th June 2007, 02:49
OK! Finally updated the table... except for you, ditche. I'll get yours in there tomorrow when I have time.
Dr.Khron
11th June 2007, 03:18
Here you go. First my new work laptop:
http://img528.imageshack.us/img528/8048/notebookrun3st5.jpg
http://img148.imageshack.us/img148/6135/cpuzlaptop1hs4.jpghttp://img201.imageshack.us/img201/82/cpuzlaptop2wg7.jpg
Now for the old Athlon desktop, with 0% OC and 25% OC:
http://img149.imageshack.us/img149/4082/desktop0ocev9.jpg
http://img149.imageshack.us/img149/9654/desktop25oczp5.jpg
http://img244.imageshack.us/img244/5744/cpuz0oc1wt0.jpghttp://img505.imageshack.us/img505/8638/cpuz0oc2mg0.jpg
EDIT:
Both OC settings were with the memory = 100% of FSB
ditche
11th June 2007, 11:29
Dr.Khron : We need to see exactly the duration of your first test. :)
So, your 2100+ oc is @ 2,17 GHz ?
Graysky : OK, now I'm trying with a Sempron 2600+. :cool:
Dr.Khron
11th June 2007, 12:48
Um, why would you need to see the duration?
The FPS calc has more sig figs then the whole seconds provided by the start/stop times... just multiply by the number of frames to figure the the time elapsed.
If its that important to you, I'll have to run the test again.
As for the OC speed, I dunno, I didn't take a screenshot of CPUZ for OC'ed state. I would imagine its just:
(133 + 1/3 ) x 13.0 = 1,733
(166 + 2/3 ) x 13.0 = 2,167
As I said, I just cranked up the FSB and left all the other timings at stock, including the mem bus = 100% FSB.
Edit:
Oh DUH! 2,167 Mhz = 2.17 Ghz.... your euro-style use of commas as decimals confused me.
ditche
11th June 2007, 13:59
http://img523.imageshack.us/img523/1073/nouvelleimagemm3.png
http://img523.imageshack.us/img523/7079/nouvelleimage1ig2.png http://img523.imageshack.us/img523/7586/nouvelleimagerg5.png
ditche
11th June 2007, 18:34
Um, why would you need to see the duration?
The FPS calc has more sig figs then the whole seconds provided by the start/stop times... just multiply by the number of frames to figure the the time elapsed.
If its that important to you, I'll have to run the test again.
OK, don't run the test again, 53 sec and 158 sec for the 2 tests. :)
delacroixp
13th June 2007, 14:35
Um, why would you need to see the duration?
The FPS calc has more sig figs then the whole seconds provided by the start/stop times... just multiply by the number of frames to figure the the time elapsed.
FPS as in indication of speed is only valid if it's entirely constant...
A car (automobile) travelling at 90 kmh (55 mph) for 30 minutes and then 50 kmh (31 mph) uphill for another 30 minutes has an average speed of 70kmh (43mph).
For short encodes it's probably irrelevant but longer encodes are rarely consistent or constant...
I'm actually looking for an app that maps FPS over time... during H264 encodes... in the form of a graph.
:):D:eek:
Pascal
simonhowson
15th June 2007, 14:50
Bus speed really doesn't matter; see the first post of the thread. If you do a lot of x264, and it sounds like you do, the quad is the tool you want and will out perform the dual every time since there are 4 cores vs. 2 cores, and since x264 does a great job using all 4 cores in parallel.
Also, why get a Q6600 and run it @ stock values of 9x266? I've had mine clocked @ 9x333 (1333 quad pumped) for months. A long as you have decent cooling and keep your vcore low it should be fine.
Yeah the Q6600 is the better choice. But now the new penryn 45 nm core is on the horizon, so I'm thinking of waiting for that (133 MHz bus, 6 MB cache per core + SSE4 extensions)
graysky
15th June 2007, 23:43
@simon: SSE4 is sort of an unknown. Might take a while before x264.exe uses them. 2 extra megs per core might translate into to something tangible... tough to say. Finally, 1333 MHz FSB is doable on today's hardware.
simonhowson
16th June 2007, 15:51
@simon: SSE4 is sort of an unknown. Might take a while before x264.exe uses them. 2 extra megs per core might translate into to something tangible... tough to say. Finally, 1333 MHz FSB is doable on today's hardware.
Intel are claiming that all SSE instructions will receive some speed boost, but who knows if that is just marketing hype.
I have looked at some early penryn benchmarks, and it looks like it will be 10 - 20% faster at the same clock speed when compared to current Core 2 Duos. Add to that the fact the 45 nm process means that the CPUs will start at higher frequencies means this could be a great CPU for video encoding.
I just hope they introduce an entry level quad core part, like the Q6600. Since Intel only have 1 45 nm fab, I fear that they will introduce those CPUs at the high end, but keep the lower end parts as 65 nm CPUs, well, at least until their two new 45 nm fabs open (one in Arizona and one in Israel).
graysky
16th June 2007, 18:33
I've seen those data as well. I think it may take 6-9 mo. after their launch before you see SSE4 incorporated into x264. Have a look at this (http://forum.doom9.org/showthread.php?t=124885&highlight=SSE4) thread for example.
legoman666
18th June 2007, 02:59
@legoman: yeah, I think you're using the "dev" version of MeGUI. It downloads, and updates to version 0.2.4.1041 and if you notice when you run an update, all the plugins,apps, etc. are the dev (unstable) versions. Perhaps a bug?
I'll add the most recent results from you, chainring, and dancho in a few...
I set MeGUI to download beta versions of the tools because I knew there were newer versions of several of the apps out there and the MeGUI auto update wasn't downloading them. For example, the new version of DGIndex has support for .ts files.
delacroixp
18th June 2007, 08:38
Intel are claiming that all SSE instructions will receive some speed boost, but who knows if that is just marketing hype.
The speed boost may only amount to 1% in certain instances... but even 1% is worth pursuing... in the quest for the edge...
:):D:eek:
Pascal
simonhowson
20th June 2007, 18:23
According to this page (http://www.fudzilla.com/index.php?option=com_content&task=view&id=1548&Itemid=1) the first lot of Penryn Quad Core CPUs (code named Wolfdale) are going to run on the 1066 MHz bus, whereas even the new Conroe based Core 2 Duos (due for release on July 22nd) are going to run on a 1333 MHz bus. I wonder if this means early Penryn based Quad cores are going to run significantly slower than Penryn based dual cores?
graysky
20th June 2007, 20:49
@simonhowson - I dunno man. Just because the FSB is faster doesn't mean the chip will be faster. I will be very interested in seeing someone w/ one of these new chips run the MeGUI test...
Ronin-7
23rd June 2007, 15:59
My results done on Vista x64|Q6600 at 3Ghz|4GB DDR-2 800 {4-4-4-12}|MSI P35 Platinum. I used the MT.dll as well for these.
http://homepage.eircom.net/~rock2002/pictures/MeGUI-Bench.JPG
http://homepage.eircom.net/~rock2002/pictures/CPU-Z%20Bench.JPGhttp://homepage.eircom.net/~rock2002/pictures/CPU-Z%20RAM%20Bench.JPG
graysky
23rd June 2007, 22:08
You did have mt.dll enabled for these results? They are inline with my 9x333 numbers wo/ mt.dll; when I ran it w/ the mt.dll enabled, my pass-1 jumped up to about 114 fps. What was the syntax you used?
Ronin-7
24th June 2007, 08:30
Yeah I was using the mt.dll 0.7, the first time I ran it I got 97fps but I forgot to take a screen shot, the second time I ran it I got 92fps on the first pass wondering if there was some sort of variable in performance I re-ran the test 4 more times and it always came back at 92.xx fps
Do you need to do anything special for mt.dll I thought you just dropped it in over avisynth and MeGUI takes care of the rest or am I misunderstanding it's usage ?
graysky
24th June 2007, 14:31
You need to correctly install the mt.dll (/avisynth/plugins) as well replace the avisynth.dll in /windir/system32 and add the following to the avs as the first line:
SetMTmode(2,0)
Ronin-7
27th June 2007, 15:40
You need to correctly install the mt.dll (/avisynth/plugins) as well replace the avisynth.dll in /windir/system32 and add the following to the avs as the first line:
SetMTmode(2,0)
I see thanks for the heads up, a new bios for the MSI P35 is out as well which improves things a small bit more so I might give that a go too.
graysky
7th July 2007, 15:59
I was just going through the data and noticed that we have no FX-60 or FX-anything examples... anyone out there w/ this chip care to be the first?
ditche
9th July 2007, 11:17
True, and there is not A64 X2 w/ 1 Mo L2, like 4400+ or 4800+, nor of A64 X2 AM2 > 4600+... :(
dansus
11th July 2007, 00:46
wow that comparison graph of the chips is music to my eyes, i was debating whether to get a QX6600 or 6700, but it seems the 66 OC is good enough to do the job and its not really worth splashing out the extra cash for the 67, especially now the intels are round the corner.
Im running 3700+ at the mo, with 25/7 fps. Hope i can see something near the 100/45 fps mark.
graysky
12th July 2007, 00:16
Glad you found it useful. I was thinking about re-doing the table... errr adding an additional simplified table that just shows the scalability of a single chip:
core 2 duo 6600-6700 up through clock ranges
core 2 quad 6600-6700 up through clock ranges
Like that. Really there 6600 or 6700 is just a model number. It might be easier to think of in terms of the model (c2d or c2q) and a clockrate.
ditche
12th July 2007, 17:44
Updated.
http://img508.imageshack.us/img508/5475/couper9ah5.png
http://img508.imageshack.us/img508/7747/couper10zg5.png
http://img508.imageshack.us/img508/964/couper11ot3.png
(*) = OC
And without oc :
http://img508.imageshack.us/img508/8092/couper12nx9.png
dansus
12th July 2007, 22:30
im wandering why the xeon's have a more even fps over the two passes compared to the others?
aicjofs
12th July 2007, 22:50
Those are 8 CPU core systems. This leads to the first pass not scaling as it does for a single/dual/quad system. Some discussion if x264 is not entirely optimised for such a setup here.
http://forum.doom9.org/showthread.php?t=124557
I haven't read that thread in awhile but I think there was a few test builds that made the 1st pass for the 8 core system more on par with what we are use to seeing in the other data.
Escaflowne
14th July 2007, 22:38
Hi, I find this thread is very interesting and I appreciate your efforts. I alwalys looked for benchmarks showing the capabilities of CPUs in encoding videos. I hope this thread will be continuously updated also for future generations of CPUs.
After running the test I got no FPS results in the queue tab but I found them in the log. So I will add the log instead of a screenshot. Sorry for that.
My Notebook (Asus Z53M series )
AMD Turion 64 X2 TL-50 2x1600Mhz - stock speed - 1024 meg DDR2-667
FPS1 = 34.10
FPS2 = 10.90
My old desktop PC:
AMD Duron 1200Mhz - stock speed - 640 meg SD-Ram PC-133
FPS1 = 9.59
FPS2 = 2.74
My second Desktop PC:
Intel Pentium 4 1800Mhz - stock speed - 512 meg DDR-Ram 333Mhz
FPS1 = 10.01
FPS2 = 3.03
I was stunned when i saw the result of my old Duron compared with the result of the P4. The AMD Duron is much more efficient... . Seemingly the Intel CPUs were really not that powerful at that time.
ditche
22nd July 2007, 13:46
x264 r620 ??
miglet
25th July 2007, 00:36
Someone help me fix this error I get, fresh MeGUI install, replaced the x264.exe with correct version, log:
Calling setup of processor failed with error The file C:\work\test-NEW.avs cannot be opened.
Error message for your reference: MPEG2Source: DGIndex/DGDecode mismatch. You are picking up
a version of DGDecode, possibly from your plugins directory,
that does not match the version of DGIndex used to make the D2V
file. Search your hard disk for all copies of DGDecode.dll
and delete or rename all of them except for the one that
has the same version number as the DGIndex.exe that was used
to make the D2V file.
(C:\work\test-NEW.avs, line 3)
graysky
25th July 2007, 00:38
Yeah, you're using a newer version of DGDecode than the one that was used at the time I started that test. If you download evrsion 1031 of MeGUI and make sure you auto update to the stable version, not development version it should work.
miglet
26th July 2007, 15:37
Got it sorted now, and with my new Q6600 posted times on a par with other Q6600s, so I'm damn pleased.
graysky
26th July 2007, 20:31
@miglet - glad to hear it... do you know which revision your q6600 is? Load up CPU-Z and have a look at the first screen: mine is a B3 as you can see.
http://img263.imageshack.us/img263/4551/ooey0.gif
legoman666
28th July 2007, 07:09
Updated results again. I dropped my HTT multipler to 3x so I could ramp up the CPU speed some more. Running at 2750mhz now as opposed to 2600mhz previously. Ram is also at 229 instead of 217mhz. It'll post all the way up to 2790mhz, but won't boot into windows. I guess that's the limit of my poor little 3800+ AMD X2. When I set it to 2800mhz, it doesn't even post at all. Not too shabby for a chip that runs at 2.0ghz default though.
http://img63.imageshack.us/img63/479/meguitest3ox2.jpg
ditche
28th July 2007, 13:43
legoman666 > I don't understand...
Your 1st pass equal ± a CD2 E6700, and your 2nd pass equal ± a C2D E6700 @ 3,7 GHz...
:confused:
legoman666
28th July 2007, 14:50
I know. I was curious about that too, but I ran the benchmark 3 times to confirm what I was seeing.
ditche
20th August 2007, 17:06
Up. :)
graysky : You don't update your results ? :)
Episodio1
11th September 2007, 03:40
Here with a x2 +5600.
Waiting for links to be repost (send me a PM to alert me). ^_^
graysky
11th September 2007, 08:39
@episodio1 - I have released a newer, "better" benchmark that the people at techarp were kind enough to host for me. Here (http://www.techarp.com/showarticle.aspx?artno=442&pgno=0) is the url to download it/see results. If you don't wanna reg. over there, just PM me your data here.
ditche
11th September 2007, 18:30
Fun, see you there. :)
legoman666
12th September 2007, 01:22
pm'ed ya my results.
graysky
12th September 2007, 08:50
Thanks for the results all. I started a new thread here (http://forum.doom9.org/showthread.php?p=1044126#post1044126) so you don't have to PM me the results... feel free to post them in that thread.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.