Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
10th July 2009, 07:54 | #1 | Link |
Hardware Aspirin
Join Date: Jul 2007
Posts: 461
|
CUDA H.264 vs x264 Speed and Image Quality Benchmarks, discussion
Hardware
Software
Source Videos VforVendetta 2000Kbps 1280x720 Clip Mediainfo on Source Code:
Video ID : 2 Format : AVC Format/Info : Advanced Video Codec Format profile : High@L4.0 Format settings, CABAC : Yes Format settings, ReFrames : 4 frames Muxing mode : Container profile=Unknown@4.0 Codec ID : V_MPEG4/ISO/AVC Duration : 1mn 52s Nominal bit rate : 2 000 Kbps Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 2.35 Frame rate : 23.976 fps Resolution : 24 bits Colorimetry : 4:2:0 Scan type : Progressive Bits/(Pixel*Frame) : 0.091 Writing library : x264 core 65 r999+1 eb3ef1b Encoding settings : cabac=1 / ref=4 / deblock=1:-1:-1 / analyse=0x3:0x113 / me=tesa / subme=9 / psy_rd=1.0:0.0 / mixed_ref=1 / me_range=32 / chroma_me=1 /trellis=0 / 8x8dct=1 / cqm=0 / deadzone=4,4 / chroma_qp_offset=-2 / threads=3 / nr=0 / decimate=0 / mbaff=0 / bframes=3 / b_pyramid=1 / b_adapt=2 / b_bias=0 / direct=3 / wpredb=1 / keyint=250 / keyint_min=25 / scenecut=40(pre) / rc=2pass / bitrate=2000 / ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 / qpstep=4 / cplxblur=20.0 / qblur=0.5 / ip_ratio=1.40 / pb_ratio=1.30 / aq=1:1.00 Encoding Settings CUDA GPU MeGUI x264 Preset ultrafast used hereCode:
program --bitrate 800 --no-mixed-refs --bframes 1 --no-weightb --direct temporal --nf --no-cabac --subme 1 --partitions none --scenecut 0 --me dia --threads auto --thread-input --aq-mode 0 --output "output" "input" --subme 0 --preset ultrafast Speed Results
Image comparison Source x264 @ 800 Kbps CUDA GPU @ 800 Kbps Output File Mediainfo x264 @ 800 Kbps Code:
Video ID : 1 Format : AVC Format/Info : Advanced Video Codec Format profile : Main@L3.1 Format settings, CABAC : No Format settings, ReFrames : 2 frames Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 1mn 52s Bit rate mode : Variable Bit rate : 900 Kbps Nominal bit rate : 800 Kbps Maximum bit rate : 2 229 Kbps Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16/9 Frame rate mode : Constant Frame rate : 23.976 fps Resolution : 24 bits Colorimetry : 4:2:0 Scan type : Progressive Bits/(Pixel*Frame) : 0.041 Stream size : 12.1 MiB (100%) Writing library : x264 core 68 r1179M 96e2229 Encoding settings : cabac=0 / ref=1 / deblock=0:0:0 / analyse=0:0 / me=dia / subme=0 / psy_rd=0.0:0.0 / mixed_ref=0 / me_range=16 / chroma_me=1 / trellis=0 / 8x8dct=0 / cqm=0 / deadzone=21,11 / chroma_qp_offset=0 / threads=6 / nr=0 / decimate=1 / mbaff=0 / bframes=1 / b_pyramid=0 / b_adapt=1 / b_bias=0 / direct=1 / wpredb=0 / keyint=250 / keyint_min=25 / scenecut=0 / rc=abr / bitrate=800 / ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 / qpstep=4 / ip_ratio=1.40 / pb_ratio=1.30 / aq=0 CUDA GPU @ 800 Kbps Code:
Video ID : 1 Format : AVC Format/Info : Advanced Video Codec Format profile : High@L5.1 Format settings, CABAC : Yes Format settings, ReFrames : 2 frames Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 1mn 52s Bit rate mode : Variable Bit rate : 869 Kbps Maximum bit rate : 1 776 Kbps Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16/9 Frame rate mode : Constant Frame rate : 23.976 fps Resolution : 24 bits Colorimetry : 4:2:0 Scan type : Progressive Bits/(Pixel*Frame) : 0.039 Stream size : 11.6 MiB (100%) Shows that GPU temperature rose by 6 C when encoding. CPU Usage was almost 95% on all 4 cores during the encode. More soon... Suggest settings and comparisons you would like to see. EDIT: Thank you for you suggestion guys. As I said in my OP, that there is more to come with different sources at different resolutions. I'm not trying to advertise anyone here. Just feeding my curiosity to see what kind of encoding does the free CUDA encoder do and probably helping others feeling the same way in the process. The settings in this encode were used to test the pure speed of x264, to see if it could be as fast as GPU encode at similar or better quality. Since that didn't happen, I will try to match the quality and see what is the difference in performance. As to the hardware used, this is the best thing I have access to right now. Also as others said GTS 250 is a last generation GPU based on the G92 chip used in 8800GTS 512 MB, 9800GTX, 9800 GTX+. Also I may use Badaboom and MediaShow Espresso in future if time permits. Also I plan on using this video 1080p VBR Video Quality Test Streams Sony HDW-F900 footage, 1080p@25, 18 Mbps average, 30 Mbps peak in a 35 Mbps Transport Stream (259,534,064 bytes) on this page http://www.w6rz.net/ as one of the sources. Please let me know if there is another uncompressed source you would like me to try. I'm not too sure about the questions regarding the decoder, I only have ffdshow+haali media splitter installed on my system. I would really appreciate if you let me know If i need to change something with the decoders.
__________________
Stream video on web in your quality Last edited by St Devious; 10th July 2009 at 15:03. |
10th July 2009, 08:56 | #3 | Link |
Straight to video
Join Date: Jun 2005
Posts: 637
|
I wonder what settings MediaCoder used to arrive at these benchmarks?
__________________
. |
10th July 2009, 08:59 | #4 | Link |
MediaCoder author
Join Date: Sep 2005
Location: Shanghai
Posts: 65
|
Absolutely. Though the decoding is done in the separate process and there is a large ring-buffer to connect decoder and encoder, the decoding is still a bottleneck on a fast multi-core processor. Fortunately mplayer-mt/ffmpeg-mt has multi-threaded H.264 decoding.
__________________
When things work together, things work. MediaCoder makes audio and video things work. |
10th July 2009, 09:03 | #5 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Why did you use such retardedly low quality settings with x264?
At a minimum your goal should be to match the quality of the two; it makes no sense to test two encoders against each other in terms of speed at vastly different compression settings. Also, if you're going to compare two encoders, use raw video input, not a highly compressed H.264 stream whose decoding method differs between the two encoders.
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 10th July 2009 at 09:05. |
10th July 2009, 09:06 | #6 | Link |
MediaCoder author
Join Date: Sep 2005
Location: Shanghai
Posts: 65
|
I think Q9450 and GTS250 is not the hardware of the same level, at least not the same price. ;-)
__________________
When things work together, things work. MediaCoder makes audio and video things work. |
10th July 2009, 09:14 | #8 | Link |
MediaCoder author
Join Date: Sep 2005
Location: Shanghai
Posts: 65
|
Actually x264 do generate better quality when it is configured for maximum quality, but that will also make the transcoding extremely slow.
__________________
When things work together, things work. MediaCoder makes audio and video things work. |
10th July 2009, 09:19 | #10 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Of course, it doesn't help that he even turned off subpixel motion vectors though... his settings are completely ridiculous.Yes, and he posted it in a very official-looking fashion despite the entire thing being done in a completely idiotic and haphazard manner. I mean seriously: 1. Pick a GPU that's more costly and faster than the CPU. 2. Go out of the way to pick the worst possible settings for x264 (literally!) 3. Use two different decoders to feed the different encoders. 4. Show how the GPU encoder looks so much better than x264. I'm not going to bother with this anymore as it's clear that this guy is here solely to try to advertise crappy encoders by performing intentionally bad tests.
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 10th July 2009 at 09:22. |
|
10th July 2009, 09:30 | #11 | Link |
MediaCoder author
Join Date: Sep 2005
Location: Shanghai
Posts: 65
|
I've published the x264 options in my benchmark. Under this configuration, both encoders have near (x264 is slightly better) output quality. The CPU I used costs about US$ 195, the GPU (display adapter with 896MB onboard GDDR3) I used costs about US$ 235.
PS1: For serious encoding, I myself use x264. PS2: I don't know and have no relationship with St Devious and I don't think he can benefit anything by advertising the crappy encoder. I just saw this post by a back-link to my blog.
__________________
When things work together, things work. MediaCoder makes audio and video things work. Last edited by stanleyhuang; 10th July 2009 at 09:44. |
10th July 2009, 10:13 | #13 | Link |
MediaCoder author
Join Date: Sep 2005
Location: Shanghai
Posts: 65
|
He might just want x264 to work as fast as it can.
__________________
When things work together, things work. MediaCoder makes audio and video things work. |
10th July 2009, 11:44 | #16 | Link |
Registered User
Join Date: Jun 2005
Posts: 278
|
Probably due to decoding speed. Unfortunately it is unclear which decoders were used. If either the source was uncompressed or at least DXVA+readback was used for x264 or the source used some format that the GPU can't accelerate it might make some sense as an encoder comparison so far the main conclusions is: A fast GPU can decode H.264 a lot faster than a slow CPU with some random (probably single-threaded) decoder. Not exactly news. And not in any way related to x264.
|
10th July 2009, 11:53 | #17 | Link | |
Registered User
Join Date: Apr 2008
Posts: 1,181
|
Quote:
|
|
10th July 2009, 12:06 | #18 | Link | |
Registered User
Join Date: Aug 2006
Location: Stockholm/Helsinki
Posts: 805
|
Quote:
And about picking the "worst possible settings", he's trying to match the speeds, not the quality. However, it doesn't say if x264 was able to use all cores. It says "95% on all 4 cores", was this during the x264 encode? And why only show a picture of the last core? There's a nice picture of the task manager with all 4, separately or combined. Not to mention what decoder was used (use uncompressed), and why not use the source of the source? |
|
10th July 2009, 13:07 | #19 | Link | |
Registered User
Join Date: Jun 2005
Posts: 278
|
Quote:
That is DXVA2 only though... |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|