View Single Post
Old 10th July 2009, 07:54   #1  |  Link
St Devious
Hardware Aspirin
 
Join Date: Jul 2007
Posts: 461
CUDA H.264 vs x264 Speed and Image Quality Benchmarks, discussion

Hardware
  • Intel Q9450 @ 3.2 Ghz
  • 4GB RAM @ 800 MHz 5-4-4-12 2T
  • GTS 250 512 MB @ 738/1836/1100 MHz (Core/Shader/Memory)



Software
  • Nvidia GeForce 186.18 WHQL Drivers
  • MeGUI 0.3.1.1047
  • x264 r1178 Jeeb's Build
  • Mediacoder 0.7.1.4475 for CUDA GPU encoding


Source Videos
VforVendetta 2000Kbps 1280x720 Clip

Mediainfo on Source
Code:
Video
ID                               : 2
Format                           : AVC
Format/Info                      : Advanced Video Codec
Format profile                   : High@L4.0
Format settings, CABAC           : Yes
Format settings, ReFrames        : 4 frames
Muxing mode                      : Container profile=Unknown@4.0
Codec ID                         : V_MPEG4/ISO/AVC
Duration                         : 1mn 52s
Nominal bit rate                 : 2 000 Kbps
Width                            : 1 280 pixels
Height                           : 720 pixels
Display aspect ratio             : 2.35
Frame rate                       : 23.976 fps
Resolution                       : 24 bits
Colorimetry                      : 4:2:0
Scan type                        : Progressive
Bits/(Pixel*Frame)               : 0.091
Writing library                  : x264 core 65 r999+1 eb3ef1b
Encoding settings                : cabac=1 / ref=4 / deblock=1:-1:-1 / analyse=0x3:0x113 / me=tesa / subme=9 / psy_rd=1.0:0.0 / 
mixed_ref=1 / me_range=32 / chroma_me=1 /trellis=0 / 8x8dct=1 / cqm=0 / deadzone=4,4 / chroma_qp_offset=-2 / threads=3 / 
nr=0 / decimate=0 / mbaff=0 / bframes=3 / b_pyramid=1 / b_adapt=2 / b_bias=0 / direct=3 / wpredb=1 / keyint=250 / keyint_min=25 / 
scenecut=40(pre) / rc=2pass / bitrate=2000 / ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 / qpstep=4 / cplxblur=20.0 / qblur=0.5 / ip_ratio=1.40
 / pb_ratio=1.30 / aq=1:1.00
Encoding Settings
CUDA GPU



MeGUI x264
Preset ultrafast used here
Code:
program --bitrate 800 --no-mixed-refs --bframes 1 --no-weightb --direct temporal --nf --no-cabac --subme 1 --partitions none --scenecut 0 --me dia 
--threads auto --thread-input --aq-mode 0 --output "output" "input" --subme 0 --preset ultrafast
Speed Results
  • CUDA GPU H.264 - 23.5s 114.4 FPS
  • x264 preset ultrafast - 30s 90.8 FPS

Image comparison

Source



x264 @ 800 Kbps



CUDA GPU @ 800 Kbps


Output File Mediainfo

x264 @ 800 Kbps

Code:
Video
ID                               : 1
Format                           : AVC
Format/Info                      : Advanced Video Codec
Format profile                   : Main@L3.1
Format settings, CABAC           : No
Format settings, ReFrames        : 2 frames
Codec ID                         : avc1
Codec ID/Info                    : Advanced Video Coding
Duration                         : 1mn 52s
Bit rate mode                    : Variable
Bit rate                         : 900 Kbps
Nominal bit rate                 : 800 Kbps
Maximum bit rate                 : 2 229 Kbps
Width                            : 1 280 pixels
Height                           : 720 pixels
Display aspect ratio             : 16/9
Frame rate mode                  : Constant
Frame rate                       : 23.976 fps
Resolution                       : 24 bits
Colorimetry                      : 4:2:0
Scan type                        : Progressive
Bits/(Pixel*Frame)               : 0.041
Stream size                      : 12.1 MiB (100%)
Writing library                  : x264 core 68 r1179M 96e2229
Encoding settings                : cabac=0 / ref=1 / deblock=0:0:0 / analyse=0:0 / me=dia / subme=0 / psy_rd=0.0:0.0 / mixed_ref=0 / 
me_range=16 / chroma_me=1 / trellis=0 / 8x8dct=0 / cqm=0 / deadzone=21,11 / chroma_qp_offset=0 / threads=6 / nr=0 / decimate=1 / mbaff=0 /
 bframes=1 / b_pyramid=0 / b_adapt=1 / b_bias=0 / direct=1 / wpredb=0 / keyint=250 / keyint_min=25 / scenecut=0 / rc=abr / bitrate=800 /
 ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 / qpstep=4 / ip_ratio=1.40 / pb_ratio=1.30 / aq=0
CUDA GPU @ 800 Kbps

Code:
Video
ID                               : 1
Format                           : AVC
Format/Info                      : Advanced Video Codec
Format profile                   : High@L5.1
Format settings, CABAC           : Yes
Format settings, ReFrames        : 2 frames
Codec ID                         : avc1
Codec ID/Info                    : Advanced Video Coding
Duration                         : 1mn 52s
Bit rate mode                    : Variable
Bit rate                         : 869 Kbps
Maximum bit rate                 : 1 776 Kbps
Width                            : 1 280 pixels
Height                           : 720 pixels
Display aspect ratio             : 16/9
Frame rate mode                  : Constant
Frame rate                       : 23.976 fps
Resolution                       : 24 bits
Colorimetry                      : 4:2:0
Scan type                        : Progressive
Bits/(Pixel*Frame)               : 0.039
Stream size                      : 11.6 MiB (100%)


Shows that GPU temperature rose by 6 C when encoding. CPU Usage was almost 95% on all 4 cores during the encode.


More soon...

Suggest settings and comparisons you would like to see.

EDIT: Thank you for you suggestion guys.

As I said in my OP, that there is more to come with different sources at different resolutions.

I'm not trying to advertise anyone here. Just feeding my curiosity to see what kind of encoding does the free CUDA encoder do and probably helping others feeling the same way in the process.

The settings in this encode were used to test the pure speed of x264, to see if it could be as fast as GPU encode at similar or better quality. Since that didn't happen, I will try to match the quality and see what is the difference in performance.

As to the hardware used, this is the best thing I have access to right now. Also as others said GTS 250 is a last generation GPU based on the G92 chip used in 8800GTS 512 MB, 9800GTX, 9800 GTX+.

Also I may use Badaboom and MediaShow Espresso in future if time permits.

Also I plan on using this video

1080p VBR Video Quality Test Streams
Sony HDW-F900 footage, 1080p@25, 18 Mbps average, 30 Mbps peak in a 35 Mbps Transport Stream (259,534,064 bytes)

on this page http://www.w6rz.net/ as one of the sources. Please let me know if there is another uncompressed source you would like me to try.

I'm not too sure about the questions regarding the decoder, I only have ffdshow+haali media splitter installed on my system.
I would really appreciate if you let me know If i need to change something with the decoders.

Last edited by St Devious; 10th July 2009 at 15:03.
St Devious is offline   Reply With Quote