Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 1st August 2014, 10:30   #21  |  Link
Dark Eiri
Registered User
 
Join Date: May 2006
Posts: 335
I've been experimenting this last week with CUDA / NVENC and x264 with both CPU and CPU+OpenCL modes. I have a 4770k and a GTX 770. I do understand these are only valid for personal uses (my own) and are not ideal or detailed. I'm just writing my own experience based on my own usage and the stuff I would, personally, do while encoding stuff. Not even tried to emulate the general habits, I just did it the way I like to do it and would do it everyday.

My results are quite what I expected Those are for 1080p encoding, action packed (explosions, speed chasing, blah blah), very detailed / grainy master in H264 with insane bitrate, direct file input with CUDA decoding on AME / DGNV indexing for x264 input. My objective was to see the mean speed on CUDA/NVENC and reproduce it on x264, then compare the output quality.

>> SPEED
- CUDA ~120 fps
Mainconcept, Adobe Media Encoder CC 2014 - 1080p 8 Mbps, BluRay Profile

- NVENC ~102 fps
NVENC_export, Adobe Media Encoder CC 2014 - 1080p 8 Mbps, BluRay Profile

- x264 CPU ~92 fps
Profile: superfast. ABR to match the GPU encoders, naturally.

- x264 CPU + OpenCL ~92 fps
At first I started on Superfast and it was a little bit faster.
Then I enabled all partitions, Trellis 1, me-tree with rc-lookahead 40 and me umh and got it to match with the CPU only equivalent, hopefully getting a nice boost in quality. That was a nice surprise.


>> QUALITY
- x264: same quality with and without OpenCL, speed boost was awesome so, yeah, OpenCL all the way. This was the 10.0, obviously. I wasn't expecting any other outcome, just wanted to see how the GPU encoders tried to take on this one). Very nice quality and grain retention.

- CUDA: The new CUDA encoder doesn't make me want to tear my eyes off anymore. Nice improvement since I last tested a few years ago. I ended up trying DVDFab and Freemake Video Converter and got the same quality and speed, so, yeah. AME CC14 not needed at all - it's the same SDK encoder on all of them, it seems. Grain retention is not up to par with anything. I'd give it a 7.0.

- NVENC: A little bit sharper than CUDA. Nicer grain retention. Not awful. At all. 8.0.

Conclusions:
- GPUenc isn't worth it on a High End PC where x264 beats the f*ck out of the GPU encoders without being much slower.

- NVENC seems to be a VERY nice option for people on low end systems matched with a very unexpensive GTX 750 Ti. CUDA will be slower on low-end cards, so... NVENC is the way to go, specially since the 8xx series will have the exact same NVENC chip even on the most basic (830, 820) chips. Better yet, the newer NVENC on the 750 Ti doesn't turn the GPU up at all, it stays on power saving mode since it has a dedicated ARM core on the Maxwell chip to control it.

- Do you guys remember the Badaboom CUDA H264 from some years ago? That cringe inducing Atari quality it produced? Yeah, CUDA is definitely not like that anymore. It sets the GPU for maximum output so it will be energy hungrier than NVENC. Still pales in comparison with x264 on similar speed.

I then activated my IGP and tried QuickSync just for the kicks. Used maximum quality and default quality. Without any difference on speed, I mantained only the results for the maximum quality. It's better than NVENC, not by much, but it is. The speed? Around 160 fps with ~180 fps peaks. I'd seriosly consider it to replace x264 on quick non-archival encodes. I'd say 8.5. Unmatched speed, though.
Dark Eiri is offline   Reply With Quote
Old 1st August 2014, 14:04   #22  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
NVENC use a asic so doesn't slow down the gpu so much.

in the OBS forum normally nvenc and amd vce win in term of quality nvenc has a lot of quality options the best give good results.
haswell quicksync is usual not as good as nvenc and amd vce. ivy/sandybridge quicksync is a very very bad joke...
huhn is offline   Reply With Quote
Old 1st August 2014, 18:20   #23  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
Haswell QuickSync has a lot of quality options, I don't know about NVENC and VCE.

If possible, could you post some quality options that Haswell QS lacks, compared to NVENC/VCE ?
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 30th August 2014, 20:55   #24  |  Link
cr0n=0sTT88
Registered User
 
cr0n=0sTT88's Avatar
 
Join Date: Aug 2013
Posts: 61
Sorry for my late response. I've been on holidays for one month without my computer

Quote:
Originally Posted by Dark Eiri View Post
I've been experimenting this last week with CUDA / NVENC and x264 with both CPU and CPU+OpenCL modes. I have a 4770k and a GTX 770. I do understand these are only valid for personal uses (my own) and are not ideal or detailed. I'm just writing my own experience based on my own usage and the stuff I would, personally, do while encoding stuff. Not even tried to emulate the general habits, I just did it the way I like to do it and would do it everyday.

My results are quite what I expected Those are for 1080p encoding, action packed (explosions, speed chasing, blah blah), very detailed / grainy master in H264 with insane bitrate, direct file input with CUDA decoding on AME / DGNV indexing for x264 input. My objective was to see the mean speed on CUDA/NVENC and reproduce it on x264, then compare the output quality.

>> SPEED
- CUDA ~120 fps
Mainconcept, Adobe Media Encoder CC 2014 - 1080p 8 Mbps, BluRay Profile

- NVENC ~102 fps
NVENC_export, Adobe Media Encoder CC 2014 - 1080p 8 Mbps, BluRay Profile

- x264 CPU ~92 fps
Profile: superfast. ABR to match the GPU encoders, naturally.

- x264 CPU + OpenCL ~92 fps
At first I started on Superfast and it was a little bit faster.
Then I enabled all partitions, Trellis 1, me-tree with rc-lookahead 40 and me umh and got it to match with the CPU only equivalent, hopefully getting a nice boost in quality. That was a nice surprise.


>> QUALITY
- x264: same quality with and without OpenCL, speed boost was awesome so, yeah, OpenCL all the way. This was the 10.0, obviously. I wasn't expecting any other outcome, just wanted to see how the GPU encoders tried to take on this one). Very nice quality and grain retention.

- CUDA: The new CUDA encoder doesn't make me want to tear my eyes off anymore. Nice improvement since I last tested a few years ago. I ended up trying DVDFab and Freemake Video Converter and got the same quality and speed, so, yeah. AME CC14 not needed at all - it's the same SDK encoder on all of them, it seems. Grain retention is not up to par with anything. I'd give it a 7.0.

- NVENC: A little bit sharper than CUDA. Nicer grain retention. Not awful. At all. 8.0.

Conclusions:
- GPUenc isn't worth it on a High End PC where x264 beats the f*ck out of the GPU encoders without being much slower.

- NVENC seems to be a VERY nice option for people on low end systems matched with a very unexpensive GTX 750 Ti. CUDA will be slower on low-end cards, so... NVENC is the way to go, specially since the 8xx series will have the exact same NVENC chip even on the most basic (830, 820) chips. Better yet, the newer NVENC on the 750 Ti doesn't turn the GPU up at all, it stays on power saving mode since it has a dedicated ARM core on the Maxwell chip to control it.

- Do you guys remember the Badaboom CUDA H264 from some years ago? That cringe inducing Atari quality it produced? Yeah, CUDA is definitely not like that anymore. It sets the GPU for maximum output so it will be energy hungrier than NVENC. Still pales in comparison with x264 on similar speed.

I then activated my IGP and tried QuickSync just for the kicks. Used maximum quality and default quality. Without any difference on speed, I mantained only the results for the maximum quality. It's better than NVENC, not by much, but it is. The speed? Around 160 fps with ~180 fps peaks. I'd seriosly consider it to replace x264 on quick non-archival encodes. I'd say 8.5. Unmatched speed, though.

Dark Eiri, thank you very much for your EXCELLENT REVIEW!!!

I've created one spreadsheet for resume your results:

https://docs.google.com/spreadsheets...cds/edit#gid=0

Only one question (if you can remember :P). What software did you used for QuickSync? what parameters?
For x264 what soft? handbreak? MeGui?
How many memory do you have? 8GB or 16GB? DDR3-1600Mhz?

Quote:
Originally Posted by NikosD View Post
Haswell QuickSync has a lot of quality options, I don't know about NVENC and VCE.

If possible, could you post some quality options that Haswell QS lacks, compared to NVENC/VCE ?
If anyone can answer it I'm also interested in this information!

Last edited by cr0n=0sTT88; 30th August 2014 at 21:21.
cr0n=0sTT88 is offline   Reply With Quote
Old 30th August 2014, 20:59   #25  |  Link
cr0n=0sTT88
Registered User
 
cr0n=0sTT88's Avatar
 
Join Date: Aug 2013
Posts: 61
Quote:
Originally Posted by huhn View Post
NVENC use a asic so doesn't slow down the gpu so much.

in the OBS forum normally nvenc and amd vce win in term of quality nvenc has a lot of quality options the best give good results.
haswell quicksync is usual not as good as nvenc and amd vce. ivy/sandybridge quicksync is a very very bad joke...
Thank you for your opinion. Do you have any review or post about this?

Thank you!
cr0n=0sTT88 is offline   Reply With Quote
Old 31st August 2014, 08:54   #26  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
http://on-demand.gputechconf.com/gtc...ncoder-api.pdf

here the config file from the SDK:
Code:
///////////////////////////////////////////////////////////
// H.264 encoder parameter file
// syntax: [parameter [=] value] [[#|//] comment]
///////////////////////////////////////////////////////////

///////////////////////////////////////////////////////////
// Profile, Level

profile           = 100        // 0=auto(=lowest possible profile), 66=Baseline, 77=Main, 100=High, 128 = (High Profile Stereo), 244 = (High Profile 444), 257 = (Constrained High Profile)
level             = 40        // 10,11,12,13,20,21,22,30,31,32,40,41,42,50,51=level_idc, 0=auto(=lowest possible level)

///////////////////////////////////////////////////////////
// sequence structure

gopLength           = 30       // IDR frame distance
numBFrames          = 1        // number of B frames between I and P frames
fieldMode           = 0        // 0=prog frame, 1=field
bottomfieldFirst    = 0        // 0=top field first, 1=bottom field first
numSlices           = 0        // 0=auto, 1..PicHeightInMbs=number of slices per picture

///////////////////////////////////////////////////////////
// Input File Details
width           = 1920      // Width of Input Frame
height          = 1080      // Height of Input Frame
maxwidth        = 1920      // Max Width for resource allocation. Max resoultion  change during reconfiguration can not go beyond  MaxWidth, MaxHeight
maxheight       = 1080      // Max Width for resource allocation. Max resoultion  change during reconfiguration can not go beyond  MaxWidth, MaxHeight
inFile          =../YUV/1080p/PixelBlur-1920x1080.yuv


// Driver COde Path property
preset          = 1        // 0=NV_ENC_PRESET_DEFAULT, 1=NV_ENC_PRESET_LOW_LATENCY_DEFAULT, 2=NV_ENC_PRESET_HP, 3=NV_ENC_PRESET_HQ, 4=NV_ENC_PRESET_BD, 5=NV_ENC_PRESET_LOW_LATENCY_HQ, 6=NV_ENC_PRESET_LOW_LATENCY_HP
interfaceType   = 2         // 0=DX9, 1=DX11, 2=CUDA, 3=DX10
syncMode        = 1         // 0=AsyncMode, 1=SyncMode

    

///////////////////////////////////////////////////////////
// bitstream characteristics (Type I HRD)

maxbitrate        = 10000000   // Peakbitrate, maximum bitrate (bits/sec), 0=use encoder defined default (level limit)
vbvBufferSize     = 3333333  // vbvBufferSize, decoder buffer size (bits), 0=use encoder defined default (level limit)
vbvInitialDelay   = 1000000  // vbvInitialDelay, initial decoder buffer fullness (bits), 0=use encoder defined default (50%)

///////////////////////////////////////////////////////////
// rate control

rcMode              = 2        // 0=fixed QP, 1=VBR, 2=CBR, 4=VBR_MINQP, 8=Multi pass encoding optimized for image quality, 16= Multi pass encoding optimized for maintaining frame size, 32=Multi pass VBR
bitrate             = 10000000   // average bit rate (bits/s) (ignored if RCMode != 1)
enableInitialRCQP   = 1
initialQPI          = 28       // If enableInitialRCQP set then It will be used as InitialQP value for I slices for RCMode !=0. constant QP value for I slices For RCMode =0.
initialQPP          = 28       // If enableInitialRCQP set then It will be used as InitialQP value for P slices for RCMode !=0. constant QP value for P slices For RCMode =0.
initialQPB          = 34       // If enableInitialRCQP set then It will be used as InitialQP value for B slices for RCMode !=0. constant QP value for B slices For RCMode =0.
frameRateNum        = 30  
frameRateDen        = 1


///////////////////////////////////////////////////////////
// picture type decision

enablePtd           = 1        // 1=picture type decision will be takane by Encoder driver. 0=picture type decision will be takane by application
you can use make custom settings not sure if nvenc has more than quicksync. I only know that nvenc has usually better quality then QS haswell. I did a couple of tests too.
in the end x264 looks a worlds better. if quality is more inportant than time these encoder doesn't stand a chance. even for real time encoding x264 is still top.

here is a test from someone where haswell out performed nvenc by a huge margin.
https://obsproject.com/forum/threads...-laptop.11868/

I tested it with a gtx 760 not sure if there are differences like haswell and ivy too.
amd VCE has to versions out there nearly all cards use version 1.
i think the r9 290 and the r7 260 got version 2 the rest version 1. I got a r9 270 but I already forgot if I did a test with amd VCE 1 too.
huhn is offline   Reply With Quote
Old 31st August 2014, 09:11   #27  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
Quote:
Originally Posted by huhn View Post
here is a test from someone where haswell out performed nvenc by a huge margin.
https://obsproject.com/forum/threads...-laptop.11868/
I don't think OBS is a good test platform, I played around with it a while ago and i don't think it configures the nvenc decoder properly.
Its encoding are much worse than with other nvenc tools, or even NVIDIAs shadow play recordings.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 31st August 2014, 09:16   #28  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
Quote:
Originally Posted by nevcairiel View Post
I don't think OBS is a good test platform, I played around with it a while ago and i don't think it configures the nvenc decoder properly.
Its encoding are much worse than with other nvenc tools, or even NVIDIAs shadow play recordings.
I agree. we need cli encoder for all 3 for a proper test.

but it's really strange if OBS is worse than shadowplay both use simple presets or at least should. but it not really tested when I tested it, this was month ago.
huhn is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:31.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.