Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
1st August 2014, 10:30 | #21 | Link |
Registered User
Join Date: May 2006
Posts: 335
|
I've been experimenting this last week with CUDA / NVENC and x264 with both CPU and CPU+OpenCL modes. I have a 4770k and a GTX 770. I do understand these are only valid for personal uses (my own) and are not ideal or detailed. I'm just writing my own experience based on my own usage and the stuff I would, personally, do while encoding stuff. Not even tried to emulate the general habits, I just did it the way I like to do it and would do it everyday.
My results are quite what I expected Those are for 1080p encoding, action packed (explosions, speed chasing, blah blah), very detailed / grainy master in H264 with insane bitrate, direct file input with CUDA decoding on AME / DGNV indexing for x264 input. My objective was to see the mean speed on CUDA/NVENC and reproduce it on x264, then compare the output quality. >> SPEED - CUDA ~120 fps Mainconcept, Adobe Media Encoder CC 2014 - 1080p 8 Mbps, BluRay Profile - NVENC ~102 fps NVENC_export, Adobe Media Encoder CC 2014 - 1080p 8 Mbps, BluRay Profile - x264 CPU ~92 fps Profile: superfast. ABR to match the GPU encoders, naturally. - x264 CPU + OpenCL ~92 fps At first I started on Superfast and it was a little bit faster. Then I enabled all partitions, Trellis 1, me-tree with rc-lookahead 40 and me umh and got it to match with the CPU only equivalent, hopefully getting a nice boost in quality. That was a nice surprise. >> QUALITY - x264: same quality with and without OpenCL, speed boost was awesome so, yeah, OpenCL all the way. This was the 10.0, obviously. I wasn't expecting any other outcome, just wanted to see how the GPU encoders tried to take on this one). Very nice quality and grain retention. - CUDA: The new CUDA encoder doesn't make me want to tear my eyes off anymore. Nice improvement since I last tested a few years ago. I ended up trying DVDFab and Freemake Video Converter and got the same quality and speed, so, yeah. AME CC14 not needed at all - it's the same SDK encoder on all of them, it seems. Grain retention is not up to par with anything. I'd give it a 7.0. - NVENC: A little bit sharper than CUDA. Nicer grain retention. Not awful. At all. 8.0. Conclusions: - GPUenc isn't worth it on a High End PC where x264 beats the f*ck out of the GPU encoders without being much slower. - NVENC seems to be a VERY nice option for people on low end systems matched with a very unexpensive GTX 750 Ti. CUDA will be slower on low-end cards, so... NVENC is the way to go, specially since the 8xx series will have the exact same NVENC chip even on the most basic (830, 820) chips. Better yet, the newer NVENC on the 750 Ti doesn't turn the GPU up at all, it stays on power saving mode since it has a dedicated ARM core on the Maxwell chip to control it. - Do you guys remember the Badaboom CUDA H264 from some years ago? That cringe inducing Atari quality it produced? Yeah, CUDA is definitely not like that anymore. It sets the GPU for maximum output so it will be energy hungrier than NVENC. Still pales in comparison with x264 on similar speed. I then activated my IGP and tried QuickSync just for the kicks. Used maximum quality and default quality. Without any difference on speed, I mantained only the results for the maximum quality. It's better than NVENC, not by much, but it is. The speed? Around 160 fps with ~180 fps peaks. I'd seriosly consider it to replace x264 on quick non-archival encodes. I'd say 8.5. Unmatched speed, though. |
1st August 2014, 14:04 | #22 | Link |
Registered User
Join Date: Oct 2012
Posts: 7,925
|
NVENC use a asic so doesn't slow down the gpu so much.
in the OBS forum normally nvenc and amd vce win in term of quality nvenc has a lot of quality options the best give good results. haswell quicksync is usual not as good as nvenc and amd vce. ivy/sandybridge quicksync is a very very bad joke... |
1st August 2014, 18:20 | #23 | Link |
Registered User
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
|
Haswell QuickSync has a lot of quality options, I don't know about NVENC and VCE.
If possible, could you post some quality options that Haswell QS lacks, compared to NVENC/VCE ?
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1) HEVC decoding benchmarks H.264 DXVA Benchmarks for all |
30th August 2014, 20:55 | #24 | Link | |
Registered User
Join Date: Aug 2013
Posts: 61
|
Sorry for my late response. I've been on holidays for one month without my computer
Quote:
Dark Eiri, thank you very much for your EXCELLENT REVIEW!!! I've created one spreadsheet for resume your results: https://docs.google.com/spreadsheets...cds/edit#gid=0 Only one question (if you can remember :P). What software did you used for QuickSync? what parameters? For x264 what soft? handbreak? MeGui? How many memory do you have? 8GB or 16GB? DDR3-1600Mhz? If anyone can answer it I'm also interested in this information! Last edited by cr0n=0sTT88; 30th August 2014 at 21:21. |
|
30th August 2014, 20:59 | #25 | Link | |
Registered User
Join Date: Aug 2013
Posts: 61
|
Quote:
Thank you! |
|
31st August 2014, 08:54 | #26 | Link |
Registered User
Join Date: Oct 2012
Posts: 7,925
|
http://on-demand.gputechconf.com/gtc...ncoder-api.pdf
here the config file from the SDK: Code:
/////////////////////////////////////////////////////////// // H.264 encoder parameter file // syntax: [parameter [=] value] [[#|//] comment] /////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////// // Profile, Level profile = 100 // 0=auto(=lowest possible profile), 66=Baseline, 77=Main, 100=High, 128 = (High Profile Stereo), 244 = (High Profile 444), 257 = (Constrained High Profile) level = 40 // 10,11,12,13,20,21,22,30,31,32,40,41,42,50,51=level_idc, 0=auto(=lowest possible level) /////////////////////////////////////////////////////////// // sequence structure gopLength = 30 // IDR frame distance numBFrames = 1 // number of B frames between I and P frames fieldMode = 0 // 0=prog frame, 1=field bottomfieldFirst = 0 // 0=top field first, 1=bottom field first numSlices = 0 // 0=auto, 1..PicHeightInMbs=number of slices per picture /////////////////////////////////////////////////////////// // Input File Details width = 1920 // Width of Input Frame height = 1080 // Height of Input Frame maxwidth = 1920 // Max Width for resource allocation. Max resoultion change during reconfiguration can not go beyond MaxWidth, MaxHeight maxheight = 1080 // Max Width for resource allocation. Max resoultion change during reconfiguration can not go beyond MaxWidth, MaxHeight inFile =../YUV/1080p/PixelBlur-1920x1080.yuv // Driver COde Path property preset = 1 // 0=NV_ENC_PRESET_DEFAULT, 1=NV_ENC_PRESET_LOW_LATENCY_DEFAULT, 2=NV_ENC_PRESET_HP, 3=NV_ENC_PRESET_HQ, 4=NV_ENC_PRESET_BD, 5=NV_ENC_PRESET_LOW_LATENCY_HQ, 6=NV_ENC_PRESET_LOW_LATENCY_HP interfaceType = 2 // 0=DX9, 1=DX11, 2=CUDA, 3=DX10 syncMode = 1 // 0=AsyncMode, 1=SyncMode /////////////////////////////////////////////////////////// // bitstream characteristics (Type I HRD) maxbitrate = 10000000 // Peakbitrate, maximum bitrate (bits/sec), 0=use encoder defined default (level limit) vbvBufferSize = 3333333 // vbvBufferSize, decoder buffer size (bits), 0=use encoder defined default (level limit) vbvInitialDelay = 1000000 // vbvInitialDelay, initial decoder buffer fullness (bits), 0=use encoder defined default (50%) /////////////////////////////////////////////////////////// // rate control rcMode = 2 // 0=fixed QP, 1=VBR, 2=CBR, 4=VBR_MINQP, 8=Multi pass encoding optimized for image quality, 16= Multi pass encoding optimized for maintaining frame size, 32=Multi pass VBR bitrate = 10000000 // average bit rate (bits/s) (ignored if RCMode != 1) enableInitialRCQP = 1 initialQPI = 28 // If enableInitialRCQP set then It will be used as InitialQP value for I slices for RCMode !=0. constant QP value for I slices For RCMode =0. initialQPP = 28 // If enableInitialRCQP set then It will be used as InitialQP value for P slices for RCMode !=0. constant QP value for P slices For RCMode =0. initialQPB = 34 // If enableInitialRCQP set then It will be used as InitialQP value for B slices for RCMode !=0. constant QP value for B slices For RCMode =0. frameRateNum = 30 frameRateDen = 1 /////////////////////////////////////////////////////////// // picture type decision enablePtd = 1 // 1=picture type decision will be takane by Encoder driver. 0=picture type decision will be takane by application in the end x264 looks a worlds better. if quality is more inportant than time these encoder doesn't stand a chance. even for real time encoding x264 is still top. here is a test from someone where haswell out performed nvenc by a huge margin. https://obsproject.com/forum/threads...-laptop.11868/ I tested it with a gtx 760 not sure if there are differences like haswell and ivy too. amd VCE has to versions out there nearly all cards use version 1. i think the r9 290 and the r7 260 got version 2 the rest version 1. I got a r9 270 but I already forgot if I did a test with amd VCE 1 too. |
31st August 2014, 09:11 | #27 | Link | |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,347
|
Quote:
Its encoding are much worse than with other nvenc tools, or even NVIDIAs shadow play recordings.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders |
|
31st August 2014, 09:16 | #28 | Link | |
Registered User
Join Date: Oct 2012
Posts: 7,925
|
Quote:
but it's really strange if OBS is worse than shadowplay both use simple presets or at least should. but it not really tested when I tested it, this was month ago. |
|
|
|