Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 14th June 2020, 17:38   #1  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
NVEnc encoding on RTX2060 Turing GPU

I tried the latest NVEnc from here and compared it to x264 at similar bitrates. NVEnc very fast but the quality is still way behind x264 even though I read that Turing GPUs would provide massive quality improvements.

Any secret switches I can use or is x264 still far superior to NVEnc H.264?
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 14th June 2020, 19:09   #2  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,997
Quote:
Originally Posted by Groucho2004 View Post
I tried the latest NVEnc from here and compared it to x264 at similar bitrates. NVEnc very fast but the quality is still way behind x264 even though I read that Turing GPUs would provide massive quality improvements.

Any secret switches I can use or is x264 still far superior to NVEnc H.264?
Basically yes MVEnc is still inferior apart from its blazing speed, but it depends what you understand by 'far'. It also depends on the bitrate you are willing to spend. NVEncC is normally softer (looks like a denoised version) and loses details. Some people like its 'denoising' effect though - saving a dedicated noise or grain filter ;-)
What is your commandline? Someone migh have better setting suggestions.
Sharc is offline   Reply With Quote
Old 15th June 2020, 00:00   #3  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
On modern hardware, x264 at a reasonable preset is fast enough that it rarely pays off to use NVEnc for file-file encoding; generally a RTX 2060 is going to be coupled with a number of fast, modern cores. What GPU encoding is really good for is things like capturing game footage during game play, where the CPU is already heavily taxed, and running x264 as well would hurt FPS.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 15th June 2020, 09:20   #4  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by benwaggoner View Post
On modern hardware, x264 at a reasonable preset is fast enough that it rarely pays off to use NVEnc for file-file encoding; generally a RTX 2060 is going to be coupled with a number of fast, modern cores. What GPU encoding is really good for is things like capturing game footage during game play, where the CPU is already heavily taxed, and running x264 as well would hurt FPS.
Thanks, very helpful post. I guess I'll keep using x264 and use the GPU for KNLMeansCL de-noising or similar tasks.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 16th June 2020, 22:31   #5  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by Groucho2004 View Post
Thanks, very helpful post. I guess I'll keep using x264 and use the GPU for KNLMeansCL de-noising or similar tasks.
GPUs are great for all kinds of signal processing, particular ones with a one way "waterfall" style of processing and with highly parallelizable tasks that don't require a lot of inter-process communication.

As it turns out, encoding to a moden video codec is one of the things in digital media LEAST suited to running on a GPU. There are so many ways to do any given 4x4 block of pixels, and the choices made in one frame or region of a single frame impact lots of others. Lots of fast cores with SIMD, shared cache, and low-latency memory access turns out to be the best in the HEVC/AV1 era and beyond.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 17th June 2020, 16:57   #6  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,340
Quote:
Originally Posted by benwaggoner View Post
As it turns out, encoding to a moden video codec is one of the things in digital media LEAST suited to running on a GPU.
Thats why it doesn't actually run on what you consider the "GPU". The media engine is just part of the GPU die, but its entirely different hardware, and it shares no characteristics with the GPU itself. Its fully fixed function hardware. Encoding, or decoding, does not leverage the usual GPU cores.

(PS: Decoding in some cases did, with what we call a "hybrid decoder", which was available before full hardware support was added in the next generation, but its usually been terrible, and NVDEC still contains a CUDA MPEG2 decoder if one wants to use it - the fixed function decoder exists as well, of course)
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 17th June 2020 at 17:02.
nevcairiel is offline   Reply With Quote
Old 18th June 2020, 02:06   #7  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by nevcairiel View Post
Thats why it doesn't actually run on what you consider the "GPU". The media engine is just part of the GPU die, but its entirely different hardware, and it shares no characteristics with the GPU itself. Its fully fixed function hardware. Encoding, or decoding, does not leverage the usual GPU cores.
Right, and the fixed function encoders are even more constrained in making fast branching decisions than something running on CUDA or OpenCL. Even getting B-frame support in an on-die encoder is cause for celebration.

Quote:
(PS: Decoding in some cases did, with what we call a "hybrid decoder", which was available before full hardware support was added in the next generation, but its usually been terrible, and NVDEC still contains a CUDA MPEG2 decoder if one wants to use it - the fixed function decoder exists as well, of course)
It's a lot easier to make a performance fixed-function decoder than an encoder. Doing one in OpenCL is somewhat more tricky because there aren't THAT many ways to parallelize decode with H.264 (WPP in HEVC gave another welcome axis).
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 26th July 2020, 16:17   #8  |  Link
Richard1485
Guest
 
Posts: n/a
Quote:
Originally Posted by Groucho2004 View Post
Any secret switches I can use or is x264 still far superior to NVEnc H.264?
While I completely agree with the consensus in the thread that x264>NVenc, it would still be useful to have some input about which switches might improve quality.

Code:
ffmpeg -i clip.mkv -vcodec h264_nvenc -preset 7 -profile:v 2 -level 41 -rc vbr_hq -2pass 1 -qmin 0 -qmax 51 -maxrate:v 40000k -b:v 25000k -bufsize:v 30000k -bf 3 -refs:v 3 -spatial-aq 0 -temporal-aq 0 -rc-lookahead 0 -surfaces 32 -no-scenecut 0 -nonref_p 1 -strict_gop 1 -coder:v cabac -bluray-compat 1 output.264
Any ideas? (I'm using ffmpeg on Linux.) The settings above are experimental and largely modeled on those that I use with x264.

EDIT: The settings have BD compatibility in mind.

Last edited by Richard1485; 26th July 2020 at 16:20.
  Reply With Quote
Old 26th July 2020, 17:48   #9  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,997
My current settings, using quality based encoding rather than bitrate:
Code:
-c:v h264_nvenc -preset:v bd -profile:v high -level 41 -2pass 1 -maxrate 40000k -bufsize 30000k -b:v 0k -cq 22 -refs 0 -bf 3 -rc vbr_hq -b_adapt 1 -rc-lookahead 32 -surfaces 48 -spatial-aq 1 -temporal-aq 1 -aq-strength 4 -nonref_p 1 -b_ref_mode 2 -g 24 -bluray-compat 1 -pix_fmt yuv420p -out.264
Sharc is offline   Reply With Quote
Old 26th July 2020, 17:55   #10  |  Link
Greenhorn
Registered User
 
Join Date: Apr 2018
Posts: 61
Not really an NVEnc expert either, just a user of it:

I'd consider increasing rc-lookahead, enabling temporal-aq, and setting b_ref_mode to middle (2). You'll definitely need to see what those actually do for you, though. I don't THINK any of those would harm BD compatibility, and hopefully bluray-compat would nix them if they did.

With my version of ffmpeg (from July 24), preset 7 corresponds to a "lowlatency" preset; but with a standalone NVENC encoder (which wraps ffmpeg, admittedly) it corresponds to a "quality" preset. Which one are you targeting? (I'm not sure BD would need a low-latency preset, but if you've found it necessary/beneficial you already know more than me.)

Edit: weighted_pred is incompatible with bframes as Sharc said below, my mistake.

Last edited by Greenhorn; 26th July 2020 at 22:15. Reason: Correction
Greenhorn is offline   Reply With Quote
Old 26th July 2020, 18:41   #11  |  Link
Richard1485
Guest
 
Posts: n/a
Thanks, guys. Unfortunately, I can't switch on temporal-aq or increase rc-lookahead without receving a "No NVENC capable devices found" warning, which despite what it sounds like means that my hardware doesn't support such settings. Nonetheless, they are definitely advisable, so I'll add them in.

In respect of "preset 7", I did indeed think that it meant quality, rather than low latency. Perhaps switiching it to 6 is a safer bet. I disable weightp even with x264, but I'll add it to the settings, switch bref to 2, and add -g 24.

Anyway, here's a second attempt:

Code:
ffmpeg -i clip.mkv -vcodec h264_nvenc -preset 6 -profile:v high -level 41 -rc vbr_hq -2pass 1 -qmin 0 -qmax 51 -maxrate:v 40000k -b:v 25000k -bufsize:v 30000k -bf 3 -refs:v 3 -spatial-aq 1 -temporal-aq 1 -aq-strength 4 -b_ref_mode 2 -rc-lookahead 32 -surfaces 48 -no-scenecut 0 -nonref_p 1 -strict_gop 1 -coder:v cabac -g 24 -bluray-compat 1 output.264
Let me know if I've overlooked anything. I wondered if it might be worth setting a minimum bitrate.

EDIT: Removed weighted_pred.

Last edited by Richard1485; 27th July 2020 at 16:00. Reason: Removed weighted_pred,
  Reply With Quote
Old 26th July 2020, 19:43   #12  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,997
You may play with the -aq-strength. I found that setting it to 0 produced slightly better results, using metrics. But this probably depends on the source.....
I am not sure whether -weighted_pred is supported with B-frames, (maybe it depends on the HW). I don't know which other settings the -preset 6 (or whatever) defines, so maybe some settings in the commandline are double-stitched.
Set -refs:v 0. The 'no capable devices found' should disappear, and 3 or 4 ref frames will automatically be set.
Also, don't blindly trust the bluray compliance. You'll have to find out what your blu-ray authoring software will eventually accept.

Last edited by Sharc; 26th July 2020 at 20:16.
Sharc is offline   Reply With Quote
Old 26th July 2020, 22:13   #13  |  Link
Richard1485
Guest
 
Posts: n/a
Yeah, aq-strength definitely depends on the source. It's one of the settings that I used to tune when using hcenc.
Quote:
Originally Posted by Sharc View Post
I am not sure whether -weighted_pred is supported with B-frames, (maybe it depends on the HW). I don't know which other settings the -preset 6 (or whatever) defines, so maybe some settings in the commandline are double-stitched.
Set -refs:v 0. The 'no capable devices found' should disappear, and 3 or 4 ref frames will automatically be set.
Settings -refs:v 0 on its own doesn't clear the warning. I have to disable rc-lookahead, temporal-aq, and -weighted_pred to accomplish that. But my card's pretty old.

A bit of double-stitching is probably inevitable, but it doesn't hurt to spell things out in an exemplar. Users can always tweak the settings to suit.

Quote:
Originally Posted by Sharc View Post
Also, don't blindly trust the bluray compliance.
Agreed. Compliance is a useful starting point but not the be-all and end-all. Thanks for your help.

Last edited by Richard1485; 27th July 2020 at 04:54.
  Reply With Quote
Old 27th July 2020, 02:23   #14  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by Sharc View Post
You may play with the -aq-strength. I found that setting it to 0 produced slightly better results, using metrics.
Given it is a psychovisual optimization, aq-strength tends to improve subjective quality while reducing objective metrics. Thus you can't use objective metrics to determine if it is useful or not.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 27th July 2020, 12:57   #15  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,997
Quote:
Originally Posted by benwaggoner View Post
Given it is a psychovisual optimization, aq-strength tends to improve subjective quality while reducing objective metrics. Thus you can't use objective metrics to determine if it is useful or not.
Ah yes, I forgot that --aq-spatial is a psychovisual optimization. Thanks.
From the NVIDIA SDK docs:
Quote:
Although spatial AQ improves the perceptible visual quality of the encoded video, the required bit redistribution results in PSNR drop in most of the cases. Therefore, during PSNR-based evaluation, this feature should be turned off.

Last edited by Sharc; 27th July 2020 at 13:10.
Sharc is offline   Reply With Quote
Old 27th July 2020, 13:25   #16  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,997
Quote:
Originally Posted by Richard1485 View Post
Thanks, guys. Unfortunately, I can't switch on temporal-aq or increase rc-lookahead without receving a "No NVENC capable devices found" warning, which despite what it sounds like means that my hardware doesn't support such settings.

Anyway, here's a second attempt:

Code:
ffmpeg -i clip.mkv -vcodec h264_nvenc -preset 6 -profile:v high -level 41 -rc vbr_hq -2pass 1 -qmin 0 -qmax 51 -maxrate:v 40000k -b:v 25000k -bufsize:v 30000k -bf 3 -refs:v 3 -spatial-aq 1 -temporal-aq 1 -aq-strength 4 -b_ref_mode 2 -rc-lookahead 32 -surfaces 48 -no-scenecut 0 -nonref_p 1 -weighted_pred 1 -strict_gop 1 -coder:v cabac -g 24 -bluray-compat 1 output.264
Let me know if I've overlooked anything. I wondered if it might be worth setting a minimum bitrate.
You should remove (or disable) -weighted_pred. According to NVIDIA's SDK Doc:
Quote:
NVENCODE API supports weighed prediction for HEVC and H.264 starting from Pascal generation GPUs.
Weighted prediction is not supported if the encode session is configured with B frames.
Sharc is offline   Reply With Quote
Old 27th July 2020, 15:58   #17  |  Link
Richard1485
Guest
 
Posts: n/a
Done! We're getting there.
  Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:49.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.