Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 30th April 2019, 15:57   #1  |  Link
kabelbrand
Compression mode: Lousy
 
kabelbrand's Avatar
 
Join Date: Mar 2009
Location: Hamburg, Germany
Posts: 74
NVidia GPU encode performance differences

Hi,

as I understand the encode engine is separate from the CUDA engine so the CUDA core count does not seem to be a (huge) factor.
I suspect there is no big performance difference within a GPU generation and only depends on the number of NVENC, see here.
So in theory a Quadro P5000 would be twice as fast as a P4000 and a P2000 would still be as fast as a P4000!?

I am wondering how GPU generations differ in encoding speed, e.g. K2000 vs. M2000 vs. P2000 or P4000 vs. RTX4000.
Does the number of streams per NVENC roughly translate into fractions of real time for a single stream?

Any thoughts are appreciated.

Thank you.

Last edited by kabelbrand; 30th April 2019 at 16:38.
kabelbrand is offline   Reply With Quote
Old 30th April 2019, 19:10   #2  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 6,440
Don't own any card which multiple encoder (NVENC) chips, but I thought that 'the number of encoders' * 'the number of parallel encodings' would be the number of streams you can encode in parallel with that card,....
(Total # of NVENC) * (Max # of concurrent sessions) = 'number of streams that can be encoded in parallel'

Never thought that multiple chips could be used to speed up the encoding of a single stream,...
__________________
Hybrid here in the forum, homepage
Notice: Since email notifications do not work here any more, it might take me quite some time to notice a reply to a thread,..
Selur is offline   Reply With Quote
Old 30th April 2019, 19:13   #3  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,163
I would be surprised if single-stream encoding performance increases on those cards with multiple NVENC engines, but since they are rather expensive, I can't really confirm or deny it officially. Single Stream encoding is already quite fast even on consumer GeForce cards though.

The generation of the card certainly matters however - even more so in quality then speed however. Turing for example made pretty big leaps in quality.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 1st May 2019, 05:04   #4  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Metropolitan City of Milan, Italy
Posts: 1,851
@kabelbrand... don't bother, I have an NVIDIA Quadro M4000 but I don't use it to encode 'cause the quality of the resulting encode is not suitable for production as it falls way behind CPU encoding on SSIM/PSNR tests. I use it to scroll the timeline of my NLE smoothly and sometimes with OpenCl when I have to encode in x264, that's all.
For everything else, there's CPU only encoding.
FranceBB is offline   Reply With Quote
Old 2nd May 2019, 11:05   #5  |  Link
kabelbrand
Compression mode: Lousy
 
kabelbrand's Avatar
 
Join Date: Mar 2009
Location: Hamburg, Germany
Posts: 74
Thank you guys.

Yes, it would not make sense if encoding a single stream uses more than one NVENC. When testing with large Quicktime and MXF files I noticed the limiting factor was not the GPU but I/O and audio encoding so a single NVENC will be sufficient.

In terms of quality I guess using the latest generation is the way to go. And even if I don't aim for quality in my current scenario it doesn't hurt if it's the best possible quality at a given bitrate.

A few years back I did some tests with a Kepler card and wasn't very impressed with the encode quality but the Pascal card I recently used performed much better in terms of quality.

So the Quadro P2000 seems to be the obvious budget choice here and the RTX4000 the encode quality pick. I guess NVidia will also introduce entry level Turing Quadro cards later on.
kabelbrand is offline   Reply With Quote
Old 2nd May 2019, 20:54   #6  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,325
Unfortunately a recently released low end Turing, the 1650, has the older Volta generation NVENC with quality more similar to Pascal's. You have to be pretty careful when buying a GPU for its NVENC generation.

Edit: Why do you only discuss Quadros for encoding? Wouldn't a 1660 Ti be the budget choice? Do you need lots of concurrent sessions? That seems to be the only difference for single NVENC chip Quadros, with Geforce limited to 2 sessions while Quadros can run as many at once as you want. Curiously none of the RTX cards have multiple NVENC chips.
__________________
madVR options explained

Last edited by Asmodian; 2nd May 2019 at 21:07.
Asmodian is offline   Reply With Quote
Old 2nd May 2019, 22:49   #7  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,163
Quote:
Originally Posted by Asmodian View Post
Unfortunately a recently released low end Turing, the 1650, has the older Volta generation NVENC with quality more similar to Pascal's. You have to be pretty careful when buying a GPU for its NVENC generation.
Wow, wonder what sort of recycling is going on there. Good that I was impatient and got a 1660 a few weeks ago instead of waiting for the 1650.

Regarding concurrent sessions, that is purely a driver limitation. The implication being that if one is crafty enough, it can be lifted....
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 3rd May 2019, 10:42   #8  |  Link
kabelbrand
Compression mode: Lousy
 
kabelbrand's Avatar
 
Join Date: Mar 2009
Location: Hamburg, Germany
Posts: 74
Quote:
Originally Posted by Asmodian View Post
Why do you only discuss Quadros for encoding? Wouldn't a 1660 Ti be the budget choice?
The GTX 1660 is a good choice indeed but my interest in single session performance is more out of curiousity and a 2 session limit would heavily restrict use cases.
Messing around with drivers or firmware is not an option either. Also I'd prefer a single slot card since this is supposed to go into a 1U 19" enclosure.
kabelbrand is offline   Reply With Quote
Old 9th November 2020, 20:11   #9  |  Link
kabelbrand
Compression mode: Lousy
 
kabelbrand's Avatar
 
Join Date: Mar 2009
Location: Hamburg, Germany
Posts: 74
https://docs.nvidia.com/video-techno...nc-performance
kabelbrand is offline   Reply With Quote
Old 10th November 2020, 01:51   #10  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,325
It is really interesting that Ampere and Turing dropped interlaced H.264 encoding. The only "N" in that column.

I am happy to see the disappearance of interlaced video but I am surprised that it was worth removing from the new chip.
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Old 10th November 2020, 09:07   #11  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,851
Quote:
Originally Posted by Asmodian View Post
It is really interesting that Ampere and Turing dropped interlaced H.264 encoding. The only "N" in that column.

I am happy to see the disappearance of interlaced video but I am surprised that it was worth removing from the new chip.
Why happy? Blu-ray compliance requires 'interlaced' for 25 and 29.97fps.
Fake-interlaced is not possible in all cases.
Sharc is offline   Reply With Quote
Old 11th November 2020, 21:29   #12  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,325
I took Nvidia dropping support for interlaced video as a sign that interlacing was on its way out, and I was happy about that. No new video created today should be interlaced. Like chroma subsampling, interlaced video is a bad hack to reduce the bandwidth required. We should have stopped using it long before we did.

There is always some pain when dropping an old standard but interlaced video is bad enough that it would take a lot more pain than worry about fake interlaced for bluray compatibility not being possible in some cases for me to think supporting interlaced video is still important.

In the worst case software encoding for interlaced H.264 is pretty fast.
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Old 12th November 2020, 18:24   #13  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 3,981
Quote:
Originally Posted by nevcairiel View Post
I would be surprised if single-stream encoding performance increases on those cards with multiple NVENC engines, but since they are rather expensive, I can't really confirm or deny it officially.
That's validated here. Interesting that some GPUs have multiple NVENC modules: https://docs.nvidia.com/video-techno...nc-performance

Quote:
While Kepler and first-generation Maxwell GPUs had one NVENC engine per chip, certain variants of the second-generation Maxwell, Pascal and Volta GPUs have two/three NVENC engines per chip. This increases the aggregate encoder performance of the GPU. NVIDIA driver takes care of load balancing among multiple NVENC engines on the chip, so that applications don’t require any special code to take advantage of multiple encoders and automatically benefit from higher encoder capacity on higher-end GPU hardware. The encode performance listed in Table 3 is given per NVENC engine. Thus, if the GPU has 2 NVENCs (e.g. GP104, GM204), multiply the corresponding number in Table 3 by the number of NVENCs per chip to get aggregate maximum performance (applicable only when running multiple simultaneous encode sessions). Note that performance with single encoding session cannot exceed performance per NVENC, regardless of the number of NVENCs present on the GPU.
I imagine the scenario is for things like recording game activity and webcam simultaneously, or perhaps encoding multiple bitrates at once for adaptive streaming.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 12th November 2020, 20:57   #14  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,876
Quote:
I took Nvidia dropping support for interlaced video as a sign that interlacing was on its way out
Not in broadcast, it's not. The vast majority of linear HDTV networks are 1080i, and that's not changing any time soon. Heck, there's still TONS of SDTV networks.

Many facilities have large production switchers and baseband routers that are 1.5 Gbps HD-SDI, so even going to 1080p is totally unthinkable without rebuilding the whole facility.

Only newer facilities have 3G-SDI or faster for their baseband.

Interlacing is here to stay
Blue_MiSfit is offline   Reply With Quote
Old 14th November 2020, 01:03   #15  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,058
Quote:
I am happy to see the disappearance of interlaced video
Quote:
Why happy? Blu-ray compliance requires 'interlaced' for 25 and 29.97fps.
And not only that.
A codec/encoder which can not accommodate yesterday's formats would be unfit for archival.
This would affect millions of hours of recorded material, BTW.

Not being able to read cuneiform: No entrance to library ;-)

rantilein start
Why authorities back the in D1 times had opted to save an interlaced picture as one frame, not as two consecutive fields will always look like a sore thumb to me.
DCT HF coefficients wasted, motion analysis complicated and with the advent of 4:2:0 chroma destroyed. What a sabotage.
HEVC starts to simply treat interlaced just the abovementioned way and demands that reinterlacing be done after decoding fields at rendering. Oh well.
rantilein end
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're working on that issue. Synce invntoin uf lingöage..."

Last edited by Emulgator; 14th November 2020 at 01:16.
Emulgator is offline   Reply With Quote
Old 14th November 2020, 11:28   #16  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,163
If you do long-term quality archiving with a GPU encoder, you are doing it very wrong anyway. The only category in which they win is speed, most suitably for real-time processing, not archival.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 14th November 2020, 13:01   #17  |  Link
zub35
Registered User
 
Join Date: Oct 2016
Posts: 58
nevcairiel or fast-lossless where the final size is not so critical.
zub35 is offline   Reply With Quote
Old 16th November 2020, 04:38   #18  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,325
Why are you re-encoding this interlaced video archive though? I assume any interlaced video you are planning to feed a NVENC chip is already encoded as something. If fast and dirty interlaced encoding is OK why not save some time and keep the original video? For the vast majority of customers interlaced video encoding in the hardware is a waste of transistors and dev time.

I do VHS archival and I still capture new tapes but I capture to MagicYUV. For archival I keep it as interlaced H.264, using very slow software encoding. I don't think hardware should worry about archival video, its use case is real time video encoding. As improved as hardware encoding is I still cannot countenance its use for archival.
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Old 16th November 2020, 10:22   #19  |  Link
excellentswordfight
Lost my old account :(
 
Join Date: Jul 2017
Posts: 206
Best use case I can think of is doing hw-encodes of broadcast DI formats, e.g. AVC-I/XAVC 1080i which is still rather popular. We also still have distribution and contribution encoding that use interlaced AVC that needs to be encoded in realtime, were nvenc could be used as a platform.

Quote:
Originally Posted by Asmodian View Post
Why are you re-encoding this interlaced video archive though? I assume any interlaced video you are planning to feed a NVENC chip is already encoded as something.
Realtime ingest is a thing though, and there are AVC-based DI/mezzanine formats that are in fact very much in use. But I dont think i've seen any nvenc based encoder that produce those formats, but cards like Matrox M264 does exist for a reason.

But I think that the biggest market for nvenc is realtime live streaming (with customers like AWS/Elemental), and game streaming, and thats pretty much exclusively progressive, so I guess thats why they are dropping it.

Last edited by excellentswordfight; 16th November 2020 at 16:07.
excellentswordfight is offline   Reply With Quote
Old 16th November 2020, 18:20   #20  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 3,981
Quote:
Originally Posted by excellentswordfight View Post
But I think that the biggest market for nvenc is realtime live streaming (with customers like AWS/Elemental), and game streaming, and thats pretty much exclusively progressive, so I guess thats why they are dropping it.
Yeah, the real limitation is the lack of interlaced support in decoders. Generations of mobile devices shipped with progressive-only decode, so interlaced encodes are largely only delivered as broadcast to set top boxes and TV tuners.

And in my Gen X generation of streaming video engineers, interlaced has been a huge pain point for us for decades, so we tend to avoid using it at all costs. Interlaced was always a Boomer legacy thing, like Drop Frame timecode.

Interlaced almost got dropped back in ATSC 1.0, and that would have been a much better timeline to live in. At least there's no interlaced UHD. And H.264 was the last codec mainstream codec to have interlaced be treated as a first-class citizen with a well-tuned MBAFF.

A good 1080i30->1080p60 conversion can look great, is much more compatible, isn't reliant on client-side deinterlacer implementation. and takes maybe 20% more bits max in HEVC last I tested. Heck, in most cases the total experience is probably better in 1080p60 than 1080i30 at the same bitrate. Some more encoding artifacts can be better than deinterlacing artifacts.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Reply

Tags
gpu, nvenc, nvidia

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:10.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, vBulletin Solutions Inc.