NVEnc as plugin? [Archive]

tormento

4th July 2023, 10:43

I am reading the NVEnc parameters (https://github.com/rigaya/NVEnc/blob/master/NVEncC_Options.en.md) and it has so many level functions in hardware that it is at least awesome.

Unfortunately the only way to unlock its power now is to go to an intermediate lossless and then encode with software encoder.

What about using the available source to have a powerful plugin for AVS?

It would do 3D LUT tetrahedral transformations, high quality resizing and so on. It is similar to zlib but way faster.

That would be great.

Pinterf? DTL2020? Anyone? :)

DTL

5th July 2023, 06:27

tormento

5th July 2023, 09:07

Possibly the most interesting denoise
Thanks for your experience.

Would be anyway possible to translate NVEnc as an AVS plugin?

anton_foy

5th July 2023, 09:11

Possibly the most interesting denoise on very advanced (?) AI-models from NVIDIA still do not show good details saving - https://github.com/NVIDIA/MAXINE-VFX-SDK . Also require at least Turing GPU or later. Possibly it is only for cheap webcams and simple web conferences where no best quality is required.

While it shows loss of details it also gives a prominent color cast. But other features may be interesting and useful though.

Here a request about degrain/Mvtools:
https://github.com/rigaya/NVEnc/issues/243

DTL

5th July 2023, 11:26

"Would be anyway possible to translate NVEnc as an AVS plugin?"

Yes - there are several ways possible. As 'filters' in NVenc application are implemented as separate processing units (though with only fixed order) it is possible 2 ways:
1. Make it as a single plugin with all filterchain fixed and single input and single output and passing a long string of arguments as it implemented in NVenc.
2. Assume NVenc project Vpp-features as a collection of several filters separated and extracted as separate filters to AVS (ported to CUDA).

Solution 1 may be faster to implement and may be faster in dispatching several filters enabled because I assume they keep all data in accelerator memory. Also if there were added new filters to NVenc (including mvtools on CUDA) they may be used in a more easy way in AVS.

The only thing I not like in all this project is strict dependence on a single hardware vendor (and maybe Windows only too and possibly not very old Windows. So maybe the small residual freeware open source developers for AVS will not be of great interest in making such work. You may try to ask Asd-g if he can put some time into this.

"Unfortunately the only way to unlock its power now is to go to an intermediate lossless and then encode with software encoder."

To use its output in x26x encoder you may try to ask NVenc or x26x developers to make compatible output to feed into x26x command line directly. I think there are many more active developers still exist for x26x projects.

kedautinh12

5th July 2023, 11:47

Asd-g won't make Cuda plugins

DTL

5th July 2023, 11:52

If NVIDIA will see some market benefit from having AVS plugin - may be we can get AVS plugin from professional NVIDIA developers. So some request may be sent to NVIDIA/CUDA development (e-mail or forums and so on).

anton_foy

5th July 2023, 12:14

Solution 1 may be faster to implement and may be faster in dispatching several filters enabled because I assume they keep all data in accelerator memory. Also if there were added new filters to NVenc (including mvtools on CUDA) they may be used in a more easy way in AVS.

This sounds great. Your version of mvtools can make use of this too?

DTL

5th July 2023, 18:03

There were some attempts to put MAnalyse on CUDA - https://github.com/pinterf/AviSynthCUDAFilters/blob/master/KTGMC/MVKernel.cu . Other filters like MCompensate or MDegrainN are much more simple. But it looks old developers are lost and no other can take it and continue development. Again it is a subject to ask NVIDIA as current still alive manufacturer of CUDA hardware.

"Your version of mvtools can make use of this too?"

No.

tormento

5th July 2023, 20:50

Asd-g won't make Cuda plugins
The code is there in Rigaya repo, almost all that is needed. That said, my programming competence is just some Finite Elements Analysis in Fortran. ;)

If not a CUDA plugin, perhaps a pipe or something like that, not to have to write to disk but the output can be used directly inside AVS.

DTL

5th July 2023, 21:42

If NVenc is expected to be an interface to hardware onboard encoder - I think it can not simply bypass the encoder after the Vpp stage and output uncompressed frame data to host memory from onboard memory. So the main part to implement for AVS plugin - downloading uncompressed frames after Vpp chain from accelerator board into host memory so it can be presented to the next AVS filter at the GetFrame() function. About input from AVS - it looks close to being implemented in the NVenc.

FranceBB

5th July 2023, 21:45

perhaps a pipe

Considering that, aside from .avs scripts, it can take y4m pipes, I think adding --y4m support in output wouldn't be such a big deal for Rigaya.
I think it's worth opening a feature request on Rigaya's Github (https://github.com/rigaya/NVEnc/tree/master).

This would lead to:

Avisynth (indexing and post-processing) .avs -> NVEnc (cuda accelerated filtering/LUT application) y4m pipe -> x265 (encoding) .h265 file

If NVenc is expected to be an interface to hardware onboard encoder - I think it can not simply bypass the encoder after the Vpp stage and output uncompressed frame data to host memory from onboard memory.

Oh... I didn't know... bummer... :(

About input from AVS - it looks close to being implemented in the NVenc.

It's already (finally) there, actually. :)
That's what Tormento and I have done yesterday and today: straight from Avisynth to NVEnc. ;)

DTL

6th July 2023, 11:37

While it shows loss of details it also gives a prominent color cast. But other features may be interesting and useful though.

Here a request about degrain/Mvtools:
https://github.com/rigaya/NVEnc/issues/243

It was old state. So developers not like to port full mvtools to CUDA. At least MSuper (with sub sample calculation) + complex enough MAnalyse + simple enough MDegrainN.

Now NVIDIA provide hardware ME and it replaces MAnalyse. So developers of temporal degrain filter only need to implement much more simple MDegrainN. At least in the state of 2.7.45 version. Also the interpolated overlap is not hard to port to accelerator I hope - with some dependance between blocks processing shaders. So at the 2023 may be developers of NVenc will not be as afraid to port only small part of mvtools to make full hardware accelerated motion compensated temporal denoise filter on NVIDIA board. Also hardware ME not required MSuper with subsample calculated data and also some shader example available how to do required subsample shifting for MDegrainN. To keep full precision provided by hardware ME (up to qpel).