LAV CUVID Decoder - High Quality Hardware decoding for NVIDIA [Archive] - Page 5

nevcairiel

14th April 2011, 07:09

sadly, no
For the moment (and it's not even certain that this is fixable) LAV CUVID decoder will not output the doubled framerate which results in jerky playback.
You get nice deinterlaced picture but with judder which is not acceptable for sports ... or anything else if you are serious about your quality experience.

Someone that knows more about actual technical details of deinterlacing .. how does the doubled framerate work?
I assume it doesn't simply output the same image twice, the two will be minimally different, right?

If i have 60 fields, how do i create 60 frames out of that?
the CUVID API has some parameters that i can set to get different images out of the post-processor .. but i'm not sure what i really want to get.

I haven't really bothered to look into the technical details of deinterlacing up until now - it also seems to be a well guarded secret. :p

madshi

14th April 2011, 08:07

Someone that knows more about actual technical details of deinterlacing .. how does the doubled framerate work?
I assume it doesn't simply output the same image twice, the two will be minimally different, right?

If i have 60 fields, how do i create 60 frames out of that?
the CUVID API has some parameters that i can set to get different images out of the post-processor .. but i'm not sure what i really want to get.

I haven't really bothered to look into the technical details of deinterlacing up until now - it also seems to be a well guarded secret. :p
You probably know this already, but let me explain, just to be safe:

First of all you must strictly separate between movie and video content.

movie content was originally shot in 24p (or in Europe eventually in 25p). This is then converted to 60i (or 50i) by first interlacing the progressive frames and then repeating some of the interlaced fields, if necessary (not with 25p -> 50i). Meaning that for perfect deinterlacing you need to throw away the repeated fields (if any) and then fuse the correct original fields together to get back to the original 24p/25p. This is called IVTC. Important to note is that all the interlaced fields originate from only 24 (or 25) different points in time.

video content was originally shot in 60i or 50i. It's important to note here that every interlaced field comes from a different point in time (!!). There are different kinds of cameras. Some shoot directly interlaced. Others shoot progressive and then throw away (or filter away) half the scanlines to produce interlaced output. Deinterlacing video content to 30p or 25p is VERY wrong. The reason for that is that you cannot simply put two fields together which come from a different point in time. Basically the only proper way of deinterlacing video content is to create 60 or 50 progressive frames. The most simple way to do that is to simply take one field at a time and scale it 2x up vertically. This will give you 60p or 50p with fluid motion. But of course this will not be very sharp. This is called BOB deinterlacing. There are better methods. But one thing is clear: The output *must* be 50p or 60p.

I don't suppose either DXVA or CUVID can automatically handle all this for you, but I don't really know. My best guess is that you need to offer users a switch between movie / video content. Then for video content you should make CUVID output 50/60 progressive frames. For movie content you should make it output 24/25 progressive frames. I don't know if the latter is possible. 25p output should be no problem. But for NTSC/ATSC movie content maybe CUVID can only output 30p, I don't know.

Of course things can get even more complicated than this. E.g. broadcasters like to blend *video* fonts / advertisements over movie content. Then you have a funny mixture of video and movie content in one stream. This is why better hardware deinterlacers decide per pixel (not per frame) whether the content is video or movie based. For this kind of content the best output is probably 60p / 50p.

tiny

14th April 2011, 11:46

Regarding the case of deinterlacing and CUVID it might be interesting to look at the (rather lenghty) dialog between Doom9's forum user neuron2 and Nvidia that took place during the development of DGAVCDecNV (http://neuron2.net/dgdecnv/dgdecnv.html).

That dialog is available here: http://neuron2.net/dgdecnv/cuda/cuda.html

yesgrey

14th April 2011, 12:28

I assume it doesn't simply output the same image twice, the two will be minimally different, right?
Right.

If i have 60 fields, how do i create 60 frames out of that?
Regarding the case of deinterlacing and CUVID it might be interesting to look at the (rather lenghty) dialog between Doom9's forum user neuron2 and Nvidia
Here is a quote from it that might answer your question...
"I'm assuming you mean 60Hz (or double frame rate) deinterlacing. This can be implemented by calling cuvidMapVideoFrame() twice: the first time with second_field=0, and the second time with second_field=1. The actual deinterlacing mode is still currently going to be the driver default (may be different on different GPUs and OSes), but it would allow you for example to convert 30Hz interlaced to 60Hz progressive (better than dropping one of the fields)."

nevcairiel

14th April 2011, 12:34

Here is a quote from it that might answer your question...
"I'm assuming you mean 60Hz (or double frame rate) deinterlacing. This can be implemented by calling cuvidMapVideoFrame() twice: the first time with second_field=0, and the second time with second_field=1. The actual deinterlacing mode is still currently going to be the driver default (may be different on different GPUs and OSes), but it would allow you for example to convert 30Hz interlaced to 60Hz progressive (better than dropping one of the fields)."

That was actually one of the things i found in the API that i was going to test, having some confirmation that it might even do what i want it to do, great! :)

yesgrey

14th April 2011, 12:45

Here is another quote that might be useful. It seems in some situations one problem can arise, and the NVidia guy suggests a way of solving that problem.
I've noticed that one frame remains salted away somewhere when the deinterlacer is enabled and I am unable to flush it. When I do my reset and start pushing NALUs again, the hidden frame from the previous decoding comes out and I cannot find a way to flush it.

Any ideas?

The unflushable frame that I mentioned in my last mail appears to be in the D3D instance somehow, because I can get rid of it by killing the server and re-starting it.

Would you know if there is a way to flush it out with a D3D call? If I kill and remake the D3D instance, I'm going to get hit by the crash on second D3D instantiation problem again.

I've seen this problem as well: what is happening is that when this particular high-end deinterlacing method is used, it unfortunately introduces one field delay, which does suck (not a problem for normal playback, but a big issue for a generic API like NVCUVID).

So far, I haven't been able to get rid of this, but if your only concern is being able to flush it, one way to do so is to start at second_field=1 after a discontinuity.

Ultimately, the best way to get rid of this is to perform the deinterlacing within NVCUVID using a cuda kernel, rather than relying on VMR9, with sometimes some rather obscure side effects, but this requires fairly significant amount of changes in nvcuvid, so it's going to take a while.

Aha, I confirm what you say. It has nothing to do with unsuccessful flushing.

If my field sequence shown in [] delimited frames is:

[a b] [c d] [e f] [g h] [i j] ...

Then when I deinterlace double rate, I get this sequence of bobbed frames:

X a b c d e f g ...

Where X is a leftover (field now a frame) from the end of the last decode sequence. The first field is just garbage.

And yes, your suggested strategy looks good. It's a complication for my random access code so I'll have to implement it carefully.

@neuron2,
I hope you don't mind for me to put this quote here. In case you do, let me know and I'll remove it.

yesgrey

14th April 2011, 12:47

Another thing that might be worth it to try is if the double rate is only achievable when using BOB, or if it can also be achieved when using adaptive.

madshi

14th April 2011, 13:19

Adaptive is for video, not for movies. It would be a major design flaw if it couldn't do double rate. The only deinterlacing modes that shouldn't do double rate are IVTC and Weave.

yesgrey

14th April 2011, 15:47

Adaptive is for video, not for movies.
Right.

It would be a major design flaw if it couldn't do double rate.
Not necessarily, it would depend on how the adaptive mode works.
Let's consider the first 6 frames of a 60fps interlaced video:
a b c d e f
I think that deinterlacers typically work outputting half the frame rate by combining the frames in pairs, like this: ab cd ef
However, this is not the best solution. One better solution would be outputting the original frame rate but with frames created using the adjacent frames, like this: ab abc bcd cde def ef

According to what I've quoted in my previous posts, it seems the CUVID deinterlacing works like the first one, grouping the frames in pairs, and if that's the case, I don't think it would be possible to get double rate using other than BOB, because the second frame will not have any info about the next frame. However, without knowing the internals of it I can't say for sure.

Furthermore, neuron2's tool only allows double frame rate using bob. If it would work I don't see why neuron2 didn't do it...

madshi

14th April 2011, 16:00

I think that deinterlacers typically work outputting half the frame rate by combining the frames in pairs
No, they don't. Doing that would simply be broken design, nothing else. No single CE (consumer electronics) hardware deinterlacer I've ever heard of does that.

"Adaptive" probably means "motion adaptive deinterlacing" which means that for moving image parts BOB is used while for static image content WEAVE is used. BOB always outputs 60p/50p, unless you throw away every other field, which would again be broken design.

yesgrey

14th April 2011, 16:50

No single CE (consumer electronics) hardware deinterlacer I've ever heard of does that.
You're right, but I was talking about software deinterlacers, not hardware ones.

BOB always outputs 60p/50p, unless you throw away every other field, which would again be broken design.
I agree. From what I've read on the discussion between neuron2 and nvidia, it seems it's not a broken design, because internally the deinterlacer uses the original frame rate, but then it only outputs half of it unless told otherwise...

madshi

14th April 2011, 17:13

You're right, but I was talking about software deinterlacers, not hardware ones.
Well, I don't know much about HTPC software deinterlacers. If some/many of them do 60i -> 30p for video content then IMHO the deinterlacer devs didn't know what they were doing.

CruNcher

14th April 2011, 19:30

Eh it's an option not that you have todo it :) 60i->60p takes alot of resources and bandwith (depending on resolution) so some want to save that nothing wrong with it, since VPx and UVD on the Desktop tough it can be done with Yadiff matching quality (currently) @ 1080p with fairly low system requirements and shaders in Realtime @ relatively low power consumption, so only bandwith is an issue ;)
Sure doing that on a soccer game to save some bandwith would be crazy unless you want to create a more film style look for the Game and leave what in real was captured by the camera ;)
Might be currently one of the worlds most advanced Software Deinterlacer (http://forum.doom9.org/showthread.php?t=156028)
And here it might be one of the Worlds most advanced Hardware counterpart developed from US Military Research (http://www.hqv.com/index.cfm?page=tech.de-interlacing)

Dogway

14th April 2011, 19:37

Is LAVCUVID suited for bluray playback? Im doing some tests and it performs laggish, ffdshow is OK, it's apocalypto bluray.

VincAlastor

14th April 2011, 20:09

thank you for LAV CUVID. LAV CUVID doesn't run judder-free on my notebook with nvidia 8400m gt gpu and latest mpc hc, madvr 0.53 and LAV splitter, but thats not so important, to play around with your great tools is fascinating enough :D.
is it possible to fuse LAV CUVID, LAV splitter and madVR into one amazing new high quality avisynth source filter with cuda acceleration and maybe first-time avisynth dxva acceleration if madshi integrates dxva in madVR?

thank you nevcairiel and thank you madshi my videos look fantastic!

madshi

14th April 2011, 20:24

Eh it's an option not that you have todo it :) 60i->60p takes alot of resources and bandwith (depending on resolution) so some want to save that nothing wrong with it
Well, if you want to save resources and bandwidth you can also decode only key frames and throw away all other frames. Nothing wrong with that, either... :p

spartan711

14th April 2011, 21:16

I have consistent stuttering every 10 seconds. ~ 8 frames dropped every time. Here is the MediaInfo profile of the media.

Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High@L4.1
Format settings, CABAC : Yes
Format settings, ReFrames : 5 frames
Codec ID : V_MPEG4/ISO/AVC
Duration : 1h 47mn
Bit rate : 8 062 Kbps
Width : 1 920 pixels
Height : 800 pixels
Display aspect ratio : 2.40:1
Frame rate : 23.976 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.219
Stream size : 5.95 GiB (91%)
Writing library : x264 core 79 r1342 e8501ef
Encoding settings : cabac=1 / ref=5 / deblock=1:0:0 / analyse=0x3:0x113 / me=umh / subme=7 / psy=1 / psy_rd=1.0:0.2 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / chroma_qp_offset=-3 / threads=12 / nr=0 / decimate=1 /

mbaff=0 / constrained_intra=0 / bframes=3 / b_pyramid=0 / b_adapt=1 / b_bias=0 / direct=1 / wpredb=1 / wpredp=0 / keyint=250 / keyint_min=25 / scenecut=40 / rc_lookahead=40 / rc=2pass / mbtree=1 / bitrate=8062 / ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 / qpstep=4 / cplxblur=20.0 / qblur=0.5 / ip_ratio=1.40 / aq=1:1.00
Language : English

Audio
ID : 2
Format : AC-3
Format/Info : Audio Coding 3
Mode extension : CM (complete main)
Codec ID : A_AC3
Duration : 1h 47mn
Bit rate mode : Constant
Bit rate : 640 Kbps
Channel(s) : 6 channels
Channel positions : Front: L C R, Side: L R, LFE
Sampling rate : 48.0 KHz
Bit depth : 16 bits
Compression mode : Lossy
Stream size : 494 MiB (7%)
Language : English

Text
ID : 3
Format : UTF-8
Codec ID : S_TEXT/UTF8
Codec ID/Info : UTF-8 Plain Text
Language : English

Hardware is
Processor: C2D T5800 2.0 Ghz
RAM: 4GB
Video Card: 9600M GT
OS: 64 bit Windows 7 Pro

Process priority is on high, and I have nothing else running.

Tom Keller

15th April 2011, 01:11

is it possible to fuse LAV CUVID, LAV splitter and madVR into one amazing new high quality avisynth source filter with cuda acceleration and maybe first-time avisynth dxva acceleration if madshi integrates dxva in madVR?
MadVR is great (not just cause it's the only video renderer that works perfect with the LAV CUVID Decoder for me :p )! But since madVR simply IS a video renderer, it has no place in AviSynth, since AviSynth doesn't need any video renderer at all. It "only" outputs uncompressed video to any appropriate application - so AviSynth doesn't need to do video rendering.

However... i guess it should be possible to use the same (or at least: similar) scaling and upsampling algorithms inside AviSynth (for HQ resizing and color space conversion), which are used by madVR.

As to the rest: with DGDecNV one source filter with CUDA acceleration still exists. It's not free though - but it works damn good.

roozhou

15th April 2011, 10:09

However... i guess it should be possible to use the same (or at least: similar) scaling and upsampling algorithms inside AviSynth (for HQ resizing and color space conversion), which are used by madVR.

Avisynth's built-in resizers is CPU only. And more importantly, unlike libswscale and most resizers used in video renderers, Avisynth's built-in resizers do not apply any colorspace conversion.

e.g. You want to watch a 720x480 DVD video fullscreen on a 1920x1080 monitor. MadVR will convert 720x480 YV12 -> 1920x1080 RGB(8bit or 10bit) in one step. The intermediate results are 16bits or higher so rounding errors are minimized. Unfortunately avisynth has to use two filters.
LanczosResize(1920,1080).ConvertToRGB32()
The chroma is upscaled two times, so it is slower and the quality is lower.

VincAlastor

15th April 2011, 11:01

MadVR is great (not just cause it's the only video renderer that works perfect with the LAV CUVID Decoder for me :p )! But since madVR simply IS a video renderer, it has no place in AviSynth, since AviSynth doesn't need any video renderer at all. It "only" outputs uncompressed video to any appropriate application - so AviSynth doesn't need to do video rendering.

However... i guess it should be possible to use the same (or at least: similar) scaling and upsampling algorithms inside AviSynth (for HQ resizing and color space conversion), which are used by madVR.

As to the rest: with DGDecNV one source filter with CUDA acceleration still exists. It's not free though - but it works damn good.

thank you, i thought for dxva avisynth need a videorenderer. and madvr's gpu scaler and gpu upsampler are missing in avisynth. i use DGDecNV!, but avi's can not be decoded, indexing is nerving and an adaptive deinterlacer is missing too. and maybe LAV CUVID is faster....

a new all in one high quality hardware accelerated source filter were good for avisynth, and a strong and fast bundle of LAV splitter, LAV CUVID, ffdshow/ffmpeg and (parts of) madVR could make it possible (i think, i wish) ...if i had studying informatics and experience, damn...

do you thing my nvidia 8400m gt isn't to slow for LAV CUVID and full hd material???

roozhou

15th April 2011, 11:05

avisynth cannot use DXVA and LAV CUVID does not use DXVA. 8400m is enough for LAV CUVID as long as you 128M+ vmem.

VincAlastor

15th April 2011, 11:15

avisynth cannot use DXVA and LAV CUVID does not use DXVA. 8400m is enough for LAV CUVID as long as you 128M+ vmem.

i meant madvr with dxva in future, but not necessary if avisynth can't use a renderer for dxva. :) but madshi's gpu upsampler and gpu scaler are missing in avisynth, maybe madshi could make a high quality spatio-temoral gpu shaders denoiser and sharpener. i think he can when i watch my videos with his renderer.

my 8400m gt has 256mb vmem, but full hd videos are juddering.

yesgrey

15th April 2011, 13:16

i use DGDecNV!, but avi's can not be decoded, indexing is nerving and an adaptive deinterlacer is missing too.
That's not correct. DGDecNV also uses the adaptive deinterlacer, the same used in LAV CUVID.

However, considering this other quote from neuron2's discussion with nvidia guys
Because we currently rely on default DirectX behavior, tweaked for realtime performance,
there is no guarantee for the type of deinterlacing that will be used (to the contrary,
I can guarantee that it's not the best deinterlacing mode available on the GPU)
we might not be able to have the best deinterlacing adaptive mode via CUVID like we get via DXVA...

CruNcher

15th April 2011, 18:48

That's not correct. DGDecNV also uses the adaptive deinterlacer, the same used in LAV CUVID.

However, considering this other quote from neuron2's discussion with nvidia guys
Because we currently rely on default DirectX behavior, tweaked for realtime performance,
there is no guarantee for the type of deinterlacing that will be used (to the contrary,
I can guarantee that it's not the best deinterlacing mode available on the GPU)
we might not be able to have the best deinterlacing adaptive mode via CUVID like we get via DXVA...

It's known that they dynamically decide via the driver based on the shader amount which cuda core (deinterlacing complexity) to use though i had always the impression it will be the same Motion Adaptive Deinterlacer (when the GPU is strong enough) for both DXVA aswell as Nvcuvid (Yadiff comparable Quality, see also the many tests of Avisynth Deinterlacers vs DGDecNV NVcuvids Deinterlacer). You can also see that in one of my earlier posts where in one Driver they accidentally disabled that check and i had the Motion Adaptive Deinterlacer available on my 7600 GS they prepared for the 8 series obviously with it i couldn't reach 1080 Mpeg-2 30i->60p anymore in Realtime the 7600 GS was just to weak but the results where much better (im pretty sure that was the time when they reached Yadiff quality) and that i guess is what he wanted to say with this and the other thing might be he meant cuda deinterlacing cores that are running in their labs @ the time of the talk with Donald and will come with new Drivers (which most probably should be out by now). :D.
Though what you read out of this conversation i guess should be visually provable using roozhou DXVA decoder with his framegrabber if VMR Deinterlacing should be applied with the captured output ?, and some interlaced test samples be it SD or HD :)
And then compare those results with either LAV CUVIDs output and DGDecNV or any other Nvcuvid based app. Also if you think logic it makes no sense to use a better deinterlacer for VMR (Higher Complexity more resource needs) to gain as he said Realtime Playback results and on the Nvcuvid side a lower Quality one where Realtime plays no role @ all, that would be exactly the contrary.
Though it would make sense if you are as naughty and don't want 3rd parties be able to reach the Deinterlacing quality with their Products Output as you would be able to achieve Realtime, though i dont see what for a reason Nvidia should have todo that they sell Hardware it would be plain dumb to restrict 3rd party apis that way, hmm unless you target a more pro area and cards (Quadro) and need some feature to differentiate from the Consumer area ;)

yesgrey

15th April 2011, 20:12

Unfortunately the above quote pretty much explains the results I got some while ago when I compared the CUVID deinterlacers with the same ones via DXVA. At that time I used DGDecNV for CUDA and ffdshow for DXVA. I've found the test images and made a new one with LAV CUVID to be sure, and it gave exactly the same result as DGDecNV, so it's a CUVID limitation.

I agree, it's pretty dumb, but it is what it is. If we want to use the best deinterlacers that our GPUs have we need to use DXVA.

I've also included two images using ffdshow's yadif, and its quality it's pretty close to DXVA... however, I could not catch the same frame, so the comparison is not as good as with all the other deinterlacers. My GPU is a NVidia GT 240.

You can grab my tests results here (http://www.megaupload.com/?d=ZJL57VPH).

@madshi,
it seems that you still need to consider support DXVA in madVR for us to get access to the best deinterlacing algos available on nvidia GPUs (and ATI GPUs too).

nevcairiel

15th April 2011, 20:18

CruNcher

15th April 2011, 20:32

@yesgrey
Im not sure what your test their should show but i took out only the important ones based on this thread and the quote from nvidia which are

dgdecnv_double.png
dgdecnv_single.png
ffdshow_adapt.png
lavcuvid_adapt.png

its virtualy impossible to compare them slicies is a good test but comparing all of these in different rendering environments (madvr,vmr/evr,avisynth?) is no good idea to get a clean comparable result, that blurring on the lavcuvid result you cant be sure that this isnt coming from the renderer instead the deinterlacer as compared to the DXVA result on VMR/EVR. You can see though that dgdecnv_single and lavcuvid_adapt match same renderer also for the ffdshow_adapt ?. Visually you can clearly see that the ffdshow_adapt has less aliasing on the football field line which is NEDI quality :) but without comparing all on the same renderer i wouldn't put my hand into the fire that this is the effect of the deinterlacer and maybe not difference that the renderer introduces.

nevcairiel

15th April 2011, 20:48

I created a quick and dirty option in the GUI that will let you enable the DXVA interop mode of the CUVID decoder. It does look different to me, a quick comparison of the slicies test seems like it does look really similar to the output of EVR doing deinterlacing on NV12 content send by ffdshow with interlaced flags set. Before the CUVID output looked worse, some of the horizontal lines went missing etc, now its basically the same, to my eyes anyway. Didn't do screenshots for comparison or anything.

Here is the test built, just flip the switch and restart the player / reload the file.
http://files.1f0.de/cuvid/LAVCUVID-0.2-dxva-interop.zip

Would be great if you could test that. :)

yesgrey

15th April 2011, 20:48

The CUVID driver has a flag with which you can select your preferred decoder. The options are CUVID, DXVA, and CUDA (which seems to fallback to one of the other, because there is no CUDA decoder). I wonder if that can emulate real DXVA mode and give us the full power..

I can provide a test build with it switched, or even add an option for that for testing..
If you want to, post it and I will give it a try.

You can see though that dgdecnv_single and lavcuvid_adapt match same renderer also for the ffdshow_adapt ?.
The ones to compare are dgdecnv_single, lavcuvid_adapt and ffdshow_adapt. All these three are using the adaptive mode. The differences between ffdshow (dxva) and the other two are so obvious that could not be justified by a different renderer. Furthermore, we are having much higher quality with EVR (ffdshow) than with madVR (LAV cuvid) and exact frame grabbing via avisynth (DGDecNV), so it shows even more that something is not right with the CUVID decoders.

Let's hope that what nevcairiel suggested works...

nevcairiel

15th April 2011, 20:53

See one post above yours ^

yesgrey

15th April 2011, 21:05

See one post above yours ^
Here (http://www.megaupload.com/?d=IWSZOXT5) is the result with the dxva option checked.

Exactly the same as with ffdshow/evr. There is only a slight difference on the red contour, but that's due to different chroma upsampling.

So now we nvidia guys already have the best hardware deinterlacing via madVR. :)

In short: You're da man!!! :thanks:

PS: Now, to be perfect, we only need the double rate method... ;)

Dogway

15th April 2011, 21:18

This is my GPU load:
http://img19.imageshack.us/img19/7361/asgg.th.gif (http://img19.imageshack.us/img19/7361/asgg.gif)

RAM was 830Mb and CPU around 10%

So I dont see a reason Bluray couldn't play, it is LAVCUVID or some settings in my card related to it, because ffmpeg-mt could play it.

nevcairiel

15th April 2011, 21:23

So now we nvidia guys already have the best hardware deinterlacing via madVR. :)

Great, at least something related to interlacing "just works". :)

madshi

15th April 2011, 21:40

So now we nvidia guys already have the best hardware deinterlacing via madVR. :)
No, not yet. 60i -> 30p is far from good for video content... :p

(sorry)

nevcairiel

15th April 2011, 21:42

PS: Now, to be perfect, we only need the double rate method... ;)

I just did a quick hack, instructing CUVID to give me the same image a second time with the "second_field" flag set, and running a memcmp() on the two resulting buffers - and it did actually produce different results.

This at least lets me hope that i might actually get the second field separately .. i'll implement this soon and see what happens.

I wish there was some easy way to detect the difference between 60i VIDEO to be played at 60p and 60i FILM to be played at 24p. But then again, i don't have much 60i FILM content, so huh! :)

SamuriHL

15th April 2011, 21:50

Just a quite note, Nev. CUVID does in fact play my HD DVD converted to MKV with audio in sync. madVR shows the fps as 29.whatever but it stays in sync. With my AMD machine using HAM mode madVR doesn't have a clue what the fps is but it still works.

madshi

15th April 2011, 22:03

I just did a quick hack, instructing CUVID to give me the same image a second time with the "second_field" flag set, and running a memcmp() on the two resulting buffers - and it did actually produce different results.
Sounds promising!

I wish there was some easy way [...]
Yeah, me too.

nevcairiel

15th April 2011, 22:12

I did some really, really hacky implementation of this, but my eyes are tricking me, and i don't know if its really smoother.

This version will most likely do the weirdest things on progressive content, but if you have some interlaced VIDEO file (50i/60i) where you can directly see a difference in smoothness to the previous version, please do a visual comparison, and give me some hints if i'm on the right track.

No options or anything, just plain frame doubling, for any and all content you throw at it. :)
http://files.1f0.de/cuvid/LAVCUVID-0.2-frame-doubling.zip

BTW, how common are files that are mixed interlaced and progressive content?

yesgrey

15th April 2011, 22:29

No, not yet. 60i -> 30p is far from good for video content... :p
PS: Now, to be perfect, we only need the double rate method... ;)
You missed my PS... However, I was referring to the best adaptive mode available from the GPU, and not to the best possible... ;)

I wish there was some easy way to detect the difference between 60i VIDEO to be played at 60p and 60i FILM to be played at 24p.
Now with Blu-ray I think that is not so critical anymore... but with dvds it would be problematic... and it would be dvds which would benefit the most from the hardware deinterlacing...

I did some really, really hacky implementation of this, but my eyes are tricking me, and i don't know if its really smoother.
Great! I will give it a try. Have you done it with both BOB and adaptive, or only with BOB?

BTW, how common are files that are mixed interlaced and progressive content?
I don't know. I think it happens when you are watching the full Blu-rays or DVDs. However, for me that's not a problem, because I always convert to mkv before watching.

nevcairiel

15th April 2011, 22:30

Great! I will give it a try. Have you done it with both BOB and adaptive, or only with BOB?

Both should work. Hell it will most likely even double the weave frame rate, but i havent tested it. Like i said, its really just an ugly hack to see what images it outputs. ;)

yesgrey

15th April 2011, 22:43

Hell it will most likely even double the weave frame rate, but i havent tested it.
In that case it would give two equal images, because the second field variable only is valid for BOB and adaptive.

pankov

15th April 2011, 23:49

I did some really, really hacky implementation of this, but my eyes are tricking me, and i don't know if its really smoother.

This version will most likely do the weirdest things on progressive content, but if you have some interlaced VIDEO file (50i/60i) where you can directly see a difference in smoothness to the previous version, please do a visual comparison, and give me some hints if i'm on the right track.

No options or anything, just plain frame doubling, for any and all content you throw at it. :)
http://files.1f0.de/cuvid/LAVCUVID-0.2-frame-doubling.zip

Nev,
you are going to give me an eye orgasm if you continue this
;)

I've just tried this version with a few 50i and 60i sports recordings and I must say that you've nailed it.
Using this and latest madVR (0.56) is close to video nirvana
:D
:thanks::thanks::thanks:

yesgrey

15th April 2011, 23:58

I'm getting very jerky results with massive frames dropping. I think my computer might not be able to handle full 1080i... I will give a try with 480i.

As a side note, are you changing the reported frame rate? It seems you aren't, because madVR still reports it as 29.97, and after the doubling it should be 59.94. I don't know if this would affect negatively the output...

JarrettH

16th April 2011, 00:01

Is there a benefit to using this versus the Microsoft DTV Decoder? I'm talking strictly for DVDs

yesgrey

16th April 2011, 00:42

OK. I've tested with a DVD NTSC music concert I have and the result is ........................ (sorry, I'm speechless)

No, not yet. 60i -> 30p is far from good for video content...
madshi was so right!... It's a completely new experience watching video with such a fluidity... it's everything so smooth that we almost forget that we are watching an interlaced source. Of course there is some jagginess, sometimes, on some diagonals, but this was with a 480i source. I'm sure that with a 1080i source the results should be magnificent!
Of course the new madVR path is also a must. In fact, it only works when using madVR's fullscreen exclusive mode, both new and old paths. With windowed mode I got lots of dropped frames, but it might be related to LAC CUVID not yet reporting the correct frame rate to the renderer. madshi might clear this up better.

Another thing that might be useful would be allowing the selection of the top/bottom field order. Some files might work better with different settings.

@nevcariel,
congratulations and thank you very much once again. What a proficuous day you had today!!!
:thanks::thanks::thanks:

yesgrey

16th April 2011, 00:44

Is there a benefit to using this versus the Microsoft DTV Decoder? I'm talking strictly for DVDs
If you want GPU decoding and deinterlacing with madVR the answer is YES.

yesgrey

16th April 2011, 01:57

A few more tests...

I decided to run timeCodec to measure LAV CUVID performance with my graphics card (nvidia GT 240)

With a WVC1 1080i 29.97 source file:
v0.2: 47.1 dfps
v0.2 dxva-interop: 29.6 dfps
v0.2 frame-doubling: 21.3 dfps

Here's the reason for not being able to get smooth playback with frame doubling, my GPU is too slow. It seems I would need 3x the processing power of my current GPU for watching smooth 1080i video. So, the deinterlacing should be performed by using the shaders, and not specific hardware on the GPU.

It would be interesting to see test results from more powerful GPUs to have an idea of what would be needed for smooth playback.

Dogway

16th April 2011, 02:41

I tested over an AVC file with timecodec:

lavcuvid
600fps
109dfps
ffdshow-mt
2700fps
118dfps

I have core2duo 2.53Ghz 1066fsb, and a 9600mgt card. Is my CPU supposed to go faster than the GPU?

nevcairiel

16th April 2011, 06:24

It would be interesting to see test results from more powerful GPUs to have an idea of what would be needed for smooth playback.

I actually had a 1080i/60 file that did not play properly on my GTX570, and i might know why. The execution is not really properly pipelined - right now i get one frame from the source, decode it, post process it, deliver it - wait for the next frame.

This was also mentioned in neuron's chat with NVIDIA, you should have more frames in the pipeline for much higher efficiency.
Luckily, NVIDIA also gave an example how to do this, so i'll be implementing this as well - hope it makes the difference.

PS:
My results for my problem file on my high-end card
w/ DXVA and frame doubling

74fps with BOB
59fps with Adaptive

while 59fps comes close, it was not fast enough. In EVR i get around ~50fps runing that file