View Full Version : LAV CUVID Decoder - High Quality Hardware decoding for NVIDIA
nevcairiel
29th March 2011, 07:46
I do not have CoreAVC, but DGDecNV and it works perfectly there. I also remuxed the file, problem is still the same.
By the way, the file the problem occurs is not recorded from DVB, but encoded by a DSLR and is in MOV container, I remuxed to MKV, same problem. File is 1080p25 (1088 coded). Plays fine with ffdshow.
Can you upload a short sample that shows the problem? Maybe cutting it to 1080 solves it already, but i dont think i have a video with those problems
Another question: MPEG-2, MPEG-4 ASP and H.264 work for me, but it refuses to accept WMV3 (from wmv/asf file), which is the same as VC-1? Or are there some differences and CUDA cannot decode it?
I suppose i can try feeding it directly into the decoder and claiming its VC1, and see what happens..
CruNcher
29th March 2011, 09:23
Nev might be interesting http://forum.doom9.org/showpost.php?p=1488268&postcount=6181 especially the 12 second pattern before drops start (buffer difference ?) for some Decoder including yours :)
Can you upload a short sample that shows the problem? Maybe cutting it to 1080 solves it already, but i dont think i have a video with those problems
Here it is:
http://www.megaupload.com/?d=F7S8JZO1
Thanks
neoufo51
29th March 2011, 23:50
For some reason, no matter what, this decoder will only work when I play H264 files. It doesn't work at all with any XVID files. I've disabled the MPC-HC decoder but it then just uses the Microsoft mpeg4 DMO.
EDIT: Whoops, my main card only supports featureset B. My mistake.
SamuriHL
29th March 2011, 23:51
That's because it only decodes h.264/vc-1/mpeg2 afaik.
BatKnight
30th March 2011, 00:44
That's because it only decodes h.264/vc-1/mpeg2 afaik.
Nope, LAV CUVID decodes H264, VC-1, MPEG2 and MPEG4-ASP (DivX/Xvid) as long as you own a nVIDIA card that supports the VDPAU Feature Set C.
Cards supporting only Feature Set B or A can't decode MPEG4-ASP (DivX/Xvid)
You may check http://en.wikipedia.org/wiki/Nvidia_PureVideo#Table_of_PureVideo_.28HD.29_GPUs for a complete list of availability.
Nuno
SamuriHL
30th March 2011, 00:54
Nope, LAV CUVID decodes H264, VC-1, MPEG2 and MPEG4-ASP (DivX/Xvid) as long as you own a nVIDIA card that supports the VDPAU Feature Set C.
Cards supporting only Feature Set B or A can't decode MPEG4-ASP (DivX/Xvid)
You may check http://en.wikipedia.org/wiki/Nvidia_PureVideo#Table_of_PureVideo_.28HD.29_GPUs for a complete list of availability.
Nuno
Ah, neat. I didn't know it did mpeg4-asp. Well, obviously if your card supports it. Mine would. Sweet!
neoufo51
30th March 2011, 00:56
Nope, LAV CUVID decodes H264, VC-1, MPEG2 and MPEG4-ASP (DivX/Xvid) as long as you own a nVIDIA card that supports the VDPAU Feature Set C.
Cards supporting only Feature Set B or A can't decode MPEG4-ASP (DivX/Xvid)
You may check http://en.wikipedia.org/wiki/Nvidia_PureVideo#Table_of_PureVideo_.28HD.29_GPUs for a complete list of availability.
Nuno
Crap, that's exactly the problem. I thought my card supported C but I doubled checked and its B. I'll buy new hardware in the near future anyway. Thanks for your help.
JarrettH
30th March 2011, 02:19
Doesn't seem to be loading for me.
Using MPC and EVR-CP I blocked Microsoft DTV-DVD Decoder, made LAVCUDSFDFGFDS decoder preferred, DVD still loads with MPEG-2 Video Decoder (low merit)
ranpha
30th March 2011, 03:06
Doesn't seem to be loading for me.
Using MPC and EVR-CP I blocked Microsoft DTV-DVD Decoder, made LAVCUDSFDFGFDS decoder preferred, DVD still loads with MPEG-2 Video Decoder (low merit)
Yeah, it seems that using the 'Open DVD' function, and LAV CUVID isn't used (it uses the Cyberlink decoder in my case - and surprisingly, there are no Macrovision error with madVR). Only if I open the .vob file directly, LAV CUVID decoder will be used.
JarrettH
30th March 2011, 03:19
I am also wondering what this does that the Microsoft DTV-DVD one does not. The Microsoft one looks pretty great to be honest. :cool:
thuan
30th March 2011, 03:51
@nevcairiel:
Since the time nvidia released driver newer than 258.96, DXVA and CUDA acceleration has been very choppy for me (home config), the worst is with 1080p video, with your decoder it plays like slideshow (1 frame per 4 or 5 seconds), with CoreAVC CUDA or MPC DXVA I still have it at certain positions in a video and after seek. It is really annoying that I'm thinking of changing hardware. So can you compile this decoder with an older version of CUDA SDK so I can use it with the older driver? It will be very appreciated. Also what is the current state of playback on your newer nvidia (GTS 450 I presume), is it butter smooth for any video resolution/bitrate?
Thanks.
nevcairiel
30th March 2011, 06:53
Playback is perfect for me, on both my GTS 450 and my GTX 570, using either the 267.24 or 266.58 driver.
bjd
30th March 2011, 09:20
Yeah, it seems that using the 'Open DVD' function, and LAV CUVID isn't used (it uses the Cyberlink decoder in my case - and surprisingly, there are no Macrovision error with madVR). Only if I open the .vob file directly, LAV CUVID decoder will be used.
MediaType issue:
OpenDVD will produce MEDIATYPE_ENCRYPTEDPACK (not supported)
Open VOB will produce MEDIATYPE_VIDEO
thuan
30th March 2011, 09:39
@nevcairiel: Then I do hope I see you compile a version of your filter that support older graphic driver before I lost my patience and jump the gun :D. Don't really wanna change GFX card now as I'm going oversea sometime this year but this issue has been going for a while already and it seems to affect other with nvidia gtx 260/280 and older cards (saw a few on guru3d forum).
CruNcher
30th March 2011, 11:22
@nevcairiel:
Since the time nvidia released driver newer than 258.96, DXVA and CUDA acceleration has been very choppy for me (home config), the worst is with 1080p video, with your decoder it plays like slideshow (1 frame per 4 or 5 seconds), with CoreAVC CUDA or MPC DXVA I still have it at certain positions in a video and after seek. It is really annoying that I'm thinking of changing hardware. So can you compile this decoder with an older version of CUDA SDK so I can use it with the older driver? It will be very appreciated. Also what is the current state of playback on your newer nvidia (GTS 450 I presume), is it butter smooth for any video resolution/bitrate?
Thanks.
Hmm could you post Cuda-Z memory bandwith results ?
Low Power Profile
http://img836.imageshack.us/img836/1623/9800gtsb.png
High Power Profile
http://img690.imageshack.us/img690/1493/9800gtsbhigh.png
Also what for a renderer are you using ?
Also be carefull the DSP is only optimized upto H.264 1080p 60 fps if you try to playback something @ higher framerates you would need the VPx core that supports MVC :)
jj666
30th March 2011, 11:33
Thuan,
I'm using the 9800 GX2 with 267.60 driver (and previously 267.24), according to Wikipedia, should be exactly the same processing unit as the 9800 GT. No problems here...
GeForce 9600 GSO, 9800 GT, 9800 GTX, 9800 GTX+, 9800 GX2 G92 VP2 A March 2008
Cheers,
-jj-
thuan
30th March 2011, 11:42
@jj666: Your OS is? I will search for and try 267.60
@CruNcher: Here's the screenshot in High Power mode
http://thumbnails35.imagebam.com/12577/4a29e7125762717.jpg (http://www.imagebam.com/image/4a29e7125762717)
My GPU is factory overclocked to 740MHz core and 1850 shader though. Driver is 266.58 ATM of testing.
madVR or EVR-CP, same problem.
And no, I was trying to play simple 1080p 24fps file.
CruNcher
30th March 2011, 12:37
Hmm quite low (what for System Memory you use ? Frequency,timings ?)
are these also result in playback issues (maybe even frame drops use the Youtube Info display, right click )
http://www.youtube.com/watch_popup?v=wdySQHhLXjA&vq=hd1080
http://www.youtube.com/watch_popup?v=qxXf7AJZ73A&vq=hd1080
don't forget to turn hardware acceleration on
nevcairiel
30th March 2011, 12:40
Note that LAV CUVID is only using pinned memory to transfer the image back from the GPU to the System RAM, as pageable memory is known to have performance problems.
Playback is perfect for me, on both my GTS 450 and my GTX 570, using either the 267.24 or 266.58 driver.
Tested again:
MPC internal splitter: problem
Haali Splitter: problem
LAVSplitter: works
The renderer doesn´t matter. Since all other decoders I´ve tested (ffdshow, mpc internal, DivX, Microsoft, DiAVC trial) work with haali/mpc splitter the bug is probably in your decoder.
nevcairiel
30th March 2011, 12:41
Lesson learned: use LAV Splitter! :)
I'll do some tests once i'm working on the decoder again, but from my past experience, there really has been a cycle of bugs surrounding the MPC splitters, the mpc decoders and ffdshow, where they all rely on the bugs of the other to work properly, and no-one is really doing it just right.
Selur
30th March 2011, 12:53
just wondering,..
would it be possible to use this decoder (on a windows system) with mplayer/mencoder by modifying the codecs.conf similiar to
videocodec lavcuvid
info "LAV CUVID Decoder"
status working
fourcc AVC,avc
fourcc VSSH
fourcc X264,x264
fourcc h264,H264
driver dshow
dll "LAVCUVID.ax"
out YV12
(since DGNVDec can't output Yv12 to stdout and feed i.e. x264 directly I just thought that using mencoder this might be a way to get gpu decoding to x264 without the need for avisynth and DGNVDec.)
Cu Selur
nevcairiel
30th March 2011, 12:56
I have no idea how those setups work, but two things:
- the fourccs are AVC1 and avc1 (FourCC kinda implies its always 4 :P)
- It can only output NV12, not YV12
Selur
30th March 2011, 13:00
Thanks! Will try a bit if I get this working,.. ;)
thuan
30th March 2011, 13:03
http://thumbnails25.imagebam.com/12577/0cf6a7125768934.jpg (http://www.imagebam.com/image/0cf6a7125768934)
4x2GB Kingmax DDR2-800. Your question reminds me of some BIOS options related to VGA performance in my Gigabyte board. Gonna try tweaking.
I tried these youtube video with latest flash player and those play fine (only problem is my internet speed, 1.5Mbps). Unrelated note, the first video also reminds me how I dislike trance/house/club/psychedelic music, funny how other ppl in their twenties like them but I simply can't.
CruNcher
30th March 2011, 13:15
Thanks! Will try a bit if I get this working,.. ;)
http://forum.doom9.org/showthread.php?t=141441 <- easier way (also should be more performant then going stdout):)
@thuan
memory speed seems not to be the problem as nev also said he uses pinned memory try a new driver with a newer nvcuvid.dll you can also test different nvcuvid.dll by just pushing them in LA CUVIDs main filter directory if you want to seperate 3D from Video in Nvidias driver (works only reliable though if they didn't made major CUDA branch changes that impact also nvcuvid or and nvcuvenc)
Selur
30th March 2011, 13:30
easier way
but it would require
1. for the .ax file to be registered,..
2. the use of directshow splitters
("also should be more performant then going stdout" -> shouldn't do much of a difference decoding with mencoder/ffmpeg and piping to x264 was on par with x264 and internal decoding, last time I checked ;))
-> played a bit around with the codecs.conf of mplayer, but didn't get it working,.. (may be I'll try it again some time)
Cu Selur
CruNcher
30th March 2011, 13:37
Hehe yeah it would require a controlled Dshow environment else Dshow Hell is guaranteed, and yeah i myself use mencoder and piping and yet have to see lower performance same as stan and you use it reliable for their encoding frameworks (oldshool) ;)
thuan
30th March 2011, 13:51
@cruNcher: Indeed trying to change these settings in my BIOS did little to the whole problem, albeit setting Robust Graphic Booster at Turbo and set memory speed to integer multiple of my FSB seems to make the problem a "little" less severe but it is still nowhere acceptable. I'm using the latest 267.91 now and still the same problem. I'm thinking Gigabyte did something to this graphic board BIOS that causes it to conflict with newer nvidia driver. The only configuration I know that work is 258.96 with DXVA or CoreAVC CUDA, unfortunately lav cuvid is not compiled to be compatible with that driver version so I can't test. I also don't think it's a broken windows installation as I have tried to reinstall once and still only 258.96 works.
I'm torn between getting a GTS 450 board and waiting for a fix. Considering I'm only of minority (seen some other had this problems too), likely it will be the same as that ATI chroma upsampling problem that I reported to ATI a few years ago (the respondents said the screenshots I sent them looks fine lol, that guy must have worked with 14in CRT). We are at the mercy of manufacturers.
@selur: I don't think that is going to work as this DirectShow filter is only a wrapper of CUDA calls included in a Windows DLL that works with Windows driver.
CruNcher
30th March 2011, 14:05
As you can read in CoreCodecs CoreAVC 2.5 changelog they use some ASM modifications for the CUDA Part i guess LA CUVID uses plain the Nvidia API but not really any optimization yet be it Colorspace Conversion or maybe other parts, though unfortunately the issue im facing Personally with both is not RAW Performance related at all and i don't found the reason for it yet.:confused:
SamuriHL
30th March 2011, 16:10
GPUDirect 2.0....does that buy any performance?
nevcairiel
30th March 2011, 17:39
As i understand it, GPUDirect is just a smart way to copy data between two CUDA cards, without ever seeing the CPU.
SamuriHL
30th March 2011, 18:21
Ah, I see. Not useful for us then. Thanks!
CruNcher
30th March 2011, 18:42
finally its out that Drivers nvcuvenc.dll also fixes a very pesky Encoder bug that caused strange moving chroma :)
mark0077
30th March 2011, 19:22
Thanks alot navcairiel, very impressive stuff with the decoder. Strangely, and maybe I'm hoping you could explain if this is something that could be improved up on or not, but using this or the coreavc decoders with my GTX295 / core i7 920 setup on Windows 7, I get more frame drops than with ffdshow ffmpeg software decoder.
I am using madVR and some avisynth scripts that change the frame rate from ~24 -> 50fps so I'm not sure if anything in this chain, is known to cause a bottleneck somewhere gpu wize thats causing the frame drops. Its only 1 or 2 frame drops a second but I am now getting 0 with the software decoders... Not a big deal but just thought I should mention it and I'm here if you need any tests / trials done to see if this can be resolved. I'd love to use your decoder, and even push my avisynth scripts further (quality wise) due to the freed up cpu.
Im using the 270.51 drivers released today. Testing today with a blu-ray vc1
mindbomb
30th March 2011, 19:56
im getting persistant stuttering on vc-1 content.
i have an nvidia 9400m, using madvr as my renderer, and lav splitter
CruNcher
30th March 2011, 20:25
im getting persistant stuttering on vc-1 content.
i have an nvidia 9400m, using madvr as my renderer, and lav splitter
very container dependent if it's inside PS/TS it wont surprise ;)
nevcairiel
30th March 2011, 20:29
im getting persistant stuttering on vc-1 content.
i have an nvidia 9400m, using madvr as my renderer, and lav splitter
Turn off "Enable VC-1 Timestamp Correction" in the LAV Splitter settings.
That goes for mark0077 as well.
The next version of LAV Splitter will have LAV CUVID in the list of decoders to autodetect it should be off.
mindbomb
30th March 2011, 21:13
I am still noticing dropped and delayed frames in madvr still with vc-1 actually on both my 9400m and gtx 260
I am also getting poor performance for h264 for my 2 computers each with vp2 cards (a gtx 260 and an 8600gt) . But especially for the 8600gt; 1080p video was unwatchable.
For the gtx 260, it was almost perfect.
namaiki
31st March 2011, 14:42
mindbomb, how much video ram on all of those cards? Also, you can check statistics for GPU/VRAM usage and load with the latest Forceware with GPU-z.
mindbomb
31st March 2011, 17:11
the 9400m has none since it's integrated, but it's vp3, so is suspect thats why it has very good performance for h264
the gtx 260 has 896mb, and the 8600gt has 256mb
i dont think its a video ram issue on the 8600gt, since it seems to use 240mb of ram for all videos, but performance is especially poor for 1080p h264, while 720p h264 is fine, and vc-1 isn't perfect, but its in much better shape than 1080p h264.
CruNcher
31st March 2011, 19:19
nev http://forum.doom9.org/showpost.php?p=1488268&postcount=6181 <- im almost sure this is somehow timestamp related changing Mainconcepts timestamp setting has several effects on the droped frames rate. Im trying your Lav Splitter and audio decoder again :)
I tried the combination of Lav Audio Decoder + Lavf Splitter + VMR9 Renderless and still i get frame drops after 12 seconds :(
VMR9 Renderless = Drops after 12 seconds
VMR7 Renderless = OK (useless no Shader support)
VMR9 Windowed = Drops after 12 seconds (useless no Shader support)
VMR7 Windowed = OK (useless no Shader support)
So its affecting only VMR9 on my system and with Cyberlinks Decoder its ok so it seems only renderer related not really timestamp @ all or a combo of both maybe even Driver related. Still i wonder what Cyberlink does different that it doesn't get affected compared to all the others.
I guess i try another player with VMR9 support and see if it happens their too (maybe it has something todo with MPC-HC sync stuff and GPU buffering)
Hmm nope same in Graphstudio connected to Video Rendering Mixer 9 every decoder DXVA or CUDA fails with that stream except Cyberlinks Decoder maybe it has todo something with my more strict Windows Timings.
IDLE
ClockRes v2.0 - View the system clock resolution
Copyright (C) 2009 Mark Russinovich
SysInternals - www.sysinternals.com
Maximum timer interval: 15.625 ms
Minimum timer interval: 1.000 ms
Current timer interval: 15.625 ms
Video Playback
ClockRes v2.0 - View the system clock resolution
Copyright (C) 2009 Mark Russinovich
SysInternals - www.sysinternals.com
Maximum timer interval: 15.625 ms
Minimum timer interval: 1.000 ms
Current timer interval: 0.977 ms
seems that with really high resolution timings not every decoder can cope with on VMR9
thuan
1st April 2011, 09:51
On my 9800GT with 512MB RAM, with madVR and LAV CUVID or EVR-CP and DXVA, playing 1080p video use around 450MB RAM. The funny thing is the video decoder engine usage is a mess jumping around to around max 70% and sometimes down to 10% with mem controller load never goes over 20% but it still stutters on certain heavy bitrate video like slideshow.
roozhou
1st April 2011, 10:08
I have a 8500gt with 128MB VRAM. With driver earlier than 197.45, the VP2 unit on this card is unable to decode H264 video with width >1856. With newer driver, DXVA1 can be used on 1920x1080 content but CUDA still has the 1856 width limit (Both CoreAVC and LAV CUVID).
Found another bug:
LAVCUVID will connect and try to decode not supported H.264 profiles, tested with lossless (old and new format).
Result is heavily corrupted image (new lossless format) or no output at wrong resolution (old lossless format). Would be great if LAVCUVID could detect unsupported H.264 profiles and refuse the connection.
CruNcher
2nd April 2011, 21:45
Yep thats a big no go if you cant decode it let the next filter in the chain try it ISVs should also learn that, though i guess nev just missed to parse for x264 losless as nvcuvid knowingly doesn't support it and never will ;)
It also tries to playback Mpeg-2 Studio Profile as well and obviously fails
PS: I also wanted to add that even benchmarks show that its slower then CoreCodecs Nvcuvid implementation http://forum.doom9.org/showpost.php?p=1489410&postcount=48 therefore it is more stable (more consistent playback results) if you close MPC-HC playback reopen close reopen CoreAVCs Nvcuvid has a tendency to lose efficiency and recovers after the 3rd run very slowly LA CUVID doesn't show this behavior (every run it goes to its full speed of 45 fps), something that DXVAchecker obviously ignores ;)
Xorp
3rd April 2011, 19:02
Any chance this decoder could someday perform inverse telecine on interlaced film content? (ie the occasional 1080i60 movie Blu-ray) Like Dscaler does for MPEG2.
madshi
3rd April 2011, 19:05
@Xorp, nevcairiel says it already does. I don't fully believe it yet, though. I think the hardware should properly detect the cadence and put the right fields together, but I doubt that the hardware will mod the output to 24p. I think the output will be either 30p or 60p. I'm ready to be proven wrong, though... :p
CruNcher
3rd April 2011, 20:28
Nev any idea where round about 40 fps of 45 fps are disappearing too when rendering out with any of the NVcuvid Dshow filters (not only yours) on either Haalis or Madshis renderer ?
it looks like that currently
VMR7 = 45 fps
VMR9 = 45 fps
Haali = 4 fps
Madvr = 4 fps
Haali(CPU) = 60 fps
Madvr(CPU) = 60 fps
little heavy that loss, catch them and force them to arrive @ Haali/Madvr ;)
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.