View Full Version : LAV CUVID Decoder - High Quality Hardware decoding for NVIDIA
CruNcher
3rd September 2011, 12:12
Though i guess for DXVA2 to get this working there would be no other way then to contact Nvidia and Intel and provide them such a sample so they can fix the behavior, i tried everything no go on HD2000 to get this smooth with DXVA2 with any Splitter->Decoder->Renderer combination. In Software Mode this issue fixes itself magically on the HD2000 either on EVR (normal) or as funny as it sounds with DirectVobsub on Evr Custom :P not sure though if that works also for Nvidia.
mzso
4th September 2011, 13:48
Hi!
After reading the first post its unclear to me, but does cuvid do all of the work on the GPU itself or does it use some separate decoder chip (which I remember dxva using.). Or some sort of combination?
Also does it have the sort of limitations that dxva does? Like that it wouldn't work at all with higher h264 ref frame numbers, or other video filters in the chain. (I see it works with all renderers so that's a plus)
Also what's the minimum recommended GPU? I'm thinking of replacing my 2600xt. I was thinking something like a gts450. (Not sure if its already the mentioned VP4 or hw or not)
CruNcher
4th September 2011, 13:54
Ok update i measured now the Power Consumption difference of Mpeg-2 Playback Intels Decoder (Hardware DXVA2) vs Lav Video (Software Multithreading) (on Core I5-2400)
and at least for Mpeg-2 i came to the conclusion that the Playback Problems that are mostly caused due to Hardware decoding aren't worth the hassle with @ best 1W save Sandy Bridge is very efficient here on the cpu part :) (nothing to cheer about obviously comparing to a real SOC neither the Intel Decoder result)
so 1080p with full 60 fps Deinterlacing takes approx 13W (idle 5W + decoding 8W (includes Parser,Aero, Audio (Lav Audio) and Renderer (EVR Custom) overhead)) on Sandy Bridge and Intels Decoder (tested via Microsoft-DTV-Decoder and Intels own Mpeg-2 Decoder both are mostly identical when it comes to DXVA2, still trying to find out how to get software mode working for both forcing renderer change is no good solution for constant measuring results) doesn't do much better with DXVA2 12W (7W) :/
Im using Ben Waggoners TallShip sample for measuring if someone has a more heavier sample it would be nice to see it (but please something that occured in AVG user life not PRO and also non Studio 4:2:2) ;)
General
ID : 0 (0x0)
Complete name : H:\theislandoriginal\TallShip_1080i_ATSC.ts
Format : MPEG-TS
File size : 777 MiB
Duration : 5mn 35s
Overall bit rate : 19.4 Mbps
Video
ID : 4096 (0x1000)
Menu ID : 1 (0x1)
Format : MPEG Video
Format version : Version 2
Format profile : Main@High
Format settings, BVOP : Yes
Format settings, Matrix : Custom
Codec ID : 2
Duration : 5mn 35s
Bit rate : 18.0 Mbps
Maximum bit rate : 18.0 Mbps
Width : 1 920 pixels
Height : 1 080 pixels
Display aspect ratio : 16:9
Frame rate : 29.970 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Interlaced
Scan order : Top Field First
Compression mode : Lossy
Bits/(Pixel*Frame) : 0.289
Stream size : 719 MiB (93%)
Audio
ID : 4097 (0x1001)
Menu ID : 1 (0x1)
Format : AC-3
Format/Info : Audio Coding 3
Mode extension : CM (complete main)
Codec ID : 129
Duration : 5mn 35s
Bit rate mode : Constant
Bit rate : 448 Kbps
Channel(s) : 6 channels
Channel positions : Front: L C R, Side: L R, LFE
Sampling rate : 48.0 KHz
Bit depth : 16 bits
Compression mode : Lossy
Stream size : 17.9 MiB (2%)
Language : English
sneaker_ger
4th September 2011, 14:12
Hi!
After reading the first post its unclear to me, but does cuvid do all of the work on the GPU itself or does it use some separate decoder chip (which I remember dxva using.). Or some sort of combination?
It uses the same decoding unit as DXVA, but the GPU itself for deinterlacing IIRC.
Also does it have the sort of limitations that dxva does? Like that it wouldn't work at all with higher h264 ref frame numbers, or other video filters in the chain. (I see it works with all renderers so that's a plus)
It doesn't rely on a DXVA compatible renderer and it works fine with up to 16 ref frames. (But 16 ref frames also work fine on DXVA - at least on Vista/7, for pretty much every Nvidia Card and all ATI cards HD4xxx and newer)
CruNcher
4th September 2011, 14:51
Jep the difference is roughly 1 Watt and without Audio overhead and with default EVR (0.5W-1W still optimizing measuring) we are @ 5W (Intel Decoder) and 6W (Lav Video) (though if you calculate all the issues with DXVA2 into that with different renderer special adoption of Splitter ect i find this 1W very acceptable ;)
http://software.intel.com/en-us/articles/intel-energy-checker-sdk/ <- Really rocks :)
nevcairiel
4th September 2011, 14:51
MPEG-2 is a rather trivial format, compare H264 instead. :)
CruNcher
4th September 2011, 15:18
Hehe yeah there it looks much different ;) but i want to improve measuring first :D
mzso
4th September 2011, 17:13
It uses the same decoding unit as DXVA, but the GPU itself for deinterlacing IIRC.
It doesn't rely on a DXVA compatible renderer and it works fine with up to 16 ref frames. (But 16 ref frames also work fine on DXVA - at least on Vista/7, for pretty much every Nvidia Card and all ATI cards HD4xxx and newer)
OK. Thanks for the enlightenment. I never thought that the ref frame limit might be video card specific...
jmone
4th September 2011, 21:55
Out of interest and given there is nothing in the tracker, is LAV CUVID "done", "stable" and feature complete (eg V1.0)?
rica
4th September 2011, 22:44
not sure though if that works also for Nvidia.
Hi crunch. I made tests on 3840*1080 SBS file.
Here is my HW:
intel 540 on H55 (Clarkdale)
2*2GB RAM (1600 Mhz)
Geforce GTX 550 Ti
SW:
Seven 32 Pro, SP1 Build 7601
nVidia drivers 280.19
And here are my test results:
Splitters:
Haali,
Gabest,
LAV
Video decoders:
ffdshow video dxva,
lav CUVID
Renderer:
EVR Custom.
Avarage CPU usage is around 20%.
http://img695.imageshack.us/img695/5798/lavsplit3840evr.png (http://imageshack.us/photo/my-images/695/lavsplit3840evr.png/)
With same splitters and video decoders if i use MadVR as renderer, CPU usage climbs to 40% as expected since Madshi says:
[known problems / limitations:
- hardware accelerated video decoding (DXVA) is currently not supported
- hardware accelerated deinterlacing (DXVA) is currently not supported
http://img545.imageshack.us/img545/6209/lavsplit3840madvr.png (http://imageshack.us/photo/my-images/545/lavsplit3840madvr.png/)
EDIT: If you ask me "have you given it a go with MadVR only; as decoder+renderer?": YES, the result never changed.
_ _ _ _ _
nevcairiel
5th September 2011, 08:03
Out of interest and given there is nothing in the tracker, is LAV CUVID "done", "stable" and feature complete (eg V1.0)?
Is it called 1.0? No? Right.
Also, what tracker? LAV CUVID doesn't have a public bug tracker.
Chillgurke
5th September 2011, 08:41
I think he meant, if any new features or updates planned ?
jmone
5th September 2011, 10:12
Is it called 1.0? No? Right.
Also, what tracker? LAV CUVID doesn't have a public bug tracker.
For some reason I figured it would be part of http://code.google.com/p/lavfilters/
Anyway ... just swapped out my last ATI card for a nvidia one. FYI - you are not the only one getting better results from nvidia over ATI. Cyberlinks PD9 can output AVC 50 or 60p using the nvidia GPU but not with ATI.
CruNcher
5th September 2011, 14:02
Hi crunch. I made tests on 3840*1080 SBS file.
Here is my HW:
intel 540 on H55 (Clarkdale)
2*2GB RAM (1600 Mhz)
Geforce GTX 550 Ti
SW:
Seven 32 Pro, SP1 Build 7601
nVidia drivers 280.19
And here are my test results:
Splitters:
Haali,
Gabest,
LAV
Video decoders:
ffdshow video dxva,
lav CUVID
Renderer:
EVR Custom.
Avarage CPU usage is around 20%.
http://img695.imageshack.us/img695/5798/lavsplit3840evr.png (http://imageshack.us/photo/my-images/695/lavsplit3840evr.png/)
With same splitters and video decoders if i use MadVR as renderer, CPU usage climbs to 40% as expected since Madshi says:
http://img545.imageshack.us/img545/6209/lavsplit3840madvr.png (http://imageshack.us/photo/my-images/545/lavsplit3840madvr.png/)
EDIT: If you ask me "have you given it a go with MadVR only; as decoder+renderer?": YES, the result never changed.
_ _ _ _ _
I wasn't talking about that DXVA2 works on Nvidia sure it does i was more concerned if Nvidias Telecine does the 24.30 (evil_tree sample) correct either with Lav Cuvid or via DXVA2 over EVR or does in the end also fail like Quicksync does with DXVA2.
But as nev allready mentioned this sample is strange and for such a thing you would need something very adaptive to get it right :)
Though that it works in a full Software chain (Lav Splitter->Lav Video->EVR) seems to show @ least that Intels Telecine does it's job nicely and Nvidia should not fail here either on EVR with the same Software Decoding setup.
JanWillem32
5th September 2011, 21:37
For those interested, to quote myself:In regard to the evil trees sample, I've removed the pulldown, the flags were set rather wrong. The damaged parts of the stream are now clearly visible when enforcing a strict mode:-link removed-. That explains why I kept getting pauses during playback with all renderers I tried. The original stream was 24/1.001 fps progressive as 48/1.001 fps weave interlaced.
rica
5th September 2011, 22:39
I wasn't talking about that DXVA2 works on Nvidia sure it does i was more concerned if Nvidias Telecine does the 24.30 (evil_tree sample) correct either with Lav Cuvid or via DXVA2 over EVR or does in the end also fail like Quicksync does with DXVA2.
But as nev allready mentioned this sample is strange and for such a thing you would need something very adaptive to get it right :)
Though that it works in a full Software chain (Lav Splitter->Lav Video->EVR) seems to show @ least that Intels Telecine does it's job nicely and Nvidia should not fail here either on EVR with the same Software Decoding setup.
Crunch, with my nVidia GPU, Lav Splitter > Lav CUVID > EVR gives smooth playing; no dropping frames, no jitter but it shows 39 fps (on quality info) at 3-4% CPU utilization. (with Lav Video: 6%) (don't ask how :) )
But it is a damaged cut and all my experience say "interlaced short mpeg2 cuts are always problemetic unless they are cut by Video ReDo or VideoReDo TV Suite."
So this sample may never be a trial subject imo.
EDIT: I haven't used Cuttermaran for years, this might be another working way to cut the mpeg2 files correctly.
Xaurus
6th September 2011, 16:41
edit: posted in wrong thread
rica
6th September 2011, 21:52
edit: posted in wrong thread
If it is addressed to me, i just said your sample has been cut with a non-suitable SW (IMO) so it has lots of issues. This is not your fault.
Sorry maybe i posted in wrong thread?
Xaurus
6th September 2011, 23:04
If it is addressed to me, i just said your sample has been cut with a non-suitable SW (IMO) so it has lots of issues. This is not your fault.
Sorry maybe i posted in wrong thread?
It was about the clip, yes, but it was directed at JanWillem who has already answered my questions in another thread. :)
In any case, I will wait for the Blue Ray release of Game of Thrones (December) before I do anything. :D
CruNcher
7th September 2011, 08:25
Crunch, with my nVidia GPU, Lav Splitter > Lav CUVID > EVR gives smooth playing; no dropping frames, no jitter but it shows 39 fps (on quality info) at 3-4% CPU utilization. (with Lav Video: 6%) (don't ask how :) )
But it is a damaged cut and all my experience say "interlaced short mpeg2 cuts are always problemetic unless they are cut by Video ReDo or VideoReDo TV Suite."
So this sample may never be a trial subject imo.
EDIT: I haven't used Cuttermaran for years, this might be another working way to cut the mpeg2 files correctly.
Sounds good (though you shouldn't concentrate to much on any info displays but more on the actual motion with your pure eyes, as you know the stream is totaly wrong and depending on the sended mediatype flag the decoder and renderer will react strange though if it gets the motion smooth everythings fine) :) it fails on EVR Custom ? what is the result with MadVR ?
Also did you tested on EVR with Nvidias Telecine in the controll panel active or disabled ? (Though Lav Cuvid should overwrite this i guess)
Xaurus: You should also contact the ISV that Software you used to cut this and make him aware of this discussion :)
rica
7th September 2011, 20:59
Hi Crunch.
I can not see any difference between MAdVR and EVR.
Inverse telecine is enabled or disabled has no effect on the result.
EVR custom (even it shows 23.97 fps) fails; it has lots of dropped frames.
But as i told before this cut can not be reference.
Take care.
Kripsy
7th September 2011, 21:29
I've encountered a problem with a MBAFF 25p (flagged as i) AVC Blu-ray file and think I've diagnosed the troubles. The symptoms are video stuttering and are most prominent with madvr, but you can also see them with EVR. The trouble seems to be with the deinterlacing of the progressive material to 50fps even though the stream header says 25fps. If I turn the deinterlacing off or change the deinterlace framerate to 50p/60p then the stuttering disappears.
Sample: http://www.mediafire.com/?5f93gk9iw02tb0f
Mixer73
7th September 2011, 23:37
Hi nevcairel
I'm a big fan of your projects, and I really love the quality, but I seem to have a performance bottleneck in my system and I was wondering how I might get to the bottom of it.
Decoding MPEG2 1080i streams, in PotPlayer with either EVR, EVR CP or madVR renderers, my system seems at best to drop approx 1 frame per second.
System is an i7/965 at stock clocks with a GTX260 card. Its running Vista 64bit at the moment and is in a dual-head configuration.
I'm planning on a fresh format to Win7 anyway but I'm wondering if my GPU is bottlenecking, and if this is the case whether upgrading to a DDR5 model might help, over the DDR3 RAM in the 260?
nevcairiel
8th September 2011, 07:14
The easiest way to check if there is some obvious performance bottleneck would be to get GPU-Z and check the video engine load, GPU load and memory controller load - as well as GPU memory used.
ney2x
8th September 2011, 07:34
I compare CoreAVC 3.0 to LAV CUVID 0.12, the result is this.
LAV CUVID 0.12
VPU Load = 33%
GPU Load = 12%
Memory Load = 721MB/1024MB
CPU Load = Average 14%
CoreAVC 3.0
VPU Load = 32%
GPU Load = 11%
Memory Load = 738MB/1024MB
CPU Load = Average 20%
*LAV Video 0.35
VPU Load = 0%
GPU Load = 38%
Memory Load = 332MB/1024MB
CPU Load = 30%
I used GPU Observer Gadget (I think it's the same as GPU-Z) to measure the loads. Filters used : LAV Filters 0.35, Reclock, madVR 0.74, MPC-HC. I don't know who is the real winner here but my observation is, faster seeking with LAV CUVID.
Is LAV Video a software decoder like ffdshow? Additionally, I got no glitches in madvr using LAV Video wherein CoreAVC and LAV CUVID I get 5-8 glitches, I'm using the default settings of madvr.
Chillgurke
8th September 2011, 07:34
Hi Nev,
is there any update in the pipe ? Or a little roadmap for this great Decoder ?
nevcairiel
8th September 2011, 08:09
Is LAV Video a software decoder like ffdshow? Additionally, I got no glitches in madvr using LAV Video wherein CoreAVC and LAV CUVID I get 5-8 glitches, I'm using the default settings of madvr.
Yes it is a pure software decoder. The increased GPU load is probably because the GPU actually goes into a lower power state. CUDA forces it into the high-power mode for some reason..
is there any update in the pipe ? Or a little roadmap for this great Decoder ?
I'm focusing on other projects at this time. I'm not aware of any issues that need immediate attention, everything is working pretty good.
ney2x
8th September 2011, 08:20
Ohh lastly, if you're going to release new version of LAV CUVID, kindly skip 0.13 cause that's not my lucky number :scared:
devil-strike
8th September 2011, 10:34
I want to say, very GOOD job you have done on this codec, is allot beter than coreavc that dont even work anymore after 2.6 under dvbviewer.
Also channel change is allot faster than any codec i have seen, so keep up the good work.
Mixer73
8th September 2011, 12:49
The easiest way to check if there is some obvious performance bottleneck would be to get GPU-Z and check the video engine load, GPU load and memory controller load - as well as GPU memory used.
Very interesting.
Showing only 280mb used of 896mb, 16% GPU load, 6% Memory Controller Load, 6% Video Engine load, less than 5% CPU usage but I am still losing approx 1 frame every .5-1 second...
Testing again it seems to be more related to madVR than LAV CUVID. I think with EVR CP it drops almost no frames.
However if I have any significant CPU utilisation (say video encoding), it will drop more.
Do you have any idea whether being in dual head mode might have an impact?
CruNcher
8th September 2011, 13:12
Mixer73 all this frame dropping stuff depends on many more then just 1 part of the system also Windows is no realtime OS so you have to be careful what is going on the system (everywhere) any load somewhere can cause frame drops (using a exclusive mode is more secure but also not a 100% avoiding solution) it's not easy to solve by anyone if you don't know exactly your system and it's configuration and i mean really in depth analyzing it directly on the machine.
If you have a cluttered system and even EVR CP in Exclusive mode drop frames you have to kill everything you know that could introduce this the best is really to test on a absolute clean system installation with no more 3rd party code on it, and if you put more 3rd party code on it you should always check if that interferes with your Playback :(
Mixer73
8th September 2011, 13:41
Mixer73 all this frame dropping stuff depends on many more then just 1 part of the system also Windows is no realtime OS so you have to be careful what is going on the system (everywhere) any load somewhere can cause frame drops (using a exclusive mode is more secure but also not a 100% avoiding solution) it's not easy to solve by anyone if you don't know exactly your system and it's configuration and i mean really in depth analyzing it directly on the machine.
Thanks, I'm gonna do the clean format anyway, this machine is a little long in the tooth, hell maybe it will just do the trick.
I am adding an SSD to the system as well with the format so it can't get worse :P
CruNcher
8th September 2011, 14:15
Thanks, I'm gonna do the clean format anyway, this machine is a little long in the tooth, hell maybe it will just do the trick.
I am adding an SSD to the system as well with the format so it can't get worse :P
Yeah sometimes the best solution though if you think you can handle it you could first look in your Performance information since NT 6 Windows has some nice self Diagnostic Layer that can be really helpful to find not sooo deep Performance problems @ least and some of them could also be the cause for overall system issues :)
Boot up/down time analysis code 100,102,103 ect are the first thing to look @ :)
jmonier
9th September 2011, 18:58
I've noticed two anomalies with LAV CUVID which, while they don't seem to be a problem for me, may be of interest. I haven't tested really extensively, but in my setup (which includes Zoomplayer and madVR), they happen reliably with LAV CUVID and not with other decoders.
1) GPU memory usage (as displayed by GPU-Z). Whenever I stop playback and then start it again, the memory usage is higher than it was before. This continues until the upper limit of memory is reached, then it stays at that value. Playback is fine, even after the upper memory limit is reached. The memory only goes back to zero when I exit Zoomplayer.
2) madVR Windowed Mode with interlaced material. This causes constant dropped and delayed frames. It's fine in Full Screen Exclusive mode and also fine with non-interlaced material in both modes.
CruNcher
9th September 2011, 20:27
@nev
http://forum.doom9.org/showthread.php?p=1525110#post1525110 <- here you can see the current efficiency Intel reached themselves with the copy overhead on SB, most probably the driver team)
will be super interesting to compare that vs Lav Cuvid :)
TheShadowRunner
11th September 2011, 00:15
Nevcairiel, thank you very much, between your filters and Madshi's renderer, you've pretty much given a new life to HTPCs.
I'm wondering, do you ever plan to integrate LAV CUVID into Lav Video Dec eventually?
So that 8bit h264 could be CUDA accelerated and 10bit could be software decoded, with a single codec? (since these video streams share the same subtype, setting 2 decoders is pretty much impossible (?))
Also, i take it you've no hints regarding nvidia's ivtc feature available in their driver?
CruNcher
11th September 2011, 10:35
Nope nev was one of the first who thought about this ;) Lav Cuvid supports to fallback (it wont allow connections for 10 bit 4:2:2 neither 4:4:4 and also not Mpeg-2 4:2:0 content) you just have to configure your player right :)
You put Lav Cuvid on top of the chain and enable H.264 and Lav Video bellow it (or any other you prefer for this type of content, makes it much more flexible in your decision what you want to use), now if you open 10 bit content it will use Lav Video same for Mpeg-2 Studio if you select also Mpeg-2.
Though the Mpeg-2 fallback is a special case it works only combined with Lav Splitter.
TheShadowRunner
11th September 2011, 11:46
Brilliant, thanks for the tip, CruNcher. Now I need time for some testing ;)
Strange this isn't documented anywhere though ^^;
clsid
11th September 2011, 12:46
It is not documented because it is 'normal' behavior. For example the MPC and ffdshow DXVA decoders also deny connection for stuff they do not support to allow fallback to the next available decoder.
mindbomb
11th September 2011, 13:49
how is the mpeg 4 asp decoding in cuvid?
i don't have a compatible graphics card to test it, but dxva for it on the radeon 6000 series using the arcsoft video decoder results in a generally blurrier picture and there are frequently bugs. I'm wondering if similar problems are found with cuvid.
betaking
11th September 2011, 14:00
how is the mpeg 4 asp decoding in cuvid?
i don't have a compatible graphics card to test it, but dxva for it on the radeon 6000 series using the arcsoft video decoder results in a generally blurrier picture and there are frequently bugs. I'm wondering if similar problems are found with cuvid.
Last cyberlink video decoder dixv decoder or daum potlayer built deocoder can support radeon 6000 series mpeg 4 asp dxva!
cuvid only for nvidia cuda graphics card!
CruNcher
12th September 2011, 10:40
how is the mpeg 4 asp decoding in cuvid?
i don't have a compatible graphics card to test it, but dxva for it on the radeon 6000 series using the arcsoft video decoder results in a generally blurrier picture and there are frequently bugs. I'm wondering if similar problems are found with cuvid.
The difference is that they can chose the PP method they want to apply themselves Mpeg-4 Asp has no standard defined PP or any kind of inloop it's up to the implementer if they want to use it or not and how (bitstream information based, default strenght,manual) everything's possible and most probably will be different from implementer to implementer (resulting in different visual output) :D
Arcsoft seems by default then to use a heavy PP to make sure blocks are decimated as much as possible taking into account that it kills fine details (though they could also use a more adaptive approach that works only in their own framework,needing bitstream information).
This and that the IDCT,Qpel could differ (between encoder/decoder) where the 2 major reasons for it to fail and to replace it with H.264
Mpeg-4 ASP had no Commercial Stability that was needed for HD-DVD,Blu-Ray and more Advanced Interoperable Internet (Cloud) solutions we see today in combination with a Hardware Ecosystem that works very good (unthinkable with Mpeg-4 ASP, that's why DivX Networks(Inc) initiated their Certification they saw that others would lose in this catastrophe without any guidance and that a own Ecosystem with their own Standards was highly needed for those sheep's @ that time) :)
Andy o
12th September 2011, 10:49
Yes it is a pure software decoder. The increased GPU load is probably because the GPU actually goes into a lower power state. CUDA forces it into the high-power mode for some reason..
Is there only a low and a high power mode, or is this a mid power state like AMD does with many mid-high-end cards like the 5770 (and the 4670)? When UVD is being used, I get 400/900 (locked so there's no DPC latencies). Idle/low state is 157/300 and high is 850/1200. I'm considering swapping to my GTX460 once again to check out your latest developments especially re: deinterlacing.
nevcairiel
12th September 2011, 10:54
There is typically 3 states with NVIDIA cards that are used. P0 is the full 3D power mode, P8 is the "Video" mode at around half speed, and P12 is the 2D low-power mode (which seems to actually be even lower then your low state, just going by clocks).
DXVA uses the P8 mode, unless the 3D load from post-processing is too high and forces P0. But CUDA always forces the P0 mode, unless you use a tool like NVIDIA Inspector to change the modes per-application or by some other rules.
Recently, NVIDIA changed the way the P-state switching is done. However, in the upcoming 285 driver release, they went back to a more "conservative" P-state logic, because it apparently caused quite some issues.
Using my DXVA-HD deinterlacer, it usually starts out in P0 mode, but if i let it run for a while, after 30s or so it goes into P8 (which can be seen in my debug timings, as the values double. :))
pankov
12th September 2011, 11:02
Andy,
yes there are three power states in the recent NVIDIA cards (including the GTX460).
P0 - full 3D power
P8 - video
P12 - idle
There is a nice application called NVIDIA Inspector which has a feature called "Multi Display Power Saver" which allows you to select which apps trigger which state and also at what level (Threshold) of GPU/VPU usage should each level be activated.
I use GTX 460 SE and I'm very glad I changed my 5750 for it because now I can watch my 1080i recordings with perfect deinterlacing and perfect display synchronization thanks to madVR which works perfectly with LAV CUVID Decoder.
Edit:
nevcairiel was quicker
CruNcher
12th September 2011, 11:13
Also nev could put this multi display power saver part right into Lav Cuvid utilizing NVAPI additionally and implementing a switch (always keep P8 state on/off)
For most this would be useless though as you need they dynamic of their system else you always gonna endup with to slow results and Nvidia has to find this best dynamic on the driver level which is hard as they have to take a lot of different scenarios into account it never will be perfect in (performance/power/latency) for everything that's for sure (Web,Games,Video,Folding) :)
The last time Nvidia Engineers tried to lower the latency most cards couldn't hold that stress and errored out, especially Gamers where not happy with that :(
thuan
12th September 2011, 11:13
On my GTX560 card, normally my card has 3 power states like nevcairiel mentioned, but after using CUVID 0.12 my card won't go to P12 and stucks at P8 when idle, I need to restart to have the correct behavior back. @Dev: do you have the same problem?
EDIT: DXVA is fine BTW.
nevcairiel
12th September 2011, 11:19
On my GTX560 card, normally my card has 3 power states like nevcairiel mentioned, but after using CUVID 0.12 my card won't go to P12 and stucks at P8 when idle, I need to restart to have the correct behavior back. @Dev: do you have the same problem?
EDIT: DXVA is fine BTW.
I don't have that problem, it always goes back into P12 for me eventually.
CruNcher
12th September 2011, 11:30
Also Windows Vista/7 with Aero is more complex it should standby any application in the background though but who knows what a faulty application could do with your GPU ;)
Best is to check with Process Explorer and the OS GPU state and see if something in that context could trigger a P state switch also or keep P state constantly high for example i guess some changing wallpapers could trigger it.
A javascript driven Picture Change in the Browser can already if additional you scroll around for example (if the spike occurs @ the same time) ;)
Nvidia also is going to introduce a FPS capper in the driver which could be used to avoid such things if Applications dont come up with such sollutions themselves ;)
Andy o
12th September 2011, 11:42
Thanks guys, Nvidia Inspector seems like a pretty useful tool. I'll probably be trying it soon.
Using my DXVA-HD deinterlacer, it usually starts out in P0 mode, but if i let it run for a while, after 30s or so it goes into P8 (which can be seen in my debug timings, as the values double. :))
Does this mean I'd have to use LAV Video instead of CUVID?
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.