madVR - high quality video renderer (GPU assisted) - Page 2507

huhn · 7th April 2018, 15:02

Quote:

Originally Posted by e-t172

You do not need to "hold back" any frames to do IVTC if you already know the cadence. At least not on 60p (not sure about 60i). I think I've made that clear.

because you don't know the cadance with out it.
and the real trick of TVs is that they can do both at the same time by displaying at 120 HZ. the transition has to be smooth and that'S important so the switch between movie and CM is not a judder party.
so the real trick is not switch between modes it is stay in one.
and that's most likely the reason 60 HZ sonys can'T do it even though they can display 24 p correctly

Quote:

They often are. Why wouldn't they be? They're coming from the same framebuffer on the source side, and there is no lossy temporal processing in between. (Though I guess maybe dithering steps could throw a wrench into this assumption.)

it's made for broadcast not us HTPC user.
and broadcast is far from bit identical and a simply comparison is not enough.

Quote:

Even if they're not, any IVTC algorithm worth its salt will not assume they are identical. Instead it will compute some kind of difference metric (such as simple RMS) to provide at least some resilience to noise.

and that's why you should buffer frames to make sure it's a better result.

Quote:

And again, please keep in mind that duplicate detection is only used to detect cadence changes. Once you know the cadence, you don't need to do any measurements to know which frames should be discarded and which should be kept.

you have to check for changes. braodcast changes all the time.

Quote:

I don't see how that's complicated - it's just arithmetic. Detecting the cadence is the complicated part. The decimation part is trivial in comparison. Once the TV knows the cadence (i.e. it knows when the 3:2 pattern starts and when it ends), it knows that it needs to switch to 24p, and it knows which frames need to be displayed and when.

Quote:

The only case where you incur massive input lag is if you want to detect cadence changes in advance so that you can seamlessly transition from one cadence to the next. That indeed requires you to look far into the future. About 10 years ago I wrote my own IVTC filter that did something like that, and I believe that's also how madVR IVTC works to some extent. The video player can afford to do that because it can preprocess frames in advance; the TV can't. But that's only a "nice to have", not a requirement, especially for high-quality 60p. You can do without it as long as you're not after perfect transitions between different cadences, which is really not a problem for the scenario that this discussion is about (stable 24p@60Hz).

why should a Tv use a lower quality and not simply adds input lag?

TV mostly display braodcast that'S the opposite of high quality.

Quote:

You can deduce from your own numbers that in the worst case scenario the minimum required input lag is about 6 ms in your very own example. (The worst case scenario is when you're in the middle of the 3:2 pattern, at t+42ms, where you need to display the second 24p frame right now but it will only come at t+48ms - a 6 ms delay.) That's even better than the 16 ms number I mistakenly put forward initially.

if you have duplicated frames that are nearly or even bit perfect you could do it faster but what end device is outputting something like that except a PC?
why would you write an algorithm for that.

so yes you can do it faster by blindly following a cadance pattern and dropping/repeating frames as they pleases and ignore transitions.
very limited use case at best.

Quote:

What makes you think that it can't be done in parallel with the rest of image processing? I can very easily imagine an implementation where images can be removed from the internal processing pipeline queue during or after other processing steps have been done.

because you just start rendering images after the IVTC algo is done with it to be save.
the calculation for zones over drive and so many other stuff needs to be made and can be totally screwed over if you change the frame that needs to be displayed.

Quote:

Oh so we trust the terminology used in TV OSD menus now? Since when?

this cadance detection is very important for frame interpolation. it's basicly the first step and part of it.

nevcairiel · 7th April 2018, 15:04

Quote:

Originally Posted by huhn

because you don't know the cadance with out it

You can however safely assume that the cadence is not going to change all the time, so that you can just analyze the cadence and act on it once you know - without any buffering. Sure, it may result in the cadence processing to take a second or so to turn on, but thats not a problem in real-world usage - because its just not changing repeatedly.

No extra latency required. Buffering requires extra hardware, this costs money, hence easy to imagine that its best avoided.

huhn · 7th April 2018, 15:07

Quote:

Originally Posted by nevcairiel

You don't need to delay anything when decimating 60 fps to 24 fps, you just throw away some frames in between. This is not like encoded 30 fps IVTC content where you get fields that need re-combining into frames. Its full frames on a 3:2 repeat pattern.

you can assume a perfect situation here that a very dangerous.

you assume the pattern doesn't change ever.
this adds at least 8 ms delay this is the pure minimum.
because the 3:2 part has a frame that needs is 48 ms in these 48 ms you don't even have the frame you need to display after 42 ms you have to wait for the frame with 33 ms length.

huhn · 7th April 2018, 15:08

Quote:

Originally Posted by nevcairiel

You can however safely assume that the cadence is not going to change all the time, so that you can just analyze the cadence and act on it once you know - without any buffering. Sure, it may result in the cadence processing to take a second or so to turn on, but thats not a problem in real-world usage - because its just not changing repeatedly.

No extra latency required. Buffering requires extra hardware, this costs money, hence easy to imagine that its best avoided.

because TV are made for broadcast which changes all the time.
and you have the buffer for frame interpolation anyway the hardware is present.

e-t172 · 7th April 2018, 15:55

Quote:

Originally Posted by huhn

because you don't know the cadance with out it.

Sure, that means that it might take some time for the TV to switch into the proper cadence. Let's say that it takes 6 patterns for it to detect a cadence, i.e. 30 frames. That means it will take around 0.5 sec for the TV to detect a cadence change. That's quite acceptable for most use cases IMHO. People don't spend their time switching between 24p and other types of content constantly, and if they do, they probably don't care about it being silky smooth around transitions.

Quote:

Originally Posted by huhn

and the real trick of TVs is that they can do both at the same time by displaying at 120 HZ. the transition has to be smooth and that'S important so the switch between movie and CM is not a judder party.
so the real trick is not switch between modes it is stay in one.
and that's most likely the reason 60 HZ sonys can'T do it even though they can display 24 p correctly

I'm sceptical. AFAIK the reason why changing modes take time is because the HDMI connection is being reset (handshake, etc.) But here there is no HDMI connection - the TV is changing modes internally (not because of the source). Therefore I would expect the TV to be able to seamlessly switch between 24 Hz and 60 Hz in this scenario. Maybe the internal processing pipeline of typical TVs prevent them from doing that, but I honestly wouldn't know. Do you have evidence for these statements?

Quote:

Originally Posted by huhn

it's made for broadcast not us HTPC user.

Nowadays I would expect plenty of non-broadcast sources to output 24p over 60 Hz simply because they don't know better or can't be bothered to do better. PCs of course, but also dongles, phones/tablets, streaming apps, etc. My guess is that's why modern TVs pay extra attention to it and try to convert it to proper 24p before display. I don't think they're doing it solely because of broadcast sources. In fact, if you look at the Rtings results you'll notice that most pre-2017 TVs can't do this trick - it's only very recent TVs that are capable of doing this on-the-fly decimation thing. Which is why I find this phenomenon especially interesting and worth discussing now.

Quote:

Originally Posted by huhn

and broadcast is far from bit identical and a simply comparison is not enough.

You don't need to tell me. Again, I wrote my own IVTC filter specifically designed for broadcast TV. I have first-hand experience of the issues involved, and yes, it's a mess. But we're not dealing with broadcast here, we're dealing with a much cleaner stream.

Quote:

Originally Posted by huhn

and that's why you should buffer frames to make sure it's a better result.

You don't have to buffer frames to achieve a good result. You only need to do that if you want perfect transitions, which a very specific, less important requirement.

Quote:

Originally Posted by huhn

you have to check for changes. braodcast changes all the time.

Not really, no. The worst I've seen is a TV broadcast constantly switching between soft telecine and hard telecine (which is bonkers), but even that can be handled without buffering by making the algorithm a bit more clever. And even then I've only seen this kind of crazyness with 30i (because interlacing makes everything more fun!). 60p is usually broadcast as a perfectly clean 3:2 pattern. And again, this is not about broadcast.

Quote:

Originally Posted by huhn

why should a Tv use a lower quality and not simply adds input lag?

Because engineering tradeoffs? Because algorithms that use larger buffers require more memory and are often more complicated? Because TV manufacturers are aware that people use their devices for things other than video playback?

Quote:

Originally Posted by huhn

TV mostly display braodcast that'S the opposite of high quality.

I think that assumption is becoming weaker and weaker. This is 2018 - it's all about Netflix and friends now.

Quote:

Originally Posted by huhn

if you have duplicated frames that are nearly or even bit perfect you could do it faster but what end device is outputting something like that except a PC?

It would be faster, but not that much faster. Computing something like an RMS difference between two frames is not much slower than comparing them byte by byte. Both operations are pretty fast compared to other image processing tasks, especially on specialized hardware. In both cases I would expect memory bandwidth to be the bottleneck anyway.

Quote:

Originally Posted by huhn

why would you write an algorithm for that.

I wouldn't. I would just compute some metric for the difference.

Quote:

Originally Posted by huhn

so yes you can do it faster by blindly following a cadance pattern and dropping/repeating frames as they pleases and ignore transitions.
very limited use case at best.

You don't ignore transitions. There is just a small reaction delay (less than a second) when a transition occurs. Which is very reasonable, even for broadcast.

Quote:

Originally Posted by huhn

because you just start rendering images after the IVTC algo is done with it to be save.
the calculation for zones over drive and so many other stuff needs to be made and can be totally screwed over if you change the frame that needs to be displayed.

I'm not sure I agree, but I don't think that discussion is relevant anyway because I've already shown you multiple times that IVTC can be done with negligible delay as long as you're happy with it not being perfect over a short window around transitions.

Quote:

Originally Posted by huhn

this cadance detection is very important for frame interpolation. it's basicly the first step and part of it.

I guess that makes sense. It doesn't necessarily mean that enabling cadence detection results in motion interpolation being forcibly enabled, though.

Quote:

Originally Posted by huhn

you assume the pattern doesn't change ever.

No, we're not assuming that. We're assuming that the user is okay with the cadence being slightly wrong for a short period of time (<1 second) right after a cadence change occurs.

Quote:

Originally Posted by huhn

this adds at least 8 ms delay this is the pure minimum.
because the 3:2 part has a frame that needs is 48 ms in these 48 ms you don't even have the frame you need to display after 42 ms you have to wait for the frame with 33 ms length.

I had calculated 6 ms, but that's neither here nor there, I think we agree on principle.

Quote:

Originally Posted by huhn

because TV are made for broadcast which changes all the time.

I think you're exaggerating things. Content type might change every few minutes. It certainly won't change every few seconds.

Quote:

Originally Posted by huhn

and you have the buffer for frame interpolation anyway the hardware is present.

If you want to perfectly predict cadence changes in advance on noisy TV broadcasts you need a much larger buffer than the one required for frame interpolation. Back in the days when I was writing an IVTC filter for TV broadcasts, I had to buffer at least a dozen frames in advance or something like that.

huhn: I really think this discussion is going nowhere and I'm not sure we'll be able to convince each other unless we start reverse engineering TV processing pipeline internals (which would be, well, hard). I don't even understand why we're having this discussion, because Rtings clearly demonstrated that at least a dozen 2017 TVs from a variety of manufacturers are, in fact, capable of decimating 24p@60Hz on the fly and these TVs don't have more input lag than other models (you can add "Input Lag" columns to the table if you're not convinced). It's not a question of "if they can" - there is a mountain of evidence that they, in fact, can. My original question was "how can we exploit this to simplify or improve our playback systems", which I think is the more relevant question here. Can we get back to that please?

huhn · 7th April 2018, 16:25

Quote:

Originally Posted by e-t172

huhn: I really think this discussion is going nowhere and I'm not sure we'll be able to convince each other unless we start reverse engineering TV processing pipeline internals (which would be, well, hard). I don't even understand why we're having this discussion, because Rtings clearly demonstrated that at least a dozen 2017 TVs from a variety of manufacturers are, in fact, capable of decimating 24p@60Hz on the fly and these TVs don't have more input lag than other models (you can add "Input Lag" columns to the table if you're not convinced). It's not a question of "if they can" - there is a mountain of evidence that they, in fact, can. My original question was "how can we exploit this to simplify or improve our playback systems", which I think is the more relevant question here. Can we get back to that please?

these numbers are done with gaming mode none of these TVs is know to be able to do it in gaming modes.
outside of gaming mode they have something like 40-160 ms input lag exactly my point. and to top it of they even show you that you have to use setting from interpolation. (i'm not saying you are getting a soap opera on your screen).

i say it out laud now this feature comes from frame interpolation because it is needed for it. they got it for free.

the reason this is hard to use for a HTPC user is here i repeat my self.
high input lag no PC mode.

and here some new ones it will die as soon as you move your mouse. usually loss of chroma resolution lot's of other processing quirks.

Quote:

"how can we exploit this to simplify or improve our playback systems"

you set your TV to 60 HZ in windows and active the option in the interpolation settings. and live with the problems of it.

Asmodian · 7th April 2018, 17:05

I know my 2017 LG OLED (touted as low input lag) only has low input lag in PC or Game mode, and in that mode it switches off most processing. It even does worse tone mapping in PC HDR modes, though I think this might be due to not wanting to over-saturate sRGB games rather than not having enough time.

Measuring input lag in game or PC mode but then using other modes to test things like 24p in 60Hz can give misleading impressions.

In its other modes my TV's input lag is very high, the mouse cursor feels connected by long rubber bands. I haven't measured it but it is too high to use it as a monitor.

e-t172 · 7th April 2018, 17:34

Quote:

Originally Posted by huhn

these numbers are done with gaming mode none of these TVs is know to be able to do it in gaming modes.

Rtings measures TV input lag in a dozen different modes. Not all are gaming modes.

My point was not really that TVs can do this with no input lag, just that the non-gaming-mode input lag is similar between TVs that have this "24p@60Hz" feature and those that do not. I should have been clearer about this.

Quote:

Originally Posted by huhn

outside of gaming mode they have something like 40-160 ms input lag exactly my point.

Okay. Let me rephrase your point as follows:

"You will only get true 24p from 60 Hz with these TVs if you use them outside low-input-lag mode. Therefore you cannot use that approach and have low input lag at the same time."

I agree with that statement. I will add, however, that this only matters if you care about input lag.

(That said, I have not seen actual evidence that TVs are unable to recover the 24p stream when running in low-input-lag mode. I agree that it makes sense though, so I'm happy to accept this assumption.)

Can we move past this now?

Quote:

Originally Posted by huhn

and to top it of they even show you that you have to use setting from interpolation. (i'm not saying you are getting a soap opera on your screen).

Sure.

Quote:

Originally Posted by huhn

i say it out laud now this feature comes from frame interpolation because it is needed for it. they got it for free.

Okay.

Quote:

Originally Posted by huhn

the reason this is hard to use for a HTPC user is here i repeat my self.
high input lag no PC mode.

This is not something everyone necessarily cares about, but sure, I agree.

Quote:

Originally Posted by huhn

and here some new ones it will die as soon as you move your mouse.

I think moving a very small blob of pixels like the mouse is unlikely to confuse the decimation process. Also the issue will disappear as soon as the mouse stops moving. But I get your point - any use of the UI can cause this problem. On the other hand, you wouldn't want to mess with the UI when playing a movie anyway - at native 24 Hz that would be painful too. The only scenarios where the UI is perfectly usable during playback is when running at 48/72/120/etc Hz or when using Smooth Motion.

Quote:

Originally Posted by huhn

usually loss of chroma resolution lot's of other processing quirks.

Agreed.

I think we're going somewhere

zaemon · 7th April 2018, 17:42

Do you guys think it is worth investing in a CPU that can (software) decode HEVC 4K in order to leave some room for madVR GPU processing? I tested software decoding with my Threadripper and it handled it very easily thus I think with a Coffee Lake i5 or i7 I could use software decoding for not a huge pile of money. Probably better to get a faster GPU but I already got a GTX 1080 so not much room in that regard (until GTX 11xx maybe). Also I’m a big fan of software decode as I feel it provides a more smooth experience.

madshi · 7th April 2018, 18:14

Quote:

Originally Posted by Asmodian

I still think using 2.4x would be more appropriate, direct 4x does take a little more GPU power than 2x and it would be consistent. Upscaling to 4x and then downscaling to 2.05x is not ideal.

So 2.4x, regardless of whether we're talking about "direct quad" or "double twice"?

Quote:

Originally Posted by Warner306

FSE is now required for 10-bit HDR passthrough, from madVR to GPU to display? I didn't know FSE was a requirement.

FSE has always been a requirement to allow madVR to pass 10bit to the GPU driver (and further to the display), in all OSs. Doesn't matter if it's SDR or HDR. There's only one exception to this rule: When you activate the OS HDR switch, 10bit suddenly is also possible in windowed mode - but only if fullscreen.

The reason for all this is that without FSE everything runs through DWM (desktop window manager), and DWM is limited to 8bit. However, if you turn the OS HDR switch on, suddenly DWM runs in 10bit. Whether or not you switch the GPU control panel to 12bit doesn't make any difference here. The limitation to 8bit is in DWM, switching GPU control panel options doesn't help with that.

That said, dithered 8bit should work very well, even for HDR.

Quote:

Originally Posted by Warner306

And one more thing...I assume the Windows OS is sending the metadata to the display untouched and isn't altering it to change the tone mapping, just like the private APIs do?

I don't know, I've no way to verify that.

IMHO the OS HDR switch is pure evil, and should be avoided at all cost, until Microsoft gets off their high horse and finally learns how to do things right. But that's just my personal opinion, of course.

Quote:

Originally Posted by Warner306

This user wants to know if you can somehow send the PQ transfer function to the display to trigger HDR mode and use madVR's adjustable and sometimes superior tone mapping to replace the display's tone mapping? Sounds technically impossible, but would be nice.

Edit: I forgot about "process HDR content by using pixel shader math" as pointed out by Asmodian. How far would that get him? Are all of the YCbCr values tone mapped and stripped of illegal color values and then sent to the display to be tone mapped again? Does the metadata remain the same? Is this recommended?

"process HDR content by using pixel shader math" will tone map according to the settings you've chosen, and then pass the tone mapped result further to the display - still in HDR format, with updated metadata. As a result the display should switch to HDR mode, and hopefully disable its own tone mapping. However, many TVs are rather dumb, so they might not disable their internal tone mapping, even if it's no longer needed, which may result in some further gamma alterations. I can't predict the exact result, because it heavily depends on the display's behaviour, so every user has to simply try.

Quote:

Originally Posted by BetA13

Where there some performance improvements?

I changed some "let madVR decide" settings for better performance, but the algos themselves shouldn't have changed their speed. If you can use NGU Medium now instead of Low, that's probably an improvement in the GPU drivers, I would guess...

Quote:

Originally Posted by sauma144

I'm gonna have some more crazy dreams about this new algorithm.

Quote:

Originally Posted by Grimsdyke

I have an issue with a few PAL-SD files in the following configuration:
-> MPC-BE + LAV (Hardware device to use: Automatic(Native)) + MadVR => File plays normally for a few seconds then screen turns completely green !! File keeps playing and I hear sound.
-> MPC-BE + LAV (Hardware device to use: (GPU selected)) + MadVR => File plays normally
Unfortunately I can' keep this configuration because the drop in performance is too big for 4K. Not that important but maybe you could look into this when you have the time. Thx !!

Quote:

Originally Posted by zaemon

Same problem for me with 10-bit files. Disabling HW acceleration is not an option with HEVC for me unfortunately. And I still got this pink/green blinking at start when HW decoding is running.

D3D11 native support is still in somewhat "rough" shape. I wish I had more time for madVR development. I'm planning to improve D3D11 native support "soon".

Quote:

Originally Posted by e-t172

A general question about achieving proper 24p cadence:

I recently realized after discussing it with someone else that neither 24 Hz nor Smooth Motion is required to achieve proper cadence with modern TVs/projectors. That's because they are capable of automatically recovering a 24p signal from a 24p@60Hz (3:2 pulldown) input, i.e. they are capable of IVTC/decimation.

I think you are a bit too optimistic here. You make it sounds as if all modern TVs/projectors could do that. Probably some can, but I highly doubt *all* can do that. I also don't know how reliable that mode is. E.g. my JVC projector is pretty bad even doing basic things as detecting video mode vs film mode deinterlacing. When I watch soccer and there are overlays, everything starts juddering. It's awful! So I'm not very confident that such a 60p -> 24p decimation algorithm in the TVs will be perfect.

Anyway...

Quote:

Originally Posted by e-t172

Also, I seem to remember madshi saying at some point that madVR is not designed to generate a "perfect" 3:2 pulldown cadence when playing 24p@60Hz, which would prevent this solution from working. Is that true?

Yes, that is still the case. It's been on my to do list for ages to tune madVR to produce a repeatable 3:2 pattern when smooth motion is turned off, but I still haven't found the time to do that yet.

Quote:

Originally Posted by zaemon

Do you guys think it is worth investing in a CPU that can (software) decode HEVC 4K in order to leave some room for madVR GPU processing? I tested software decoding with my Threadripper and it handled it very easily thus I think with a Coffee Lake i5 or i7 I could use software decoding for not a huge pile of money. Probably better to get a faster GPU but I already got a GTX 1080 so not much room in that regard (until GTX 11xx maybe). Also I’m a big fan of software decode as I feel it provides a more smooth experience.

I've also been a long time fan of software decoding. However, 4K HEVC decoding is *really* hard on the CPU, and it will depend on the bitrate and framerate, too. Low-bitrate 24fps HEVC is much easier to decode than high-bitrate and/or 60fps HEVC. I don't know which kind of CPU you need to decode even the most difficult HEVC videos. Maybe fast CPUs can do that today, I've no idea. But you'll probably get a bigger bang for the buck if you keep a budget CPU and upgrade your GPU instead.

Decoding on the GPU is usually done on a dedicated hardware circuit, so it shouldn't slow down pixel shader processing at all. The only problem with hardware decoding atm is that DXVA native decoding has all sorts of technical limitations, and D3D11 native decoding is still not in great shape in madVR. But hopefully D3D11 native decoding will improve in a future madVR build.

Warner306 · 7th April 2018, 18:47

Quote:

Originally Posted by madshi

FSE has always been a requirement to allow madVR to pass 10bit to the GPU driver (and further to the display), in all OSs. Doesn't matter if it's SDR or HDR. There's only one exception to this rule: When you activate the OS HDR switch, 10bit suddenly is also possible in windowed mode - but only if fullscreen.

The reason for all this is that without FSE everything runs through DWM (desktop window manager), and DWM is limited to 8bit. However, if you turn the OS HDR switch on, suddenly DWM runs in 10bit. Whether or not you switch the GPU control panel to 12bit doesn't make any difference here. The limitation to 8bit is in DWM, switching GPU control panel options doesn't help with that.

That said, dithered 8bit should work very well, even for HDR.

Now I am further confused. In one update, you said 10-bit output was now possible with windowed mode in Windows 10. I am only outputting in 8-bits, so I have never tested this.

I was talking to one user who has HDR passthrough set like this with an AMD card:

madVR (10-bits, HDR passthrough) -> GPU (10-bits) -> projector

His projector reports it is receiving 10-bits, RGB, HDR. If he changes the GPU to 12-bits, it reports 12-bits, RGB, HDR. So what is going wrong with this signal chain? He doesn't appear to be having any issues with banding, either.

Also, is it now possible to output 8-bit HDR passthrough with AMD cards, or are they still forced to use a complete 10-bit pipeline to get the HDR signal to the display?

I find I am always providing advice at two other forums, so it would be good to get this clear. Problems with FSE make windowed mode necessary for some users.

madshi · 7th April 2018, 19:09

Quote:

Originally Posted by Warner306

Now I am further confused. In one update, you said 10-bit output was now possible with windowed mode in Windows 10.

Oh wait, my bad. I got confused myself for a moment. I guess it's been too long that I worked on madVR.

So yes, 10bit output is supposed to be working with Windows 10 in windowed mode, regardless of whether the OS HDR switch is on or off. However, it only works if madVR is in fullscreen mode, because then the OS switches the GPU driver into "direct scanout" mode, which bypasses DWM. Thanks to that, the DWM 8bit limitation is no longer an issue. This "direct scanout" mode is currently only supported by Nvidia and AMD GPU drivers, though, but not by Intel GPUs, AFAIK.

sauma144 · 7th April 2018, 20:35

It seems there are some bugs in madVR 0.92.11 and 0.92.12.
Ranpha downgraded madVR in his latest LAV Filters Megamix setup.
https://www.videohelp.com/software/L...comments#13925

Siso · 7th April 2018, 20:53

Quote:

Originally Posted by sauma144

It seems there are some bugs in madVR 0.92.11 and 0.92.12.
Ranpha downgraded madVR in his latest LAV Filters Megamix setup.
https://www.videohelp.com/software/L...comments#13925

What sort of bugs?

sauma144 · 7th April 2018, 21:00

I don't know, I want to know if madshi is aware.

Warner306 · 7th April 2018, 21:12

Quote:

Originally Posted by madshi

So yes, 10bit output is supposed to be working with Windows 10 in windowed mode, regardless of whether the OS HDR switch is on or off. However, it only works if madVR is in fullscreen mode, because then the OS switches the GPU driver into "direct scanout" mode, which bypasses DWM. Thanks to that, the DWM 8bit limitation is no longer an issue. This "direct scanout" mode is currently only supported by Nvidia and AMD GPU drivers, though, but not by Intel GPUs, AFAIK.

You might want to look into the banding problems with Nvidia cards when set to HDR passthrough with 12-bits at the GPU. This could be the display, driver or madVR.

The only AMD user I've spoke to says 10-bit HDR passthrough seems to be fine without any banding, but I think he needs to run some tests to be certain.

Asmodian · 7th April 2018, 21:57

Quote:

Originally Posted by madshi

So 2.4x, regardless of whether we're talking about "direct quad" or "double twice"?

Yes. I do not believe it is enough better to quadruple and then downscale to e.g. 2.3x, compared to doubling and then upscaling, to justify the performance hit. The small upscale looks very good after all, and the upscaling algorithm selector scales very well with GPU power (e.g. the luma quality selected).

I do see the rational of being more willing to use direct 4x, compared to double twice. However the performance difference is still significant enough that "if any upscaling required" is not a great default. On my GPU (a Titan XP) the time difference between direct 4x and 2x is similar to that between double twice and direct 4x. It is easy to tune a profile that works for doubling 1080p and quadrupling 720p, but then doesn't work for quadrupling slightly cropped 1080p. SSIM 1D100 downscaling exacerbates this issue.

Maybe set double again to 3.0x for "let madVR decide" to maintain the delta? Not that consistency is necessarily a bad thing.

Quote:

Originally Posted by zaemon

Do you guys think it is worth investing in a CPU that can (software) decode HEVC 4K in order to leave some room for madVR GPU processing?

No, I have a i9-7900K @ 4.7 GHz and I use software decoding for everything except 4K HEVC. It is easy for me to keep the decoding queues full watching UHD blurays, using either 32-bit or 64-bit LAV 0.71.0. The 140 Mbps jellyfish sample takes 32-50% CPU with 32-bit and 25-33% CPU with 64-bit. With 64-bit even the 400Mbps jellyfish sample only uses 40-60% total CPU.

However, hardware decoding adds almost nothing to the GPU's workload and the power/heat difference is significant. Also, both startup and seeking are more responsive when using hardware decoding, even without "delay playback start until render queue is full." On modern GPUs, like Nvidia's 10 series, 10-bit HEVC has pure hardware decoding, dedicated silicon for everything required, so the shaders madVR uses are idle. The only resources shared are the PCI-bus and some GPU memory bandwidth, both of with are usually not the bottleneck for madVR. Checking just now it seems my rendering times are identical between the two.

Quote:

Originally Posted by Warner306

You might want to look into the banding problems with Nvidia cards when set to HDR passthrough with 12-bits at the GPU. This could be the display, driver or madVR.

The only AMD user I've spoke to says 10-bit HDR passthrough seems to fine without any banding, but I think he needs to run some tests to be certain.

I noticed this too but I assumed it was a problem with my TV when given content dithered to >8-bit. I do not see it even when sending 12-bit RGB when madVR is set to 8 bit. I also need to do more testing, the TV isn't off the hook yet.

I should also mention that the OSD reports it is using my SDR BT.2020 3DLUT when using "passthrough HDR content to the display" but it is not actually doing so (changing the 3DLUT does not change the output).

stefanelli73 · 8th April 2018, 09:56

I noticed yesterday too watching the film ALLIED, in the opening scene of the desert, looking at the blue sky the banding but coming out at 12 bit ... I use an AMD RX480 ....... something else like never when I get the subtitles I have a slowing down of the image and the two renderings rise in value?

madshi · 8th April 2018, 10:03

Quote:

Originally Posted by sauma144

It seems there are some bugs in madVR 0.92.11 and 0.92.12.
Ranpha downgraded madVR in his latest LAV Filters Megamix setup.

I don't know which bugs he's refering to.

Quote:

Originally Posted by Warner306

You might want to look into the banding problems with Nvidia cards when set to HDR passthrough with 12-bits at the GPU. This could be the display, driver or madVR.

I've just tested this. With the Nvidia GPU driver set to 12bit, if I disable FSE mode and then switch MPC-HC to fullscreen mode, with the "smallramp.ytp" test pattern stretched to fill the whole screen, madVR OSD displays "D3D11 fullscreen windowed (10 bit)", and there are no banding problems. Then, if I move the mouse down to show the MPC-HC seekbar, I can see banding for a second, until madVR detects that playback is not fullscreen, anymore. In that moment madVR switches back to "D3D11 fullscreen windowed (8 bit)" and the banding goes away. If I move the mouse back up, the seekbar disappears, madVR one second later switches back to "D3D11 fullscreen windowed (10 bit)" and no banding.

So as far as I can tell, it works perfectly on my PC. This is with 390.65 drivers. Maybe there's a bug in newer drivers? I don't know.

P.S: I've only tested this with my SDR test pattern. Maybe things are different in HDR passthrough mode? I don't really know how to test it there, though, because banding problems are easy to see with my test pattern, but harder to see with true HDR content.

Quote:

Originally Posted by Asmodian

Yes. I do not believe it is enough better to quadruple and then downscale to e.g. 2.3x, compared to doubling and then upscaling, to justify the performance hit. The small upscale looks very good after all, and the upscaling algorithm selector scales very well with GPU power (e.g. the luma quality selected).

I do see the rational of being more willing to use direct 4x, compared to double twice. However the performance difference is still significant enough that "if any upscaling required" is not a great default. On my GPU (a Titan XP) the time difference between direct 4x and 2x is similar to that between double twice and direct 4x. It is easy to tune a profile that works for doubling 1080p and quadrupling 720p, but then doesn't work for quadrupling slightly cropped 1080p. SSIM 1D100 downscaling exacerbates this issue.

Maybe set double again to 3.0x for "let madVR decide" to maintain the delta? Not that consistency is necessarily a bad thing.

Ok.

Quote:

Originally Posted by Asmodian

I should also mention that the OSD reports it is using my SDR BT.2020 3DLUT when using "passthrough HDR content to the display" but it is not actually doing so (changing the 3DLUT does not change the output).

Looks like a bug, thanks.

Quote:

Originally Posted by stefanelli73

I noticed yesterday too watching the film ALLIED, in the opening scene of the desert, looking at the blue sky the banding but coming out at 12 bit ... I use an AMD RX480 .......

Could be anything, could be hard coded into the movie. Try setting madVR to 8bit, does the banding go away?

stefanelli73 · 8th April 2018, 10:20

Actually I've tried only 12 bit today I will go to 10bit and 8bit ....... Madshi you have a solution to the problem that I have subtitles? In practice when forced subtitles come out in a scene or when I use subtitles normally the image slows down and the ms of the two renderings increase from 30ms up to 110ms, when the subtitles disappear it returns all right.....my player is JRIVER.

7th April 2018, 17:05	#50127 \| Link
Asmodian Registered User Join Date: Feb 2002 Location: San Jose, California Posts: 4,407	I know my 2017 LG OLED (touted as low input lag) only has low input lag in PC or Game mode, and in that mode it switches off most processing. It even does worse tone mapping in PC HDR modes, though I think this might be due to not wanting to over-saturate sRGB games rather than not having enough time. Measuring input lag in game or PC mode but then using other modes to test things like 24p in 60Hz can give misleading impressions. In its other modes my TV's input lag is very high, the mouse cursor feels connected by long rubber bands. I haven't measured it but it is too high to use it as a monitor. __________________ madVR options explained

7th April 2018, 17:42	#50129 \| Link
zaemon Registered User Join Date: Mar 2016 Posts: 27	madVR - high quality video renderer (GPU assisted) Do you guys think it is worth investing in a CPU that can (software) decode HEVC 4K in order to leave some room for madVR GPU processing? I tested software decoding with my Threadripper and it handled it very easily thus I think with a Coffee Lake i5 or i7 I could use software decoding for not a huge pile of money. Probably better to get a faster GPU but I already got a GTX 1080 so not much room in that regard (until GTX 11xx maybe). Also I’m a big fan of software decode as I feel it provides a more smooth experience.

8th April 2018, 10:20	#50140 \| Link
stefanelli73 Registered User Join Date: Jul 2016 Posts: 52	Actually I've tried only 12 bit today I will go to 10bit and 8bit ....... Madshi you have a solution to the problem that I have subtitles? In practice when forced subtitles come out in a scene or when I use subtitles normally the image slows down and the ms of the two renderings increase from 30ms up to 110ms, when the subtitles disappear it returns all right.....my player is JRIVER. Last edited by stefanelli73; 8th April 2018 at 10:22.

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

7th April 2018, 20:35	#50133 \| Link
sauma144 Registered User Join Date: Sep 2016 Posts: 89	It seems there are some bugs in madVR 0.92.11 and 0.92.12. Ranpha downgraded madVR in his latest LAV Filters Megamix setup. https://www.videohelp.com/software/L...comments#13925

7th April 2018, 21:00	#50135 \| Link
sauma144 Registered User Join Date: Sep 2016 Posts: 89	I don't know, I want to know if madshi is aware.

8th April 2018, 09:56	#50138 \| Link
stefanelli73 Registered User Join Date: Jul 2016 Posts: 52	I noticed yesterday too watching the film ALLIED, in the opening scene of the desert, looking at the blue sky the banding but coming out at 12 bit ... I use an AMD RX480 ....... something else like never when I get the subtitles I have a slowing down of the image and the two renderings rise in value?