Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 17th October 2018, 02:02   #53301  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
HDR conversation is relative free in dumb mode.

the MeasureLuminace /StrechRect shaders is significantly faster with d3d11 native is there a reason for that? i'm talking about a factor of 2-3x times faster.
huhn is offline   Reply With Quote
Old 17th October 2018, 08:08   #53302  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Thanks guys!

Quote:
Originally Posted by Manni View Post
isn't black bar detection essential for many? You lose that with native.
Yes, and yes.

Quote:
Originally Posted by Warner306 View Post
At 1080p, a GTX 1050 Ti can't do tone mapping for 60 fps content with highlight recovery enabled. highlight recovery is the killer, as it pushes rendering times way over 16ms. This is with scale chroma separately enabled.
Is that with copyback? Does using D3D11 native decoding change anything?

Quote:
Originally Posted by Asmodian View Post
Yes!
This is without highlight recovery, though, I guess? What happens if you enable that, too?

And while we're at it, which NGU Upscaling quality (Luma only, chroma set to "normal") can the 1050 Ti do for 1080 -> 4K? Medium or High?
madshi is offline   Reply With Quote
Old 17th October 2018, 08:25   #53303  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
a 960 is very similar to a 1050 ti and is can do 1080p23 with NGU high easily. it uses about 20ms ~ for NGU high alone.

and be aware that the 1050 3GB is on paper faster than the 1050 ti and cheaper at the same. just a matter if you can work with the 3GB it has.

not a lot of versions to choice from but the price is great: https://geizhals.de/?cat=gra16_512&x...+(3GB%2F96bit)
huhn is offline   Reply With Quote
Old 17th October 2018, 08:47   #53304  |  Link
mytbyte
Registered User
 
Join Date: Dec 2016
Posts: 212
Quote:
Originally Posted by Warner306 View Post

Were you trying to find a target nits that perfectly tracked BT.2390 all the way to 100% output? The only commonality I see is that the brightness does seem to get to white too slow or too fast at the top. Maybe that is something that is lost in translation with SDR? And maybe it is close enough?
No, I want to work with measured nits but I am wandering why MadVR generated BT.2390 curve (as measured) drifts gradually (and significnatly) away from HCFR reference BT.2390 curve and then returns to being spot-on at 70% but then gets clipped at 80-90% stimulus if we know the "formula" is defined in the papers and should be the same in HCFR and MadVR. Is the 80-90% clipping part of the formula to mantain some HDR effect with low peak brightness?

If translation to SDR was the cause, I expect the PQ curve tracking would drift as well but it is not.

Quote:
Originally Posted by Warner306 View Post
That part don't make no sense. Only the 8-bit pattern should display correctly, but you shouldn't fail the black clipping test if you are clipping correctly to 175 nits. The gradient should go from right-to-left until Bar 16.
hmm...there is no bar 16, there is a clip with 64-80 bars and there is a clip with C64-C111 gradient (0.0-0.07 nits). In the former I can, with great effort and in pitch black room notice that bars 76-80 are flashing, almost un-noticebly. In the gradient clip I can see the gradient becoming VERY slightly lighter, from left to right, than the bottom black part, so I am guessing there is no crush but also that near black gradation is so barely noticable - don't know if that's a problem and it should be more noticable.
mytbyte is offline   Reply With Quote
Old 17th October 2018, 09:32   #53305  |  Link
RainyDog
Registered User
 
Join Date: May 2009
Posts: 184
Quote:
Originally Posted by ryrynz View Post
He's saying clock speed is dynamic and ideally you need to log gpu speed and load for a good minute or so and calculate the average from that. Simpler tasks can show higher render times since the GPU can clock a lot lower.

Madshi, any chance we could get an average rendering statistic in the OSD? Ideally being able to see clock speeds on the OSD would be useful too.

D3D11 is quite a bit more efficient on my 1060.
Or set your GPU to maximum power state then the render times will be comparable on a level playing field.

For some reason, I can't get D3D11 native to work in 10bit exclusive mode on my 1060 It just freezes when I try going fullscreen. DXVA2 native is fine though.
RainyDog is offline   Reply With Quote
Old 17th October 2018, 09:55   #53306  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,646
The usual uninstall/reinstall graphics drivers and reset madVR. Does it work with 8bit?

For anyone downscaling 4K HDR to SDR HD, make sure you have 'scale chroma separately' checked.
Also 'Are you nuts!?' highlight recovery for HDR does wonders for clouds and other highlights and looks nicer than very high, you really want this option enabled at either of those settings I think.

SSIM 2D downlscaling for 60fps 4K content is just out of reach for a 1060 6GB and my 1080 can't quite do full madVR 4K HDR -> 1080 SDR (SSIM 2D downscaling, NGU AA chroma, ED2 Dithering, highlight recovery etc), I have to enable scale chroma separately.

Last edited by ryrynz; 18th October 2018 at 07:56.
ryrynz is offline   Reply With Quote
Old 17th October 2018, 10:20   #53307  |  Link
chros
Registered User
 
chros's Avatar
 
Join Date: Mar 2002
Posts: 2,323
Quote:
Originally Posted by huhn View Post
and be aware that the 1050 3GB is on paper faster than the 1050 ti and cheaper at the same. just a matter if you can work with the 3GB it has.
I know it was discussed couple of times but can you remind me what it can't do with 3GB? I'm interested in <=30fps content only.
Does anyone use a 3GB card here?
Maybe I can upgrade my laptop to another one with 1060 3GB version...
__________________
Ryzen 5 2600,Asus Prime b450-Plus,16GB,MSI GTX 1060 Gaming X 6GB(v398.18),Win10 LTSC 1809,MPC-BEx64+LAV+MadVR,Yamaha RX-A870,LG OLED77G2(2160p@23/24/25/29/30/50/59/60Hz) | madvr config
chros is offline   Reply With Quote
Old 17th October 2018, 10:30   #53308  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
i don't have a 3 GB card so i can't say.

but i can use more than 3 GB Vram with madVR by using ngu on chroma and luma by leaving the rest in madVR normal.

because 2Gb doesn't work easily on UHD with subtitles and we are using PC you just recommend 4 gb because that's more common.
huhn is offline   Reply With Quote
Old 17th October 2018, 10:36   #53309  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,646
You won't have any problems with 1080P or below on a 2GB card.
ryrynz is offline   Reply With Quote
Old 17th October 2018, 10:39   #53310  |  Link
mytbyte
Registered User
 
Join Date: Dec 2016
Posts: 212
Quote:
Originally Posted by ryrynz View Post

For anyone downscaling 4K HDR to SDR HD, make sure you have 'scale chroma separately' checked.
Also 'Are you nuts!?' highlight recovery for HDR does wonders for clouds and other highlights and looks nicer than very high, you really want this option enabled at either of those settings I think.
I protest - at l(e)ast there is a standard now and we still seek to tweak it.
mytbyte is offline   Reply With Quote
Old 17th October 2018, 10:42   #53311  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,646
Enabling it absolutely destroys performance better used in other places, I do prefer to not have it ticked though. Prefer not to trade quality where possible.
Definitely the last thing to untick from that section.
ryrynz is offline   Reply With Quote
Old 17th October 2018, 10:57   #53312  |  Link
Manni
Registered User
 
Join Date: Jul 2014
Posts: 942
Quote:
Originally Posted by huhn View Post
that's more than odd. there is nothing that should be able to push your CPU to full load when hardware decoding is used.
Thanks, you were absolutely right, something was wrong when I took my power readings: I didn't notice that Teamviewer was running in the background . That's why the CPU load was abnormally high. Because that didn't impact on my rendering times, I didn't think of it.

The GPU clock was maxed though, and the task manager performance monitor was correct by the way (checked with GPU-Z and CPU-Z). It usually is fairly reliable here.

I did more tests without Teamviewer running in the background (!), using Pacific Rim for a worst case scenario (as it's a 16/9 UHD HDR movie), the readings are normal (I think). I also used a kill-o-watt to measure the actual power use for each mode (rough average). Idle use is 93W.

DVXA2 NT: 18ms CPU 12% GPU 50% 230W
DVXA2 CB: 21ms CPU 30% GPU 75% 320W

D3D11 NT: 18ms CPU 15% GPU 55% 270W
D3D11 CB: 21ms CPU 30% GPU 75% 320W

I wish it was possible to use DXVA2 native as it's clearly the most power efficient mode but if I remember correctly there are banding issues with it. Both CB modes seem to use the same amount of power roughly, so unless there is a very strong reason to use DXVA2 CB, it looks like I'm going to use D3D11 native with manual picture shift, as I'll lose the black bars detection. I can't really justify 50W just for that convenience. I'll just have to select thelens memory on my iPad for each film, that's not too bad.

I have everything enabled with pixel shader with peak brightness measurements, restore highlights, no compromise, NGU High chroma upscaling, plus a 3D LUT, plus some enhancements, so it's a maxed out scenario. Still Asmodian's times with a 1050ti seem significantly lower, so I might try to disable some enhancements later to see if I get similar results.

Thanks again for pointing this out, it doesn't make any difference in real use but my results were wrong, as you pointed out.

Quote:
Originally Posted by ryrynz View Post
He's saying clock speed is dynamic and ideally you need to log gpu speed and load for a good minute or so and calculate the average from that. Simpler tasks can show higher render times since the GPU can clock a lot lower.

Madshi, any chance we could get an average rendering statistic in the OSD? Ideally being able to see clock speeds on the OSD would be useful too.

D3D11 is quite a bit more efficient on my 1060.
Thanks for the translation/explanation. I'm aware that clocks are dynamic, and I had checked that, it wasn't the explanation

I agree that an average rendering stat in the OSD would be great.

Not sure why your D3D11 mode is more efficient, there is little to no difference here with CB and DXVA2 is significantly more efficient in native.
__________________
Win11 Pro x64 b23H2
Ryzen 5950X@4.5Ghz 32Gb@3600 Zotac 3090 24Gb 551.33
madVR/LAV/jRiver/MyMovies/CMC
Denon X8500HA>HD Fury VRRoom>TCL 55C805K

Last edited by Manni; 17th October 2018 at 11:07.
Manni is offline   Reply With Quote
Old 17th October 2018, 11:07   #53313  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
blackbar detection still runs even on a movie that doesn't need it so your comparison is not fair.

and don't you think it's odd that 70% GPU load is 21 ms and 50-55 %is 18 ms.

with DXVA native even measure peak brightness is disabled which is a couple of MS if it it used.

because i can't look into your system you have to make sure all use the same settings and no setting is used that the other mode can't do.
huhn is offline   Reply With Quote
Old 17th October 2018, 11:23   #53314  |  Link
Manni
Registered User
 
Join Date: Jul 2014
Posts: 942
Quote:
Originally Posted by huhn View Post
blackbar detection still runs even on a movie that doesn't need it so your comparison is not fair.

and don't you think it's odd that 70% GPU load is 21 ms and 50-55 %is 18 ms.

with DXVA native even measure peak brightness is disabled which is a couple of MS if it it used.

because i can't look into your system you have to make sure all use the same settings and no setting is used that the other mode can't do.
I wasn't trying to do a fair comparison, only to show what results I get with my current settings if I simply change the mode.

I didn't use a 16/9 movie because I thought black bars detection wouldn't be active, I used it because it has more pixels to handle than a 2.40 title, hence worst case scenario re render times (as I said).

I'm only posting results for the settings I was currently using. I'll adapt settings depending on whether I decide to use native or copyback.

Measure peak brightness isn't disabled when running DXVA2 native, not sure where that comes from.

You are correct though that if I was to disable the CPU features used in CB that cannot be used in native, the render times would go down in CB. I guess I'll have to test that.
__________________
Win11 Pro x64 b23H2
Ryzen 5950X@4.5Ghz 32Gb@3600 Zotac 3090 24Gb 551.33
madVR/LAV/jRiver/MyMovies/CMC
Denon X8500HA>HD Fury VRRoom>TCL 55C805K

Last edited by Manni; 17th October 2018 at 11:29.
Manni is offline   Reply With Quote
Old 17th October 2018, 11:40   #53315  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
you can not call one mode more efficient if you didn't test them fair.

and did madshi change this in a new test build?
https://abload.de/img/itcomesfromherec1cmv.png
huhn is offline   Reply With Quote
Old 17th October 2018, 12:37   #53316  |  Link
Manni
Registered User
 
Join Date: Jul 2014
Posts: 942
Quote:
Originally Posted by huhn View Post
you can not call one mode more efficient if you didn't test them fair.

and did madshi change this in a new test build?
https://abload.de/img/itcomesfromherec1cmv.png
Fair enough, the only thing I can say is that whatever CB is doing, with my settings, I can't justify the power use it causes as I only need it for black bars detection. So it's more efficient for me, unless/until I can see something I lose in PQ or functionality apart from black bars detection.

I'm using ordered dithering, and although disabled black bars detection shaves 2ms and drops the CPU use from 25% to 15% (MPC-BE only), that doesn't explain the whole difference.

My picture enhancements cost me around 2ms.

Re the DXVA limitation, I don't know if it's effective or not, I can only report that the measured brightness is displayed with DXVA native as well as with D3D11 (I checked). Maybe Madshi can explain if the text in the settings still applies and if it does how come the OSD shows the measurements with DXVA native.
__________________
Win11 Pro x64 b23H2
Ryzen 5950X@4.5Ghz 32Gb@3600 Zotac 3090 24Gb 551.33
madVR/LAV/jRiver/MyMovies/CMC
Denon X8500HA>HD Fury VRRoom>TCL 55C805K

Last edited by Manni; 17th October 2018 at 12:48.
Manni is offline   Reply With Quote
Old 17th October 2018, 12:48   #53317  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
it doesn't run with version 17. easily checked with the advanced OSD.
huhn is offline   Reply With Quote
Old 17th October 2018, 12:57   #53318  |  Link
mytbyte
Registered User
 
Join Date: Dec 2016
Posts: 212
The plot thickens! How come the 100% stimulus HDR pattern is reported as 10000 nits measured/average by MadVR when content metadata says maximum content light level is 1000 nits?
mytbyte is offline   Reply With Quote
Old 17th October 2018, 12:57   #53319  |  Link
Manni
Registered User
 
Join Date: Jul 2014
Posts: 942
Quote:
Originally Posted by huhn View Post
it doesn't run with version 17. easily checked with the advanced OSD.
I have no idea what you are talking about. As I said, measure peak brightness works just as well with DXVA2 as it does with D3D11, at least according to the OSD. Both the measured brightness and the histograms show. I'm using V17 with the latest test build.

https://imgur.com/a/ASXPoRq
__________________
Win11 Pro x64 b23H2
Ryzen 5950X@4.5Ghz 32Gb@3600 Zotac 3090 24Gb 551.33
madVR/LAV/jRiver/MyMovies/CMC
Denon X8500HA>HD Fury VRRoom>TCL 55C805K
Manni is offline   Reply With Quote
Old 17th October 2018, 13:01   #53320  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
create an empty folder called ShowRenderSteps in the madVR folder.

but judging from your OSD limited information it is clearly working with that version.
huhn is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:55.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.