Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 20th September 2019, 03:07   #57441  |  Link
Megalith
Registered User
 
Join Date: Mar 2011
Posts: 110
How does madVR's 1080p > 4K upscaling compare to a legitimate 4K release? I'm starting to suspect I wasted my time with these enormous files...
Megalith is offline   Reply With Quote
Old 20th September 2019, 03:58   #57442  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 4,322
Upscaled 1080p -> 2160p does not have HDR. So, native UHD is going to, by nature, almost always look better than even an upscaled 1080p. Having said that, depending on what settings you use, e.g. NGU sharp High or Very High for luma upscaling, it looks absolutely fantastic. Still, by and by, I prefer the native UHD when it's available.
__________________
HTPC: Windows 10, I9 9900k, RTX 2070 Founder's Edition, Pioneer Elite VSX-LX303, LG C8 65" OLED
SamuriHL is offline   Reply With Quote
Old 20th September 2019, 04:10   #57443  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,138
Quote:
Originally Posted by SamuriHL View Post
Still, by and by, I prefer the native UHD when it's available.
Like Oblivion?
Stereodude is offline   Reply With Quote
Old 20th September 2019, 04:34   #57444  |  Link
SamuriHL
Registered User
 
SamuriHL's Avatar
 
Join Date: May 2004
Posts: 4,322
Haven't seen that one.
__________________
HTPC: Windows 10, I9 9900k, RTX 2070 Founder's Edition, Pioneer Elite VSX-LX303, LG C8 65" OLED
SamuriHL is offline   Reply With Quote
Old 20th September 2019, 11:13   #57445  |  Link
mclingo
Registered User
 
Join Date: Aug 2016
Posts: 872
Hi guys, just wondering, at what point is a PCI express 2.0 slot going to be a bottleneck for MADVR with a new PCI express 4.0 card like a 5700xt, if at all?
__________________
OLED 4k HDR EF950-YAM RX-V685-WIN10 444 RGB 60hz-AMD RX 5700 19.11.3 KODI DS - MAD/LAV 92.17/0.74.1 - 3D MVC / FSE:ON / MADVR 10bit
mclingo is offline   Reply With Quote
Old 20th September 2019, 16:15   #57446  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 5,964
it's currently not working properly.

PCIe 3.0 can barely do UHD60 and PCIe 4.0 interface is only double the speed so it shouldn't be able to do FUHD60.

so PCIe 2.0 is already a bottleneck that's where native hardware decoding comes into the mix and that doesn't care even PCIe 1.0 is a total over kill for that.
huhn is offline   Reply With Quote
Old 20th September 2019, 16:33   #57447  |  Link
mclingo
Registered User
 
Join Date: Aug 2016
Posts: 872
so PCI express bandwidth is currently irrelevant, it makes no different whether you put a 4.0 PCI express card in 2.0, 3.0 or 4.0 slot, for MADVR?
__________________
OLED 4k HDR EF950-YAM RX-V685-WIN10 444 RGB 60hz-AMD RX 5700 19.11.3 KODI DS - MAD/LAV 92.17/0.74.1 - 3D MVC / FSE:ON / MADVR 10bit
mclingo is offline   Reply With Quote
Old 20th September 2019, 16:39   #57448  |  Link
Klaus1189
Registered User
 
Join Date: Feb 2015
Location: Bavaria
Posts: 763
Quote:
Originally Posted by huhn View Post
it's currently not working properly.

PCIe 3.0 can barely do UHD60 and PCIe 4.0 interface is only double the speed so it shouldn't be able to do FUHD60.

so PCIe 2.0 is already a bottleneck that's where native hardware decoding comes into the mix and that doesn't care even PCIe 1.0 is a total over kill for that.
I don't understand why it PCIe 3.0 or PCIe 2.0 are a bottleneck for that pupose.
HDMI 2.1 is capable of 48 GBit/s which is about 6 GByte/s -> sufficient for 8K 60fps but with DSC Video compression
PCIe 2.0 is about 8 GByte/s for 16 Lanes
PCIe 3.0 is about 15,x GByte/s for 16 Lanes -> sufficient for uncompressed 8K 60fps 10 bit 4:4:4 (11,2 GByte/s needed)
PCIe 4.0 is about 31,x GByte/s for 16 Lanes -> sufficient for uncompressed 8K 60fps 10 bit 4:4:4 (11,2 GByte/s needed)

But only if you have a Mainboard which has full 16 Lanes on the PCIe slot.
Some Mainboards use full size slot but only half of the pins is physically there. This can be an issue.
Here is a calculator of uncompressed videostream sizes per second: https://www.extron.de/product/videotools.aspx

Note:
Bit vs. Byte
Number of PCIe Lanes: 1 / 4 / 8 / 16
Klaus1189 is offline   Reply With Quote
Old 20th September 2019, 17:26   #57449  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 5,964
Quote:
Originally Posted by mclingo View Post
so PCI express bandwidth is currently irrelevant, it makes no different whether you put a 4.0 PCI express card in 2.0, 3.0 or 4.0 slot, for MADVR?
as long as native hardware decoding is used.
subtitles use a lot too.

Quote:
I don't understand why it PCIe 3.0 or PCIe 2.0 are a bottleneck for that pupose.
because they are in real life tests. back in the days madshi created a tool to measure "copyback" speed on GPU. we got ~120 FPS with PCIe 3.0 with 1080p. the infamous AMD nnedi3 copyback bug (and they where still massively faster).
there is still the issue on how the video is encoded/stored:
https://docs.microsoft.com/en-us/win...-video-formats

Quote:
The 16-bit representations described here use little-endian WORD values for each channel. The 10-bit formats also use 16 bits for each channel, with the lowest 6 bits set to zero, as shown in the following diagram.
and well subtitles i have no clue if they are compressed or not.
UHD 60 is currently as far as you can go with copyback. maybe it is the latency we don't know going over UHD 60 make huge problem.
Quote:
But only if you have a Mainboard which has full 16 Lanes on the PCIe slot.
Some Mainboards use full size slot but only half of the pins is physically there. This can be an issue.
the 16x bus is directly connected to the CPU for a long time now.
the PCIe 16x slot with mechanical 8x and usually electrical 4x are usually connected to the chipset. you have to do a really good job in finding a new proper board that doesn't have a direct CPU 16x slot for many years now.
huhn is offline   Reply With Quote
Old 20th September 2019, 17:43   #57450  |  Link
el Filou
Registered User
 
el Filou's Avatar
 
Join Date: Oct 2016
Posts: 546
I'm convinced the real bottleneck is in the system platform (RAM) and/or the software, not the bus interface itself which is way faster than any uncompressed video you could send over it. Of course it's no help if you're stuck with PCIe 2.0 (and old system platform), but I'd be very interested in a benchmark of copyback with a 5700 on X470 and X570, to see if the bottleneck is the bus or the system RAM. Has anyone done that?

I hope madshi will implement black bar detection and film mode on the GPU now that Envy is coming out, that would help it immensely not having to do copyback on there (there would still be the problem of deinterlacing with D3D11 native on the PC version, and software support like that JRiver UHD menu thing).

Edit: that talk about future-proofing and "going over UHD60" presumes that 8K is going to become common anytime soon. I don't see that happening in the near future, not on the PC anyway. I'd watch an UHD Blu-ray over a bit-starved 8K online stream.
__________________
HTPC: Windows 10 1809, MediaPortal 1, LAV Filters, ReClock, madVR. DVB-C TV, Panasonic GT60, 6.0 speakers Denon 2310, Core 2 Duo E7400, GeForce 1050 Ti

Last edited by el Filou; 20th September 2019 at 17:51.
el Filou is offline   Reply With Quote
Old 20th September 2019, 17:52   #57451  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 5,964
i switch from PCIe 3.0 ddr3 1600 mhz to PCIe 4.0 ddr4 3200 mhz and i got worse results for copyback.
huhn is offline   Reply With Quote
Old 20th September 2019, 17:59   #57452  |  Link
el Filou
Registered User
 
el Filou's Avatar
 
Join Date: Oct 2016
Posts: 546
That is really disappointing.
__________________
HTPC: Windows 10 1809, MediaPortal 1, LAV Filters, ReClock, madVR. DVB-C TV, Panasonic GT60, 6.0 speakers Denon 2310, Core 2 Duo E7400, GeForce 1050 Ti
el Filou is offline   Reply With Quote
Old 20th September 2019, 18:02   #57453  |  Link
Klaus1189
Registered User
 
Join Date: Feb 2015
Location: Bavaria
Posts: 763
Quote:
Originally Posted by huhn View Post
i switch from PCIe 3.0 ddr3 1600 mhz to PCIe 4.0 ddr4 3200 mhz and i got worse results for copyback.
What DDR3 and DDR4 RAM modules did you compare?
Exact model number for tech specs. I want to compare on paper.
What mainboard did you use for each system?
Klaus1189 is offline   Reply With Quote
Old 20th September 2019, 18:13   #57454  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 5,964
wow wow chill.
it's an freaking zen 2 with broken navi card and the test was with agesa 1.0.0.3a. come back in a couple of month when they ironed the bugs out.

i mean the newest driver update manage to stop BSOD when playing hardware decoded videos and the driver is now only dying they are making huge steps forward!

i don't even have the card build in anymore.
huhn is offline   Reply With Quote
Old 20th September 2019, 18:25   #57455  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,833
The problem with CopyBack is also that it has to copy the image twice, once from the GPU to the system, and then back from the system to the GPU. On some GPUs, the download step is also rather slow (AMD used to historically have trouble there, no clue how recent hardware changed). But it can also stress the system RAM, especially on dual-channel memory mainstream systems.

I did a quick test on my system (which isn't a good example, since it has fast quad-channel RAM and everything else high-end as well, but regardless) with a random 4K 10-bit test clip I had at hand:
With DXVAChecker and naive EVR playback testing

DXVA2-Native, ~380 FPS
DXVA2-CopyBack, ~104 FPS
Software Decoding, ~196 FPS

The native test is close to what the hardware decoder can achieve, it was at ~95% usage most of the time. CopyBack definitely takes quite a toll on 4K. Interestingly on 1080p the overhead from CopyBack is generally extremely minimal.
Interesting is also software decoding. Granted you need a CPU that can actually decode this fast, and it was decode-limited at this point, but uploading the image alone is not bottlenecking the decoder yet.

Since I could upload at 196 fps at least (and probably more), I did another test, DXVA2-CopyBack, Decode only - which means it'll only download the image from the GPU, but not re-upload it. That yielded ~232 FPS.
Clearly the doubled use from download and upload creates the real bottleneck ... somewhere. Its not entirely clear where the real bottleneck is. Clearly the software upload path in the renderer can handle more then ~104 FPS. Clearly the download path in LAV Video can as well. PCIe is full-duplex, which means it should be capable of sending and receiving at the same time. System Memory is more complex in regards to that... but my quad-channel memory should have plenty bandwidth to accomodate this here.

What I don't know is if the EVR used in this example uses a different thread for uploading the video, or if its on the same thread as LAV Video uses to deliver the image - which might explain why its slowing down so much, since it does two things on the same thread. madVR, at least, uses a seperate thread for uploading, so it wouldn't be affected by that.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is online now   Reply With Quote
Old 20th September 2019, 20:39   #57456  |  Link
Crimson Wolf
Registered User
 
Join Date: Dec 2014
Posts: 44
Quote:
Originally Posted by Stereodude View Post
Like Oblivion?
Just FYI: Oblivion is fake 4k (mastered in 2K or 1080p then upscaled)
Crimson Wolf is offline   Reply With Quote
Old 20th September 2019, 21:54   #57457  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 5,964
we can keep going on with copyback issues here:
https://forum.doom9.org/showthread.php?t=176642
huhn is offline   Reply With Quote
Old 21st September 2019, 02:24   #57458  |  Link
seiyafan
Registered User
 
Join Date: Feb 2014
Posts: 161
In the hardware device option for copyback, is there any difference between the GPU card and the iGPU in the Intel processor?
seiyafan is offline   Reply With Quote
Old 21st September 2019, 08:07   #57459  |  Link
littleD
Registered User
 
littleD's Avatar
 
Join Date: Aug 2008
Posts: 306
Quote:
Originally Posted by nevcairiel View Post
The problem with CopyBack is also that it has to copy the image twice, once from the GPU to the system, and then back from the system to the GPU. On some GPUs, the download step is also rather slow (AMD used to historically have trouble there, no clue how recent hardware changed). But it can also stress the system RAM, especially on dual-channel memory mainstream systems.

I did a quick test on my system (which isn't a good example, since it has fast quad-channel RAM and everything else high-end as well, but regardless) with a random 4K 10-bit test clip I had at hand:
With DXVAChecker and naive EVR playback testing

DXVA2-Native, ~380 FPS
DXVA2-CopyBack, ~104 FPS
Software Decoding, ~196 FPS

The native test is close to what the hardware decoder can achieve, it was at ~95% usage most of the time. CopyBack definitely takes quite a toll on 4K. Interestingly on 1080p the overhead from CopyBack is generally extremely minimal.
Interesting is also software decoding. Granted you need a CPU that can actually decode this fast, and it was decode-limited at this point, but uploading the image alone is not bottlenecking the decoder yet.

Since I could upload at 196 fps at least (and probably more), I did another test, DXVA2-CopyBack, Decode only - which means it'll only download the image from the GPU, but not re-upload it. That yielded ~232 FPS.
Clearly the doubled use from download and upload creates the real bottleneck ... somewhere. Its not entirely clear where the real bottleneck is. Clearly the software upload path in the renderer can handle more then ~104 FPS. Clearly the download path in LAV Video can as well. PCIe is full-duplex, which means it should be capable of sending and receiving at the same time. System Memory is more complex in regards to that... but my quad-channel memory should have plenty bandwidth to accomodate this here.

What I don't know is if the EVR used in this example uses a different thread for uploading the video, or if its on the same thread as LAV Video uses to deliver the image - which might explain why its slowing down so much, since it does two things on the same thread. madVR, at least, uses a seperate thread for uploading, so it wouldn't be affected by that.
Not sure if i add something new, but computers since like forever were designed with the data flow in one direction. To make pc games reach high fps rates, it means the pc system should have fast cpu>gpu memory transfer. Backward direction was always few times slower because there were no applications needing that. That fact starts matter in GPGPU times, even before Opencl, because general processing need to reupload data many times. Since then - the path gpu>cpu have been steadily improved but still is slower.
Ram speed has no much impact in cpu>gpu transfer since thats native for pc architecture. Upload to gpu, decode (texture/image) and render is typical for pc game. Quad channel might have advantage in software decoding whith many cpu<>ram memory transfers. So ram speed might matter in software decoding.
In Your example, alone downloading image (decode only) looks fast anyhow. Reuploading (playback) is contrary to PC design, even with fast RAM, slow speed may be hardware or software limitation (dxva design?). There are some small tools to benchmark PCIE gpu>cpu transfer.
littleD is offline   Reply With Quote
Old 21st September 2019, 17:27   #57460  |  Link
TechnoPeasant
Registered User
 
Join Date: Apr 2019
Posts: 16
Has the latest Windows 10 update screwed the levels for anyone? I haven't touched my madVR setup in ages, the only thing that's changed is Windows 10, and now my levels are screwed and everything looks washed out.
TechnoPeasant is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 18:23.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.