madVR - high quality video renderer (GPU assisted) - Page 807

Sneals2000 · 10th December 2012, 02:00

Quote:

Originally Posted by turbojet

Are you guys sure 702 is actually the correct width of dvd's? I've always heard (mostly from an electronics engineer) 703 PAL and 706 NTSC which is typically rounded to 704.

Absolutely certain about 576/50i (aka "PAL") being 702 samples wide. Comes directly from 13.5MHz sampling of a 52us active line.

I don't know about NTSC - as there were all sorts of issues with 480 vs 486 lines. (Modern digital "NTSC" is based on 480 active lines, but analogue NTSC and digital broadcast video originally worked on the basis of 486 active lines)

(NB PAL changed from analogue 575 to digital 576 active lines - but didn't change picture height. How? Originally analogue "PAL" was deemed to be 575/50i active lines - with each 575 line frame made up from two fiels each of 287.5 lines - one had a half-line at the top, one had a half-line at the bottom. In digital "PAL" these half-lines are padded to full lines creating 2 x 288 line fields or a 576 line frame. You often still see the half-lines at the top and bottom of frame when viewing 576/50i content scaled to 1920x1080 and viewed with no overscan)

6233638 · 10th December 2012, 02:30

So I've gone back and done some testing with hardware accelerated decoding and interlaced DVD playback, as something didn't look right when doing that aspect ratio testing earlier.

If I use DXVA2 Native decoding in LAV Video, I am only able to select Video as a source type, Film is not available - Video appears twice when hitting the keyboard shortcut. It doesn't appear to be a problem with the label either, it's definitely not being deinterlaced as film content.

If I use DXVA2 Copy-back decoding in LAV Video, Film-type deinterlacing works correctly.

If I use CUVID, Film-type deinterlacing is available, but not enabled by default. When CUVID is set to 25/30p output, it is detected as having a 2:2 cadence, and 4:4 when set to 50/60p output, but the output is aliased/low resolution as if the wrong type of deinterlacing is being applied.
I see that CUVID no longer has the option to force a specific type of deinterlacing, and suspect that it's only using "Adaptive" now, which is probably the cause of problems there; Film-type content needs to use "None (Weave)" option to look correct.

EDIT: Unchecking the "Enable Adaptive HW Deinterlacing" option with CUVID also looks correct. I don't know if the only difference now is that the "bob" option has been eliminated. When you uncheck that option, the "High-Quality Processing" option is greyed out, but I don't know if that would have been used with the "None (Weave)" option anyway.

So only Software decoding, CUVID with "Adaptive HW Deinterlacing" disabled, or DXVA2 Copy-Back are working correctly with interlaced film-type content. (I only have film-type content to test with)

In case it makes a difference, this was PAL film-type content. (as you probably guessed from the 2:2/4:4 cadence)

Quote:

Originally Posted by Sneals2000

Hmm - yet the SD grabs are slightly narrower with black samples either side. Annoying. It's as if the standard is being partially followed...

It looks like the image is the correct aspect ratio when scaled from 720x576 to 1024x576, but with black bars on either side of it. If I stretch it to eliminate the bars, the image looks distorted.

pie1394 · 10th December 2012, 03:41

The old-day CRT has the overscan area. The signal is analog. The electron scanning beam is contiguous, not like flat panel devices --- PDP, LCD, DMD-DLP, etc... So you never see all CRTs have the same aspect ratio like 4:3 or 16:9 w/o accurate calibration. But only very few people can tell the minor difference.

In my country, most DVB-T broadcasters send out MPEG-2 704x480i30 signal on SDTV programs. But few of them send out MPEG-2 720x480i30 one.

Yet the DAR (Display Aspect Ratio) is described in the MPEG-2 stream's header. It does cause issues on calculating PAR (Pixel Aspect Ratio) to prevent unnecessary horizontal scaling for MPEG-2 720x480, 720x576 DVD / SDTV contents.

At that time I decided to use MPEG-4 PAR to describe the image pixel apect ratio. I also made the MPEG-2 decoder to send out the video frame with fake PAR information to the video scaling processor. So all 720x480, 720x576 combinations always get the same PAR of 704x480, 704x576.

If the output signal is NTSC / PAL / SDTV D1, for the 704x480 and 704x576 contents at 100% zooming rate, just let the video player to put the video displaying window at (8,0)-(704, 480/576) of the video frame buffer w/o any scaling.

pie1394 · 10th December 2012, 06:25

Quote:

Originally Posted by madshi

FWIW, if your CPU doesn't support SSE4.1 then madVR is not doing copyback internally, so the behaviour is different to what CUDA/DXVA-copyback in LAV Video Decoder do. In theory native DXVA should perform better in your situation, but with a small hit on chroma image quality.

I am just curious which SSE4.1 instructions are needed in your operation...

Any consideration about AVX?

It is supported on the 2nd and newer Core i processors. Not only the integer parts, it also does provide some boosts over SSE3 on FP32 / FP64 intensive calculations. The Altivec-type 3-operand instruction formats also get some benefits on saved code size. But 256-bit AVX packed shuffle / sum-of-2 instruction logics are somewhat different from 128-bit SSE3 ones on the same sized element.

turbojet · 10th December 2012, 06:29

It's wiki, so not to be taken as a scientific fact but https://en.wikipedia.org/wiki/Oversc...olution_issues proves both Sneals2000 (analog) and EE guy (digital) correct. It's all above my head but he explained it to me a few years ago and something about color carrier and a bunch of numbers it came out to 703 and 706 respectively, bored me to death. But it's 1 pixel no naked eye could tell the difference. However avisynth has issues with resolutions that aren't mod4, green lines and repeated lines, dunno know if that's a problem with madvr.

Mangix · 10th December 2012, 07:36

Request: add a text file to the madvr.zip file that states miscellaneous stuff.

In particular the peculiarities of DXVA2 scaling, decoding, deinterlacing, w/e. It's confusing as not only must this information be chased down, it's subject to change. And if anyone does not follow the hundreds of pages in this thread, important notes can be missed.

ryrynz · 10th December 2012, 07:55

Whilst madVR has been stable for quite some time I still believe Madshi considers it a "in development" project and is still interested in implementing a few more features.
Full documentation will likely be supplied once it gets close to or hits 1.0

nevcairiel · 10th December 2012, 09:06

Quote:

Originally Posted by pie1394

I am just curious which SSE4.1 instructions are needed in your operation..

The important instruction is MOVNTDQA, a special instruction to read memory more efficiently from write-combining memory, as used by GPUs. Without this, a Copy-Back implementation is most likely very inefficient.

AVX does not provide an improvement over this, AVX2 will however, AFAIK.

egur · 10th December 2012, 10:21

Quote:

Originally Posted by nevcairiel

The important instruction is MOVNTDQA, a special instruction to read memory more efficiently from write-combining memory, as used by GPUs. Without this, a Copy-Back implementation is most likely very inefficient.

AVX does not provide an improvement over this, AVX2 will however, AFAIK.

I implemented an AVX2 copy back function which works fine on Haswell. It doesn't provide any performance improvements over SSE4.1. To date I haven't found an implementation faster than the QuickSync implementation (either SSE4.1 and AVX2).
Copy back should be done on 2 threads using SSE aligned addresses.

You must use the VS2012 compiler or Intel compiler 12.1 (maybe 13.0, didn't check) or newer.
The instructions used are:
VMOVNTDQA: streaming load 256bit. Used via the _mm256_stream_load_si256 intrinsic function.
VMOVDQA: aligned store 256bit. Used via the _mm256_store_si256 intrinsic function.

See the code here (gpu_memcpy_avx2 function).

pie1394 · 10th December 2012, 10:49

Quote:

Originally Posted by egur

I implemented an AVX2 copy back function which works fine on Haswell. It doesn't provide any performance improvements over SSE4.1. To date I haven't found an implementation faster than the QuickSync implementation (either SSE4.1 and AVX2).
Copy back should be done on 2 threads using SSE aligned addresses.

Does it mean the Haswell is still restricted with maximum 64 bytes read per PCI-e transaction?

Without the buffered DMA unit + internal CPU data ram, it is quite inefficient for CPU to fetch / write-back large stride-based contiguous non-cacheable memory like video frame buffer or peripheral memory. Yet it is often only provided on the video processors, not general-purposed one like x86...

egur · 10th December 2012, 11:05

Quote:

Originally Posted by pie1394

Does it mean the Haswell is still restricted with maximum 64 bytes read per PCI-e transaction?

Without the buffered DMA unit + internal CPU data ram, it is quite inefficient for CPU to fetch / write-back large stride-based contiguous non-cacheable memory like video frame buffer or peripheral memory. Yet it is often only provided on the video processors, not general-purposed one like x86...

I haven't tried reading from a PCI-E device. I used CB to copy from the integrated GPU. the iGPU communicates with the CPU via a ring bus not PCI-E. They sit on the same silicon die. There's no DMA copy going on.
For discrete GPUs, I'm not sure how CB actually works, this is not my cup of tea...
I can guess that the surface is copied via DMA when it's locked via the LockSurface function, but maybe not.

madshi · 10th December 2012, 11:05

Quote:

Originally Posted by turbojet

Ok, worth considering the labels I mentioned? hddvd/hd-dvd could fixed with *hd* having a higher priority then *dvd*

I haven't decided on the labels yet, but I don't think I'll support either DVD or HDDVD/HD-DVD. I'll rather support specific tags to switch specific things.

Quote:

Originally Posted by turbojet

if par=4:3 and width <721 matrix=601
if par=16:9 and width <721 matrix=601
if width <721 matrix =601 else matrix=709
would this work?

No, because custom h264 PAL encodes are often 1024x576.

Quote:

Originally Posted by THX-UltraII

madVR also tells me I get 1 dropped frame every 21 minutes. This would mean approx. 3 per our. But the strange thing is that I get 20-30 dropped frames per hour. You might have an idea what could be the cause of this?

Well, when you get those frame drops, are any of the queues near empty (see madVR debug OSD -> Ctrl+J)?

Quote:

Originally Posted by 6233638

Unfortunately, I have made it a habit to get rid of the DVD version of a film as soon as I get the Blu-ray release, so I don't have many left to compare.

Sample 1:

Blu-ray
DVD

Sample 2:

Blu-ray
DVD

I had to shift the position on them slightly as they didn't match identically, but it appears that 720x576 to 1024x576 is what studios assume.

Thanks! So it is as I feared: There's no way to know which scaling/AR is needed for any given content...

Quote:

Originally Posted by Sneals2000

Yep - it comes as a surprise to many - even though it's an inherent part of ITU 601 - the core SD digital video standard. It's a direct result of 13.5MHz sampling. Sorry if it seems a bit pedantic - but when I saw 1050x576 being slightly dissed, I had to pipe up.

It's not pedantic at all. I'm very interested in such things.

Quote:

Originally Posted by Sneals2000

Yes - though NTSC is slightly more confusing to get a grip on because of the change from 486 to 480 active lines with the introduction of DV and MPEG2 (though ISTR D1/D2 and I think D3 captured 486 lines). I've not worked extensively with NTSC but I believe it is based on 711x486 (but with 6 lines cropped when analogue NTSC was digitised I'm not sure what has happened). It's a lot less clear though

Ouch.

Quote:

Originally Posted by Sneals2000

At the end of the day it's going to be a very small geometric distortion, and I suspect there are significant numbers of DVDs mastered in both ways. However ITU 601 is really clear - and all SD digital video based on 720x576 should be ITU 601 compliant - and anything handling ITU 601 content should really follow the standards I would suggest. Just because others don't doesn't mean everyone shouldn't...

From my point of view, following the standards is very important. However, making the majority of content look right is even more important. So for me the key questions are:

(1) Do users today play more content which needs 720x576 -> 1024x576 scaling, or do they play more content which needs 720x576 -> 1050x576 scaling?
(2) What do the majority of DVD studios do today?
(3) What do the majority of broadcasters do today?

(these are not questions I expect you to be able to answer)

It seems to me that probably the majority of newly released DVDs these day probably need 720x576 -> 1024x576 scaling. This is an educated guess, though, and would need to be confirmed first. I've not sure about (3). There's a good chance that the majority of broadcast material might need 720x576 -> 1050x576 scaling. But it's hard to know for sure.

So what all this leaves us in is a total mess...

I think it would be a bad idea if I just scaled all 720x576 content to 1050x576 now because my impression is that it would produce stretched images for many newer DVDs. But then simply ignoring the whole issue might not be such a good idea, either. Oh well. Not really sure what to do now...

Quote:

Originally Posted by 6233638

If I use DXVA2 Native decoding in LAV Video, I am only able to select Video as a source type, Film is not available - Video appears twice when hitting the keyboard shortcut. It doesn't appear to be a problem with the label either, it's definitely not being deinterlaced as film content.

madVR's IVTC algorithm currently still runs on the CPU (SSE2 code). Because of that it can't be activated when using Native DXVA2 decoding. I might move the IVTC algorithm to the GPU at some point in the future, but not too soon.

Quote:

Originally Posted by Mangix

Request: add a text file to the madvr.zip file that states miscellaneous stuff.

In particular the peculiarities of DXVA2 scaling, decoding, deinterlacing, w/e. It's confusing as not only must this information be chased down, it's subject to change. And if anyone does not follow the hundreds of pages in this thread, important notes can be missed.

Which exact peculiarities are you interested in? DXVA2 processing in madVR should behave mostly similar to other renderers.

Quote:

Originally Posted by egur

I implemented an AVX2 copy back function which works fine on Haswell. It doesn't provide any performance improvements over SSE4.1. To date I haven't found an implementation faster than the QuickSync implementation (either SSE4.1 and AVX2).
Copy back should be done on 2 threads using SSE aligned addresses.

You must use the VS2012 compiler or Intel compiler 12.1 (maybe 13.0, didn't check) or newer.
The instructions used are:
VMOVNTDQA: streaming load 256bit. Used via the _mm256_stream_load_si256 intrinsic function.
VMOVDQA: aligned store 256bit. Used via the _mm256_store_si256 intrinsic function.

See the code here (gpu_memcpy_avx2 function).

FWIW, I'm using _mm_stream_si128 instead of _mm_store_si128. That's because I'm not copying from GPU -> CPU RAM. Instead I'm directly copying GPU -> GPU.

kasper93 · 10th December 2012, 11:26

@madshi
madVR doesn't work good with mpc and microsoft dxva decoder at least on HD5870 with 12.10 driver.

Playback starts fine, but madvr freeze after seek: see https://dl.dropbox.com/u/16282309/mad/madVR-freeze.report.mpc.dxva.7z
and log: https://dl.dropbox.com/u/16282309/mad/madVR.log.mpc.dxva.7z

ffdshow DXVA and LAV doesn't have seeking problem. MPC and Microsoft DTV-DVD Video Decoder have.

alexacolor · 10th December 2012, 11:45

Sorry if already discussed: When running on old video card (Radeon 550 PS2.0) just a black screen. No any errors. All interfaces are created without errors. How can I know that the video can not be played (black screen)?

DragonQ · 10th December 2012, 12:09

Quote:

Originally Posted by Sneals2000

Yep - that's what should happen with properly mastered SD video that is mastered to that international standards (i.e. ITU 601) An HD 16:9 1920x1080 or 1280x720 source should be downscaled to 702x576 and sit in the middle of the 720x576 frame - as can be seen by BBC One HD and BBC One SD. I'd hope Channel Four and ITV1 HD and SD were similar. (Particularly as Channel Four is played out by the same people as play out BBC channels - Red Bee Media - formerly BBC Broadcast)

That may be what should happen with regards to the broadcast, but the software isn't treating it properly.

Maybe it should be an auto-detection thing - if the outer x number of columns are blank, assume it's ITU 601 and scale accordingly. If they're not blank, assume the whole frame is meant to be 16:9 and scale accordingly.

pandy · 10th December 2012, 12:14

Quote:

Originally Posted by madshi

Create an empty file with the name "YCbCr" in the madVR folder...

Big THX Madshi! i love undocumented features!

6233638 · 10th December 2012, 12:47

Quote:

Originally Posted by madshi

So what all this leaves us in is a total mess...

I think it would be a bad idea if I just scaled all 720x576 content to 1050x576 now because my impression is that it would produce stretched images for many newer DVDs. But then simply ignoring the whole issue might not be such a good idea, either. Oh well. Not really sure what to do now...

It's still less than ideal, and I don't know if it would be possible, but the best solution I can think of, would be to have an option such as:

Automatically activate ITU scaling when pillarboxing is detected

if in doubt, activate scaling
if in doubt, deactivate scaling

Of course, this assumes you are able to figure out some way to detect when there is a small amount of pillarboxing on the side of the image.

A simpler solution would simply be to have a global preference for what you want to do with all SD content, and a keyboard shortcut to toggle between the two though.

pandy · 10th December 2012, 12:55

Quote:

Originally Posted by Sneals2000

Something that a LOT of people get wrong...

Quote:

Originally Posted by Sneals2000

Absolutely certain about 576/50i (aka "PAL") being 702 samples wide. Comes directly from 13.5MHz sampling of a 52us active line.

Ok - let starts some math - ITU-R BT.1700 (supersede BT.470)
tells us that active video for 625 line system can have 52.3us (64 - 11.7us).

System based on BT.601/656 use 13.5MHz sampling ie 1 pixel have 1/13.5MHz length ie active video length in digital pixels is equal: 52.3/(1/13.5)=(approx) 706 pixels so no 702, no 704, no 708 or 710 or 720 but 706 pixels - thus story is even more complicated and it should be 1044 not 1050 video.

Why not 52us - in digital world time control is very precise and many vendors offer hardware capable to produce 52.3 not 52us video which is allowed by video standard .

THX-UltraII · 10th December 2012, 13:44

Quote:

Originally Posted by madshi

Well, when you get those frame drops, are any of the queues near empty (see madVR debug OSD -> Ctrl+J)?

I just raised the option 'show frames in advance' from 4 => 16 and now I get 2-5 dropped frames during a 2 hour movie! That s pretty nice I think when you keep in mind that I bitstream

THX-UltraII · 10th December 2012, 13:46

When you use a ISF calibrated digital front projector (like my Sony HW50) do I just check the option 'disable calibration for this display'? (can t remember the exact name of that option but it is the first radio button you can choose)

10th December 2012, 03:41	#16123 \| Link
pie1394 Registered User Join Date: May 2009 Posts: 212	The old-day CRT has the overscan area. The signal is analog. The electron scanning beam is contiguous, not like flat panel devices --- PDP, LCD, DMD-DLP, etc... So you never see all CRTs have the same aspect ratio like 4:3 or 16:9 w/o accurate calibration. But only very few people can tell the minor difference. In my country, most DVB-T broadcasters send out MPEG-2 704x480i30 signal on SDTV programs. But few of them send out MPEG-2 720x480i30 one. Yet the DAR (Display Aspect Ratio) is described in the MPEG-2 stream's header. It does cause issues on calculating PAR (Pixel Aspect Ratio) to prevent unnecessary horizontal scaling for MPEG-2 720x480, 720x576 DVD / SDTV contents. At that time I decided to use MPEG-4 PAR to describe the image pixel apect ratio. I also made the MPEG-2 decoder to send out the video frame with fake PAR information to the video scaling processor. So all 720x480, 720x576 combinations always get the same PAR of 704x480, 704x576. If the output signal is NTSC / PAL / SDTV D1, for the 704x480 and 704x576 contents at 100% zooming rate, just let the video player to put the video displaying window at (8,0)-(704, 480/576) of the video frame buffer w/o any scaling. Last edited by pie1394; 10th December 2012 at 05:49.

10th December 2012, 06:29	#16125 \| Link
turbojet Registered User Join Date: May 2008 Posts: 1,840	It's wiki, so not to be taken as a scientific fact but https://en.wikipedia.org/wiki/Oversc...olution_issues proves both Sneals2000 (analog) and EE guy (digital) correct. It's all above my head but he explained it to me a few years ago and something about color carrier and a bunch of numbers it came out to 703 and 706 respectively, bored me to death. But it's 1 pixel no naked eye could tell the difference. However avisynth has issues with resolutions that aren't mod4, green lines and repeated lines, dunno know if that's a problem with madvr. Last edited by turbojet; 10th December 2012 at 06:36.

10th December 2012, 07:55	#16127 \| Link
ryrynz Registered User Join Date: Mar 2009 Posts: 3,650	Whilst madVR has been stable for quite some time I still believe Madshi considers it a "in development" project and is still interested in implementing a few more features. Full documentation will likely be supplied once it gets close to or hits 1.0 Last edited by ryrynz; 10th December 2012 at 12:49.

10th December 2012, 11:26	#16133 \| Link
kasper93 MPC-HC Developer Join Date: May 2010 Location: Poland Posts: 586	@madshi madVR doesn't work good with mpc and microsoft dxva decoder at least on HD5870 with 12.10 driver. Playback starts fine, but madvr freeze after seek: see https://dl.dropbox.com/u/16282309/mad/madVR-freeze.report.mpc.dxva.7z and log: https://dl.dropbox.com/u/16282309/mad/madVR.log.mpc.dxva.7z ffdshow DXVA and LAV doesn't have seeking problem. MPC and Microsoft DTV-DVD Video Decoder have. Last edited by kasper93; 10th December 2012 at 12:08.

10th December 2012, 11:45	#16134 \| Link
alexacolor Registered User Join Date: Dec 2012 Posts: 5	Old videocard Sorry if already discussed: When running on old video card (Radeon 550 PS2.0) just a black screen. No any errors. All interfaces are created without errors. How can I know that the video can not be played (black screen)?

10th December 2012, 07:36	#16126 \| Link
Mangix Audiophile Join Date: Oct 2006 Posts: 353	Request: add a text file to the madvr.zip file that states miscellaneous stuff. In particular the peculiarities of DXVA2 scaling, decoding, deinterlacing, w/e. It's confusing as not only must this information be chased down, it's subject to change. And if anyone does not follow the hundreds of pages in this thread, important notes can be missed.

10th December 2012, 13:46	#16140 \| Link
THX-UltraII Registered User Join Date: Aug 2008 Location: the Netherlands Posts: 851	When you use a ISF calibrated digital front projector (like my Sony HW50) do I just check the option 'disable calibration for this display'? (can t remember the exact name of that option but it is the first radio button you can choose)

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode