Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 26th January 2014, 14:44   #21941  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Quote:
Originally Posted by madshi View Post
Random dithering is currently done before drawing the OSD. Error diffusion is done even later, after drawing the OSD. Both is done after scaling and after smooth motion FRC. As nevcairiel said, it must be that way. Of course when using smooth motion FRC, the number of rendered frames increases, so that will drive up error diffusion cost.

I've read somewhere that when using NVidia OpenCL, the CPU consumption goes up because for some reason NVidia OpenCL just works that way. I think the thread waiting for OpenCL to finish its work is probably running at 100%. Don't ask me why. I don't know if this happens with AMD or Intel, too, haven't really tested that.
Okay, thanks for clarifying. Next question would be if there is a way to disable OpenCL dither with Smooth Motion via profiles? It seems we are missing a boolean value for 'is Smooth Motion active'? Would it also be possible to add a codec rules for image formats. If I had a boolean for PNG, I could workaround Issue #142 on your bug tracker.

Quote:
Originally Posted by madshi View Post
madVR871b
...
@cyber, this should also fix the settings delay.
Confirmed

Last edited by cyberbeing; 26th January 2014 at 15:05.
cyberbeing is offline   Reply With Quote
Old 26th January 2014, 15:09   #21942  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by ryrynz View Post
Because of the GPU performance requirement for error diffusion and the slight improvement in quality with it being done on the CPU, what's the chances of this ever being an option that could be enabled on the CPU in MadVR?
Not a chance, because it would require heavy copyback operations. Much more so than a DXVA2 decoder does.

Quote:
Originally Posted by James Freeman View Post
I disagree.
87.1b works much faster than even 86.11.
Quote:
Originally Posted by James Freeman View Post
Edit: I forgot to turn off Debanding in 87.1b, which means its even faster than whats in this image.
What is weird is that in one of your images the max numbers are smaller than the average numbers. Not sure how that could happen.

Quote:
Originally Posted by nevcairiel View Post
These numbers look unrealistic. I seriously doubt their accuracy.
I agree that such a big difference looks suspicious. However, I did do one important change in v0.87.0 (see also changelog): In all older versions sometimes when the backbuffer queue was full, rendering performance slowed down dramatically. IIRC, some users reported that they cut the rendering queue size down so that the backbuffer queue never gets full to solve this problem, or something like that.

I've worked around this problem in v0.87.x simply by telling D3D9 to allocate one backbuffer more than needed. I can imagine that James Freeman runs into the original problem with v0.86.11, so that his rendering times are much worse than his GPU is really capable of. And the workaround in v0.87.x could have solved this issue and thus restored the original/full speed of the GPU. This is just a guess, though, I could be wrong. But it could explain the improvement he's seeing. I did have high hopes for the workaround I added. Had one situation on my PC where I could reproduce the original problem, and the workaround fully fixed it.

Quote:
Originally Posted by huhn View Post
these are all screens from 86.11

ivtc doesn't really care about queues but deinterlacing... how to judge performance this way.

even if amd in this case is changing the deinterlacing mode because the number of buffered frames is to low why are the rendertimes affected?
How long did you let rendering run when you made these images? Is it possible that with the bigger queue size the rendering times are only so much worse directly at the start of video playback? Maybe after a couple of seconds they slowly come down and then at some point match the rendering times with smaller queue sizes? (With v0.87.1b at least?)

Quote:
Originally Posted by hannes69 View Post
Just tested 87.1b on AMD HD4550 (quite old, quite slow). Everything is working fine so far. I didnīt test everything (e.g. deinterlacing), but for my typical usage scenario everything works. No black or bluescreens, freezes, stuttering or other unwanted behaviour.
Good to hear - thanks for bringing good news for a change...

Quote:
Originally Posted by noee View Post
madshi:
With my VC1 1080i test file, .87b makes substantial perf gains for deint:

DXVA Native (FSE Mode):
.86: ~7.3ms, full queues, perfect playback
.871: ~18.2ms, starved render/backbuffer, many drops
.871b: ~4.10ms, full queues, perfect playback

DXVA CB (FSE Mode):
.871: ~7.5ms, full queues, perfect playback
.871b: ~2.75ms, full queues, perfect playback

HD6570
Looks great! Can you confirm that frame dropping behaviour matches these rendering times, just to be safe? Meaning: Can you actually use even more demanding scaling algorithms now compared to v0.86.11? At least the same, without getting frame drops? I'm not sure how much I can really trust the rendering times, although they're usually at least roughly right.

Quote:
Originally Posted by DragonQ View Post
All of this testing is on my laptop using an Intel HD4000, as I've said several times (although I understand if you haven't been following the past dozen pages). I don't/can't use MadVR on my HTPC since MediaPortal doesn't support it.
Ok. So it seems your performance issue is probably the last one remaining. I'm not sure if you said this in the past already, but does your performance issue only occur when some kind of deinterlacing is involved? You did say it had to do with smooth motion FRC. But does interlacing play any role? DXVA deinterlacing? Or film mode? Both or none? Thanks.

Quote:
Originally Posted by cyberbeing View Post
Okay, thanks for clarifying. Next question would be if there is a way to disable OpenCL dither with Smooth Motion via profiles? It seems were are missing a boolean value for 'is Smooth Motion active'? Would it also be possible to add a codec rules for image formats. If I had a boolean for PNG, I could workaround Issue #142 on your bug tracker.
We can look at possible additions to the profile variables once all the other issues are resolved. Please let's delay this until then, though. Generally I'm absolutely open to add further profile variables. Anything that makes sense and doesn't cost performance.
madshi is offline   Reply With Quote
Old 26th January 2014, 15:18   #21943  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
Quote:
Originally Posted by madshi View Post
How long did you let rendering run when you made these images? Is it possible that with the bigger queue size the rendering times are only so much worse directly at the start of video playback? Maybe after a couple of seconds they slowly come down and then at some point match the rendering times with smaller queue sizes? (With v0.87.1b at least?)
87.1b is at least awesome just have a look at my posting http://forum.doom9.org/showpost.php?p=1664365&postcount=21948

when i switch from 5/5 to 16/8 it takes about 2 sec and the avg deint are at 2 ms again still worlds better than 86.11.
at first i though i have broken 86.11 again but James Freeman got the same...

and the screens for 86.11 where running for at least 18 sec so no they don't get any better
huhn is offline   Reply With Quote
Old 26th January 2014, 15:21   #21944  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by huhn View Post
87.1b is at least awesome just have a look at my posting http://forum.doom9.org/showpost.php?p=1664365&postcount=21948

when i switch from 5/5 to 16/8 it takes about 2 sec and the avg deint are at 2 ms again still worlds better than 86.11.
at first i though is have broken 86.11 again but James Freeman got the same...

and the screens for 86.11 where running for at least 18 sec so no they don't get any better
Yes, but even with 871b your screenshots still show lower rendering times for 5/5 than for 16/8. I'm wondering if these rendering times get nearer to each other if you let the video run for a longer time? You could also zoom in and back out again during playback to reset the average rendering times.
madshi is offline   Reply With Quote
Old 26th January 2014, 15:27   #21945  |  Link
noee
Registered User
 
Join Date: Jan 2007
Posts: 530
Quote:
Originally Posted by madshi
Meaning: Can you actually use even more demanding scaling algorithms now compared to v0.86.11? At least the same, without getting frame drops?
Yes, I tested for that too on some SD interlaced video I have. With .86, I could run Jinc3/AR luma no problems at all. Same now with .871b on same interlaced video (GPU load is actually a touch lower, ~8% IIRC), but I didn't actually record the numbers, I just watched for a few minutes and monitored frame drops (of which there were none). Software decoding for this SD testing.

If I try NNEDIx2 for the SD interlaced, I have to drop down to NN/Bilin and 16neurons. It appears I have just reached the limits of this card, unless there is some kind of dramatic perf increase with the next OpenCL driver update from our friends at AMD. I would guess that the OpenCL<>D9D3D interop has room for improvement. Whether or not that translates to more headroom for these older, lowly cards is another question.
noee is offline   Reply With Quote
Old 26th January 2014, 15:30   #21946  |  Link
DragonQ
Registered User
 
Join Date: Mar 2007
Posts: 934
Quote:
Originally Posted by madshi View Post
Ok. So it seems your performance issue is probably the last one remaining. I'm not sure if you said this in the past already, but does your performance issue only occur when some kind of deinterlacing is involved? You did say it had to do with smooth motion FRC. But does interlacing play any role? DXVA deinterlacing? Or film mode? Both or none? Thanks.
Yes, it's definitely only "video mode" deinterlacing that makes this problem manifest itself. It's made dramatically worse when Smooth Motion is on, but like I said above, this might just be a red herring since the GPU is at 100% load and can't cope.

Watching progressive videos and/or enabling film mode results in performance on par with 0.86.11 (or slightly better).
__________________
TV Setup: LG OLED55B7V; Onkyo TX-NR515; ODroid N2+; CoreElec 9.2.7
DragonQ is offline   Reply With Quote
Old 26th January 2014, 15:33   #21947  |  Link
James Freeman
Registered User
 
Join Date: Sep 2013
Posts: 919
Quote:
Originally Posted by madshi View Post
Yes, but even with 871b your screenshots still show lower rendering times for 5/5 than for 16/8. I'm wondering if these rendering times get nearer to each other if you let the video run for a longer time? You could also zoom in and back out again during playback to reset the average rendering times.
Thanks for the tip.
Play with the volume (mouse wheel) for instant refresh of everything else.

After a few minutes the stable times are (87.1b 1080i30 SM /On):

Avarage:
Deint 0.46
split 2.94
rendering 3.33
present 0.10

Max Stats:
deint 1.03
split 4.12
rendering 4.06
present 0.16

Quote:
Originally Posted by madshi
What is weird is that in one of your images the max numbers are smaller than the average numbers. Not sure how that could happen.
I've taken the photo too fast.
It does not matter, 87.1b still the fastest release yet.
__________________
System: i7 3770K, GTX660, Win7 64bit, Panasonic ST60, Dell U2410.

Last edited by James Freeman; 26th January 2014 at 15:39.
James Freeman is offline   Reply With Quote
Old 26th January 2014, 15:36   #21948  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
nope 16/8 ~2 ms 5/5 0.65 ms over a min should be more than enough. deinterlacing is and was for all test on automatic. deint starts at 9 ms and falls to 2 ms after some time. but gpu load is at ~35% for both.

but there is one thing that doesn't make sense at first look it's ivtc rendertime they are at ~9 ms and deint is at 4.5 ms...
butthat's simple and makes rendertime kinda worthless because in ivtc made the gpu was at 157/300 mhz - 500/300 mhz so at idle most of the time but with deint it was at max powerstate 850/1300 mhz.

so to compare different version we need the powerstate too...
huhn is offline   Reply With Quote
Old 26th January 2014, 15:44   #21949  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Ah yes, powerstate. Probably makes sense to fix the powerstate to a specific value (max?) while benchmarking rendering times...
madshi is offline   Reply With Quote
Old 26th January 2014, 15:51   #21950  |  Link
pie1394
Registered User
 
Join Date: May 2009
Posts: 212
On my HTPC, 0.87.1b is improved on the GPU deinterlacing performance although it is not on pair with 0.86.10. You have done a good job to restore some lost performance.

[720x480i60]
0.86.10 : 0.83 ms
0.87.1: 1.64 ms
0.87.1b: 1.38 ms

[1440x1080i60]
0.86.10 : 1.44 ms
0.87.1: 4.21 ms
0.87.1b: 2.12 ms

Note the average number is not always measured the same, but still observed with the similar values on repeated tests.

For my HD7970, it always seems to go up to 925 MHz from 300MHz with madVR playback. The only difference is the fan speed level under different GPU loading.


--
Core i5-3570K + Z77 + dual-ch DDR3-2400 + HD7970@925MHz Catalyst 13.12 + Win7x64SP1 + MPC-BE 1.3.0.3 + LavFilter 0.60.1 (DXVA)

Last edited by pie1394; 26th January 2014 at 15:58.
pie1394 is offline   Reply With Quote
Old 26th January 2014, 16:04   #21951  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
Quote:
Originally Posted by madshi View Post
Ah yes, powerstate. Probably makes sense to fix the powerstate to a specific value (max?) while benchmarking rendering times...
at least this is impossible with my card when dxva decoding is used it goes 500/~800 and stays there even with jinc 8 ar. if i force the high powerstate the player will crash with dxva decodering and i'm pretty sure the 6770 isn't the only card with that "awesome" powerstate design. so we have to exclude dxva decoding and maybe other dxva parts like deintlacing if this is crippled other cards too.
huhn is offline   Reply With Quote
Old 26th January 2014, 16:08   #21952  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
Quote:
Originally Posted by huhn View Post
at least this is impossible with my card when dxva decoding is used it goes 500/~800 and stays there even with jinc 8 ar. if i force the high powerstate the player will crash with dxva decodering and i'm pretty sure the 6770 isn't the only card with that "awesome" powerstate design. so we have to exclude dxva decoding and maybe other dxva parts like deintlacing if this is crippled other cards too.
A long standing issue with AMD cards, as soon as you use DXVA it forces itself into the "video" power state, and anything you do to change that - it won't like.

My assumption is that it also controls the clock of the decoder, and it crashes if its used at a too high clock, but thats just guessing.
Might be worth to test if just using DXVA deint also has that effect, or if its only decoding.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 26th January 2014, 16:18   #21953  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,436
Question (hopefully an easy one instead of all the bug reports)... If I'm playing a VFR mkv that has 30000/1001fps and 24000/1000fps content should I see the "movie frame interval" in the OSD update during playback as it the frame rate changes? Also, the OSD says "smooth motion is off (settings)". The file starts with 30000/1000fps content so that might be normal, but should I see if turn on when the framerate in the mkv drops to 24000/1000fps? madVR is set to enable smooth motion frame rate conversion (only if there would be motion judder without it...) and when I play back 24000/1000fps cfr files the OSD says it is on.
Stereodude is offline   Reply With Quote
Old 26th January 2014, 16:22   #21954  |  Link
noee
Registered User
 
Join Date: Jan 2007
Posts: 530
madshi:
Not bad news, but, fwiw, I spoke too soon on the perf note before. I was set at Spline64/AR when I tested .86 and .87(1)(b) for deint. All is good, but just to pass on that .86 and .871b both only allow me to go max Spline64/AR image with 480i material. If I go to Jinc3 on either version, I get drops, so at least perf, in this case, appears to be about the same from .86->.871b for 480i->1080 with deint.

Quote:
Originally Posted by nevcairiel
My assumption is that it also controls the clock of the decoder, and it crashes if its used at a too high clock, but thats just guessing.
Good guess, it's been an issue as the DPM and VDPAU stuff has been implemented in later linux kernels/drivers. Quite quirky depending on asic and they don't seem to really know the fix or have time for it because it's probably different per asic. Hell, in the last dri pull, they disabled default DPM on Turks, Cayman, Bonaire?. because of this quirkiness. That said, I run the DPM (Turks) but I don't use the UVD and it works great over there.
noee is offline   Reply With Quote
Old 26th January 2014, 16:32   #21955  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Quote:
Originally Posted by Stereodude View Post
If I'm playing a VFR mkv...
madshi has yet to add any VFR detection code to madVR. It may happen someday, but he said it was rather complex to get right, and he wouldn't want to implement such a thing unless he could make it behave reliably.
cyberbeing is offline   Reply With Quote
Old 26th January 2014, 16:33   #21956  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
Quote:
Originally Posted by Stereodude View Post
Question (hopefully an easy one instead of all the bug reports)... If I'm playing a VFR mkv that has 30000/1001fps and 24000/1000fps content should I see the "movie frame interval" in the OSD update during playback as it the frame rate changes? Also, the OSD says "smooth motion is off (settings)". The file starts with 30000/1000fps content so that might be normal, but should I see if turn on when the framerate in the mkv drops to 24000/1000fps? madVR is set to enable smooth motion frame rate conversion (only if there would be motion judder without it...) and when I play back 24000/1000fps cfr files the OSD says it is on.
i played at lot with vfr file i suggest to use 60 hz and force smoothmotion.

vfr detection is not very good in madvr and splitter (most likely thanks to missing information's in the file it self)
huhn is offline   Reply With Quote
Old 26th January 2014, 16:57   #21957  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,436
Quote:
Originally Posted by cyberbeing View Post
madshi has yet to add any VFR detection code to madVR. It may happen someday, but he said it was rather complex to get right, and he wouldn't want to implement such a thing unless he could make it behave reliably.
Where does the timecode information that's muxed into the .mkv go? madVR can't access that?
Stereodude is offline   Reply With Quote
Old 26th January 2014, 16:59   #21958  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
intel hd deint:

86.11 left first 2 and right 87.1b only screen...


long story short 87.1 plus intel = issue

render deint times are pretty unstable so don't take then to serious the gpu usage is very stable.

86.11 high queue got max power state at ~82%
86.11 low queue got max power state at ~79%
87.1b queue doesn't matter max powerstate ~92%

the problems start with upload already. the pc i used here got 1333 mhz ram (so i set it up wrong in the bios... good to know)

Quote:
My assumption is that it also controls the clock of the decoder, and it crashes if its used at a too high clock, but thats just guessing.
Might be worth to test if just using DXVA deint also has that effect, or if its only decoding.
my card only cares about decoding deint works at max powerstate. but i think is possible that amd messed this up tooon some cards too we are talking about amd here.
huhn is offline   Reply With Quote
Old 26th January 2014, 17:01   #21959  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,923
Quote:
Originally Posted by Stereodude View Post
Where does the timecode information that's muxed into the .mkv go? madVR can't access that?
the time code info's reach madvr but smoothmotion doesn't engage on file with 29p or 23p files
huhn is offline   Reply With Quote
Old 26th January 2014, 17:42   #21960  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
The b test build fixes both the settings delay and the restart/next file problem for me. (Didn't test test build a.)

There has also been an mpc-hc commit about something similar, it seems:
https://github.com/mpc-hc/mpc-hc/com...501f04e3eb59f3

Last edited by sneaker_ger; 26th January 2014 at 17:51.
sneaker_ger is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 19:34.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.