Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 15th March 2014, 15:38   #24981  |  Link
noee
Registered User
 
Join Date: Jan 2007
Posts: 530
Quote:
Originally Posted by Ceremony View Post
Have there been performance tests with AMD's A10-7850K yet? NNEDI is probably out of the question, but what about Jinc resizing? Can the integrated graphics chips handle that?
Im considering building such a HTPC for about 400-500€
I had an opportunity with a Kaveri system and madVR, but only for a very limited time, as I messed up the build and had to deliver it under deadline. But, yes, Jinc3 AR was fine with the SD material I tested.

Unfortunately, I had no other material to check and I did not verify NNEDI3 doubling.

I am hoping to build another here soon and do a more thorough analysis.
__________________
Win7Ult || RX560/4G || Ryzen 5
noee is offline   Reply With Quote
Old 15th March 2014, 15:44   #24982  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by Werewolfy View Post
I tried apimonitor, I hope I have what you need, it's the first time I use this software. I only checked "Audio and Video" and "Graphics and Gaming", here are the results. I included capture file and the same content in text files in case you can't read the capture file (like I said I don't know this software).

https://www.mediafire.com/?25ploj5619ohh50
Did I reply to this one? I think not, but I'm not sure. Just to be safe:

Unfortunately the log doesn't tell Direct3D changes the display mode. Which means that Direct3D does that through an API which the API Monitor didn't hook/watch... So back to square one: I can probably only do anything about this if I'm able to reproduce it on my PC, which currently I can't.
madshi is offline   Reply With Quote
Old 15th March 2014, 15:55   #24983  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Quote:
Originally Posted by madshi View Post
On my PC with an HD7770 I get the following results:

Code:
D3D9 StretchRect: 1834 fps
D3D9 HLSL PixelShader: 2407 fps

Error Diffusion DirectCompute: 347 fps
Error Diffusion DirectCompute interop: 316 fps
Error Diffusion DirectCompute interop 2: 297 fps
If that DirectCompute error diffusion kernel is anything like the one in madVR, it confirms my previous suspicion that the dither compute time is definitely much slower on NVIDIA vs AMD. With my GTX770 I get the following results (321.10 Driver), after re-compiling the binary with the OpenCL components removed:

Code:
D3D9 StretchRect: 7083 fps
D3D9 HLSL PixelShader: 7765 fps

Error Diffusion DirectCompute: 171 fps
Error Diffusion DirectCompute interop: 166 fps
Error Diffusion DirectCompute interop 2: 164 fps
cyberbeing is offline   Reply With Quote
Old 15th March 2014, 16:21   #24984  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by cyberbeing View Post
If that DirectCompute error diffusion kernel is anything like the one in madVR, it confirms my previous suspicion that the dither compute time is definitely much slower on NVIDIA vs AMD. With my GTX770 I get the following results (321.10 Driver), after re-compiling the binary with the OpenCL components removed:

Code:
D3D9 StretchRect: 7083 fps
D3D9 HLSL PixelShader: 7765 fps

Error Diffusion DirectCompute: 171 fps
Error Diffusion DirectCompute interop: 166 fps
Error Diffusion DirectCompute interop 2: 164 fps
That's quite interesting. The StretchRect/PixelShader results are great! The DirectCompute results not so much. Would be interesting to test the new Maxwell GPUs, maybe they do better at DirectCompute?
madshi is offline   Reply With Quote
Old 15th March 2014, 16:27   #24985  |  Link
6233638
Registered User
 
Join Date: Apr 2009
Posts: 1,019
Quote:
Originally Posted by G_M_C View Post
mainly ive used chapter 4 of my bluray of Samsara. This scene provides the best test i know. It goes from the bleak greyish landscape of Tibet (contrast/greyscale performance test) to a scene where monks lay out a mandala with pure primary-coloured and black/white sand where you can see individual grains of sand (colour, contrast, detail - performance test)
...
Setting dithering to 'chance every frame' results is a visible layer of slight noise, resembling slight film-grain.
I think you're just seeing film grain that's in the source, or something that is a result of your compression. I don't really see any difference if I turn dither on or off, and certainly nothing switching between static/dynamic dither.

I do sometimes notice a slight flicker/pulsing brightness in the smooth areas of the image (e.g. the sky) which always has me questioning whether it's in the source or due to the fact that dither is not linked to the framerate. (it's most likely the source, but now that idea is in my head, I notice it all the time)
6233638 is offline   Reply With Quote
Old 15th March 2014, 16:45   #24986  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
Quote:
Originally Posted by madshi View Post
That's quite interesting. The StretchRect/PixelShader results are great! The DirectCompute results not so much. Would be interesting to test the new Maxwell GPUs, maybe they do better at DirectCompute?
On a GTX 750 (Maxwell):

D3D9 StretchRect: 2312 fps
D3D9 HLSL PixelShader: 2258 fps
Error Diffusion DirectCompute: 276 fps
Error Diffusion DirectCompute interop: 374 fps
Error Diffusion DirectCompute interop 2: 350 fps
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 15th March 2014, 16:55   #24987  |  Link
sajara
Registered User
 
Join Date: Jan 2013
Posts: 18
Just for the sake of to see how a low end GPU, Radeon 5730M, (but still beefy enough for everything, bar NNEDI) would perform, here are my results:

Quote:
D3D9 StretchRect: 772 fps
D3D9 HLSL PixelShader: 333 fps
OpenCL copy: 820 fps
OpenCL kernel: 700 fps
OpenCL copy interop: 231 fps
OpenCL kernel interop: 231 fps

Error Diffusion OpenCL: 101 fps
Error Diffusion OpenCL interop: 46 fps
Error Diffusion OpenCL interop 2: 74 fps
Error Diffusion DirectCompute: 56 fps
Error Diffusion DirectCompute interop: 54 fps
Error Diffusion DirectCompute interop 2: 52 fps
ED OpenCL results vs ED DCompute are interesting, but I can vouch that in practical terms, OCL ED was ~56% on my GPU and DC ED is ~34%, so I don't know how to interpret these numbers.

Last edited by sajara; 15th March 2014 at 17:06.
sajara is offline   Reply With Quote
Old 15th March 2014, 17:00   #24988  |  Link
*Touche*
Registered User
 
Join Date: May 2008
Posts: 84
Quote:
Originally Posted by Ceremony View Post
Have there been performance tests with AMD's A10-7850K yet? NNEDI is probably out of the question, but what about Jinc resizing? Can the integrated graphics chips handle that?
Im considering building such a HTPC for about 400-500€
A10-5800k @ 1000 MHz can do 720p24->1080 with no trade quality options checked, with debanding, Jinc3 AR luma/chroma upscaling and error diffusion dithering. NNEDI is out of the question.
*Touche* is offline   Reply With Quote
Old 15th March 2014, 17:05   #24989  |  Link
iSunrise
Registered User
 
Join Date: Dec 2008
Posts: 496
Quote:
Originally Posted by nevcairiel View Post
On a GTX 750 (Maxwell):

D3D9 StretchRect: 2312 fps
D3D9 HLSL PixelShader: 2258 fps
Error Diffusion DirectCompute: 276 fps
Error Diffusion DirectCompute interop: 374 fps
Error Diffusion DirectCompute interop 2: 350 fps
That looks promising. Only about 1/10 less FPS at the worst case. Definitely a lot better than cyberbeing's results.
iSunrise is offline   Reply With Quote
Old 15th March 2014, 17:33   #24990  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
Quote:
Originally Posted by iSunrise View Post
That looks promising. Only about 1/10 less FPS at the worst case. Definitely a lot better than cyberbeing's results.
That is assuming it increases as raw performance increases.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 15th March 2014, 17:36   #24991  |  Link
sajara
Registered User
 
Join Date: Jan 2013
Posts: 18
For some reason, my first test put a pretty low score on the D3D9 tests, maybe card was idling and power states did not jump fast enough, i really don't know. But here are the final scores after 4 runs:

sajara is offline   Reply With Quote
Old 15th March 2014, 17:38   #24992  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by nevcairiel View Post
On a GTX 750 (Maxwell):

D3D9 StretchRect: 2312 fps
D3D9 HLSL PixelShader: 2258 fps
Error Diffusion DirectCompute: 276 fps
Error Diffusion DirectCompute interop: 374 fps
Error Diffusion DirectCompute interop 2: 350 fps
Thanks! That looks really promising. Though, I wonder why interop seems to speed things up in your case!? That's rather surprising. Is that reproduceable every time?

Quote:
Originally Posted by sajara View Post
Just for the sake of to see how a low end GPU, Radeon 5730M, (but still beefy enough for everything, bar NNEDI) would perform, here are my results:

ED OpenCL results vs ED DCompute are interesting, but I can vouch that in practical terms, OCL ED was ~56% on my GPU and DC ED is ~34%, so I don't know how to interpret these numbers.
FWIW, I think the error diffusion shaders don't match. Meaning the DC ED is probably a newer algorithm, doing more work, while the OCL ED is the very first version, doing less work. The purpose of the test tool was not to compare OCL ED vs. DC ED, but it was interop costs OCL vs DC. Because of that I didn't invest any time to make sure that the ED algorithms were identical in OCL and DC.

Quote:
Originally Posted by nevcairiel View Post
That is assuming it increases as raw performance increases.
I think it should.
madshi is offline   Reply With Quote
Old 15th March 2014, 17:49   #24993  |  Link
kasper93
MPC-HC Developer
 
Join Date: May 2010
Location: Poland
Posts: 586
And my poor HD5870

Code:
---------------------------
speed measurements:
---------------------------
D3D9 StretchRect: 2245.072021 fps
D3D9 HLSL PixelShader: 4834.187012 fps
OpenCL copy: 4395.217773 fps
OpenCL kernel: 3370.635010 fps
OpenCL copy interop: 91.733017 fps
OpenCL kernel interop: 98.520515 fps
Error Diffusion OpenCL: 401.759674 fps
Error Diffusion OpenCL interop: 13.235280 fps
Error Diffusion OpenCL interop 2: 52.057819 fps
Error Diffusion DirectCompute: 228.971298 fps
Error Diffusion DirectCompute interop: 218.831314 fps
Error Diffusion DirectCompute interop 2: 214.933136 fps
The same card without aero
Code:
---------------------------
speed measurements:
---------------------------
D3D9 StretchRect: 5446.919922 fps
D3D9 HLSL PixelShader: 5224.660645 fps
OpenCL copy: 4317.416504 fps
OpenCL kernel: 3300.330078 fps
OpenCL copy interop: 91.844238 fps
OpenCL kernel interop: 97.938782 fps
Error Diffusion OpenCL: 401.779083 fps
Error Diffusion OpenCL interop: 13.242332 fps
Error Diffusion OpenCL interop 2: 51.752998 fps
Error Diffusion DirectCompute: 227.469162 fps
Error Diffusion DirectCompute interop: 218.710220 fps
Error Diffusion DirectCompute interop 2: 213.650116 fps
Sadly OpenCL interop is VERRRY slow ;/ Would be really great if they improve that on driver level.

Last edited by kasper93; 15th March 2014 at 18:09.
kasper93 is offline   Reply With Quote
Old 15th March 2014, 17:50   #24994  |  Link
Ver Greeneyes
Registered User
 
Join Date: May 2012
Posts: 447
Quote:
Originally Posted by nevcairiel View Post
That is assuming it increases as raw performance increases.
Could be that the maximum FPS (on the stretchrect and shader tests) is capped by memory bandwidth or bus speed, whereas the directcompute stuff is capped by processing power. In that case the Maxwell 750 is still something like 276 / 171 ≈ 60% faster than the Kepler 770. Of course that's purely speculation. It seems a bit weird that the versions with interop are faster on the Maxwell though.
Ver Greeneyes is offline   Reply With Quote
Old 15th March 2014, 17:51   #24995  |  Link
e-t172
Registered User
 
Join Date: Jan 2008
Posts: 589
Quote:
Originally Posted by madshi View Post
Interested? I don't think it would be of help for madVR development because I don't think it's madVR's fault and there's probably nothing I can do about it. But I'd be willing to look into a debug log to help you. Although I think what I'll find will probably be an unexplained period where either the GPU rendered very slowly, or where madVR got no CPU time. Do you have any software running which might be using the GPU? Try closing that. Are there any background processes or services which might stress the CPU every once in a while? Try stopping them, too. Also try giving the media player process a higher priority. Maybe any of that helps?
Hey madshi,

So I did what you suggested and then some: I used Process Monitor to find if any other processes were doing anything while the frame drops happen. I disabled all background processes I could find in the resulting traces. Then I tried disabling as much hardware as possible (unplugging all USB except mouse and keyboard, unplugging network, disabling onboard devices in BIOS, switching SATA controllers, etc.). I switched all power options (GPU and Windows) to maximum performance. I changed MPC-HC process priority to High. None of that makes any difference.

I tried going back to NVidia driver 321.10. I suspect it's slightly better but nevertheless, I still get frame drops.

Then I tried using XPerf and while I'm still unable to fix the problem, I managed to observe some interesting phenomenon:



It appears that every ~7 seconds, a madVR thread (blue spiky line) is blocked during a significant amount of time (10-15 milliseconds). Stack information indicates that during these spikes the vast majority of the time is spent in GetRasterStatus(). madshi: does that make sense to you? This happens all the time and with both NVidia drivers 321.10 and 335.23. Reading the documentation for GetRasterStatus() I am confused as to why any code would spent so much time in this seemingly lightweight non-blocking method like that, especially with such weird periodicity.

Now this might or might not be related to the frame drops, but I suspect it is because every time a frame drop happens it appears to be time-aligned with one of these spikes. In fact with 335.23 (and maybe with 321.10, I'm not sure) in some (not all) frame drop instances it can get much worse:



The blue line is the madVR thread with the aforementioned spike. The green spike is a massive CPU spin in a kernel thread with a nvlddmkm.sys (NVidia driver) stack trace. It lasts for an enormous amount of time (230 ms in this instance), during which an entire CPU core is hosed and madVR seems to freeze, resulting in a burst of 5+ frame drops. What's interesting there is that the green spike seems to directly follow the blue spike, and the correlation doesn't seem to be a coincidence (the blue spike happens every 7 seconds and the green one follows less than 100 milliseconds after that - or maybe even less due to the sampling resolution). Technically that's probably a NVidia driver bug as it's never supposed to do that but maybe some behavior in madVR is triggering it.

madshi: here are some logs. The "try15" one is with 335.23 and relates to the graph above. The drop happens at timestamp 00439135 in the log. The "try16" one is with 321.10 and the drop happens at timestamp 01562318. In both runs, care has been taken to disable all unnecessary hardware and background processes to make the test as "pure" as possible. All madVR options are set to defaults except for Smooth Motion which is enabled. Sorry for the giant "try16" log - I sometimes have to wait a long time before the issue shows up again.

I really hope you can make sense of this, because I'm *really* out of options now. My only possible next steps are nuclear options like a Windows reinstall or even replacing the GPU or motherboard...

By the way, feature request: it would be nice if madVR could log the current system time in human-readable form every second or so, as it makes it easier to correlate the entries in the log with external tools like Process Monitor or XPerf.

Last edited by e-t172; 15th March 2014 at 19:14.
e-t172 is offline   Reply With Quote
Old 15th March 2014, 17:59   #24996  |  Link
iSunrise
Registered User
 
Join Date: Dec 2008
Posts: 496
Quote:
Originally Posted by nevcairiel View Post
That is assuming it increases as raw performance increases.
Yeah, we'll have to wait and see.

Last edited by iSunrise; 15th March 2014 at 18:05.
iSunrise is offline   Reply With Quote
Old 15th March 2014, 18:16   #24997  |  Link
markanini
Registered User
 
Join Date: Apr 2006
Posts: 299
HD7770 Calayst Beta

markanini is offline   Reply With Quote
Old 15th March 2014, 19:22   #24998  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by kasper93 View Post
And my poor HD5870

Code:
OpenCL kernel: 3351.992920 fps
OpenCL copy interop: 91.797607 fps

Error Diffusion OpenCL: 402.034302 fps
Error Diffusion OpenCL interop: 13.219771 fps
Seriously!?!? This is not funny, anymore. 402fps = 2.5ms rendering time. 13.22fps = 75.6ms rendering time. So on your PC interop cost per 1080p frame is 73.1ms. That's insane.

Quote:
Originally Posted by e-t172 View Post
It appears that every ~7 seconds, a madVR thread (blue spiky line) is blocked during a significant amount of time (10-15 milliseconds). Stack information indicates that during these spikes most of the time is spent in GetRasterStatus(). madshi: does that make sense to you? This happens all the time and with both NVidia drivers 321.10 and 335.23. Reading the documentation for GetRasterStatus() I am confused as to why any code would spent so much time in this seemingly lightweight non-blocking method like that, especially with such weird periodicity.
Yes, it does make sense. madVR calls GetRasterStatus() all the time to decide in which exact moment to present frames. Especially in windowed and overlay modes the information collected from GetRasterStatus() is extremely important. In FSE mode it's a bit less important, but still a bit.

I agree with you, there's no possible reason why a call to GetRasterStatus() would block for any serious amount of time. Furthermore, madVR intentionally uses a separate D3D9 device to call GetRasterStatus() to avoid any conflicts with the rendering and presentation devices! So there's basically one D3D9 device which is used for nothing else but calling GetRasterStatus() once every millisecond (Sleep(1)).

Such a block for 10-15ms could cause problems in windowed/overlay mode, but I believe FSE mode should not even notice that, unless the driver internally blocks, too, and fails to flip to an already planned video frame via VSync hardware interrupt. That would then be one situation which would end up as a presentation glitch.

Quote:
Originally Posted by e-t172 View Post
The blue line is the madVR thread with the aforementioned spike. The green spike is a massive CPU spin in a kernel thread with a nvlddmkm.sys (NVidia driver) stack trace. It lasts for an enormous amount of time (230 ms in this instance), during which an entire CPU core is hosed and madVR seems to freeze, resulting in a burst of 5+ frame drops. What's interesting there is that the green spike seems to directly follow the blue spike, and the correlation doesn't seem to be a coincidence (the blue spike happens every 7 seconds and the green one follows less than 100 milliseconds after that - or maybe even less due to the sampling resolution). Technically that's probably a NVidia driver bug as it's never supposed to do that but maybe some behavior in madVR is triggering it.
I can't say if madVR is triggering it, at least I wouldn't know which madVR behaviour would/could be responsible for that. But one thing is probably important to mention: There are quite a lot of madVR users, many of them posting regularly in this thread, and I think the majority of them have NVidia GPUs, but it seems you're the only one who has this specific problem (at least that I'm aware of). That indicates that there is probably something special about your PC. Have you tried updating the GPU BIOS? Not sure if that would help, but it might be worth a try. It could also be a weird GPU hardware problem. I don't really know. Have you tried reinstalling the OS (ouch!)? Not sure what else I could suggest... The easiest test might be to get another (cheap?) NVidia GPU just to double check whether replacing the GPU would solve the problem or not. Or you could try plugging your GPU into a different PCIe slot, if your mainboard has more than one.

Quote:
Originally Posted by e-t172 View Post
here are some logs. The "try15" one is with 335.23 and relates to the graph above. The drop happens at timestamp 00439135 in the log. The "try16" one is with 321.10 and the drop happens at timestamp 01562318. In both runs, care has been taken to disable all unnecessary hardware and background processes to make the test as "pure" as possible. All madVR options are set to defaults except for Smooth Motion which is enabled. Sorry for the giant "try16" log - I sometimes have to wait a long time before the issue shows up again.
I appreciate the time you took to create clean logs etc, but unfortunately driver problems like this (threads getting stuck in NVidia drivers) are not really an ideal candidate to use logs for. The logs help mostly if there are problems in madVR. I've had a quick look at the logs and the only thing I can see in there is that when the frame drops occur, shortly before there's nothing in the logs for e.g. 200ms or so, which is a dramatically long time. And I can't see from the logs *why* nothing is occurring, it just looks like madVR isn't getting any CPU time or something like that...

Quote:
Originally Posted by e-t172 View Post
I really hope you can make sense of this, because I'm *really* out of options now. My only possible next steps are nuclear options like a Windows reinstall or even replacing the GPU or motherboard...
I'm not sure if there are other options than that, to be honest... Do you have an old GPU lying around somewhere? That would be the easiest and quickest way to double check if replacing the GPU would help.

Quote:
Originally Posted by e-t172 View Post
By the way, feature request: it would be nice if madVR could log the current system time in human-readable form every second or so, as it makes it easier to correlate the entries in the log with external tools like Process Monitor or XPerf.
True. Though, usually users don't really know how to interpret logs, and I personally don't have much use for such timestamps from the users' PCs because I never got logs from other tools from a user yet.
madshi is offline   Reply With Quote
Old 15th March 2014, 19:25   #24999  |  Link
XMonarchY
Guest
 
Posts: n/a
madshi, I uninstalled madVR using uninstall.bat from ALL locations, then did the same for restore default settings.bat, then I removed all the files from all location, ran several file cleaners, registry cleaners, restarted, re-downloaded the latest madVR, ran restory default settings.bat again, opened madTPG and the screen turned gray again (not due to 16-235 range)...

Here is what happens when I open madTPG.exe. At first Windows shows me the Security Warning. I press RUN. Then madTPG appears and it has a valid black screen. Then, I get madHcCtrl.exe Security Warning. As soon as I press RUN - madTPG screen goes from black to gray and then I get Windows Access problem... so it is something in madHcCtrl.exe I presume (not saying its madVR's issue), a setting or something... yet, all is at default, no old settings used. I really have no idea... Maybe the latest 335 nVidia drivers did this...
  Reply With Quote
Old 15th March 2014, 19:29   #25000  |  Link
Eiffel
Registered User
 
Join Date: Jun 2011
Location: London, UK
Posts: 10
Quote:
Originally Posted by madshi View Post
Ok, but you didn't answer any of the questions I asked you. Sorry, can't help you if you ignore my questions.
I presume you were looking for the following questions

Quote:
Originally Posted by madshi View Post
You mean you can't "unpress" both buttons at the same time?
Yes, I cannot do that with the recent builts.

Quote:
Originally Posted by madshi View Post
Works just fine on my PC. Are you talking about madTPG running on its own? Or while ArgyllCMS and/or HCFR are doing measurements through madTPG? .
This issue is with madTPG running on its own and with HCFR (I'm not sure how to try while ArgyllCMS are doing automated measurements)...

Quote:
Originally Posted by madshi View Post
In the moment when ArgyllCMS/HCFR take control of madTPG, they have the "power" to enforce these buttons to be pressed. However, as long as no other software is remote controlling madTPG, it should be possible to unpress both of those buttons at the same time..
While not a question, I can't unpress both of these buttons (at least one of the buttons will be blue no matter what I try). I suspect it has do with the logic controlling the buttons... when disable VideoLUTs is grey (ie the VideoLUT is enabled), and disable 3dlut is blue (ie disabled), pressing the disable 3dlut button toggles the state of Both buttons.
Eiffel is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 03:57.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.