Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
15th March 2014, 00:54 | #24961 | Link | |
Registered User
Join Date: Jun 2011
Location: London, UK
Posts: 10
|
Quote:
|
|
15th March 2014, 01:14 | #24962 | Link | |
Registered User
Join Date: Jun 2011
Location: London, UK
Posts: 10
|
Quote:
dispcal.exe -v2 -dmadvr -c1 -yn -qh -m "-w0.3127,0.329" -G2.4 -f0 -k0 -A4.0 dispwin.exe -v -dmadvr -c dispread.exe -v -dmadvr -c1 -yn -K colprof.exe -v -qh -ax -bl -C -M madVR -D collink -v -qh -G -iaw -r65 -n -3m -et -Et -IB:2.4 -a |
|
15th March 2014, 03:49 | #24965 | Link | |
Registered User
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,407
|
Quote:
If you want to test with VideoLUTs enabled use "-H" in collink instead of "-a". VideoLUTs will be forced on if you enable a 3DLUT created with "-H". edit: Or you could not use "-a" or "-H" and be able to enable & disable the VideoLUTs at will. The 3DLUT would only be correct with the VideoLUT enabled. edit2: I just noticed you also use "-K" in dispread, you might get slightly more accurate results with "-k" if you create the 3DLUT without using "-a". Last edited by Asmodian; 15th March 2014 at 04:10. |
|
15th March 2014, 03:53 | #24966 | Link | |
Registered User
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,407
|
Quote:
I understand a v-sync is forced at the composition rate by Aero so you might try the "disable desktop composition" option if you are experiencing issues caused by the composition rate. |
|
15th March 2014, 08:38 | #24967 | Link | |
Registered User
Join Date: Jan 2014
Posts: 216
|
Quote:
I've been using 1.5 sharpness, .05 clamp, and pattern 3. I'd really like to understand what makes this pattern better or what the optimal settings are in general and the best way to test. Last edited by StinDaWg; 15th March 2014 at 08:41. |
|
15th March 2014, 09:02 | #24969 | Link |
Registered User
Join Date: Feb 2006
Posts: 1,076
|
Hello madshi,
I've been testing madVR 87.7 (and the test-build before that). For my test i've used my bluray of howls moving castle, and a 10-bit test-encode (x264) ive made from it. But mainly ive used chapter 4 of my bluray of Samsara. This scene provides the best test i know. It goes from the bleak greyish landscape of Tibet (contrast/greyscale performance test) to a scene where monks lay out a mandala with pure primary-coloured and black/white sand where you can see individual grains of sand (colour, contrast, detail - performance test). For my tests i've chosen to use nnedi, 32 neurons, to do chroma up-sampling. The reason for this is that - As fas as i understand, chroma up-sampling, normally speaking always involves doubling (4:2:x to 4:4:x involves a doubling afaik). - Based on the screen-shots you posted when you presented nnedi, it was my feeling that nnedi provided the most accurate up-sampling of all scalers; even with odd artifacts. Based on my tests i found that Error diffusion, on low setting gets best result (ED option 2). I also use chroma dithering with that option. This because it lowers luma artifacts, knowing that human eyes are more sensitive to luma then to chroma. to my reasoning that means the fewer luma-artifacts the better. Setting dithering to 'chance every frame' results is a visible layer of slight noise, resembling slight film-grain. Using an other scaler improves performance notably, but gives slightly less detail and color-accuracy on the Samsara tests. For my use, and based on my eyes, combining both nnedi-32 for chroma up-sampling and ED-2+chroma up-sampling provides very natural looking image. On my calibrated plasma it provides the suggestion as if 'looking through a very clean window'. I've chosen to stick with these settings. Mayby in the future i can use more neurons for chroma upsampling, but i dont know if it would provide even better results. On my PC i can use pixel shaders icw MPT-HC and madVR. It does seem to work. But i have no idea what the difference is between adding a shader before or after resizing. For instance does it before mean 'before madVR does its chroma upsampling / magic' ? I'd be interested to know. Last edited by G_M_C; 15th March 2014 at 12:50. |
15th March 2014, 10:28 | #24970 | Link |
Registered User
Join Date: Apr 2012
Posts: 16
|
Madshi can you describe the AMD interop bug? I managed to get in touch with an AMD representative who is eager to fix the issue, I just need some more info. If you can PM me some info or type it here that would be great!
|
15th March 2014, 12:11 | #24973 | Link | ||||||||
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
Quote:
Quote:
Oh, interesting. Somewhat strange, though, I have the Intel stuff installed on my PC, too, and it doesn't seem to make any problems here... Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
http://madshi.net/OpenClSpeedTest.zip On my PC with an HD7770 I get the following results: Code:
D3D9 StretchRect: 1834 fps D3D9 HLSL PixelShader: 2407 fps OpenCL copy: 1986 fps OpenCL kernel: 2117 fps OpenCL copy interop: 427 fps OpenCL kernel interop: 424 fps Error Diffusion OpenCL: 466 fps Error Diffusion OpenCL interop: 134 fps Error Diffusion OpenCL interop 2: 192 fps Error Diffusion DirectCompute: 347 fps Error Diffusion DirectCompute interop: 316 fps Error Diffusion DirectCompute interop 2: 297 fps According to these tests (and my own experience) AMD's OpenCL implementation is working pretty well and fast, as long as you use only native OpenCL image objects. But as soon as I try to integrate OpenCL into my D3D9 rendering pipeline, speed suffers quite noticeably. A simple 1920x1080 frame copy slows down from 2000 fps to 425 fps. Or worse: Error diffusion slows down from 466 fps to just 134 fps. 466fps means one frame takes about 2.5ms to process. 134fps means one frame takes about 7.5ms to process. So basically the interop between OpenCL and Direct3D costs about 5ms per 1080p frame on my PC. And I think this test is making the cost "look good". In real world usage the cost seems to be rather higher. Some users are reporting interop to cost about 10ms per frame. Which is A LOT, considering that for 1080p60 playback each frame must be fully processed and displayed in 16.7ms per frame. And we're talking about 1080p here. I don't even want to think about 4K. In contrast to that, I can do DirectCompute interop with my Direct3D9 rendering pipeline without any noticeable interop cost at all! Which is how it should be, IMHO. I don't see why there should be any noticeable interop costs. FWIW, neither NVidia's nor Intel's D3D9 <-> OpenCL interop seems to cost much performance. AMD is the only one with a noticeable interop penalty. And another key problem is that the D3D9 <-> OpenCL interop cost does *not* seem to depend on the GPU speed. The very fastest and the very slowest AMD GPUs both seem to have interop costs of about 5-10ms per 1080p frame. If this could be fixed, that would be great! Edit: Here's a user report who has interop costs of about 73ms per 1080p frame (!!), with an HD5870: http://forum.doom9.org/showpost.php?p=1673651&postcount=25002 Last edited by madshi; 15th March 2014 at 19:23. |
||||||||
15th March 2014, 12:14 | #24974 | Link | |
Registered User
Join Date: Nov 2012
Posts: 167
|
Quote:
Also VirusTotal is a pretty neat site that let's you upload files to scan with ~50 different antivirus/antimalware apps. Here are the results for newest madVR https://www.virustotal.com/en/file/0...is/1394854647/ |
|
15th March 2014, 13:27 | #24975 | Link |
Registered User
Join Date: Feb 2014
Posts: 162
|
This is on my R9 270x:
D3D9 StretchRect: 2832 fps D3D9 HLSL PixelShader: 2133 fps OpenCL copy: 3050 fps OpenCL kernel: 3612 fps OpenCL copy interop: 400 fps OpenCL kernel interop: 462 fps The difference is even greater!!!!!!! |
15th March 2014, 14:21 | #24977 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
It wasn't meant to be used on NVidia GPUs. I've written this just for AMD to demonstrate the interop problem. It requires support for the official OpenCL 1.2 D3D9 interop extension, which NVidia doesn't support atm.
|
15th March 2014, 14:29 | #24978 | Link |
Registered User
Join Date: Oct 2012
Posts: 179
|
Ah - sorry about that. I got confused by your mention of the NVIDIA interop not being badly impacted performance wise (thought you must have measured it via the tool, and was curious about what the figures would be on my system).
|
15th March 2014, 15:14 | #24979 | Link |
Registered User
Join Date: Jan 2014
Posts: 93
|
Have there been performance tests with AMD's A10-7850K yet? NNEDI is probably out of the question, but what about Jinc resizing? Can the integrated graphics chips handle that?
Im considering building such a HTPC for about 400-500€ |
Tags |
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling |
Thread Tools | Search this Thread |
Display Modes | |
|
|