Log in

View Full Version : MPC-HC tester builds for internal renderer fixes


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 [34] 35

ts1
10th October 2015, 11:18
No, I switched to lanczos3 recently.

JanWillem32
14th October 2015, 21:23
I have been debugging a bit and I would like to verify if the VMR9 (renderless) with default settings + X8R8G8B8 (a.k.a. RGB32) input bug reported by ts1 happens with all vendors and drivers. I suspect it's a driver bug. I can verify that the bug is there with my AMD HD7970. If you own an Nvidia or Intel GPU, please test VMR9 (renderless) with default settings with a sample with nothing but X8R8G8B8 (a.k.a. RGB32) temporarily enabled in the filter output of your decoder of choice (both ffdshow tryouts and LAV filters support this on their video output configuration tabs). If the bug is there in multiple configurations I'll have to deal with a bug in the VMR-9, and try to find a solution for it.

JanWillem32
15th October 2015, 21:15
I've renewed the intermediate tester builds.
The VMR-9 bug reported by ts1 is solved (in a very odd way, but it works).
I've written a new dithering method. It's not yet integrated, but volunteers can test it.
To test it, enable the renderer in a sort of debug mode:
- enable 32-bit floating-point surfaces
- disable intital color mixing shaders
- under color management, enable color passtrough mode
- disable the internal ditherer (set levels to 0)
- disable pre-resize shaders
- enable the "block-based error diffusion dithering" shader post-resize as the only shader
- set the zoom factor to 100%
The renderer wil then render nothing more than that shader in its pipeline. (Forgive the lack of chroma up-sampling.)
If there is enough interest in this method after users have tested this dithering method, I can install this shader as an internal option for the renderer.

XRyche
16th October 2015, 00:49
It looks better on my old circa 2007 TN monitor than the coloured dithering. I can't even pick out any dithering pattern at all and I was sitting right at my monitor. At higher levels of colour dithering i can start seeing the dithering pattern. Considering it's a 6 bit TN monitor I think this does make a difference.

I would really like to know what others think as well. I've been known to fall prey to the "placebo effect" before. Honestly though, I don't think I'm just seeing an improvement because I want to. Thanks for making this available for testing JanWillem32.

I did disable frame interpolation as well.

ts1
18th October 2015, 10:48
Do you plan to upgrade to 1.7? 1.6 unfortunately doesn't support stdin.

JanWillem32
19th October 2015, 17:40
ts1, I'll have to discuss some things with the MPC-HC staff.
If I were to merge the renderer fixes onto the latest version of the code I could work miracles. However, it will come at the cost of VMR-7 r. and EVR Sync. If people are willing to accept that, I can finally merge the renderer fixes into the main branch.

new shaders (note that the 5×5 version will not work in the current build, but will in the next):// (C) 2015 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// 5×5 block-based error diffusion dithering
// This shader can be run as a screen space pixel shader.
// This shader requires compiling with ps_2_b, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// This shader will dither down an image using block-based error diffusion.

#define sa(A, B, C) float3 A = tex2D(s0, tex+float2(B, C)*c1).rgb;
#define Quantization 255.

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float2 n = frac(tex*c0*.2);
tex -= (n*5.-.5)*c1;

// original pixels
sa(s1, 0., 0.) sa(s2, 1., 0.) sa(s3, 2., 0.) sa(s4, 3., 0.) sa(s5, 4., 0.)
sa(s16, 0., 1.) sa(s17, 1., 1.) sa(s18, 2., 1.) sa(s19, 3., 1.) sa(s6, 4., 1.)
sa(s15, 0., 2.) sa(s24, 1., 2.) sa(s25, 2., 2.) sa(s20, 3., 2.) sa(s7, 4., 2.)
sa(s14, 0., 3.) sa(s23, 1., 3.) sa(s22, 2., 3.) sa(s21, 3., 3.) sa(s8, 4., 3.)
sa(s13, 0., 4.) sa(s12, 1., 4.) sa(s11, 2., 4.) sa(s10, 3., 4.) sa(s9, 4., 4.)

float3 f1 = frac(s1*Quantization);
float3 f2 = frac(s2*Quantization);
float3 f3 = frac(s3*Quantization);
float3 f4 = frac(s4*Quantization);
float3 f5 = frac(s5*Quantization);
float3 f6 = frac(s6*Quantization);
float3 f7 = frac(s7*Quantization);
float3 f8 = frac(s8*Quantization);
float3 f9 = frac(s9*Quantization);
float3 f10 = frac(s10*Quantization);
float3 f11 = frac(s11*Quantization);
float3 f12 = frac(s12*Quantization);
float3 f13 = frac(s13*Quantization);
float3 f14 = frac(s14*Quantization);
float3 f15 = frac(s15*Quantization);
float3 f16 = frac(s16*Quantization);
float3 f17 = frac(s17*Quantization);
float3 f18 = frac(s18*Quantization);
float3 f19 = frac(s19*Quantization);
float3 f20 = frac(s20*Quantization);
float3 f21 = frac(s21*Quantization);
float3 f22 = frac(s22*Quantization);
float3 f23 = frac(s23*Quantization);
float3 f24 = frac(s24*Quantization);
float3 f25 = frac(s25*Quantization);

float3 current;
float3 mf1 = -f1;
float3 uf1 = 1.-f1;
[flatten] if(f1.r > .5) f1.r = uf1.r;
else f1.r = mf1.r;
[flatten] if(f1.g > .5) f1.g = uf1.g;
else f1.g = mf1.g;
[flatten] if(f1.b > .5) f1.b = uf1.b;
else f1.b = mf1.b;
s1 += f1/Quantization;
current = -f1;

float3 cu2 = current+f2;
float3 mf2 = -f2;
float3 uf2 = 1.-f2;
[flatten] if(cu2.r > .5) f2.r = uf2.r;
else f2.r = mf2.r;
[flatten] if(cu2.g > .5) f2.g = uf2.g;
else f2.g = mf2.g;
[flatten] if(cu2.b > .5) f2.b = uf2.b;
else f2.b = mf2.b;
s2 += f2/Quantization;
current -= f2;

float3 cu3 = current+f3;
float3 mf3 = -f3;
float3 uf3 = 1.-f3;
[flatten] if(cu3.r > .5) f3.r = uf3.r;
else f3.r = mf3.r;
[flatten] if(cu3.g > .5) f3.g = uf3.g;
else f3.g = mf3.g;
[flatten] if(cu3.b > .5) f3.b = uf3.b;
else f3.b = mf3.b;
s3 += f3/Quantization;
current -= f3;

float3 cu4 = current+f4;
float3 mf4 = -f4;
float3 uf4 = 1.-f4;
[flatten] if(cu4.r > .5) f4.r = uf4.r;
else f4.r = mf4.r;
[flatten] if(cu4.g > .5) f4.g = uf4.g;
else f4.g = mf4.g;
[flatten] if(cu4.b > .5) f4.b = uf4.b;
else f4.b = mf4.b;
s4 += f4/Quantization;
current -= f4;

float3 cu5 = current+f5;
float3 mf5 = -f5;
float3 uf5 = 1.-f5;
[flatten] if(cu5.r > .5) f5.r = uf5.r;
else f5.r = mf5.r;
[flatten] if(cu5.g > .5) f5.g = uf5.g;
else f5.g = mf5.g;
[flatten] if(cu5.b > .5) f5.b = uf5.b;
else f5.b = mf5.b;
s5 += f5/Quantization;
current -= f5;

float3 cu6 = current+f6;
float3 mf6 = -f6;
float3 uf6 = 1.-f6;
[flatten] if(cu6.r > .5) f6.r = uf6.r;
else f6.r = mf6.r;
[flatten] if(cu6.g > .5) f6.g = uf6.g;
else f6.g = mf6.g;
[flatten] if(cu6.b > .5) f6.b = uf6.b;
else f6.b = mf6.b;
s6 += f6/Quantization;
current -= f6;

float3 cu7 = current+f7;
float3 mf7 = -f7;
float3 uf7 = 1.-f7;
[flatten] if(cu7.r > .5) f7.r = uf7.r;
else f7.r = mf7.r;
[flatten] if(cu7.g > .5) f7.g = uf7.g;
else f7.g = mf7.g;
[flatten] if(cu7.b > .5) f7.b = uf7.b;
else f7.b = mf7.b;
s7 += f7/Quantization;
current -= f7;

float3 cu8 = current+f8;
float3 mf8 = -f8;
float3 uf8 = 1.-f8;
[flatten] if(cu8.r > .5) f8.r = uf8.r;
else f8.r = mf8.r;
[flatten] if(cu8.g > .5) f8.g = uf8.g;
else f8.g = mf8.g;
[flatten] if(cu8.b > .5) f8.b = uf8.b;
else f8.b = mf8.b;
s8 += f8/Quantization;
current -= f8;

float3 cu9 = current+f9;
float3 mf9 = -f9;
float3 uf9 = 1.-f9;
[flatten] if(cu9.r > .5) f9.r = uf9.r;
else f9.r = mf9.r;
[flatten] if(cu9.g > .5) f9.g = uf9.g;
else f9.g = mf9.g;
[flatten] if(cu9.b > .5) f9.b = uf9.b;
else f9.b = mf9.b;
s9 += f9/Quantization;
current -= f9;

float3 cu10 = current+f10;
float3 mf10 = -f10;
float3 uf10 = 1.-f10;
[flatten] if(cu10.r > .5) f10.r = uf10.r;
else f10.r = mf10.r;
[flatten] if(cu10.g > .5) f10.g = uf10.g;
else f10.g = mf10.g;
[flatten] if(cu10.b > .5) f10.b = uf10.b;
else f10.b = mf10.b;
s10 += f10/Quantization;
current -= f10;

float3 cu11 = current+f11;
float3 mf11 = -f11;
float3 uf11 = 1.-f11;
[flatten] if(cu11.r > .5) f11.r = uf11.r;
else f11.r = mf11.r;
[flatten] if(cu11.g > .5) f11.g = uf11.g;
else f11.g = mf11.g;
[flatten] if(cu11.b > .5) f11.b = uf11.b;
else f11.b = mf11.b;
s11 += f11/Quantization;
current -= f11;

float3 cu12 = current+f12;
float3 mf12 = -f12;
float3 uf12 = 1.-f12;
[flatten] if(cu12.r > .5) f12.r = uf12.r;
else f12.r = mf12.r;
[flatten] if(cu12.g > .5) f12.g = uf12.g;
else f12.g = mf12.g;
[flatten] if(cu12.b > .5) f12.b = uf12.b;
else f12.b = mf12.b;
s12 += f12/Quantization;
current -= f12;

float3 cu13 = current+f13;
float3 mf13 = -f13;
float3 uf13 = 1.-f13;
[flatten] if(cu13.r > .5) f13.r = uf13.r;
else f13.r = mf13.r;
[flatten] if(cu13.g > .5) f13.g = uf13.g;
else f13.g = mf13.g;
[flatten] if(cu13.b > .5) f13.b = uf13.b;
else f13.b = mf13.b;
s13 += f13/Quantization;
current -= f13;

float3 cu14 = current+f14;
float3 mf14 = -f14;
float3 uf14 = 1.-f14;
[flatten] if(cu14.r > .5) f14.r = uf14.r;
else f14.r = mf14.r;
[flatten] if(cu14.g > .5) f14.g = uf14.g;
else f14.g = mf14.g;
[flatten] if(cu14.b > .5) f14.b = uf14.b;
else f14.b = mf14.b;
s14 += f14/Quantization;
current -= f14;

float3 cu15 = current+f15;
float3 mf15 = -f15;
float3 uf15 = 1.-f15;
[flatten] if(cu15.r > .5) f15.r = uf15.r;
else f15.r = mf15.r;
[flatten] if(cu15.g > .5) f15.g = uf15.g;
else f15.g = mf15.g;
[flatten] if(cu15.b > .5) f15.b = uf15.b;
else f15.b = mf15.b;
s15 += f15/Quantization;
current -= f15;

float3 cu16 = current+f16;
float3 mf16 = -f16;
float3 uf16 = 1.-f16;
[flatten] if(cu16.r > .5) f16.r = uf16.r;
else f16.r = mf16.r;
[flatten] if(cu16.g > .5) f16.g = uf16.g;
else f16.g = mf16.g;
[flatten] if(cu16.b > .5) f16.b = uf16.b;
else f16.b = mf16.b;
s16 += f16/Quantization;
current -= f16;

float3 cu17 = current+f17;
float3 mf17 = -f17;
float3 uf17 = 1.-f17;
[flatten] if(cu17.r > .5) f17.r = uf17.r;
else f17.r = mf17.r;
[flatten] if(cu17.g > .5) f17.g = uf17.g;
else f17.g = mf17.g;
[flatten] if(cu17.b > .5) f17.b = uf17.b;
else f17.b = mf17.b;
s17 += f17/Quantization;
current -= f17;

float3 cu18 = current+f18;
float3 mf18 = -f18;
float3 uf18 = 1.-f18;
[flatten] if(cu18.r > .5) f18.r = uf18.r;
else f18.r = mf18.r;
[flatten] if(cu18.g > .5) f18.g = uf18.g;
else f18.g = mf18.g;
[flatten] if(cu18.b > .5) f18.b = uf18.b;
else f18.b = mf18.b;
s18 += f18/Quantization;
current -= f18;

float3 cu19 = current+f19;
float3 mf19 = -f19;
float3 uf19 = 1.-f19;
[flatten] if(cu19.r > .5) f19.r = uf19.r;
else f19.r = mf19.r;
[flatten] if(cu19.g > .5) f19.g = uf19.g;
else f19.g = mf19.g;
[flatten] if(cu19.b > .5) f19.b = uf19.b;
else f19.b = mf19.b;
s19 += f19/Quantization;
current -= f19;

float3 cu20 = current+f20;
float3 mf20 = -f20;
float3 uf20 = 1.-f20;
[flatten] if(cu20.r > .5) f20.r = uf20.r;
else f20.r = mf20.r;
[flatten] if(cu20.g > .5) f20.g = uf20.g;
else f20.g = mf20.g;
[flatten] if(cu20.b > .5) f20.b = uf20.b;
else f20.b = mf20.b;
s20 += f20/Quantization;
current -= f20;

float3 cu21 = current+f21;
float3 mf21 = -f21;
float3 uf21 = 1.-f21;
[flatten] if(cu21.r > .5) f21.r = uf21.r;
else f21.r = mf21.r;
[flatten] if(cu21.g > .5) f21.g = uf21.g;
else f21.g = mf21.g;
[flatten] if(cu21.b > .5) f21.b = uf21.b;
else f21.b = mf21.b;
s21 += f21/Quantization;
current -= f21;

float3 cu22 = current+f22;
float3 mf22 = -f22;
float3 uf22 = 1.-f22;
[flatten] if(cu22.r > .5) f22.r = uf22.r;
else f22.r = mf22.r;
[flatten] if(cu22.g > .5) f22.g = uf22.g;
else f22.g = mf22.g;
[flatten] if(cu22.b > .5) f22.b = uf22.b;
else f22.b = mf22.b;
s22 += f22/Quantization;
current -= f22;

float3 cu23 = current+f23;
float3 mf23 = -f23;
float3 uf23 = 1.-f23;
[flatten] if(cu23.r > .5) f23.r = uf23.r;
else f23.r = mf23.r;
[flatten] if(cu23.g > .5) f23.g = uf23.g;
else f23.g = mf23.g;
[flatten] if(cu23.b > .5) f23.b = uf23.b;
else f23.b = mf23.b;
s23 += f23/Quantization;
current -= f23;

float3 cu24 = current+f24;
float3 mf24 = -f24;
float3 uf24 = 1.-f24;
[flatten] if(cu24.r > .5) f24.r = uf24.r;
else f24.r = mf24.r;
[flatten] if(cu24.g > .5) f24.g = uf24.g;
else f24.g = mf24.g;
[flatten] if(cu24.b > .5) f24.b = uf24.b;
else f24.b = mf24.b;
s24 += f24/Quantization;
current -= f24;

float3 cu25 = current+f25;
float3 mf25 = -f25;
float3 uf25 = 1.-f25;
[flatten] if(cu25.r > .5) f25.r = uf25.r;
else f25.r = mf25.r;
[flatten] if(cu25.g > .5) f25.g = uf25.g;
else f25.g = mf25.g;
[flatten] if(cu25.b > .5) f25.b = uf25.b;
else f25.b = mf25.b;
s25 += f25/Quantization;
//current -= f25;

[flatten] if(n.y > .8) {
[flatten] if(n.x > .8) return s9.rgbb;
else [flatten] if(n.x > .6) return s10.rgbb;
else [flatten] if(n.x > .4) return s11.rgbb;
else [flatten] if(n.x > .2) return s12.rgbb;
else return s13.rgbb;}
else [flatten] if(n.y > .6) {
[flatten] if(n.x > .8) return s8.rgbb;
else [flatten] if(n.x > .6) return s21.rgbb;
else [flatten] if(n.x > .4) return s22.rgbb;
else [flatten] if(n.x > .2) return s23.rgbb;
else return s14.rgbb;}
else [flatten] if(n.y > .4) {
[flatten] if(n.x > .8) return s7.rgbb;
else [flatten] if(n.x > .6) return s20.rgbb;
else [flatten] if(n.x > .4) return s25.rgbb;
else [flatten] if(n.x > .2) return s24.rgbb;
else return s15.rgbb;}
else [flatten] if(n.y > .2) {
[flatten] if(n.x > .8) return s6.rgbb;
else [flatten] if(n.x > .6) return s19.rgbb;
else [flatten] if(n.x > .4) return s18.rgbb;
else [flatten] if(n.x > .2) return s17.rgbb;
else return s16.rgbb;}
else {
[flatten] if(n.x > .8) return s5.rgbb;
else [flatten] if(n.x > .6) return s4.rgbb;
else [flatten] if(n.x > .4) return s3.rgbb;
else [flatten] if(n.x > .2) return s2.rgbb;
else return s1.rgbb;}
}

JanWillem32
19th October 2015, 17:42
// (C) 2015 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// 4×4 block-based error diffusion dithering
// This shader can be run as a screen space pixel shader.
// This shader requires compiling with ps_2_a, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// This shader will dither down an image using block-based error diffusion.

#define sa(A, B, C) float3 A = tex2D(s0, tex+float2(B, C)*c1).rgb;
#define Quantization 255.

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float2 n = frac(tex*c0*.25);
tex -= (n*4.-.5)*c1;

// original pixels
sa(s1, 0., 0.) sa(s2, 1., 0.) sa(s3, 2., 0.) sa(s4, 3., 0.)
sa(s12, 0., 1.) sa(s13, 1., 1.) sa(s14, 2., 1.) sa(s5, 3., 1.)
sa(s11, 0., 2.) sa(s16, 1., 2.) sa(s15, 2., 2.) sa(s6, 3., 2.)
sa(s10, 0., 3.) sa(s9, 1., 3.) sa(s8, 2., 3.) sa(s7, 3., 3.)

float3 f1 = frac(s1*Quantization);
float3 f2 = frac(s2*Quantization);
float3 f3 = frac(s3*Quantization);
float3 f4 = frac(s4*Quantization);
float3 f5 = frac(s5*Quantization);
float3 f6 = frac(s6*Quantization);
float3 f7 = frac(s7*Quantization);
float3 f8 = frac(s8*Quantization);
float3 f9 = frac(s9*Quantization);
float3 f10 = frac(s10*Quantization);
float3 f11 = frac(s11*Quantization);
float3 f12 = frac(s12*Quantization);
float3 f13 = frac(s13*Quantization);
float3 f14 = frac(s14*Quantization);
float3 f15 = frac(s15*Quantization);
float3 f16 = frac(s16*Quantization);

float3 current;
float3 mf1 = -f1;
float3 uf1 = 1.-f1;
[flatten] if(f1.r > .5) f1.r = uf1.r;
else f1.r = mf1.r;
[flatten] if(f1.g > .5) f1.g = uf1.g;
else f1.g = mf1.g;
[flatten] if(f1.b > .5) f1.b = uf1.b;
else f1.b = mf1.b;
s1 += f1/Quantization;
current = -f1;

float3 cu2 = current+f2;
float3 mf2 = -f2;
float3 uf2 = 1.-f2;
[flatten] if(cu2.r > .5) f2.r = uf2.r;
else f2.r = mf2.r;
[flatten] if(cu2.g > .5) f2.g = uf2.g;
else f2.g = mf2.g;
[flatten] if(cu2.b > .5) f2.b = uf2.b;
else f2.b = mf2.b;
s2 += f2/Quantization;
current -= f2;

float3 cu3 = current+f3;
float3 mf3 = -f3;
float3 uf3 = 1.-f3;
[flatten] if(cu3.r > .5) f3.r = uf3.r;
else f3.r = mf3.r;
[flatten] if(cu3.g > .5) f3.g = uf3.g;
else f3.g = mf3.g;
[flatten] if(cu3.b > .5) f3.b = uf3.b;
else f3.b = mf3.b;
s3 += f3/Quantization;
current -= f3;

float3 cu4 = current+f4;
float3 mf4 = -f4;
float3 uf4 = 1.-f4;
[flatten] if(cu4.r > .5) f4.r = uf4.r;
else f4.r = mf4.r;
[flatten] if(cu4.g > .5) f4.g = uf4.g;
else f4.g = mf4.g;
[flatten] if(cu4.b > .5) f4.b = uf4.b;
else f4.b = mf4.b;
s4 += f4/Quantization;
current -= f4;

float3 cu5 = current+f5;
float3 mf5 = -f5;
float3 uf5 = 1.-f5;
[flatten] if(cu5.r > .5) f5.r = uf5.r;
else f5.r = mf5.r;
[flatten] if(cu5.g > .5) f5.g = uf5.g;
else f5.g = mf5.g;
[flatten] if(cu5.b > .5) f5.b = uf5.b;
else f5.b = mf5.b;
s5 += f5/Quantization;
current -= f5;

float3 cu6 = current+f6;
float3 mf6 = -f6;
float3 uf6 = 1.-f6;
[flatten] if(cu6.r > .5) f6.r = uf6.r;
else f6.r = mf6.r;
[flatten] if(cu6.g > .5) f6.g = uf6.g;
else f6.g = mf6.g;
[flatten] if(cu6.b > .5) f6.b = uf6.b;
else f6.b = mf6.b;
s6 += f6/Quantization;
current -= f6;

float3 cu7 = current+f7;
float3 mf7 = -f7;
float3 uf7 = 1.-f7;
[flatten] if(cu7.r > .5) f7.r = uf7.r;
else f7.r = mf7.r;
[flatten] if(cu7.g > .5) f7.g = uf7.g;
else f7.g = mf7.g;
[flatten] if(cu7.b > .5) f7.b = uf7.b;
else f7.b = mf7.b;
s7 += f7/Quantization;
current -= f7;

float3 cu8 = current+f8;
float3 mf8 = -f8;
float3 uf8 = 1.-f8;
[flatten] if(cu8.r > .5) f8.r = uf8.r;
else f8.r = mf8.r;
[flatten] if(cu8.g > .5) f8.g = uf8.g;
else f8.g = mf8.g;
[flatten] if(cu8.b > .5) f8.b = uf8.b;
else f8.b = mf8.b;
s8 += f8/Quantization;
current -= f8;

float3 cu9 = current+f9;
float3 mf9 = -f9;
float3 uf9 = 1.-f9;
[flatten] if(cu9.r > .5) f9.r = uf9.r;
else f9.r = mf9.r;
[flatten] if(cu9.g > .5) f9.g = uf9.g;
else f9.g = mf9.g;
[flatten] if(cu9.b > .5) f9.b = uf9.b;
else f9.b = mf9.b;
s9 += f9/Quantization;
current -= f9;

float3 cu10 = current+f10;
float3 mf10 = -f10;
float3 uf10 = 1.-f10;
[flatten] if(cu10.r > .5) f10.r = uf10.r;
else f10.r = mf10.r;
[flatten] if(cu10.g > .5) f10.g = uf10.g;
else f10.g = mf10.g;
[flatten] if(cu10.b > .5) f10.b = uf10.b;
else f10.b = mf10.b;
s10 += f10/Quantization;
current -= f10;

float3 cu11 = current+f11;
float3 mf11 = -f11;
float3 uf11 = 1.-f11;
[flatten] if(cu11.r > .5) f11.r = uf11.r;
else f11.r = mf11.r;
[flatten] if(cu11.g > .5) f11.g = uf11.g;
else f11.g = mf11.g;
[flatten] if(cu11.b > .5) f11.b = uf11.b;
else f11.b = mf11.b;
s11 += f11/Quantization;
current -= f11;

float3 cu12 = current+f12;
float3 mf12 = -f12;
float3 uf12 = 1.-f12;
[flatten] if(cu12.r > .5) f12.r = uf12.r;
else f12.r = mf12.r;
[flatten] if(cu12.g > .5) f12.g = uf12.g;
else f12.g = mf12.g;
[flatten] if(cu12.b > .5) f12.b = uf12.b;
else f12.b = mf12.b;
s12 += f12/Quantization;
current -= f12;

float3 cu13 = current+f13;
float3 mf13 = -f13;
float3 uf13 = 1.-f13;
[flatten] if(cu13.r > .5) f13.r = uf13.r;
else f13.r = mf13.r;
[flatten] if(cu13.g > .5) f13.g = uf13.g;
else f13.g = mf13.g;
[flatten] if(cu13.b > .5) f13.b = uf13.b;
else f13.b = mf13.b;
s13 += f13/Quantization;
current -= f13;

float3 cu14 = current+f14;
float3 mf14 = -f14;
float3 uf14 = 1.-f14;
[flatten] if(cu14.r > .5) f14.r = uf14.r;
else f14.r = mf14.r;
[flatten] if(cu14.g > .5) f14.g = uf14.g;
else f14.g = mf14.g;
[flatten] if(cu14.b > .5) f14.b = uf14.b;
else f14.b = mf14.b;
s14 += f14/Quantization;
current -= f14;

float3 cu15 = current+f15;
float3 mf15 = -f15;
float3 uf15 = 1.-f15;
[flatten] if(cu15.r > .5) f15.r = uf15.r;
else f15.r = mf15.r;
[flatten] if(cu15.g > .5) f15.g = uf15.g;
else f15.g = mf15.g;
[flatten] if(cu15.b > .5) f15.b = uf15.b;
else f15.b = mf15.b;
s15 += f15/Quantization;
current -= f15;

float3 cu16 = current+f16;
float3 mf16 = -f16;
float3 uf16 = 1.-f16;
[flatten] if(cu16.r > .5) f16.r = uf16.r;
else f16.r = mf16.r;
[flatten] if(cu16.g > .5) f16.g = uf16.g;
else f16.g = mf16.g;
[flatten] if(cu16.b > .5) f16.b = uf16.b;
else f16.b = mf16.b;
s16 += f16/Quantization;
//current -= f16;

[flatten] if(n.y > .75) {
[flatten] if(n.x > .75) return s7.rgbb;
else [flatten] if(n.x > .5) return s8.rgbb;
else [flatten] if(n.x > .25) return s9.rgbb;
else return s10.rgbb;}
else [flatten] if(n.y > .5) {
[flatten] if(n.x > .75) return s6.rgbb;
else [flatten] if(n.x > .5) return s15.rgbb;
else [flatten] if(n.x > .25) return s16.rgbb;
else return s11.rgbb;}
else [flatten] if(n.y > .25) {
[flatten] if(n.x > .75) return s5.rgbb;
else [flatten] if(n.x > .5) return s14.rgbb;
else [flatten] if(n.x > .25) return s13.rgbb;
else return s12.rgbb;}
else {
[flatten] if(n.x > .75) return s4.rgbb;
else [flatten] if(n.x > .5) return s3.rgbb;
else [flatten] if(n.x > .25) return s2.rgbb;
else return s1.rgbb;}
}// (C) 2015 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// 3×3 block-based error diffusion dithering
// This shader can be run as a screen space pixel shader.
// This shader requires compiling with ps_2_a, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// This shader will dither down an image using block-based error diffusion.

#define sa(A, B, C) float3 A = tex2D(s0, tex+float2(B, C)*c1).rgb;
#define Quantization 255.

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float2 n = frac(tex*c0/3.);
tex -= (n*3.-.5)*c1;

// original pixels
sa(s1, 0., 0.) sa(s2, 1., 0.) sa(s3, 2., 0.)
sa(s8, 0., 1.) sa(s9, 1., 1.) sa(s4, 2., 1.)
sa(s7, 0., 2.) sa(s6, 1., 2.) sa(s5, 2., 2.)

float3 f1 = frac(s1*Quantization);
float3 f2 = frac(s2*Quantization);
float3 f3 = frac(s3*Quantization);
float3 f4 = frac(s4*Quantization);
float3 f5 = frac(s5*Quantization);
float3 f6 = frac(s6*Quantization);
float3 f7 = frac(s7*Quantization);
float3 f8 = frac(s8*Quantization);
float3 f9 = frac(s9*Quantization);

float3 current;
float3 mf1 = -f1;
float3 uf1 = 1.-f1;
[flatten] if(f1.r > .5) f1.r = uf1.r;
else f1.r = mf1.r;
[flatten] if(f1.g > .5) f1.g = uf1.g;
else f1.g = mf1.g;
[flatten] if(f1.b > .5) f1.b = uf1.b;
else f1.b = mf1.b;
s1 += f1/Quantization;
current = -f1;

float3 cu2 = current+f2;
float3 mf2 = -f2;
float3 uf2 = 1.-f2;
[flatten] if(cu2.r > .5) f2.r = uf2.r;
else f2.r = mf2.r;
[flatten] if(cu2.g > .5) f2.g = uf2.g;
else f2.g = mf2.g;
[flatten] if(cu2.b > .5) f2.b = uf2.b;
else f2.b = mf2.b;
s2 += f2/Quantization;
current -= f2;

float3 cu3 = current+f3;
float3 mf3 = -f3;
float3 uf3 = 1.-f3;
[flatten] if(cu3.r > .5) f3.r = uf3.r;
else f3.r = mf3.r;
[flatten] if(cu3.g > .5) f3.g = uf3.g;
else f3.g = mf3.g;
[flatten] if(cu3.b > .5) f3.b = uf3.b;
else f3.b = mf3.b;
s3 += f3/Quantization;
current -= f3;

float3 cu4 = current+f4;
float3 mf4 = -f4;
float3 uf4 = 1.-f4;
[flatten] if(cu4.r > .5) f4.r = uf4.r;
else f4.r = mf4.r;
[flatten] if(cu4.g > .5) f4.g = uf4.g;
else f4.g = mf4.g;
[flatten] if(cu4.b > .5) f4.b = uf4.b;
else f4.b = mf4.b;
s4 += f4/Quantization;
current -= f4;

float3 cu5 = current+f5;
float3 mf5 = -f5;
float3 uf5 = 1.-f5;
[flatten] if(cu5.r > .5) f5.r = uf5.r;
else f5.r = mf5.r;
[flatten] if(cu5.g > .5) f5.g = uf5.g;
else f5.g = mf5.g;
[flatten] if(cu5.b > .5) f5.b = uf5.b;
else f5.b = mf5.b;
s5 += f5/Quantization;
current -= f5;

float3 cu6 = current+f6;
float3 mf6 = -f6;
float3 uf6 = 1.-f6;
[flatten] if(cu6.r > .5) f6.r = uf6.r;
else f6.r = mf6.r;
[flatten] if(cu6.g > .5) f6.g = uf6.g;
else f6.g = mf6.g;
[flatten] if(cu6.b > .5) f6.b = uf6.b;
else f6.b = mf6.b;
s6 += f6/Quantization;
current -= f6;

float3 cu7 = current+f7;
float3 mf7 = -f7;
float3 uf7 = 1.-f7;
[flatten] if(cu7.r > .5) f7.r = uf7.r;
else f7.r = mf7.r;
[flatten] if(cu7.g > .5) f7.g = uf7.g;
else f7.g = mf7.g;
[flatten] if(cu7.b > .5) f7.b = uf7.b;
else f7.b = mf7.b;
s7 += f7/Quantization;
current -= f7;

float3 cu8 = current+f8;
float3 mf8 = -f8;
float3 uf8 = 1.-f8;
[flatten] if(cu8.r > .5) f8.r = uf8.r;
else f8.r = mf8.r;
[flatten] if(cu8.g > .5) f8.g = uf8.g;
else f8.g = mf8.g;
[flatten] if(cu8.b > .5) f8.b = uf8.b;
else f8.b = mf8.b;
s8 += f8/Quantization;
current -= f8;

float3 cu9 = current+f9;
float3 mf9 = -f9;
float3 uf9 = 1.-f9;
[flatten] if(cu9.r > .5) f9.r = uf9.r;
else f9.r = mf9.r;
[flatten] if(cu9.g > .5) f9.g = uf9.g;
else f9.g = mf9.g;
[flatten] if(cu9.b > .5) f9.b = uf9.b;
else f9.b = mf9.b;
s9 += f9/Quantization;
//current -= f9;

[flatten] if(n.y > 2./3.) {
[flatten] if(n.x > 2./3.) return s5.rgbb;
else [flatten] if(n.x > 1./3.) return s6.rgbb;
else return s7.rgbb;}
else [flatten] if(n.y > 1./3.) {
[flatten] if(n.x > 2./3.) return s4.rgbb;
else [flatten] if(n.x > 1./3.) return s9.rgbb;
else return s8.rgbb;}
else {
[flatten] if(n.x > 2./3.) return s3.rgbb;
else [flatten] if(n.x > 1./3.) return s2.rgbb;
else return s1.rgbb;}
}

JanWillem32
19th October 2015, 19:11
For the fans of the 8-bit mode, here's a ditherer specifically to render with 256 colors:// (C) 2015 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// 5×5 block-based error diffusion dithering to 256-color mode
// This shader can be run as a screen space pixel shader.
// This shader requires compiling with ps_2_b, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// This shader will dither down an image using block-based error diffusion to 256 color mode.

#define sa(A, B, C) float3 A = tex2D(s0, tex+float2(B, C)*c1).rgb;
#define QuantizationRed 7.
#define QuantizationGreen 7.
#define QuantizationBlue 3.

static const float3 Quantization = float3(QuantizationRed, QuantizationGreen, QuantizationBlue);
sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float2 n = frac(tex*c0*.2);
tex -= (n*5.-.5)*c1;

// original pixels
sa(s1, 0., 0.) sa(s2, 1., 0.) sa(s3, 2., 0.) sa(s4, 3., 0.) sa(s5, 4., 0.)
sa(s16, 0., 1.) sa(s17, 1., 1.) sa(s18, 2., 1.) sa(s19, 3., 1.) sa(s6, 4., 1.)
sa(s15, 0., 2.) sa(s24, 1., 2.) sa(s25, 2., 2.) sa(s20, 3., 2.) sa(s7, 4., 2.)
sa(s14, 0., 3.) sa(s23, 1., 3.) sa(s22, 2., 3.) sa(s21, 3., 3.) sa(s8, 4., 3.)
sa(s13, 0., 4.) sa(s12, 1., 4.) sa(s11, 2., 4.) sa(s10, 3., 4.) sa(s9, 4., 4.)

float3 f1 = frac(s1*Quantization);
float3 f2 = frac(s2*Quantization);
float3 f3 = frac(s3*Quantization);
float3 f4 = frac(s4*Quantization);
float3 f5 = frac(s5*Quantization);
float3 f6 = frac(s6*Quantization);
float3 f7 = frac(s7*Quantization);
float3 f8 = frac(s8*Quantization);
float3 f9 = frac(s9*Quantization);
float3 f10 = frac(s10*Quantization);
float3 f11 = frac(s11*Quantization);
float3 f12 = frac(s12*Quantization);
float3 f13 = frac(s13*Quantization);
float3 f14 = frac(s14*Quantization);
float3 f15 = frac(s15*Quantization);
float3 f16 = frac(s16*Quantization);
float3 f17 = frac(s17*Quantization);
float3 f18 = frac(s18*Quantization);
float3 f19 = frac(s19*Quantization);
float3 f20 = frac(s20*Quantization);
float3 f21 = frac(s21*Quantization);
float3 f22 = frac(s22*Quantization);
float3 f23 = frac(s23*Quantization);
float3 f24 = frac(s24*Quantization);
float3 f25 = frac(s25*Quantization);

float3 current;
float3 mf1 = -f1;
float3 uf1 = 1.-f1;
[flatten] if(f1.r > .5) f1.r = uf1.r;
else f1.r = mf1.r;
[flatten] if(f1.g > .5) f1.g = uf1.g;
else f1.g = mf1.g;
[flatten] if(f1.b > .5) f1.b = uf1.b;
else f1.b = mf1.b;
s1 += f1/Quantization;
current = -f1;

float3 cu2 = current+f2;
float3 mf2 = -f2;
float3 uf2 = 1.-f2;
[flatten] if(cu2.r > .5) f2.r = uf2.r;
else f2.r = mf2.r;
[flatten] if(cu2.g > .5) f2.g = uf2.g;
else f2.g = mf2.g;
[flatten] if(cu2.b > .5) f2.b = uf2.b;
else f2.b = mf2.b;
s2 += f2/Quantization;
current -= f2;

float3 cu3 = current+f3;
float3 mf3 = -f3;
float3 uf3 = 1.-f3;
[flatten] if(cu3.r > .5) f3.r = uf3.r;
else f3.r = mf3.r;
[flatten] if(cu3.g > .5) f3.g = uf3.g;
else f3.g = mf3.g;
[flatten] if(cu3.b > .5) f3.b = uf3.b;
else f3.b = mf3.b;
s3 += f3/Quantization;
current -= f3;

float3 cu4 = current+f4;
float3 mf4 = -f4;
float3 uf4 = 1.-f4;
[flatten] if(cu4.r > .5) f4.r = uf4.r;
else f4.r = mf4.r;
[flatten] if(cu4.g > .5) f4.g = uf4.g;
else f4.g = mf4.g;
[flatten] if(cu4.b > .5) f4.b = uf4.b;
else f4.b = mf4.b;
s4 += f4/Quantization;
current -= f4;

float3 cu5 = current+f5;
float3 mf5 = -f5;
float3 uf5 = 1.-f5;
[flatten] if(cu5.r > .5) f5.r = uf5.r;
else f5.r = mf5.r;
[flatten] if(cu5.g > .5) f5.g = uf5.g;
else f5.g = mf5.g;
[flatten] if(cu5.b > .5) f5.b = uf5.b;
else f5.b = mf5.b;
s5 += f5/Quantization;
current -= f5;

float3 cu6 = current+f6;
float3 mf6 = -f6;
float3 uf6 = 1.-f6;
[flatten] if(cu6.r > .5) f6.r = uf6.r;
else f6.r = mf6.r;
[flatten] if(cu6.g > .5) f6.g = uf6.g;
else f6.g = mf6.g;
[flatten] if(cu6.b > .5) f6.b = uf6.b;
else f6.b = mf6.b;
s6 += f6/Quantization;
current -= f6;

float3 cu7 = current+f7;
float3 mf7 = -f7;
float3 uf7 = 1.-f7;
[flatten] if(cu7.r > .5) f7.r = uf7.r;
else f7.r = mf7.r;
[flatten] if(cu7.g > .5) f7.g = uf7.g;
else f7.g = mf7.g;
[flatten] if(cu7.b > .5) f7.b = uf7.b;
else f7.b = mf7.b;
s7 += f7/Quantization;
current -= f7;

float3 cu8 = current+f8;
float3 mf8 = -f8;
float3 uf8 = 1.-f8;
[flatten] if(cu8.r > .5) f8.r = uf8.r;
else f8.r = mf8.r;
[flatten] if(cu8.g > .5) f8.g = uf8.g;
else f8.g = mf8.g;
[flatten] if(cu8.b > .5) f8.b = uf8.b;
else f8.b = mf8.b;
s8 += f8/Quantization;
current -= f8;

float3 cu9 = current+f9;
float3 mf9 = -f9;
float3 uf9 = 1.-f9;
[flatten] if(cu9.r > .5) f9.r = uf9.r;
else f9.r = mf9.r;
[flatten] if(cu9.g > .5) f9.g = uf9.g;
else f9.g = mf9.g;
[flatten] if(cu9.b > .5) f9.b = uf9.b;
else f9.b = mf9.b;
s9 += f9/Quantization;
current -= f9;

float3 cu10 = current+f10;
float3 mf10 = -f10;
float3 uf10 = 1.-f10;
[flatten] if(cu10.r > .5) f10.r = uf10.r;
else f10.r = mf10.r;
[flatten] if(cu10.g > .5) f10.g = uf10.g;
else f10.g = mf10.g;
[flatten] if(cu10.b > .5) f10.b = uf10.b;
else f10.b = mf10.b;
s10 += f10/Quantization;
current -= f10;

float3 cu11 = current+f11;
float3 mf11 = -f11;
float3 uf11 = 1.-f11;
[flatten] if(cu11.r > .5) f11.r = uf11.r;
else f11.r = mf11.r;
[flatten] if(cu11.g > .5) f11.g = uf11.g;
else f11.g = mf11.g;
[flatten] if(cu11.b > .5) f11.b = uf11.b;
else f11.b = mf11.b;
s11 += f11/Quantization;
current -= f11;

float3 cu12 = current+f12;
float3 mf12 = -f12;
float3 uf12 = 1.-f12;
[flatten] if(cu12.r > .5) f12.r = uf12.r;
else f12.r = mf12.r;
[flatten] if(cu12.g > .5) f12.g = uf12.g;
else f12.g = mf12.g;
[flatten] if(cu12.b > .5) f12.b = uf12.b;
else f12.b = mf12.b;
s12 += f12/Quantization;
current -= f12;

float3 cu13 = current+f13;
float3 mf13 = -f13;
float3 uf13 = 1.-f13;
[flatten] if(cu13.r > .5) f13.r = uf13.r;
else f13.r = mf13.r;
[flatten] if(cu13.g > .5) f13.g = uf13.g;
else f13.g = mf13.g;
[flatten] if(cu13.b > .5) f13.b = uf13.b;
else f13.b = mf13.b;
s13 += f13/Quantization;
current -= f13;

float3 cu14 = current+f14;
float3 mf14 = -f14;
float3 uf14 = 1.-f14;
[flatten] if(cu14.r > .5) f14.r = uf14.r;
else f14.r = mf14.r;
[flatten] if(cu14.g > .5) f14.g = uf14.g;
else f14.g = mf14.g;
[flatten] if(cu14.b > .5) f14.b = uf14.b;
else f14.b = mf14.b;
s14 += f14/Quantization;
current -= f14;

float3 cu15 = current+f15;
float3 mf15 = -f15;
float3 uf15 = 1.-f15;
[flatten] if(cu15.r > .5) f15.r = uf15.r;
else f15.r = mf15.r;
[flatten] if(cu15.g > .5) f15.g = uf15.g;
else f15.g = mf15.g;
[flatten] if(cu15.b > .5) f15.b = uf15.b;
else f15.b = mf15.b;
s15 += f15/Quantization;
current -= f15;

float3 cu16 = current+f16;
float3 mf16 = -f16;
float3 uf16 = 1.-f16;
[flatten] if(cu16.r > .5) f16.r = uf16.r;
else f16.r = mf16.r;
[flatten] if(cu16.g > .5) f16.g = uf16.g;
else f16.g = mf16.g;
[flatten] if(cu16.b > .5) f16.b = uf16.b;
else f16.b = mf16.b;
s16 += f16/Quantization;
current -= f16;

float3 cu17 = current+f17;
float3 mf17 = -f17;
float3 uf17 = 1.-f17;
[flatten] if(cu17.r > .5) f17.r = uf17.r;
else f17.r = mf17.r;
[flatten] if(cu17.g > .5) f17.g = uf17.g;
else f17.g = mf17.g;
[flatten] if(cu17.b > .5) f17.b = uf17.b;
else f17.b = mf17.b;
s17 += f17/Quantization;
current -= f17;

float3 cu18 = current+f18;
float3 mf18 = -f18;
float3 uf18 = 1.-f18;
[flatten] if(cu18.r > .5) f18.r = uf18.r;
else f18.r = mf18.r;
[flatten] if(cu18.g > .5) f18.g = uf18.g;
else f18.g = mf18.g;
[flatten] if(cu18.b > .5) f18.b = uf18.b;
else f18.b = mf18.b;
s18 += f18/Quantization;
current -= f18;

float3 cu19 = current+f19;
float3 mf19 = -f19;
float3 uf19 = 1.-f19;
[flatten] if(cu19.r > .5) f19.r = uf19.r;
else f19.r = mf19.r;
[flatten] if(cu19.g > .5) f19.g = uf19.g;
else f19.g = mf19.g;
[flatten] if(cu19.b > .5) f19.b = uf19.b;
else f19.b = mf19.b;
s19 += f19/Quantization;
current -= f19;

float3 cu20 = current+f20;
float3 mf20 = -f20;
float3 uf20 = 1.-f20;
[flatten] if(cu20.r > .5) f20.r = uf20.r;
else f20.r = mf20.r;
[flatten] if(cu20.g > .5) f20.g = uf20.g;
else f20.g = mf20.g;
[flatten] if(cu20.b > .5) f20.b = uf20.b;
else f20.b = mf20.b;
s20 += f20/Quantization;
current -= f20;

float3 cu21 = current+f21;
float3 mf21 = -f21;
float3 uf21 = 1.-f21;
[flatten] if(cu21.r > .5) f21.r = uf21.r;
else f21.r = mf21.r;
[flatten] if(cu21.g > .5) f21.g = uf21.g;
else f21.g = mf21.g;
[flatten] if(cu21.b > .5) f21.b = uf21.b;
else f21.b = mf21.b;
s21 += f21/Quantization;
current -= f21;

float3 cu22 = current+f22;
float3 mf22 = -f22;
float3 uf22 = 1.-f22;
[flatten] if(cu22.r > .5) f22.r = uf22.r;
else f22.r = mf22.r;
[flatten] if(cu22.g > .5) f22.g = uf22.g;
else f22.g = mf22.g;
[flatten] if(cu22.b > .5) f22.b = uf22.b;
else f22.b = mf22.b;
s22 += f22/Quantization;
current -= f22;

float3 cu23 = current+f23;
float3 mf23 = -f23;
float3 uf23 = 1.-f23;
[flatten] if(cu23.r > .5) f23.r = uf23.r;
else f23.r = mf23.r;
[flatten] if(cu23.g > .5) f23.g = uf23.g;
else f23.g = mf23.g;
[flatten] if(cu23.b > .5) f23.b = uf23.b;
else f23.b = mf23.b;
s23 += f23/Quantization;
current -= f23;

float3 cu24 = current+f24;
float3 mf24 = -f24;
float3 uf24 = 1.-f24;
[flatten] if(cu24.r > .5) f24.r = uf24.r;
else f24.r = mf24.r;
[flatten] if(cu24.g > .5) f24.g = uf24.g;
else f24.g = mf24.g;
[flatten] if(cu24.b > .5) f24.b = uf24.b;
else f24.b = mf24.b;
s24 += f24/Quantization;
current -= f24;

float3 cu25 = current+f25;
float3 mf25 = -f25;
float3 uf25 = 1.-f25;
[flatten] if(cu25.r > .5) f25.r = uf25.r;
else f25.r = mf25.r;
[flatten] if(cu25.g > .5) f25.g = uf25.g;
else f25.g = mf25.g;
[flatten] if(cu25.b > .5) f25.b = uf25.b;
else f25.b = mf25.b;
s25 += f25/Quantization;
//current -= f25;

[flatten] if(n.y > .8) {
[flatten] if(n.x > .8) return s9.rgbb;
else [flatten] if(n.x > .6) return s10.rgbb;
else [flatten] if(n.x > .4) return s11.rgbb;
else [flatten] if(n.x > .2) return s12.rgbb;
else return s13.rgbb;}
else [flatten] if(n.y > .6) {
[flatten] if(n.x > .8) return s8.rgbb;
else [flatten] if(n.x > .6) return s21.rgbb;
else [flatten] if(n.x > .4) return s22.rgbb;
else [flatten] if(n.x > .2) return s23.rgbb;
else return s14.rgbb;}
else [flatten] if(n.y > .4) {
[flatten] if(n.x > .8) return s7.rgbb;
else [flatten] if(n.x > .6) return s20.rgbb;
else [flatten] if(n.x > .4) return s25.rgbb;
else [flatten] if(n.x > .2) return s24.rgbb;
else return s15.rgbb;}
else [flatten] if(n.y > .2) {
[flatten] if(n.x > .8) return s6.rgbb;
else [flatten] if(n.x > .6) return s19.rgbb;
else [flatten] if(n.x > .4) return s18.rgbb;
else [flatten] if(n.x > .2) return s17.rgbb;
else return s16.rgbb;}
else {
[flatten] if(n.x > .8) return s5.rgbb;
else [flatten] if(n.x > .6) return s4.rgbb;
else [flatten] if(n.x > .4) return s3.rgbb;
else [flatten] if(n.x > .2) return s2.rgbb;
else return s1.rgbb;}
}

JanWillem32
19th October 2015, 19:16
I patched the problem with the pixel shader compiler not being able to compile some complicated shaders. (The two 5×5 shaders above are good examples.)
I optimized some of the (re-)initialization sequences, specifically for the internal pixel shaders.

XRyche
19th October 2015, 22:32
ts1, I'll have to discuss some things with the MPC-HC staff.
If I were to merge the renderer fixes onto the latest version of the code I could work miracles. However, it will come at the cost of VMR-7 r. and EVR Sync. If people are willing to accept that, I can finally merge the renderer fixes into the main branch.

new shaders (note that the 5×5 version will not work in the current build, but will in the next):

Neither VMR-7 r. or EVR Sync are used anymore so why should it be an issue for them. The main branch of MPC-HC would benefit much more by allowing your modifications than simply hanging onto obsolete renders. Just my opinion, I don't know the in and the outs of what goes on in development.

I patched the problem with the pixel shader compiler not being able to compile some complicated shaders. (The two 5×5 shaders above are good examples.)
I optimized some of the (re-)initialization sequences, specifically for the internal pixel shaders.
x64 AVX: http://www.mediafire.com/download/wwkalp293q99thd/mpc-hc64_AVX_tester_im.7z
x64: http://www.mediafire.com/download/ka9773wwgiei3v9/mpc-hc64_tester_im.7z
x86 AVX: http://www.mediafire.com/download/fcdfqjdcd941cr5/mpc-hc_AVX_tester_im.7z
x86 SSE2: http://www.mediafire.com/download/hbxae9muw0312p4/mpc-hc_SSE2_tester_im.7z

I assume this means that the new block error diffusion shaders will work under normal circumstances except for turning of the internal dithering function. I also assume these shaders should be dead last in the shader chain.

P.S. I am picking up the grid pattern in dark areas when sitting right at my monitor. It of course, is progressively more apparent with 4x4 and then 3x3. 5x5 block error diffusion seems to be the most tolerable one for me. When using 4x4 or 3x3 the grid pattern is too apparent and higher levels of random colour dithering looks better. With 5x5 block error diffusion the only downside is seeing the grid in dark areas. I can't see the grid in excessively white scenes or solid one colour scenes (ie: blue skies and ocean). With random colour dithering I could see the dithering patterns at higher levels regardless of the scene's colour composition.

JanWillem32
20th October 2015, 17:44
I made new builds that can utilize the new ditherers. (On top of that these are now capable of running Y'CbCr (trough back-tracking through the mixer output video R'G'B'), LMS, display RGB and display R'G'B' shaders. For example, for the anaglyph 3D shaders getting access to the display R'G'B' stage is important.) The shaders are in the "finalpass replacement" folder of the archives.
The "Enable Color Passtrough Mode" is now separate from the other color management settings and the combination of these settings is now relevant.

XRyche
21st October 2015, 18:00
With the "substitute colour management" shader I am getting two errors when trying to compile. They are : "error X3004: undeclared identifier 'Mm' " and "error X3014: incorrect number of arguments to numeric-type constructor" .

JanWillem32
21st October 2015, 18:09
Those don't matter to the renderer's compiler which sets the M? macros correctly.

XRyche
21st October 2015, 18:19
It's telling me that it failed to compile, so I am lost.

On another note, if I want to use your XLRCAM for LMS colour management scripts I need to place them between the "Substitute colour management" and the "block diffusion dithering" scripts. Is that correct?

JanWillem32
21st October 2015, 18:35
The pixel shader compiler user interface doesn't set macros, only the renderer does. If you feed the "substitute color management" shader to the renderer, it will work. Note that color management settings are fed through the macros, so these settings are relevant while compiling the pixel shader. The pixel shader is not automatically updated when color settings change, so if you change settings, re-compile the shader by disabling and re-enabling the post-resize pixel shader stage.
The XLRCAM shader needs LMS input. It can work after the initial pass, before resizing, and just after resizing. It should not be placed in RGB or R'G'B' stages.
The display R'G'B' stage in between the "substitute color management" and the "block-based error diffusion dithering" shaders is pretty much only useful to the anaglyph 3D and the "3LCD panel software alignment"-type shaders.

XRyche
21st October 2015, 18:47
The pixel shader compiler user interface doesn't set macros, only the renderer does. If you feed the "substitute color management" shader to the renderer, it will work. Note that color management settings are fed through the macros, so these settings are relevant while compiling the pixel shader. The pixel shader is not automatically updated when color settings change, so if you change settings, re-compile the shader by disabling and re-enabling the post-resize pixel shader stage.

Okay, I think I understand. So every time I change , for instance, the ambient light settings in the internal colour management I need to disable and then enable the post-resize shaders.

The XLRCAM shader needs LMS input. It can work after the initial pass, before resizing, and just after resizing. It should not be placed in RGB or R'G'B' stages.
The display R'G'B' stage in between the "substitute color management" and the "block-based error diffusion dithering" shaders is pretty much only useful to the anaglyph 3D and the "3LCD panel software alignment"-type shaders.

So I chain the shaders in post-resize as follows:

- XLRCAM for LMS
- Substitute colour management
- block error diffusion

Of course disabling the internal coloured random dithering.
This would be correct?

JanWillem32
21st October 2015, 18:49
That would work fine. (Do set the "Enable Color Passtrough Mode" option and optionally one of the color management options.)

JanWillem32
21st October 2015, 18:58
I revised the "substitute color management" shader, as I forgot one line (it only really matters for the 3DLut-enabled mode):// (C) 2015 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// substitute color management
// This shader must be run as a screen space pixel shader.
// This shader requires compiling with ps_2_0, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// This is a renderer-specific shader.
// To use this shader, set the render options "Color Management" to "Enable Color Passtrough Mode" and "Dithering Levels" to "0 Rounding".
// After this shader one of the dithering shaders must be used.
// Simple display RGB shaders can be injected in the middle of this shader, complicated display R'G'B'-type shaders can be chained in between this shader and the dithering shader
// Don't forget to force re-compiling of this shader when changing renderer color settings.

#if Mc == 1
sampler LUT3Drg : register(s2);
sampler LUT3Db : register(s3);
static const float LUT3Dsize = Ms, unitlength = LUT3Dsize*.875-1., unitlength3 = pow(unitlength, 3);// to allow more chromatic changes to be encoded, the XYZ channels as stored in the LUT are extended by one eighth past the interval [0, 1]
#endif
static const float3x3 mat = transpose(float3x3(
#if Mc == 2
763198626475360358257813253./147994540717374853866816250., -6373975142277374689886058001./1479945407173748538668162500., 221934284697519645976087971./1479945407173748538668162500., -605343464019237999732424841./518184319406495060865758750., 12063681187144308402184664197./5181843194064950608657587500., -828403352886977796202828287./5181843194064950608657587500., 27361298410291889828212027./523137322083995952425108750., -1196635193693957989329923759./5231373220839959524251087500., 6154395430430998615298890989./5231373220839959524251087500.// convert to SMPTE C (NTSC) display RGB
#elif Mc == 3
5889610677243193337436144851./1005872293755806060844662500., -50563246336667155788637763367./10058722937558060608446625000., 1725862501793283022722939857./10058722937558060608446625000., -156299752813513892312443129./118586087319713310233175625., 2972168594434151167364839293./1185860873197133102331756250., -223310193101879141908651753./1185860873197133102331756250., 41217384068187972753550441./995689156108134584983951250., -1937830199233462093713839397./9956891561081345849839512500., 11482547919632928216017847487./9956891561081345849839512500.// convert to EBU 3213 (PAL/SECAM) display RGB
#elif Mc == 1
118340000000./61951306817., -68897700000./61951306817., 12509006817./61951306817., 22981000000./61951306817., 38970700000./61951306817., -393183./61951306817., 0., 0., 1.// convert to XYZ, but keep the white point adaptation that was applied in the initial pass
#else
#ifdef Mm
Mm// convert to display RGB
#else
1., 0., 0., 0., 1., 0., 0., 0., 1.// do no conversion in external pixel shader compilers
#endif
#endif
)
#if Mr
*65535./32767.// restore to full range, first part
#endif
);

sampler s0;

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 s1 = tex2D(s0, tex).rgb;// output default pass-trough
s1 = s1.r*mat[0]+s1.g*mat[1]+s1.b*mat[2];// convert to XYZ or display RGB
#if Mr
s1 -= 16384./32767.;// restore to full range, second part
#endif
// simple display RGB shaders can be injected here

s1 = max(0., s1);// discard negative RGB/XYZ values
#if Mc != 1
#ifdef Mg// do no conversion in external pixel shader compilers
s1 = pow(s1, 100./(float(Mg)+100.));// apply non-linear gamma correction
#endif
#endif
#if Mc == 1
float3 dcolor = pow(s1, 1./3.);// apply cube root gamma correction on the XYZ channels
float3 n = dcolor*unitlength;// to allow more chromatic changes to be encoded, the XYZ channels as stored in the LUT are extended by one eighth past the interval [0, 1]
float3 ntrunc = n-frac(n);
float3 ntrunc2 = ntrunc*ntrunc;
float3 lweight = (s1*unitlength3-ntrunc2*ntrunc)/(3.*(ntrunc2+ntrunc)+1.);// do the interpolation in linear space, optimization of (s1-pow(ntrunc/unitlength, 3))/(pow((ntrunc+1.)/unitlength, 3)-pow(ntrunc/unitlength, 3))
dcolor = (ntrunc+lweight)/(LUT3Dsize-1.);
s1 = dcolor*(LUT3Dsize-1.)/LUT3Dsize+.5/LUT3Dsize;// make the sampling position line up with an exact voxel coordinate
s1 = float3(tex3D(LUT3Drg, s1).rg, tex3D(LUT3Db, s1).r);// sample from the LUT3D
#elif Mc == 2 || Mc == 3
s1 = s1.r*float3(21827./85000., -16744./112965., 112./255.)+s1.g*float3(42851./85000., -32872./112965., -65744./178755.)+s1.b*float3(4161./42500., 112./255., -4256./59585.)+(float2(32., 1.)/510.).xyy;// BT.601 R'G'B' to Y'CbCr and compress ranges
s1 = s1.r+float3(0., -25251./73375., 1.772)*s1.g+float3(1.402, -209599./293500., 0.)*s1.b;// BT.601 Y'CbCr to R'G'B'
#elif Mc == 4
s1 = s1.r*float3(77599./425000., -119056./1182945., 112./255.)+s1.g*float3(32631./53125., -133504./394315., -133504./334645.)+s1.b*float3(26353./425000., 112./255., -40432./1003935.)+(float2(32., 1.)/510.).xyy;// BT.709 R'G'B' to Y'CbCr and compress ranges
s1 = s1.r+float3(0., -1674679./8940000., 1.8556)*s1.g+float3(1.5748, -4185031./8940000., 0.)*s1.b;// BT.709 Y'CbCr to R'G'B'
#endif
#if Mr
s1 = s1*32767/65535.+16384/65535.;// convert to limited range
#endif
return s1.rgbb;
}

XRyche
21st October 2015, 19:20
That would work fine. (Do set the "Enable Color Passtrough Mode" option and optionally one of the color management options.)

Done. Thank you for explaining it in detail.

I revised the "substitute color management" shader, as I forgot one line (it only really matters for the 3DLut-enabled mode):// (C) 2015 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// substitute color management
// This shader must be run as a screen space pixel shader.
// This shader requires compiling with ps_2_0, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// This is a renderer-specific shader.
// To use this shader, set the render options "Color Management" to "Enable Color Passtrough Mode" and "Dithering Levels" to "0 Rounding".
// After this shader one of the dithering shaders must be used.
// Simple display RGB shaders can be injected in the middle of this shader, complicated display R'G'B'-type shaders can be chained in between this shader and the dithering shader
// Don't forget to force re-compiling of this shader when changing renderer color settings.

#if Mc == 1
sampler LUT3Drg : register(s2);
sampler LUT3Db : register(s3);
static const float LUT3Dsize = Ms, unitlength = LUT3Dsize*.875-1., unitlength3 = pow(unitlength, 3);// to allow more chromatic changes to be encoded, the XYZ channels as stored in the LUT are extended by one eighth past the interval [0, 1]
#endif
static const float3x3 mat = transpose(float3x3(
#if Mc == 2
763198626475360358257813253./147994540717374853866816250., -6373975142277374689886058001./1479945407173748538668162500., 221934284697519645976087971./1479945407173748538668162500., -605343464019237999732424841./518184319406495060865758750., 12063681187144308402184664197./5181843194064950608657587500., -828403352886977796202828287./5181843194064950608657587500., 27361298410291889828212027./523137322083995952425108750., -1196635193693957989329923759./5231373220839959524251087500., 6154395430430998615298890989./5231373220839959524251087500.// convert to SMPTE C (NTSC) display RGB
#elif Mc == 3
5889610677243193337436144851./1005872293755806060844662500., -50563246336667155788637763367./10058722937558060608446625000., 1725862501793283022722939857./10058722937558060608446625000., -156299752813513892312443129./118586087319713310233175625., 2972168594434151167364839293./1185860873197133102331756250., -223310193101879141908651753./1185860873197133102331756250., 41217384068187972753550441./995689156108134584983951250., -1937830199233462093713839397./9956891561081345849839512500., 11482547919632928216017847487./9956891561081345849839512500.// convert to EBU 3213 (PAL/SECAM) display RGB
#elif Mc == 1
118340000000./61951306817., -68897700000./61951306817., 12509006817./61951306817., 22981000000./61951306817., 38970700000./61951306817., -393183./61951306817., 0., 0., 1.// convert to XYZ, but keep the white point adaptation that was applied in the initial pass
#else
#ifdef Mm
Mm// convert to display RGB
#else
1., 0., 0., 0., 1., 0., 0., 0., 1.// do no conversion in external pixel shader compilers
#endif
#endif
)
#if Mr
*65535./32767.// restore to full range, first part
#endif
);

sampler s0;

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 s1 = tex2D(s0, tex).rgb;// output default pass-trough
s1 = s1.r*mat[0]+s1.g*mat[1]+s1.b*mat[2];// convert to XYZ or display RGB
#if Mr
s1 -= 16384./32767.;// restore to full range, second part
#endif
// simple display RGB shaders can be injected here

s1 = max(0., s1);// discard negative RGB/XYZ values
#if Mc != 1
#ifdef Mg// do no conversion in external pixel shader compilers
s1 = pow(s1, 100./(float(Mg)+100.));// apply non-linear gamma correction
#endif
#endif
#if Mc == 1
float3 dcolor = pow(s1, 1./3.);// apply cube root gamma correction on the XYZ channels
float3 n = dcolor*unitlength;// to allow more chromatic changes to be encoded, the XYZ channels as stored in the LUT are extended by one eighth past the interval [0, 1]
float3 ntrunc = n-frac(n);
float3 ntrunc2 = ntrunc*ntrunc;
float3 lweight = (s1*unitlength3-ntrunc2*ntrunc)/(3.*(ntrunc2+ntrunc)+1.);// do the interpolation in linear space, optimization of (s1-pow(ntrunc/unitlength, 3))/(pow((ntrunc+1.)/unitlength, 3)-pow(ntrunc/unitlength, 3))
dcolor = (ntrunc+lweight)/(LUT3Dsize-1.);
s1 = dcolor*(LUT3Dsize-1.)/LUT3Dsize+.5/LUT3Dsize;// make the sampling position line up with an exact voxel coordinate
s1 = float3(tex3D(LUT3Drg, s1).rg, tex3D(LUT3Db, s1).r);// sample from the LUT3D
#elif Mc == 2 || Mc == 3
s1 = s1.r*float3(21827./85000., -16744./112965., 112./255.)+s1.g*float3(42851./85000., -32872./112965., -65744./178755.)+s1.b*float3(4161./42500., 112./255., -4256./59585.)+(float2(32., 1.)/510.).xyy;// BT.601 R'G'B' to Y'CbCr and compress ranges
s1 = s1.r+float3(0., -25251./73375., 1.772)*s1.g+float3(1.402, -209599./293500., 0.)*s1.b;// BT.601 Y'CbCr to R'G'B'
#elif Mc == 4
s1 = s1.r*float3(77599./425000., -119056./1182945., 112./255.)+s1.g*float3(32631./53125., -133504./394315., -133504./334645.)+s1.b*float3(26353./425000., 112./255., -40432./1003935.)+(float2(32., 1.)/510.).xyy;// BT.709 R'G'B' to Y'CbCr and compress ranges
s1 = s1.r+float3(0., -1674679./8940000., 1.8556)*s1.g+float3(1.5748, -4185031./8940000., 0.)*s1.b;// BT.709 Y'CbCr to R'G'B'
#endif
#if Mr
s1 = s1*32767/65535.+16384/65535.;// convert to limited range
#endif
return s1.rgbb;
}

It's compiling just fine now. No errors.

I am not seeing any discernible grid patterns at all now with the 5x5 dithering shader. I'll use the same file I used when I saw the grid pattern in darker areas before and check again.
I assume I was seeing the pattern in the first place because I wasn't using the substitute colour management script and colour passthrough mode on the last build.

XRyche
21st October 2015, 19:25
I just checked. I can't see any pattern at all when sitting right at my monitor using 5x5 block error diffusion dithering even in the dark areas where I was seeing it before. It looks very nice. You made some changes to the OSD graphs and progress D3D FS progress bar as well recently, didn't you?

JanWillem32
21st October 2015, 19:43
No, they're the same. I just don't have a post-OSD/subtitles/stats screen shader slot. The colors are wrong because the final pass is missing when you override it with post-resize shaders at the moment.
I should probably re-order the current post-resize shaders to the post-OSD/subtitles/stats screen shader slot. I see no harm in that.

XRyche
21st October 2015, 19:49
No, they're the same. I just don't have a post-OSD/subtitles/stats screen shader slot. The colors are wrong because the final pass is missing when you override it with post-resize shaders at the moment.

Ah, that explains it. It's not a big deal. Still is functional and everything else is working as it should.

Hera
21st October 2015, 23:23
Cursor doesn't go away on non-exclusive fullscreen (alt-enter)?

JanWillem32
22nd October 2015, 11:45
Do you know since which version this happens?

JanWillem32
22nd October 2015, 12:00
I already fixed the problem.

XRyche
23rd October 2015, 01:42
Latest build has also seemed to fix the long standing and ever present cursor problem when using D3D FS as well.

The block error diffusion method , at least when using 5x5, has made a huge difference in quality on my 6 bit TN over random-colour dithering. I honestly didn't think it would make such an impact. I expected it would improve quality, at least a little, but not this much. What level of random colour dithering would be comparable to 5x5 block error diffusion dithering?

Anima123
23rd October 2015, 02:26
So I chain the shaders in post-resize as follows:

- XLRCAM for LMS
- Substitute colour management
- block error diffusion

Of course disabling the internal coloured random dithering.
This would be correct?

Mind if you can elaborate in more detail about how to use customized shader and how to chain them?

XRyche
23rd October 2015, 04:41
Mind if you can elaborate in more detail about how to use customized shader and how to chain them?

The shaders I listed were all post process shaders. You want to make sure the "block-based error diffusion shader" is at the very end of the chain. Right before this though, you want the Substitute colour management shader". The only type of shader you would put between these two shader scripts would be a shader that is in the R'G'B' colourspace like JanWilliem32's anaglyph 3D and the "3LCD panel software" that JanWilliem32 mentioned before. To my understanding, the average user shouldn't really have a need to put anything between the "Substitute Colour management" and the"block-based error diffusion" shaders. Any shaders that deal with sharpening or colour controls would always be before these two shaders.

Also just like JanWilliem32 said you have to set Colour Management to "Enable Colour Passthrough Mode" in Renderer Settings.

If you don't put them in the correct order it's not going to work.

I hope this helped. I'm sure JanWilliem32 will be able to better elaborate on how to change these new shaders though.

JanWillem32
23rd October 2015, 11:07
What level of random colour dithering would be comparable to 5x5 block error diffusion dithering?The random ditherer actually is just a noise generator that spans multiple levels to hide banding by heavy noise. Error diffusion dithering is single level (more comparable with the other two single-level ditherers). It's less noisy than any of the ditherers because of its methods.

I hope this helped. I'm sure JanWilliem32 will be able to better elaborate on how to change these new shaders though.You are mostly correct. The only thing missing is that you also have to set the internal ditherer to "0 (Rounding)" to enable the pass-through mode.

Hera
24th October 2015, 03:20
I already fixed the problem.

Almost perfect!

With this new build,

If I move my cursor at least once when in full screen mode - it disappears.

If I don't move my cursor when in full-screen - cursor remains on the screen.

Can anyone reproduce?

Anima123
24th October 2015, 04:51
Almost perfect!

With this new build,

If I move my cursor at least once when in full screen mode - it disappears.

If I don't move my cursor when in full-screen - cursor remains on the screen.

Can anyone reproduce?

The same here.

ts1
24th October 2015, 12:13
4k 60fps vp9 video is very laggy with display stats enabled. It plays the same with or without display stats on tranc mpc-hc. https://www.sendspace.com/file/o0oe5t test video.

Hera
24th October 2015, 16:01
4k 60fps vp9 video is very laggy with display stats enabled. It plays the same with or without display stats on tranc mpc-hc. https://www.sendspace.com/file/o0oe5t test video.
Sam here - lags with stats on.

XRyche
24th October 2015, 20:24
Almost perfect!

With this new build,

If I move my cursor at least once when in full screen mode - it disappears.

If I don't move my cursor when in full-screen - cursor remains on the screen.

Can anyone reproduce?

I can confirm this behavior as well. This only happens in Fullscreen mode not D3D FS mode. In D3D FS the cursor disappears after only a second or two without any user interaction.

JanWillem32
25th October 2015, 20:03
The stats screen of the trunk build uses a method that is less CPU-intensive than mine when text is being shown. I'll take a look at optimizing the text part a bit later on by not using GDI text drawing on every frame. I assume that the problem isn't there in the stats screen mode with only the graph visible?
As for the cursor hiding problem, I can probably copy the D3D FS exclusive mode cursor hiding methods to those of the windowed fullscreen mode. I'll have to test that a bit. But I will not be home for a few days, so have patience.

XRyche
4th November 2015, 02:57
JanWillem32, frame interpolation is the very last step your renderer takes before actually outputting the video, correct?

JanWillem32
4th November 2015, 10:59
No, frame interpolation is in between the pre-resize shaders and the resizing steps since the last few versions.

XRyche
5th November 2015, 06:14
Oh, that explains about a lot of the recent improvements I've been seeing with frame interpolation aliasing artifacts. On well encoded HD media (BD and digital BD files mostly) I can barely see any aliasing artifacts even when sitting right at the monitor. The aliasing artifacts are still there with DVD and well encoded SD files but much more toned down. Even with a little bit of post-resize sharpening (Fine-sharp HLSL script tweaked to work with SD) the aliasing artifacts aren't that bad. Badly encoded or damaged SD files still can have some heavy aliasing artifacts but that is to be expected considering.

JanWillem32
6th November 2015, 19:18
I changed the fullscreen mouse hider, specifically when opening in windowed fullscreen. It's a bit hachish, but it seems to work for me.
I changed the stats screen font drawing system. It should be much faster now.
x64 AVX: http://www.mediafire.com/download/wwkalp293q99thd/mpc-hc64_AVX_tester_im.7z
x64: http://www.mediafire.com/download/ka9773wwgiei3v9/mpc-hc64_tester_im.7z
x86 AVX: http://www.mediafire.com/download/fcdfqjdcd941cr5/mpc-hc_AVX_tester_im.7z
x86 SSE2: http://www.mediafire.com/download/hbxae9muw0312p4/mpc-hc_SSE2_tester_im.7z

ts1
6th November 2015, 20:57
Still same lags. Also with only the graph visible and in D3D fullscreen mode. CPU load is only 75-80% though (100% in stats, trunk mpc-hc 67-72%) and GPU ~20. And with VMR9 player hangs after 1st lag.

Edit: Without stats CPU usage is 65-69%.

XRyche
7th November 2015, 06:31
I only get about a 2% increase, if that, in cpu usage when having the full stats screen enabled. No lags or anything. This is using the modified EVR-CP renderer. GPU increase is negligible if at all when using the full stats and I use a lot of your custom HLSL scripts and Motion adaptive-High motion frame interpolation as well. I'm not having any issues with it at all on my setup and cosmetically it looks the same, which is good.

Hera
7th November 2015, 19:36
I think just having the graph on - no other text lags the output.
Oh and the default resizer now is Mitchel-Natravali Spline 4?

XRyche
8th November 2015, 03:07
Oh and the default resizer now is Mitchel-Natravali Spline 4?

What was the default resizer before? Mitch-Nat Spline 4 does work great for damaged video though :) .

Hera
8th November 2015, 04:57
What was the default resizer before? Mitch-Nat Spline 4 does work great for damaged video though :) .

I thought it was bilinear by default. Hm... now that you ask I am not sure that's correct.

EDIT: And, sadly, cursor doesn't disappear unless moved.

Also, the option menu could use a little dimension tweaking at high DPI though.

JanWillem32
8th November 2015, 22:38
Mitchel-Natravali Spline 4 is indeed the standard resizer. It's a balanced resizer. You can choose Catmull-Rom for sharper images (but with mild anisotropy and ringing artifacts) and B-Spline for more noise resistance (but unsharp images). See https://de.wikipedia.org/wiki/Mitchell-Netravali-Filter for a good overview in German. (The image is quite easy to read. B-spline 4 is on B=1, C=0, Mitchel-Natravali Spline 4 is B=1/3, C=1/3, Catmull-Rom spline is B=0, C=1/2. Robidoux filters are somewhere near Mitchel-Natravali Spline 4 on the dotted line.)
I'll have to experiment some more with the cursor handling code for launching in windowed fullscreen mode, it seems. The problem is, it works properly for me. I don't know I can fix this issue. It's not really code I usually edit at all. (I dislike editing the GUI elements. That also includes editing the menus.)
I can see that the current stats screen method could lag a bit when displaying at 60 frames per second on a 4k screen with high CPU usage. I balanced the CPU/GPU usage compared to the trunk build. (Somewat more CPU usage, less GPU usage.) I mostly wanted to eliminate the dependency on the extra D3D9X library (in the dll).
I'm not familiar with lags in VMR-9 r.. I'll have to test it a bit. ts1, do you have a sample and condition which always seems to get stuck with VMR-9 r.? Note that I didn't change anything timing related in the VMR-9 mixer (it doesn't allow it anyway).

XRyche
8th November 2015, 23:48
I can see that the current stats screen method could lag a bit when displaying at 60 frames per second on a 4k screen with high CPU usage.

Wouldn't that be expected though. Sixty FPS on a 4k screen is no small feat in of itself.

Hera
9th November 2015, 01:18
I have no problems with it. TBH I can't tell the difference between the resizers most of the time.

Well I double click on the video file, micro-pause, MPC launches while I don't move the mouse, then I press ALT-ENTER.

For the better or worse, most issues left are GUI related to me. Like poor HiDPI support, mouse cursor not disappearing, ... I also think I had mouse cursor disappear when over the controls in full screen non-exclusive mode - that shouldn't happen.

CPU usage climbed to ~30% (~25% medium) at 3.2Ghz. Which is a lot for my CPU. Now - I did put my computer on the high performance plan and ... no lag from the stats screen. High performance - 20% to 30% CPU at 3.6Ghz. Using VP90 3840x2160 60fps video clip. Most of the CPU usage is from decoding the video though....

ts1
9th November 2015, 13:49
VMR9 is not affected, this video played once till the end with stats enabled ~the same as on trunk mpc-hc. So lags only with evr custom + stats.
do you have a sample and condition which always seems to get stuck with VMR-9 r.?
Try to disable dxva, set player to lower priority and run 7-zip for example set to high priority on ultra with lzma2, always hangs for me on any video with vmr9 r.

v0lt
14th November 2015, 05:01
@JanWillem32
Do you plan to make a frame downscaling like madVR or VirtualDub (VirtualDub-1.10.4-src.7z\VirtualDub\source\f_resize.vdshaders)?
There is less scale image, the more reference points in the shader.

XRyche
26th November 2015, 22:13
I've ran into a problem with the ISR and the new AMD Crimson Beta Drivers. Entire subtitle lines get skipped unless you set the sub picture buffer to 0. This is with .ass subtitles, I haven't tested with vobsub.








I found a solution to this. If I create a game profile for MPC-HC Ex EVR and set the new shader cache to off, ISR works as it should. It's strange to me though since I don't think the ISR uses any shaders. Oh well, it works so idc.