Log in

View Full Version : Video pixel shader pack


Pages : 1 2 3 4 5 6 7 [8] 9 10 11

XRyche
11th July 2013, 16:51
XRyche, the RGB to Y'CbCr conversion shader is fairly basic. I assume you mean the chroma up-sampling shaders? These will distort the chroma if the values were altered before. Sharp chroma borders such as red on black and blue on black will have the worst artifacts.
I'll see what I can write today and tomorrow. Implementing this chain as three passes of pixel shaders should be easy enough.



No, I didn't mean the chroma up-sampling shaders. I already understand now that I don't need them since I have an Nvidia GPU. I meant that I wanted you to create 2 "sharpen complex, deband, denoise and color controls for SD&HD video input". One with 2 radial sharpening levels and one with 1 radial sharpening level. Similar to what you have for the Linear RGB sharpen complex, deband and denoise shaders in your Video pixel shader pack. I happen to prefer Y'CbCr to RGB and I would like to use the "RGB to Y'CbCr for SD&HD video input for floating point surfaces" shader in conjuction with the Y'CbCr "sharpen complex, deband, denoise and color controls for SD&HD video input" shaders with all my content. This includes some 1200p content which is too much for my system to use with your Y'CbCr "r=4, sharpen complex, deband, medium denoise and color controls for SD&HD video input" shader. :thanks:

edit: I don't mean to sound demanding, if that's how it sounds.

JanWillem32
13th July 2013, 22:24
XRyche, I'm still experimenting a bit with some shader stages. The shaders are not that hard to combine, but getting a combination of good performance and amiable effects isn't easy. On top of that, these are pretty much the most complex shaders I've written for processing video. I'm pretty sure I should simplify some parts, but I'll just have to try some things and hope the effect gets better.

mhourousha, I took some time to analyze what kind of effect you were trying to apply. I saw that it only increases colorfulness. The bulk of the code in your shader is the rather involved process of RGB to HSL conversion and back again. From what I could make up of the articles about the HSV and HSL color models is that they are RGB representations. They equally suffer the same problem as RGB models: without values encoded outside the nominal color interval, it is impossible to represent the gamut of human vision. http://en.wikipedia.org/wiki/CIE_1931_color_space
(Note that even when rendering in a color space that does accommodate the gamut of human vision, saturation isn't used. Various filtering steps may just shift color data in and out of visible and invisible areas of the color space. That just happens when rendering. When dealing with a renderer you have to design things that can deal with ranges up to 1 as up to 100 just as easily in most stages.)
HSV and HSL mostly seem to be convenient when doing or comparing shifts in the hue, which is a more complicated matter in RGB models. (It's not that hard if you can handle matrix transforms. I added a few shaders as an example.)
I already had a colorfulness shader, but I didn't mind making another type for the occasion. Note that you can do a lot more complex transforms than one simple multiply in the line "s1.rgb = (s1.rgb-inptot)*colorfulness+inptot;".
I saved quite a lot of instructions on not doing the complex color space transforms. In terms of (expensive) branching, I only had to use one to prevent a division by zero.
You specified that you wanted to cater to the 16-bit internal precision of the Nvidia NV30, NV31 and NV34 models introduced ten years ago (and were never DirectX 9.0 compliant because of the precision issue).
http://en.wikipedia.org/wiki/Machine_epsilon
The precision of the 16-bit floating point type is rather bad, its machine epsilon is 0.0009765625. The value you used was 0.0001. That value covers 0.2048 bits of maximum precision loss, assuming the interval [.5, 1).
The precision of the 24-bit floating point type is a lot better, its machine epsilon is 0.0000152587890625. The value 0.0001 covers 13.1072 bits of maximum precision loss, assuming the interval [.5, 1).
For the execution of this shader that all doesn't matter though: of the cases where applying epsilon was assumed useful, it actually was not. None of the cases here have to do deal with precision loss due to the specific limited amount of of mantissa bits.
Hardware assembly execution speed and amount of D3D asm instructions are not tied 1:1 indeed. Modern processor architectures are superscalar. Everything that can be simultaneously executed without stalling will generally be executed at the same time as other instructions. Note that the rules vary per architecture. The D3D asm instruction count is still a good indication of how fast a shader will run on average.
In regards the the fourth component on the output of a pixel shader, I can be clear on that. The performance mode uses X8R8G8B8 to store color, thus not saving the fourth component at all. The quality modes use A16B16G16R16F, A16B16G16R16 and A32B32G32R32F. These can store the fourth component. If you want to design a specific chain of multiple shaders for some effect, you are welcome to put data in the fourth component for every pixel. However, none of the renderer components themselves will ever use that channel. I advise to treat the fourth component as discarded for most pixel shaders.
Other renderer stages that handle textures with valid alpha do require processing the fourth component. For blending the OSD and subtitles two special pixel shaders are used in combination with alpha blending. The renderer has always been this way in regards to this aspect, I don't expect changes either.

Sample shaders (the XYZ types can only properly function with the renderer in quality mode):// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// very basic colorfulness control

#define colorfulness 1.5// default is one, this should not be zero

sampler s0 : register(s0);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = tex2D(s0, tex);
float inptot = dot(1/3., s1.rgb);// component average

[branch] if (inptot) {// prevent division by zero
s1.rgb = (s1.rgb-inptot)*colorfulness+inptot;
float intermtot = dot(1/3., s1.rgb);
s1.rgb *= inptot;
s1.rgb /= intermtot;}
return s1;
}



// colorfulness control for XYZ rendering

// white point in xyY, default is TV-type D65 {.3127, .3290, 1}, the most basic is E {1/3., 1/3., 1}
#define wpx .3127
#define wpy .3290// this cannot be zero

#define colorfulness 1.5// default is one, this should not be zero

sampler s0 : register(s0);
static const float wpyr = 1./wpy;
static const float wpX = wpx*wpyr;
static const float wpZ = wpyr-wpX-1.;
static const float3 wpXYZ = {wpX, 1, wpZ};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = tex2D(s0, tex);
float inptot = dot(1/3., s1.rgb);// component average

[branch] if (inptot) {// prevent division by zero
s1.rgb /= wpXYZ;// adapt white point
s1.rgb = (s1.rgb-inptot)*colorfulness+inptot;
float intermtot = dot(1/3., s1.rgb);
s1.rgb *= inptot;
s1.rgb /= intermtot;
s1.rgb *= wpXYZ;}// revert white point adaptation
return s1;
}



// hue shift for XYZ rendering

// white point in xyY, default is TV-type D65 {.3127, .3290, 1}, the most basic is E {1/3., 1/3., 1}
#define wpx .3127
#define wpy .3290// this cannot be zero

#define hue 180// in degrees, default is zero

sampler s0 : register(s0);
static const float wpyr = 1./wpy;
static const float wpX = wpx*wpyr;
static const float wpZ = wpyr-wpX-1.;
static const float3 wpXYZ = {wpX, 1, wpZ};
static const float huecos = cos(radians(hue));
static const float huesin = sin(radians(hue));
static const float huecosp = 1/3.-huecos/3.;
static const float huesinp = sqrt(1/3.)*huesin;
static const float huebase = huecosp+huecos;
static const float huedera = huecosp+huesinp;
static const float hueders = huecosp-huesinp;
static const float3x3 hueshiftmat = {
huebase, huedera, hueders,
hueders, huebase, huedera,
huedera, hueders, huebase};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = tex2D(s0, tex);
s1.rgb /= wpXYZ;// adapt white point
s1.rgb = mul(s1.rgb, hueshiftmat);
s1.rgb *= wpXYZ;// revert white point adaptation
return s1;
}



// variable hue shift for XYZ rendering

// white point in xyY, default is TV-type D65 {.3127, .3290, 1}, the most basic is E {1/3., 1/3., 1}
#define wpx .3127
#define wpy .3290// this cannot be zero

#define hue c0.w// in radians, default is zero

sampler s0 : register(s0);
float4 c0 : register(c0);
static const float wpyr = 1./wpy;
static const float wpX = wpx*wpyr;
static const float wpZ = wpyr-wpX-1.;
static const float3 wpXYZ = {wpX, 1, wpZ};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float huecos = cos(hue);
float huesin = sin(hue);
float huecosp = 1/3.-huecos/3.;
float huesinp = sqrt(1/3.)*huesin;
float huebase = huecosp+huecos;
float huedera = huecosp+huesinp;
float hueders = huecosp-huesinp;
float3x3 hueshiftmat = {
huebase, huedera, hueders,
hueders, huebase, huedera,
huedera, hueders, huebase};

float4 s1 = tex2D(s0, tex);
s1.rgb /= wpXYZ;// adapt white point
s1.rgb = mul(s1.rgb, hueshiftmat);
s1.rgb *= wpXYZ;// revert white point adaptation
return s1;
}

// colorfulness control for XYZ rendering on 16-bit integer surfaces

// white point in xyY, default is TV-type D65 {.3127, .3290, 1}, the most basic is E {1/3., 1/3., 1}
#define wpx .3127
#define wpy .3290// this cannot be zero

#define colorfulness 1.5// default is one, this should not be zero

sampler s0 : register(s0);
static const float wpyr = 1./wpy;
static const float wpX = wpx*wpyr;
static const float wpZ = wpyr-wpX-1.;
static const float3 wpXYZ = {wpX, 1, wpZ};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = tex2D(s0, tex);
s1.rgb -= 16384/65535.;// remove input interval [16384/65535, 49151/65535] offset to black point
float inptot = dot(1/3., s1.rgb);// component average

[branch] if (inptot) {// prevent division by zero
s1.rgb /= wpXYZ;// adapt white point
s1.rgb = (s1.rgb-inptot)*colorfulness+inptot;
float intermtot = dot(1/3., s1.rgb);
s1.rgb *= inptot;
s1.rgb /= intermtot;
s1.rgb *= wpXYZ;}// revert white point adaptation
s1.rgb += 16384/65535.;// re-apply black point offset
return s1;
}



// hue shift for XYZ rendering on 16-bit integer surfaces

// white point in xyY, default is TV-type D65 {.3127, .3290, 1}, the most basic is E {1/3., 1/3., 1}
#define wpx .3127
#define wpy .3290// this cannot be zero

#define hue 180// in degrees, default is zero

sampler s0 : register(s0);
static const float wpyr = 1./wpy;
static const float wpX = wpx*wpyr;
static const float wpZ = wpyr-wpX-1.;
static const float3 wpXYZ = {wpX, 1, wpZ};
static const float huecos = cos(radians(hue));
static const float huesin = sin(radians(hue));
static const float huecosp = 1/3.-huecos/3.;
static const float huesinp = sqrt(1/3.)*huesin;
static const float huebase = huecosp+huecos;
static const float huedera = huecosp+huesinp;
static const float hueders = huecosp-huesinp;
static const float3x3 hueshiftmat = {
huebase, huedera, hueders,
hueders, huebase, huedera,
huedera, hueders, huebase};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = tex2D(s0, tex);
s1.rgb -= 16384/65535.;// remove input interval [16384/65535, 49151/65535] offset to black point
s1.rgb /= wpXYZ;// adapt white point
s1.rgb = mul(s1.rgb, hueshiftmat);
s1.rgb *= wpXYZ;// revert white point adaptation
s1.rgb += 16384/65535.;// re-apply black point offset
return s1;
}



// variable hue shift for XYZ rendering on 16-bit integer surfaces

// white point in xyY, default is TV-type D65 {.3127, .3290, 1}, the most basic is E {1/3., 1/3., 1}
#define wpx .3127
#define wpy .3290// this cannot be zero

#define hue c0.w// in radians, default is zero

sampler s0 : register(s0);
float4 c0 : register(c0);
static const float wpyr = 1./wpy;
static const float wpX = wpx*wpyr;
static const float wpZ = wpyr-wpX-1.;
static const float3 wpXYZ = {wpX, 1, wpZ};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float huesin, huecos;
sincos(hue, huesin, huecos);
float huecosp = 1/3.-huecos/3.;
float huesinp = sqrt(1/3.)*huesin;
float huebase = huecosp+huecos;
float huedera = huecosp+huesinp;
float hueders = huecosp-huesinp;
float3x3 hueshiftmat = {
huebase, huedera, hueders,
hueders, huebase, huedera,
huedera, hueders, huebase};

float4 s1 = tex2D(s0, tex);
s1.rgb -= 16384/65535.;// remove input interval [16384/65535, 49151/65535] offset to black point
s1.rgb /= wpXYZ;// adapt white point
s1.rgb = mul(s1.rgb, hueshiftmat);
s1.rgb *= wpXYZ;// revert white point adaptation
s1.rgb += 16384/65535.;// re-apply black point offset
return s1;
}

detmek
14th July 2013, 10:05
I am using one or two shaders for small corections during playback, usually LumaSharpen or Sharpen Complex and Vibrance for anime. But sometimes video has a banding, usually my old encodes with denoised with FFT3DFilter.

Is there a pure deband shader that I can use to replace FFDShow Deband filter?

JanWillem32
16th July 2013, 18:58
The pure debanding shaders I tried were terrible. For the limit-based types, the worst cases were borders (even some pretty sharp ones). If the transition pixels on borders get blurred without compensating for the borders, aliasing occurs. For the gradual types, the main problem was blurring of pretty much everything.
So, I tried to use a gradual type, limit it, but additionally adapt a typical unsharp mask sharpening effect to to prevent some of the visible artifacts. For these adaptive shaders, the sharpening can be made stronger to accentuate contrast (without making banding and noise worse like other sharpen effects). I personally don't care about the sharpening effect, as I don't like seeing the sharpening halos at all. Though I know that if I set it too low or off, the other artifacts will be visible.
Every shader I wrote claiming to be able to deband and denoise has a notice on how to disable the sharpening effect. (Though I'm not sure how effective r=1 and r=2 types can be at debanding and denoising.) For the larger types, the sharpen factor can even be set for every layer.
In the mean time, I wrote a multi-pass sharpen, deband, denoise and color controls filter that should be better in terms of performance, while still doing a reasonably good job at debanding larger areas (the most costly part, as it requires a lot of pixels). If wanted, I can share it, but it's not quite finished yet.

PetitDragon
17th July 2013, 00:37
.... If wanted, I can share it, but it's not quite finished yet.

Yes please. We need a new shader pack for XYZ rendering.
:script::thanks:

turbojet
17th July 2013, 20:24
F3kdb has set my bar really high for debanding but I'm always interested in trying new methods. Will it be separate from sharpener, denoising and color controls?

JanWillem32 would you have interest writing an f3kdb shader? Developer of the avisynth plugin mentioned this a few months ago: http://forum.doom9.org/showthread.php?p=1621484#post1621484

jerrymh
23rd July 2013, 21:32
Where a can find a shader to put scanlines or crt grille on screen.

detmek
23rd July 2013, 22:32
Sure. I will be interested to try it. Its easier to use shader then loading FFDShow RAW filter.

JanWillem32
25th July 2013, 02:34
About the f3kdb shader, I'm quite interested. There's a limit to what I can do, though. Shaders work very differently compared to many other graphics filters. In most cases, translating to shaders is rather hard. I'll just have to try and be creative. It may just as well be easier than the combination effect shader chain I'm trying to write for XRyche. Where do I start?

jerrymh, that kind of effect is easy. I made this effect two-pass. If you want something more specific (such as a better quality lowpass), I can change a few parts. Depending on the input/output resolution ratio, you might need to blur a bit more or less.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// horizontal blur

sampler s0 : register(s0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
return (tex2D(s0, tex+float2(-3.*c1, 0))+tex2D(s0, tex+float2(-2.*c1, 0))+tex2D(s0, tex+float2(-c1, 0))+tex2D(s0, tex)+tex2D(s0, tex+float2(c1, 0))+tex2D(s0, tex+float2(2.*c1, 0))+tex2D(s0, tex+float2(3.*c1, 0)))/7.;// blur and output
}



// old CRT scan lines

#define scanlines 480// 480 for NTSC, 576 for PAL/SECAM, fractions, either decimal or not are allowed
#define gamma 1// higher is brighter, fractions, either decimal or not are allowed

sampler s0 : register(s0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = (tex2D(s0, tex+float2(0, -3.*c1.y))+tex2D(s0, tex+float2(0, -2.*c1.y))+tex2D(s0, tex+float2(0, -c1.y))+tex2D(s0, tex)+tex2D(s0, tex+float2(0, c1.y))+tex2D(s0, tex+float2(0, 2.*c1.y))+tex2D(s0, tex+float2(0, 3.*c1.y)))/7.;// blur input

float br = 1.-pow(abs(frac(abs(tex.y*scanlines-.5*scanlines))*2.-1.), gamma);// generate scan lines
return s1*br;// modulate brightness and output
}



// old CRT scan lines for XYZ rendering on 16-bit integer surfaces

#define scanlines 480// 480 for NTSC, 576 for PAL/SECAM, fractions, either decimal or not are allowed
#define gamma 1// higher is brighter, fractions, either decimal or not are allowed

sampler s0 : register(s0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = (tex2D(s0, tex+float2(0, -3.*c1.y))+tex2D(s0, tex+float2(0, -2.*c1.y))+tex2D(s0, tex+float2(0, -c1.y))+tex2D(s0, tex)+tex2D(s0, tex+float2(0, c1.y))+tex2D(s0, tex+float2(0, 2.*c1.y))+tex2D(s0, tex+float2(0, 3.*c1.y)))/7.;// blur input

float br = 1.-pow(abs(frac(abs(tex.y*scanlines-.5*scanlines))*2.-1.), gamma);// generate scan lines
return (s1-16384/65535.)*br+16384/65535.;// modulate brightness and output
}

turbojet
25th July 2013, 06:10
About the f3kdb shader, I'm quite interested. There's a limit to what I can do, though. Shaders work very differently compared to many other graphics filters. In most cases, translating to shaders is rather hard. I'll just have to try and be creative. It may just as well be easier than the combination effect shader chain I'm trying to write for XRyche. Where do I start?


Maybe messaging SAPikachu, the developer of f3kdb.dll he was open to helping in the message linked earlier. I have almost no programming experience so couldn't be of help.

XRyche
27th July 2013, 04:47
JanWillem32, First off.....Thank You for working on the hybrid shaders I've requested. I doubt if I would be able to find anyone else so willing to do that. Second, I have been doing some experimenting with different methods for cleaning up the image quality on a lot of my old XVID/DIVX/AVI files and it seems that they benefit more from deblocking (ffdshows raw filter mplayer deblocking) than denoising. I suppose since most of the files are old vhs to avi tv rips and old tvcard rips that makes sense. Anyways, would you be open to possibly doing an adjustable deblocking shader? I've read that ATI used to (I don't know if they still do) use it's shader core to do deblocking so I assume (whether correctly or not) it's possible to do it with an HLSL script. I don't have a clue what the math involved would be like so if you can't it's understandable.

turbojet
27th July 2013, 06:34
MPEG4 ASP is notorious for banding (could be mistaken for blocks on flat surfaces and faces) are you sure it's not banding?

f3kdb is a must for me with ASP much less so for any decent AVC encode or MPEG2. Have you tried it through ffdshow's avisynth interface? Make sure to use setmemorymax(128 or more) to stop the leakage.

jerrymh
28th July 2013, 05:09
About the f3kdb shader, I'm quite interested. There's a limit to what I can do, though. Shaders work very differently compared to many other graphics filters. In most cases, translating to shaders is rather hard. I'll just have to try and be creative. It may just as well be easier than the combination effect shader chain I'm trying to write for XRyche. Where do I start?

jerrymh, that kind of effect is easy. I made this effect two-pass. If you want something more specific (such as a better quality lowpass), I can change a few parts. Depending on the input/output resolution ratio, you might need to blur a bit more or less.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// horizontal blur

sampler s0 : register(s0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
return (tex2D(s0, tex+float2(-3.*c1, 0))+tex2D(s0, tex+float2(-2.*c1, 0))+tex2D(s0, tex+float2(-c1, 0))+tex2D(s0, tex)+tex2D(s0, tex+float2(c1, 0))+tex2D(s0, tex+float2(2.*c1, 0))+tex2D(s0, tex+float2(3.*c1, 0)))/7.;// blur and output
}



// old CRT scan lines

#define scanlines 480// 480 for NTSC, 576 for PAL/SECAM, fractions, either decimal or not are allowed
#define gamma 1// higher is brighter, fractions, either decimal or not are allowed

sampler s0 : register(s0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = (tex2D(s0, tex+float2(0, -3.*c1.y))+tex2D(s0, tex+float2(0, -2.*c1.y))+tex2D(s0, tex+float2(0, -c1.y))+tex2D(s0, tex)+tex2D(s0, tex+float2(0, c1.y))+tex2D(s0, tex+float2(0, 2.*c1.y))+tex2D(s0, tex+float2(0, 3.*c1.y)))/7.;// blur input

float br = 1.-pow(abs(frac(abs(tex.y*scanlines-.5*scanlines))*2.-1.), gamma);// generate scan lines
return s1*br;// modulate brightness and output
}



// old CRT scan lines for XYZ rendering on 16-bit integer surfaces

#define scanlines 480// 480 for NTSC, 576 for PAL/SECAM, fractions, either decimal or not are allowed
#define gamma 1// higher is brighter, fractions, either decimal or not are allowed

sampler s0 : register(s0);
float2 c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = (tex2D(s0, tex+float2(0, -3.*c1.y))+tex2D(s0, tex+float2(0, -2.*c1.y))+tex2D(s0, tex+float2(0, -c1.y))+tex2D(s0, tex)+tex2D(s0, tex+float2(0, c1.y))+tex2D(s0, tex+float2(0, 2.*c1.y))+tex2D(s0, tex+float2(0, 3.*c1.y)))/7.;// blur input

float br = 1.-pow(abs(frac(abs(tex.y*scanlines-.5*scanlines))*2.-1.), gamma);// generate scan lines
return (s1-16384/65535.)*br+16384/65535.;// modulate brightness and output
}

Thank you very much, :thanks::thanks:

How about a aperture grille like this on Final burn alpha, it feels like a real old crt, or LG plasma


http://img10.imageshack.us/img10/4519/6qhx.jpg

Or a scanlines at 95%

XRyche
28th July 2013, 08:19
@turbojet...Yes there is some banding but some of JanWillem32's shaders+his modified EVR-CP already help with that (as well as madVR all but eliminating banding) but blocking is still there without using fddshow's raw filter. Not that the raw filter is bad I just would like to eliminate it from my playback chain. If JanWillem32 can kindly make a deblocking shader that does as good of a job or better as the raw filter I would much rather use that.

Most of my problem video files are from old vhs recordings of TV shows converted to .avi's as well as some TIVO-type files and early PC TV card recordings so blocking is kind of a given as well as massive banding ;) . Considering that madVR doesn't do deblocking (it actually accentuates the blocking on some of my files) using madVR for these is sort of a no no without the raw filter or a shader script (one the madVR will not neuter because of gamma manipulation or such).

JanWillem32
28th July 2013, 14:17
Deblocking is mostly decoder territory. Many video codecs don't use the typical macroblocks at all. For those that do, you need the general blocking info for the luma, chroma and interlacing to deal with it. For h.264 (and some newer codecs) organized (de)blocking is mandatory for both encoder and decoder. The custom shader stages of the video renderer are a bit late in the rendering chain to properly work on blocking and such. I'm not sure if I can write a normal shader that can help with deblocking.

jerrymh, that picture mostly shows hand-drawn pixel art. No decent video will convert nicely to high contrast, low quantization images like that. I can approximate the effect by combining a few techniques, but note that posterization is a really messy effect (even in common 8-bit video and worst of all, it's everywhere).
It's a two-pass shader chain again. The warning for "should be divisible by 4" isn't too strict, the few artifacts are hard to see. For common resolutions such as 720- and 1080-line systems I can also adapt special shaders to compensate for this issue. I can also try to boost some of the contrast or colorfulness before posterization as well, but I didn't see much improvement with those effects enabled on the samples I used.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// horizontal 4-pixel averaging
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 4

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float pos = (trunc(tex.x*c0*.25)+.125)*c1*4.;// calculate the left positon of the current set of pixels
return (tex2D(s0, float2(pos, tex.y))+tex2D(s0, float2(pos+c1, tex.y))+tex2D(s0, float2(pos+2*c1, tex.y))+tex2D(s0, float2(pos+3*c1, tex.y)))*.25;// blur and output
}



// vertical 4-pixel averaging, dithering, posterizing and old CRT scan lines
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 4

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.03125, .03125, .25);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.125)*c1.y*4.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y)))*.25;// blur input
#if posterizedegamma
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*br;// modulate brightness and output
#else
s1 = round(s1*quantize+dithers);// dither and posterize
return s1*(br*quantizer);// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// vertical 4-pixel averaging, dithering, posterizing and old CRT scan lines for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 4

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.03125, .03125, .25);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.125)*c1.y*4.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y)))*.25;// blur input
#if posterizedegamma
s1 = s1*65535/32767.-16384/32767.;
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*(br*32767/65535.)+16384/65535.;// modulate brightness and output
#else
s1 = round((s1*65535/32767.-16384/32767.)*quantize+dithers);// dither and posterize
return s1*(br*quantizer*32767/65535.)+16384/65535.;// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// horizontal 5-pixel averaging
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 5

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float pos = (trunc(tex.x*c0*.2)+.1)*c1*5.;// calculate the left positon of the current set of pixels
return (tex2D(s0, float2(pos, tex.y))+tex2D(s0, float2(pos+c1, tex.y))+tex2D(s0, float2(pos+2*c1, tex.y))+tex2D(s0, float2(pos+3*c1, tex.y))+tex2D(s0, float2(pos+4*c1, tex.y)))*.2;// blur and output
}



// vertical 5-pixel averaging, dithering, posterizing and old CRT scan lines
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 5

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.025, .025, .2);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.1)*c1.y*5.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y)))*.2;// blur input
#if posterizedegamma
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*br;// modulate brightness and output
#else
s1 = round(s1*quantize+dithers);// dither and posterize
return s1*(br*quantizer);// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// vertical 5-pixel averaging, dithering, posterizing and old CRT scan lines for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 5

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.025, .025, .2);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.1)*c1.y*5.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y)))*.2;// blur input
#if posterizedegamma
s1 = s1*65535/32767.-16384/32767.;
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*(br*32767/65535.)+16384/65535.;// modulate brightness and output
#else
s1 = round((s1*65535/32767.-16384/32767.)*quantize+dithers);// dither and posterize
return s1*(br*quantizer*32767/65535.)+16384/65535.;// shrink interval back to normal after posterization, modulate brightness and output
#endif
}

jerrymh
28th July 2013, 22:35
Deblocking is mostly decoder territory. Many video codecs don't use the typical macroblocks at all. For those that do, you need the general blocking info for the luma, chroma and interlacing to deal with it. For h.264 (and some newer codecs) organized (de)blocking is mandatory for both encoder and decoder. The custom shader stages of the video renderer are a bit late in the rendering chain to properly work on blocking and such. I'm not sure if I can write a normal shader that can help with deblocking.

jerrymh, that picture mostly shows hand-drawn pixel art. No decent video will convert nicely to high contrast, low quantization images like that. I can approximate the effect by combining a few techniques, but note that posterization is a really messy effect (even in common 8-bit video and worst of all, it's everywhere).
It's a two-pass shader chain again. The warning for "should be divisible by 4" isn't too strict, the few artifacts are hard to see. For common resolutions such as 720- and 1080-line systems I can also adapt special shaders to compensate for this issue. I can also try to boost some of the contrast or colorfulness before posterization as well, but I didn't see much improvement with those effects enabled on the samples I used.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// horizontal 4-pixel averaging
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 4

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float pos = (trunc(tex.x*c0*.25)+.125)*c1*4.;// calculate the left positon of the current set of pixels
return (tex2D(s0, float2(pos, tex.y))+tex2D(s0, float2(pos+c1, tex.y))+tex2D(s0, float2(pos+2*c1, tex.y))+tex2D(s0, float2(pos+3*c1, tex.y)))*.25;// blur and output
}



// vertical 4-pixel averaging, dithering, posterizing and old CRT scan lines
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 4

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.0625, .0625, .25);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z-.5), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.125)*c1.y*4.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y)))*.25;// blur input
#if posterizedegamma
s1 = pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*br;// modulate brightness and output
#else
s1 = round(s1*quantize+dithers);// dither and posterize
return s1*(br*quantizer);// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// vertical 4-pixel averaging, dithering, posterizing and old CRT scan lines for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 4

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.0625, .0625, .25);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z-.5), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.125)*c1.y*4.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y)))*.25;// blur input
#if posterizedegamma
s1 = pow(round(sqrt(s1*65535/32767.-16384/32767.)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*(br*32767/65535.)+16384/65535.;// modulate brightness and output
#else
s1 = round((s1*65535/32767.-16384/32767.)*quantize+dithers);// dither and posterize
return s1*(br*quantizer*32767/65535.)+16384/65535.;// shrink interval back to normal after posterization, modulate brightness and output
#endif
}


Maybe if you only try to draw the mask grille, not the other efects. (only a draw a mask in front the video)

Any way I found the source code for the shader mask, but am to :( about codes.

https://github.com/libretro/common-shaders/blob/master/crt/crt-geom-flat.cg

Also found this variants of the shader
http://emulation-general.wikia.com/wiki/CRT_Geom

and the image should look like this

http://images3.wikia.nocookie.net/__cb20130723004104/emulation-general/images/thumb/4/4c/Retroarch_2013-07-22_17-21-17-60.png/1000px-Retroarch_2013-07-22_17-21-17-60.png

JanWillem32
29th July 2013, 00:27
The host renderer for the shaders your link points to is organized very differently than the ones used for the shaders over here.
The first shaders I posted actually only blur and apply the scan line effect. The results are not stellar. The second version also properly degrades to low resolution and low quantization. It won't come close to pixel art like in that picture, but it will do a reasonable job on most typical video sources.
The default quantization in the shader is rather high compared to that picture. If it's a 256-color mode, try quantizationbits at 8/3., posterizedegamma 0 and probably a different gamma for the scan lines effect for the renderer in 8-bit mode or 17/6. and posterizedegamma 1 in quality mode. (Quality mode wastes a few percent at the top of the usual [0, 1] interval for two of the three channels.)
Note that I edited my previous post to fix a few bugs in the code with dithering and negative inputs.

turbojet
29th July 2013, 07:20
XRyche: can you post a short clip?

JanWillem32
29th July 2013, 08:36
Here are some extra shaders for larger pixels. I also edited the previous post because the forum has a maximum text length limit.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// horizontal 8-pixel averaging
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 8

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float pos = (trunc(tex.x*c0*.125)+.0625)*c1*8.;// calculate the left positon of the current set of pixels
return (tex2D(s0, float2(pos, tex.y))+tex2D(s0, float2(pos+c1, tex.y))+tex2D(s0, float2(pos+2*c1, tex.y))+tex2D(s0, float2(pos+3*c1, tex.y))+tex2D(s0, float2(pos+4*c1, tex.y))+tex2D(s0, float2(pos+5*c1, tex.y))+tex2D(s0, float2(pos+6*c1, tex.y))+tex2D(s0, float2(pos+7*c1, tex.y)))*.125;// blur and output
}



// vertical 8-pixel averaging, dithering, posterizing and old CRT scan lines
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 8

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.015625, .015625, .125);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.0625)*c1.y*8.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y)))*.125;// blur input
#if posterizedegamma
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*br;// modulate brightness and output
#else
s1 = round(s1*quantize+dithers);// dither and posterize
return s1*(br*quantizer);// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// vertical 8-pixel averaging, dithering, posterizing and old CRT scan lines for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 8

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.015625, .015625, .125);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.0625)*c1.y*8.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y)))*.125;// blur input
#if posterizedegamma
s1 = s1*65535/32767.-16384/32767.;
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*(br*32767/65535.)+16384/65535.;// modulate brightness and output
#else
s1 = round((s1*65535/32767.-16384/32767.)*quantize+dithers);// dither and posterize
return s1*(br*quantizer*32767/65535.)+16384/65535.;// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// horizontal 10-pixel averaging
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 10

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float pos = (trunc(tex.x*c0*.1)+.05)*c1*10.;// calculate the left positon of the current set of pixels
return (tex2D(s0, float2(pos, tex.y))+tex2D(s0, float2(pos+c1, tex.y))+tex2D(s0, float2(pos+2*c1, tex.y))+tex2D(s0, float2(pos+3*c1, tex.y))+tex2D(s0, float2(pos+4*c1, tex.y))+tex2D(s0, float2(pos+5*c1, tex.y))+tex2D(s0, float2(pos+6*c1, tex.y))+tex2D(s0, float2(pos+7*c1, tex.y))+tex2D(s0, float2(pos+8*c1, tex.y))+tex2D(s0, float2(pos+9*c1, tex.y)))*.1;// blur and output
}



// vertical 10-pixel averaging, dithering, posterizing and old CRT scan lines
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 10

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.0125, .0125, .1);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.05)*c1.y*10.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y))+tex2D(s0, float2(tex.x, pos+8*c1.y))+tex2D(s0, float2(tex.x, pos+9*c1.y)))*.1;// blur input
#if posterizedegamma
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*br;// modulate brightness and output
#else
s1 = round(s1*quantize+dithers);// dither and posterize
return s1*(br*quantizer);// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// vertical 10-pixel averaging, dithering, posterizing and old CRT scan lines for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 10

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.0125, .0125, .1);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.05)*c1.y*10.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y))+tex2D(s0, float2(tex.x, pos+8*c1.y))+tex2D(s0, float2(tex.x, pos+9*c1.y)))*.1;// blur input
#if posterizedegamma
s1 = s1*65535/32767.-16384/32767.;
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*(br*32767/65535.)+16384/65535.;// modulate brightness and output
#else
s1 = round((s1*65535/32767.-16384/32767.)*quantize+dithers);// dither and posterize
return s1*(br*quantizer*32767/65535.)+16384/65535.;// shrink interval back to normal after posterization, modulate brightness and output
#endif
}

fagoatse
29th July 2013, 09:34
The shaders jerrymh posted are meant to be used with emulators(RetroArch/Libretro in this case) and they are tailored for a specific resolution as far as I know. RetroArch supports up to 8 passes and you can build https://github.com/libretro/libretro-ffmpeg if you wish to test them in video playback scenario.

jerrymh
4th August 2013, 07:11
Here are some extra shaders for larger pixels. I also edited the previous post because the forum has a maximum text length limit.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// horizontal 8-pixel averaging
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 8

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float pos = (trunc(tex.x*c0*.125)+.0625)*c1*8.;// calculate the left positon of the current set of pixels
return (tex2D(s0, float2(pos, tex.y))+tex2D(s0, float2(pos+c1, tex.y))+tex2D(s0, float2(pos+2*c1, tex.y))+tex2D(s0, float2(pos+3*c1, tex.y))+tex2D(s0, float2(pos+4*c1, tex.y))+tex2D(s0, float2(pos+5*c1, tex.y))+tex2D(s0, float2(pos+6*c1, tex.y))+tex2D(s0, float2(pos+7*c1, tex.y)))*.125;// blur and output
}



// vertical 8-pixel averaging, dithering, posterizing and old CRT scan lines
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 8

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.015625, .015625, .125);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.0625)*c1.y*8.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y)))*.125;// blur input
#if posterizedegamma
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*br;// modulate brightness and output
#else
s1 = round(s1*quantize+dithers);// dither and posterize
return s1*(br*quantizer);// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// vertical 8-pixel averaging, dithering, posterizing and old CRT scan lines for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 8

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.015625, .015625, .125);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.0625)*c1.y*8.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y)))*.125;// blur input
#if posterizedegamma
s1 = s1*65535/32767.-16384/32767.;
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*(br*32767/65535.)+16384/65535.;// modulate brightness and output
#else
s1 = round((s1*65535/32767.-16384/32767.)*quantize+dithers);// dither and posterize
return s1*(br*quantizer*32767/65535.)+16384/65535.;// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// horizontal 10-pixel averaging
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 10

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float pos = (trunc(tex.x*c0*.1)+.05)*c1*10.;// calculate the left positon of the current set of pixels
return (tex2D(s0, float2(pos, tex.y))+tex2D(s0, float2(pos+c1, tex.y))+tex2D(s0, float2(pos+2*c1, tex.y))+tex2D(s0, float2(pos+3*c1, tex.y))+tex2D(s0, float2(pos+4*c1, tex.y))+tex2D(s0, float2(pos+5*c1, tex.y))+tex2D(s0, float2(pos+6*c1, tex.y))+tex2D(s0, float2(pos+7*c1, tex.y))+tex2D(s0, float2(pos+8*c1, tex.y))+tex2D(s0, float2(pos+9*c1, tex.y)))*.1;// blur and output
}



// vertical 10-pixel averaging, dithering, posterizing and old CRT scan lines
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 10

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.0125, .0125, .1);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.05)*c1.y*10.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y))+tex2D(s0, float2(tex.x, pos+8*c1.y))+tex2D(s0, float2(tex.x, pos+9*c1.y)))*.1;// blur input
#if posterizedegamma
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*br;// modulate brightness and output
#else
s1 = round(s1*quantize+dithers);// dither and posterize
return s1*(br*quantizer);// shrink interval back to normal after posterization, modulate brightness and output
#endif
}



// vertical 10-pixel averaging, dithering, posterizing and old CRT scan lines for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a vertical resolution that is evenly divisible by 10

#define gamma 1// higher is brighter, fractions, either decimal or not are allowed
#define scanlinebasedarken .5// the default of .5 will darken outer pixels a bit on each set of vertical pixels to appear like old CRT scan lines, higher values will narrow the scan line beam
#define posterizedegamma 1// 0 or 1, apply dirty de-gamma for posterization, useful to preserve realistic gradients in low gamma modes
#define quantizationbits 4// posterization level, note that 'quantize' can actually take any amount, not just those based on powers of two

sampler s0 : register(s0);
float2 c0 : register(c0);
float2 c1 : register(c1);
static const float quantize = pow(2, quantizationbits)-1;
static const float quantizer = 1./quantize;
static const float qm = .0078125*quantizer;
static const float smalldithermap[8][8] = {
{-63*qm, qm, -47*qm, 17*qm, -59*qm, 5*qm, -43*qm, 21*qm},
{33*qm, -31*qm, 49*qm, -15*qm, 37*qm, -27*qm, 53*qm, -11*qm},
{-39*qm, 25*qm, -55*qm, 9*qm, -35*qm, 29*qm, -51*qm, 13*qm},
{57*qm, -7*qm, 41*qm, -23*qm, 61*qm, -3*qm, 45*qm, -19*qm},
{-57*qm, 7*qm, -41*qm, 23*qm, -61*qm, 3*qm, -45*qm, 19*qm},
{39*qm, -25*qm, 55*qm, -9*qm, 35*qm, -29*qm, 51*qm, -13*qm},
{-33*qm, 31*qm, -49*qm, 15*qm, -37*qm, 27*qm, -53*qm, 11*qm},
{63*qm, -qm, 47*qm, -17*qm, 59*qm, -5*qm, 43*qm, -21*qm}};

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 basepos = tex.xyy*c0.xyy*float3(.0125, .0125, .1);
float3 basefrac = frac(basepos);
float2 lookups = basefrac.xy*8.;
float dithers = smalldithermap[lookups.x][lookups.y];

float br = 1.-pow(abs(basefrac.z*2.*scanlinebasedarken-scanlinebasedarken), gamma);// generate scan lines

float pos = (basepos.z-basefrac.z+.05)*c1.y*10.;// calculate the top positon of the current set of pixels
float4 s1 = (tex2D(s0, float2(tex.x, pos))+tex2D(s0, float2(tex.x, pos+c1.y))+tex2D(s0, float2(tex.x, pos+2*c1.y))+tex2D(s0, float2(tex.x, pos+3*c1.y))+tex2D(s0, float2(tex.x, pos+4*c1.y))+tex2D(s0, float2(tex.x, pos+5*c1.y))+tex2D(s0, float2(tex.x, pos+6*c1.y))+tex2D(s0, float2(tex.x, pos+7*c1.y))+tex2D(s0, float2(tex.x, pos+8*c1.y))+tex2D(s0, float2(tex.x, pos+9*c1.y)))*.1;// blur input
#if posterizedegamma
s1 = s1*65535/32767.-16384/32767.;
float4 signbits = sign(s1);
s1 = signbits*pow(round(sqrt(s1)*quantize+dithers)*quantizer, 2);// dither and posterize
return s1*(br*32767/65535.)+16384/65535.;// modulate brightness and output
#else
s1 = round((s1*65535/32767.-16384/32767.)*quantize+dithers);// dither and posterize
return s1*(br*quantizer*32767/65535.)+16384/65535.;// shrink interval back to normal after posterization, modulate brightness and output
#endif
}

Thanks, long time without internet. :mad:

jerrymh
4th August 2013, 17:49
I found this on libreto ffmpeg video shader, really looks like and old crt monitor , but dont know if there any build for windows

https://photos-2.dropbox.com/t/0/AAB0TgdA87Z8o0z--fUDPR3sdcXuJyIHpmvfdSaYd3L2Lg/12/149537/png/32x32/3/1375639200/0/2/RetroArch-0719-182234.png/0vm6SKy7KZQJmOAzkTFZ8f7lNzBL7vzYcDfVtaXnYqU%2C_iLphXBTqHEvh5k1JTJPn3pjPVrjYF6islARCB6XHlI?size=1280x960

JanWillem32
11th August 2013, 22:30
XRyche, I made a three-stage chain that might work. It's currently rather restricted and I'll probably need to change a few more parameters, but it's a good start. I only made one chain, meant for the combination of HD video with the renderer settings on 16-bit integer surfaces with the disable initial pass shaders option enabled. I can add more shaders later, if these work well.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// R'G'B' to Y'CbCr for HD video input for XYZ rendering on 16-bit integer surfaces

sampler s0;

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 s1 = tex2Dlod(s0, float4(tex, 0, 0)).rgb;// original pixel
return ((s1.rrr*float3(.2126, -.1063/.9278, .5)+s1.ggg*float3(.7152, -.3576/.9278, -.3576/.7874)+s1.bbb*float3(.0722, .5, -.0361/.7874))*32767/65535.+float3(16384/65535., 32767/65535., 32767/65535.)).rgbb;// HD RGB to Y'CbCr output
}



// horizontal pass sharpen complex, deband and denoise for HD video input for XYZ rendering on 16-bit integer surfaces

#define SharpenLimitLuma 2// valid interval [0, 10], luma-specific sharpening limit, 0 is disabled, lower numbers will allow more sharpening on contours
#define SharpenLimitChroma 2// valid interval [0, 10], chroma-specific sharpening limit, 0 is disabled, lower numbers will allow more sharpening on contours
#define LumaDetectionFactor 64// valid interval (65535/32767., 250], luma-specific detection factor, if set to the lowest amount no contours can be detected, higher numbers will shift the detection on color difference intervals of debanding to noise detection limit to mimimum sharpening to maximum sharpening toward more sharpening
#define ChromaDetectionFactor 64// valid interval (65535/32767., 250], chroma-specific detection factor, if set to the lowest amount no contours can be detected, higher numbers will shift the detection on color difference intervals of debanding to noise detection limit to mimimum sharpening to maximum sharpening toward more sharpening
#define NoiseThreshold .0078125// valid interval [0, 32767/65535.), banding treshold, higher numbers mean stronger deband and denoise

sampler s0 : register(s0);
float2 c1 : register(c1);
#define sp(a) tex2Dlod(s0, float4(tex+c1*float2(a, 0), 0, 0)).rgb
static const float3 slimits = float3(-SharpenLimitLuma, -SharpenLimitChroma, -SharpenLimitChroma);
static const float3 dfactors = float3(LumaDetectionFactor, ChromaDetectionFactor, ChromaDetectionFactor);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 n, p, s1 = sp(0);// original pixel
{
float3 s2 = sp(-1);
float3 af = 1.;// accumulated amount of colors from the samples
float3 ac = s1;// accumulate color
float3 cd = abs(s1-s2);// color difference
float3 rcd = max(slimits, 1.-dfactors*cd);// factor for both base and multiplicand is 1.0, the output will be in the interval (-inf, 1]
// invert interval on sharpening
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s2*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {// continue if all channels are below the noise threshold
float3 s3 = sp(-2);
cd = abs(s1-s3);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s3*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s4 = sp(-3);
cd = abs(s1-s4);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s4*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s5 = sp(-4);
cd = abs(s1-s5);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s5*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s6 = sp(-5);
cd = abs(s1-s6);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s6*rcd;
}
}
}
}
n = ac/af;
}
{
float3 s2 = sp(1);
float3 af = 1.;// accumulated amount of colors from the samples
float3 ac = s1;// accumulate color
float3 cd = abs(s1-s2);// color difference
float3 rcd = max(slimits, 1.-dfactors*cd);// factor for both base and multiplicand is 1.0, the output will be in the interval (-inf, 1]
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s2*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {// continue if all channels are below the noise threshold
float3 s3 = sp(2);
cd = abs(s1-s3);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s3*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s4 = sp(3);
cd = abs(s1-s4);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s4*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s5 = sp(4);
cd = abs(s1-s5);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s5*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s6 = sp(5);
cd = abs(s1-s6);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s6*rcd;
}
}
}
}
p = ac/af;
}
return ((n+p)*.5).rgbb;
}



// vertical pass sharpen complex, deband, denoise and color controls for HD video input for XYZ rendering on 16-bit integer surfaces

#define SharpenLimitLuma 2// valid interval [0, 10], luma-specific sharpening limit, 0 is disabled, lower numbers will allow more sharpening on contours
#define SharpenLimitChroma 2// valid interval [0, 10], chroma-specific sharpening limit, 0 is disabled, lower numbers will allow more sharpening on contours
#define LumaDetectionFactor 64// valid interval (65535/32767., 250], luma-specific detection factor, if set to the lowest amount no contours can be detected, higher numbers will shift the detection on color difference intervals of debanding to noise detection limit to mimimum sharpening to maximum sharpening toward more sharpening
#define ChromaDetectionFactor 64// valid interval (65535/32767., 250], chroma-specific detection factor, if set to the lowest amount no contours can be detected, higher numbers will shift the detection on color difference intervals of debanding to noise detection limit to mimimum sharpening to maximum sharpening toward more sharpening
#define NoiseThreshold .0078125// valid interval [0, 32767/65535.), banding treshold, higher numbers mean stronger deband and denoise

// YCbCrColorControls, 0 is disabled, 1 is enabled
#define YCbCrColorControls 0
// Brightness, interval [-10, 10], default 0
#define Brightness 0
// Contrast, interval [0, 10], default 1
#define Contrast 1
// GrayscaleGamma and ColorfulnessGamma, interval (0, 10], default 1
#define GrayscaleGamma 1
#define ColorfulnessGamma 1
// Hue, interval [-180, 180], default 0
#define Hue 0
// Saturation, interval [0, 10], default 1
#define Saturation 1
// VideoRedGamma, VideoGreenGamma and VideoBlueGamma, interval (0, 10], default 2.4, the video gamma input factors used to convert between the video input RGB and linear RGB
#define VideoRedGamma 2.4
#define VideoGreenGamma 2.4
#define VideoBlueGamma 2.4

sampler s0 : register(s0);
float2 c1 : register(c1);
#define sp(a) tex2Dlod(s0, float4(tex+c1*float2(0, a), 0, 0)).rgb
static const float3 slimits = float3(-SharpenLimitLuma, -SharpenLimitChroma, -SharpenLimitChroma);
static const float3 dfactors = float3(LumaDetectionFactor, ChromaDetectionFactor, ChromaDetectionFactor);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float3 n, p, s1 = sp(0);// original pixel
{
float3 s2 = sp(-1);
float3 af = 1.;// accumulated amount of colors from the samples
float3 ac = s1;// accumulate color
float3 cd = abs(s1-s2);// color difference
float3 rcd = max(slimits, 1.-dfactors*cd);// factor for both base and multiplicand is 1.0, the output will be in the interval (-inf, 1]
// invert interval on sharpening
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s2*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {// continue if all channels are below the noise threshold
float3 s3 = sp(-2);
cd = abs(s1-s3);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s3*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s4 = sp(-3);
cd = abs(s1-s4);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s4*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s5 = sp(-4);
cd = abs(s1-s5);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s5*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s6 = sp(-5);
cd = abs(s1-s6);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s6*rcd;
}
}
}
}
n = ac/af;
}
{
float3 s2 = sp(1);
float3 af = 1.;// accumulated amount of colors from the samples
float3 ac = s1;// accumulate color
float3 cd = abs(s1-s2);// color difference
float3 rcd = max(slimits, 1.-dfactors*cd);// factor for both base and multiplicand is 1.0, the output will be in the interval (-inf, 1]
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s2*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {// continue if all channels are below the noise threshold
float3 s3 = sp(2);
cd = abs(s1-s3);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s3*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s4 = sp(3);
cd = abs(s1-s4);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s4*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s5 = sp(4);
cd = abs(s1-s5);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s5*rcd;
[branch] if(max(max(cd.x, cd.y), cd.z) < NoiseThreshold) {
float3 s6 = sp(5);
cd = abs(s1-s6);
rcd = max(slimits, 1.-dfactors*cd);
if(rcd.x < 0) rcd.x = SharpenLimitLuma-abs(rcd.x);
if(rcd.y < 0) rcd.y = SharpenLimitChroma-abs(rcd.y);
if(rcd.z < 0) rcd.z = SharpenLimitChroma-abs(rcd.z);
af += abs(rcd);
ac += s6*rcd;
}
}
}
}
p = ac/af;
}
float3 t0 = (n+p)*.5;
t0 = t0*65535/32767.-float3(16384/32767., 32767/65535.+.5, 32767/65535.+.5);
#if YCbCrColorControls == 1
t0.yz = mul(t0.yz, float2x2(cos(radians(Hue)), sin(radians(Hue)), -sin(radians(Hue)), cos(radians(Hue))));// process hue
t0.xyz *= float3(Contrast, 2*Saturation, 2*Saturation);// process contrast and saturation, extend the chroma interval from [-.5, .5] to [-1, 1] for gamma processing
t0.x += Brightness;// process brightness
// preserve the sign bits of Y'CbCr values
float3 sby = sign(t0);
t0 = sby*pow(abs(t0), float3(GrayscaleGamma, ColorfulnessGamma, ColorfulnessGamma));// gamma processing
t0 = t0.rrr+float3(0, -.5*.1674679/.894, .5*1.8556)*t0.ggg+float3(.5*1.5748, -.5*.4185031/.894, 0)*t0.bbb;// HD Y'CbCr to RGB, compensate for the chroma ranges
#else
t0 = t0.rrr+float3(0, -.1674679/.894, 1.8556)*t0.ggg+float3(1.5748, -.4185031/.894, 0)*t0.bbb;// HD Y'CbCr to RGB
#endif
// preserve the sign bits of RGB values
float3 sbl = sign(t0);
t0 = sbl*pow(abs(t0), float3(VideoRedGamma, VideoGreenGamma, VideoBlueGamma));// linear RGB gamma correction
t0 = mul(t0, float3x3(0.3786675215, 0.1952504408, 0.0177500401, 0.3283428626, 0.6566857251, 0.1094476209, 0.1657219631, 0.0662887852, 0.8728023391))*32767/65535.+16384/65535.;
return t0.rgbb;// XYZ output
}

JanWillem32
11th August 2013, 22:33
The "contour color expose banding" shader is useful for denoise and deband testing purposes.// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// contour color expose banding for XYZ rendering on 16-bit integer surfaces
// This shader can be run as a screen space pixel shader.
// This shader requires compiling with ps_2_0, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// Use this shader to add a color contoured effect to an image.

sampler s0;
float2 c1 : register(c1);
#define sp(a, b, c) float4 a = tex2D(s0, tex+c1*float2(b, c));

float4 main(float2 tex : TEXCOORD0) : COLOR
{
sp(s2, -1, -1) sp(s3, 0, -1) sp(s4, 1, -1) sp(s5, -1, 0) sp(s6, 1, 0) sp(s7, -1, 1) sp(s8, 0, 1) sp(s9, 1, 1)// sample surrounding pixels
return smoothstep(.0625, 0, abs(s2+s3+s4-s7-s8-s9)+abs(s2+s5+s7-s4-s6-s9)+abs(s2+s3+s5-s6-s8-s9)+abs(s3+s4+s6-s5-s7-s8))*32767/65535.+16384/65535.;// color contour output
}

JanWillem32
12th August 2013, 08:23
I just wrote some simple shaders for usage as a third pass, after the "vertical x-pixel averaging, dithering, posterizing and old CRT scan lines"-type shaders. These shaders separate RGB channels of the input video into multiple real pixels, imitating aperture grilles that use rectangular masks. (Imitating the other common shadow mask pattern would be a lot harder to program. I'm not sure it's worth the effort.) The warnings about divisibility in these shaders are not that important. The artifacts are barely visible if the input isn't evenly divisible.// horizontal 4-pixel RGB separation
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 4

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.25;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .25) mask = float4(1, 0, 0, 0);
else if(posdif < .5) mask = float4(0, 1/3., 2/3., 0);
else if(posdif < .75) mask = float4(0, 2/3., 1/3., 0);
else mask = float4(0, 0, 1, 0);
return tex2D(s0, tex)*mask;// mask and output
}



// horizontal 5-pixel RGB separation
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 5

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.2;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .2) mask = float4(1, 0, 0, 0);
else if(posdif < .4) mask = float4(0, 2/3., 1/3., 0);
else if(posdif < .6) mask = float4(0, 1, 0, 0);
else if(posdif < .8) mask = float4(0, 1/3., 2/3., 0);
else mask = float4(0, 0, 1, 0);
return tex2D(s0, tex)*mask;// mask and output
}



// horizontal 8-pixel RGB separation
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 8

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.125;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .25) mask = float4(1, 0, 0, 0);
else if(posdif < 0.375) mask = float4(0, 2/3., 1/3., 0);
else if(posdif < 0.625) mask = float4(0, 1, 0, 0);
else if(posdif < .75) mask = float4(0, 1/3., 2/3., 0);
else mask = float4(0, 0, 1, 0);
return tex2D(s0, tex)*mask;// mask and output
}



// horizontal 10-pixel RGB separation
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 10

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.1;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .3) mask = float4(1, 0, 0, 0);
else if(posdif < .4) mask = float4(0, 1/3., 2/3., 0);
else if(posdif < .6) mask = float4(0, 1, 0, 0);
else if(posdif < .7) mask = float4(0, 2/3., 1/3., 0);
else mask = float4(0, 0, 1, 0);
return tex2D(s0, tex)*mask;// mask and output
}



// horizontal 4-pixel RGB separation for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 4

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.25;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .25) mask = float4(1, 0, 0, 0);
else if(posdif < .5) mask = float4(0, 1/3., 2/3., 0);
else if(posdif < .75) mask = float4(0, 2/3., 1/3., 0);
else mask = float4(0, 0, 1, 0);
return (tex2D(s0, tex)-16384/65535.)*mask+16384/65535.;// mask and output
}



// horizontal 5-pixel RGB separation for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 5

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.2;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .2) mask = float4(1, 0, 0, 0);
else if(posdif < .4) mask = float4(0, 2/3., 1/3., 0);
else if(posdif < .6) mask = float4(0, 1, 0, 0);
else if(posdif < .8) mask = float4(0, 1/3., 2/3., 0);
else mask = float4(0, 0, 1, 0);
return (tex2D(s0, tex)-16384/65535.)*mask+16384/65535.;// mask and output
}



// horizontal 8-pixel RGB separation for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 8

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.125;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .25) mask = float4(1, 0, 0, 0);
else if(posdif < 0.375) mask = float4(0, 2/3., 1/3., 0);
else if(posdif < 0.625) mask = float4(0, 1, 0, 0);
else if(posdif < .75) mask = float4(0, 1/3., 2/3., 0);
else mask = float4(0, 0, 1, 0);
return (tex2D(s0, tex)-16384/65535.)*mask+16384/65535.;// mask and output
}



// horizontal 10-pixel RGB separation for XYZ rendering on 16-bit integer surfaces
// this shader only works properly on inputs that have a horizontal resolution that is evenly divisible by 10

sampler s0 : register(s0);
float c0 : register(c0);
float c1 : register(c1);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float scaletexx = tex.x*c0*.1;
float prepos = trunc(scaletexx);// calculate the left positon of the current set of pixels
float posdif = scaletexx-prepos;
float4 mask;// create RGB mask based on the pixel location
if(posdif < .3) mask = float4(1, 0, 0, 0);
else if(posdif < .4) mask = float4(0, 1/3., 2/3., 0);
else if(posdif < .6) mask = float4(0, 1, 0, 0);
else if(posdif < .7) mask = float4(0, 2/3., 1/3., 0);
else mask = float4(0, 0, 1, 0);
return (tex2D(s0, tex)-16384/65535.)*mask+16384/65535.;// mask and output
}

leeperry
11th November 2013, 23:54
Hi Jan,

So following your advice in another thread that was asking for a "film grain" PS script, I've played around with your "semi-random grayscale noise.txt" which looks quite good but do you think it would be possible to make a PS script version of GrainFactory3() (http://forum.doom9.org/showpost.php?p=1191292&postcount=30)?

It allows you to choose the size, strength and sharpness of grain depending on dark/mid-tone/bright areas(whose limits can also be defined) and it can really be finetuned either for deblocking purposes, grain-based EE or artistic effects meant to mimick reel grain.

The problem with Didée's script is that it quickly becomes a CPU hog, Avisynth works in 8bit only and the idea would be to process it in 32fp after scaling to Jinc3AR in mVR....so if there is any way you could work your magic to do the same within a PS script, this would be too good to be true :)

:thanks: you very much in advance for even considering it,

JanWillem32
12th November 2013, 05:37
What I could find out about GrainFactory3 was: "noise generator that tries to simulate the behaviour of silver grain on film".
I already wrote some basic noise effect shaders, but maybe I could get closer to the look of silver grain on film.
When I start coding to create an effect, I start with looking at examples on images. I don't just duplicate/imitate other filters. When I've gathered enough research materials, I just start writing out possible parameters for methods in an effect. After that, I try a few methods. These are just calculations that spring to mind, and I usually copy a lot of previously written methods, too. After a bit of tinkering, I usually get the desirable effect from a shader. When transforming the prototype shader to a final type, I optimize first, and add comments. After that, I extract the set of constants, give them names and offer them as user-configurable parameters.
The reason I'm telling this is simple; I can practically guarantee that once I've finished something that resembles silver grain on film, the effect will not have a grand total 19 user input variables like GrainFactory3.
On the other hand, I don't see any options in GrainFactory3 for using color. I would probably add an option or options for this type of filter related to color, for example; to use the properties for sepia toning instead of silver. (This possibly requires separate filters, though.)

To start off simple, the pictures I could find of real film used in cinemas varied strongly over the decades. The most evident changes were the transition from grayscale or toned video to color. The form and amount of grain on film, and the cinema equipment varied, too. Some effects are available: projector film drive scratches, projector film dust, projector film lamp vignette, projector film sepia toning for SD&HD video input, grayscale, projector film shaking, semi-random colored surface noise and semi-random grayscale noise.
What era are you targeting for this kind of vintage look? Which effects are currently missing to complete the illusion of such a look? Please specify with some true vintage cinematic examples and name some very specific factors.

leeperry
12th November 2013, 18:17
Hi Jan, thanks for the swift reply.

Well, it would appear that to simulate silver film grain you'd need the ability to set different chunk sizes for dark/mid/bright pixels as grain would appear to look thicker in dark areas for instance. I think 21 grams (http://www.google.com/search?q=21+grams&tbm=isch) is a good example of what excessive reel grain can do, of course I want to keep it less intrusive.

My real-world use of GrainF3 was to set very low values in order to deblock(which tends to increase the subjective pop-effect IME), add some subtle grain-based EE, give a DLP/silver reel look to sanitized "flat looking" digital movies the same way DLP videoprojectors look pretty grain in dark areas due to their very fast rotating mirrors (http://www2.hesston.edu/Physics/TelevisionDisplays/IMAGES/DLP.JPG) for instance.

I don't think I would be interested in chroma grain, but I would need is the abilities to:
-choose the size and strength of grain depending on dark/mid/bright areas
-choose the limits for dark/mid/bright, like in GrainF3
-choose the grain pattern or make it random, much like what foxyshadis explained here (http://forum.doom9.org/showpost.php?p=1403328&postcount=40). Once you find your favorite grain patterns for dark/mid/bright areas, you can use them permanently and the "movie grain" won't be quite random anymore as it'll be finetuned to your liking.

The intent of GrainF3 is not to mimick a 1930's sepia looking semi-busted projector but simply to add (possibly very subtle) silver looking grain. Once finetuned, it can really look great as adding grain does wonders for deblocking purposes IME. Of course it can also be used with excessive settings for artistic means.

The more I talk about it, the more I wanna play around with GrainF3 all over again but I would really rather have it done via a PS script for all the aforementioned reasons :)

PS: ouh, this looks impressive too: Cinema Film Grain Plugin for FCPX (http://www.fcpeffects.com/products/cinema-film-grain) :cool:

leeperry
9th December 2013, 18:17
Hi again Jan, so I've tried my old GrainF3() calls that looked so great on CRT/DLP but they look darn noisy on LCD....would have to rework them again from scratch with much more subtle settings I guess.

Anyway, I've got a question for you :)

Here's another Avisynth script that has always impressed me: SmoothLevels() (http://forum.doom9.org/showthread.php?t=154971)

It's gone DLL but it used to be an .AVS script and here are a few older versions of it: SmoothLevels.rar (11 KB) (https://mega.co.nz/#!K0AlXLCK!VDlGv12GxSIiqSLgStdGQHXl2sEacjiVcvQanTI0oG8)

Using it to convert TV to PC, I've always found its resulting picture to look "deeper" and simply more 3D-looking than whatever ffdshow or madVR could offer.

I mentioned it to madshi a while ago who told me that its error diffusion was technically more advanced than the dithering currently done in mVR and that PS scripts couldn't process Floyd-Steinberg.

I've made screenshots comparisons on a gray ramp and a REC709 test pattern available at SmoothL2.rar (15.3 MB) (https://mega.co.nz/#!f9wCUa5a!KdRNjFkmCuL9p_xz2yPcnWuN8uZ6iBVjwLZfvvhhc68)

I'm colorblind so judging on banding and colorimetry is a very tedious task for me, but I've been told that banding didn't look any better than in mVR and that it seemed to make colorimetry shifts?

Comparing screenshots back and forth it would appear to me that SmoothL is doing some sort of colorimetry-based EE? The borders of the color squares in that REC709 pattern look quite different with SmoothL, don't they?

My point is that for instance in this very short sample Prisoners.mkv (22.9 MB) (https://mega.co.nz/#!HtwVTBYa!ZYBMzxAkSPBBxcmyr06Hp1heUW8uqMuFPcgzfYFWUuQ) when comparing SmoothLevels(preset="tv2pc",HQ=true) to anything else, that animal in the woods seems to appear much deeper in the picture and the passing cars on the road feel less "flat" and far more natural to me :cool:

"HQ" would stand for "HQ interpolation" BTW.

All this said, do you agree that the picture looks deeper with that sample when using SmoothL? Is that because its error diffusion is more advanced than what mVR can do, or because of some -as I'm suspecting- colorimetry-based EE? If so, could you possibly provide the same kind of trick with a PS script?

My problem is that SmoothL outputs 8bit from ffdshow to mVR, it's a real CPU hog(especially in HQ mode on 1.78 1080p) and I'm totally hooked to its 3D look :o

Hope you can look into it, :thanks: in advance!

foxyshadis
13th December 2013, 18:05
Hi Jan! Excellent work with the new filters. Can I request that the main archive be updated to include the last couple of years' work, as well? I know some aren't 100% finished, but it'd be nice to have them all handy for quick downloading onto the various systems I have set up for media.

vBm
13th December 2013, 21:32
Would be quite nice to have centralized repository for all the shaders.

JanWillem32
17th December 2013, 15:41
leeperry, sorry that I took a while to give a status update. I've tried several prototype shaders for the grain/noise effects. None are really finished yet, and it's going to take some more development. I'm currently struggeling with getting larger, well-shaped, random grain particles efficiently in the output images. The prototype shaders are currently hardly any better than the simple types I already made before. (projector film dust, projector film drive scratches, semi-random colored surface noise and semi-random grayscale noise) The shaders that do generate larger grain particles generate nasty artifacts, and the shaders that generate the medium to fine noise are very per-pixel randomized.

As for "chroma grain", I'm not going to use any Y'CbCr channels for this shader, neither luma nor chroma. I do occasionally use the chroma parameter of CIECAM02 for some color-related filtering, but in this case that's not relevant at all. For the grayscale, toned (such as sepia) and three-layered film types, the grain/noise modifies the intensity of each tinted chemical applied on the film. At the moment I unfortunately still have to guess which xyY primary color (or color function) each of these tinted chemicals govern for both the recording as the playback phases, but I'll research that some more. If anybody knows these details, please notify me.

I'm not going to add support for textured patterns. It's extremely impractical from the renderer's perspective with the current interfaces, hardware support and file management. Of course the renderer does have easy access to internal resources. I'll elaborate more on that point later.

Multiple general shapes and sizes for the grain in the shaders shoud be possible, but it's currently not that easy to configure with the prototypes.

For shaders that do color corrections, I already wrote quite a few already. These all need to be edited though. The descriptions in these shaders are sub-par. I also wrote some new prototypes. I'll publish some implementations of these at a later date.

Dithering is a complicated transformation done in the absolute final filtering pass of a video renderer. I'm not going to write a dithering filter outside the context of exactly that.
Floyd–Steinberg dithering is due to its filtering kernel indeed unsuitable for pretty much any modern dithering method, including the dithering done in pixel shading passes. There are plenty of good alternatives in ordered dithering and such. I already wrote three modes to dither for a final filtering pass, and I'll happily add more modes if another suitable method pops up, but for now, I'm not going to write more dithering filters.

As for banding, when using either of the two advanced video renderers currently available for filtering; it's pretty much always a problem in the video source. For video filters that don't keep much quality in quantization and dynamics during filter transformations, additional banding will indeed occur. Debanding the source is quite a difficult problem. I've written several filters that can more or less deband a bit. (Don't expect much from the debanding filters with only a small filtering radius.)

As for the other effects you name that you attribute to SmoothLevels(), there's only so much you can transform in a width-height-color-time domain. A different ditherer isn't going to make things "appear much deeper" or "less flat". That's something generally achieved by sharpening, contrast enhancements or even gamma controls. I've written several of these filters already.

foxyshadis and vBm, as I mentioned in the main MPC-HC thread here on the software players board:The Pixel Shader Pack v1.5 will need quite a lot of extra work. I don't like the blocks of comments below the title of the shaders. It still mentions "screen space pixel shader" and such, which is just outdated. I also want to split the entire set of shaders per rendering color space+format. I'll have to add a 'readme' to properly document what to use in what renderer for that. Most of the pixel shaders are already available as multi-version (see the last few pages of the shader thread for examples).
Some shaders need to be scrapped for the simple reason that they should never be used (or I'll just put them in the junk folder called 'development'). I also wrote new shaders and corrected some wrong methods in older ones.
My main problem is actually the set of newly developed shaders. These hardly contain any comments, and some of them are just not user-friendly. I just don't have time to edit and test over 350 files.I just write a lot of, well... crap. There are a lot of prototype and 'finished' shaders that I should really edit to make them decent. Some other shaders need to be dumped in the 'development' junk folder with correct comments inside. And again some others can just be deleted. Just collecting all shaders in their current state isn't going to help at all.

leeperry
17th December 2013, 16:30
As for the other effects you name that you attribute to SmoothLevels(), there's only so much you can transform in a width-height-color-time domain. A different ditherer isn't going to make things "appear much deeper" or "less flat". That's something generally achieved by sharpening, contrast enhancements or even gamma controls. I've written several of these filters already.
Hi Jan, thanks for the reply!

I've played around with SmoothL quite a bit lately and I still suspect it to be processing chroma-based EE, isn't that possible whatsoever?

Could you please try the aforemetioned sample and SmoothL Avisynth call and see for yourself what it does? It looks like EE to me, but not the usual halo-based luma EE, there is no halo as far as I can tell and yet things look a bit like a cell-shade cartoon to me...there's more to it than what it would appear IMO. Or maybe it does mess with gamma on high contrast edges if that's even possible? I wish I could find proper test patterns to find out wth it does :sly:

All this said, it outputs 8bit to mVR so noise quickly becomes an issue and it's a CPU hog when OTOH mVR is able to process the TV>PC conversion with lower banding off a low GPU load at that.

Also, its subjective EE is impressive at first, much like Samsung's DNIE but they both kinda look "artificial" to me....if you could find out what the trick is, you could possibly allow us to finetune it using much weaker settings :)

FWIW, SmoothL supports a "debug=true" argument that shows all kinds of histograms in real time.

JanWillem32
18th December 2013, 19:50
leeperry, I just tested SmoothLevels().
I'm sorry but I can't really see much difference beyond the obviously clipped levels beyond the {[16, 235], [16, 240], [16, 240]} intervals in synthethic tests. I tested "SmoothLevels(preset="tv2pc",HQ=true)" with the "Prisoners.mkv" sample, with input intervals set to [0, 255] for the conversion to RGB. I compared it with the regular conversion without SmoothLevels() and the regular {[16, 235], [16, 240], [16, 240]} input intervals setting.
Note that for some reason the the "Prisoners.mkv" sample with a resolution of 1280×720 pixels was set to an anamorphic display ratio of 1279:720. I removed the anamorphic ratio before testing to prevent distortions.
Here's a link containing the two screenshots, taken of frame 600 of the sample, and the absolute diffence result: - .
The absolute difference in terms of R'G'B' is mostly 0.. The maximum difference is 6./255., in the form of some completely blue diffences on some of the sides of a tree.
The differences are mostly scattered R'G'B' pixels, but some groups can be seen on the sides of the trees and there are some patches on the brighter areas. The difference map also shows that only rarely combinations of R', G' and B' differences are found in the same pixel, pointing that there's not much change in brightness/luminance/contrast/luma/lightness.
My analysis is that all of this is mostly due to different dithering and dithering twice over, and not much else. Did I do something wrong during testing?

leeperry
19th December 2013, 02:52
Ah....So what did you compare "SmoothLevels(preset="tv2pc",HQ=true)" to exactly? mVR's TV to PC conversion?

Yes, the sides of the trees look different to me with SmoothL and the difference I'm seeing is definitely not placebo as I was able to DBT it with the help of a friend :o

I just recompared the REC709 test patterns from SmoothL2.rar (15.3 MB) (https://mega.co.nz/#!f9wCUa5a!KdRNjFkmCuL9p_xz2yPcnWuN8uZ6iBVjwLZfvvhhc68) and it seems clear that SmoothL outputs a higher level of dithering(I disabled dithering in mVR when using SmoothL), but maybe my brain likes it better because it's more advanced than what mVR is able to achieve...madshi made it clear that it's technically impossible to implement error diffusion with PS but that it might be possible with OpenCL/CUDA(which mVR does not support yet).

So you did write PS scripts for dithering? Would they be more efficient than what mVR currently does? No error diffusion I guess?

:thanks:

JanWillem32
19th December 2013, 07:04
I didn't use a video renderer. I compared the regular dithered output 8-bit R'G'B' by ffdshow tryouts' conversion. The internal {[16, 235], [16, 240], [16, 240]} intervals conversion to full range and that of SmoothLevels() doesn't differ all that much. The only consequence of letting SmoothLevels() is that the image is converted to YV12 in between expanding the levels and converting to R'G'B'. The consequence is dithering twice over, which is visible, but not really a good thing (I don't like adding noise at all).
The easiest 'dither' to implement is a random dither. You need to apply a lot of it so that it actually works, but it's not an elegant way of handling the issue. Example: http://caca.zoy.org/wiki/libcaca/study/1
Next are the static and random ordered dithering, which use a matrix. These require some attention on implementing, but these are very efficient single-level dithering methods. Example: http://caca.zoy.org/wiki/libcaca/study/2
Error diffusion and other methods are either pixel-progressive or multi-pass. This makes them very undesirable in modern systems. I'm not even going to attempt to implement one of those. Examples: http://caca.zoy.org/wiki/libcaca/study/3 and http://caca.zoy.org/wiki/libcaca/study/4
In general, if you want better dithering, just use a larger matrix for random ordered dithering. The other options would make due to performance problems no sense at all.
In terms of comparing my work to madVR, that's not easy. Because madVR has had a good many years of active development, it has many options. I however can't compare the internal code. I generally write really efficient, but wonderfully complicated code/assembly, that's for sure.

leeperry
19th December 2013, 19:18
Oh, I'm only parroting what madshi said: http://forum.doom9.org/showpost.php?p=1594415&postcount=14334
Screenshots are always converted to 8bit fullrange RGB (0-255), using error diffusion, which is a higher quality dithering algorithm compared to what madVR does during playback.
http://forum.doom9.org/showpost.php?p=1594420&postcount=14336
Error diffusion can't be done with GPU pixel shaders because error diffusion processes one pixel at a time, using the result of the previous pixel calculation for the next pixel. GPU pixel shaders more or less work on all pixels at the same time, which is the opposite of what error diffusion needs. *Maybe* it might be possible to do error diffusion using OpenCL/CUDA, but it will be difficult to do and likely not perform very well. For screenshots error diffusion is easy because madVR is doing the screenshot processing via CPU instead of GPU.
IIRC He told me that SmoothL does use error diffusion and that it wouldn't be possible in mVR without resorting to OpenCL/CUDA.

So you're saying that SmoothL does double dithering, interesting! I know many ppl used to enjoy my MT("GrainFactory_MT2 (http://pastebin.com/H6fDkSBp)(3,5,100,100,1.0,0.7,0,0,0,96,0)",4) call as much I did, so I guess my brain likes grain after all..I might just try some of your scripts for moar dither "grain" :)

romulous
29th December 2013, 04:51
Hi JanWillem32,

I was looking for a rotation PS (to rotate video files captured via mobile devices in portrait mode for example) - I just tried your "flip and rotate sampling direction for RGB" (for your v1.4 pack) in MPC-HC, and I finally managed to get it to load. Question - is this what the output is supposed to look like?
http://i.imgur.com/m3cNrQ3.jpg

If so, I have badly misunderstood what the PS is meant to do (I was looking for one that can rotate an entire video 90 degrees, 180 degrees, 270 degrees etc depending on how the person who filmed it was holding their camera at the time).

This is the same frame in the actual video itself for comparison with the PS:
http://i.imgur.com/pE0VuVR.jpg (as you can see, all this video requires is a simple 90 degree clockwise rotation)

Thanks.

JanWillem32
29th December 2013, 17:33
That shader performs an effect to flip and rotate each R, G and B channnel every frame on square textures.
You are probably looking for the renderer rotation effects. These are listed in the menus for resizing, rotation, pan and scan, et cetera, if available. Not all video renderers support this feature. (I wrote one with only partial support.) If that doesn't work, I can write a shader that does 90 degrees rotation, horizontal flipping and vertical flipping.
Note that geometry changes due to rotation usually require manual compensation, as the video renderer host usually doesn't compensate for it automatically.

romulous
30th December 2013, 02:48
You are probably looking for the renderer rotation effects.


Correct - I'm looking for a way to rotate entire videos. These will mainly be videos shot on mobile devices where the user has the camera not pointing the right way up. Here's an example posted here on Doom9 previously for this very thing:
http://www39.zippyshare.com/v/28485654/file.html


Not all video renderers support this feature. (I wrote one with only partial support.)


Heh - I wish I had asked here first before spending the entire day yesterday working on this (worked with your PS, various playing software, and a freely available DS rotation filter and that's the conclusion I came to).

The player I use is Zoom Player, and we have had an increasing number of people asking for video rotation, so I went looking for a solution. The only native Windows player so far that I have found that can do video rotation is MPC-BE, and only when using its own EVR CP (which doesn't help folks wanting the feature in other players such as Zoom).


If that doesn't work, I can write a shader that does 90 degrees rotation, horizontal flipping and vertical flipping.
Note that geometry changes due to rotation usually require manual compensation, as the video renderer host usually doesn't compensate for it automatically.

That would be great if it isn't too much trouble :) Zoom doesn't support PS as yet, but Blight is willing to add it as madVR already supports them (so only player support is required). I don't actually create mobile videos myself, but from what I've seen from samples various folk have posted, most seem to require a 90 degrees clockwise rotation, though I don't suppose you could really rule out a 90 degrees counter-clockwise rotation or maybe even a full 180 (how many different ways can you hold a camera that is not the correct way up?).

JanWillem32
31st December 2013, 02:47
The only native Windows player so far that I have found that can do video rotation is MPC-BE, and only when using its own EVR CP (which doesn't help folks wanting the feature in other players such as Zoom).In my honest opinion, rotation options should be implemented natively in the video renderer's mixer or resizer filtering passes. A separate pixel shader can do rotations, but these are not ideal. It's not efficient nor user-friendly. The efficient way of implementing these transforms is by using custom vertices for sampling the source texture (which in turn can be fed to any pixel shading pass in a single, combined transform). A separate pixel shader can't properly adjust the global resizing factor to fit an image nicely on screen (except when you only use the flip horizontal and flip vertical options which don't change geometry).
I'm well aware that there is no DirectShow interface to regulate video rotation, and integrated renderer filters such as this one are difficult to implement. I'll try to write a basic rotation shader today.I don't actually create mobile videos myself, but from what I've seen from samples various folk have posted, most seem to require a 90 degrees clockwise rotation, though I don't suppose you could really rule out a 90 degrees counter-clockwise rotation or maybe even a full 180 (how many different ways can you hold a camera that is not the correct way up?).180 degrees rotation is done by setting both 'flip horizontal' and 'flip vertical' options, 270 degrees rotation is done by setting all three options. I'll put that in the comments.

romulous
31st December 2013, 07:30
In my honest opinion, rotation options should be implemented natively in the video renderer's mixer or resizer filtering passes.

Indeed, Blight agrees - which is why he asked madshi to add it to madVR a while back. It's low on his to-do list though (and at the moment, madshi is not doing feature requests at all, so this will be a long time coming I'm afraid).


I'm well aware that there is no DirectShow interface to regulate video rotation, and integrated renderer filters such as this one are difficult to implement. I'll try to write a basic rotation shader today.

Thanks, at least that will be something in the meantime. For anyone wondering, this is the rotate filter I was testing:
http://videoprocessing.sourceforge.net/#rotate

Freeware and open source. It only works in RGB24 or RGB32, so you do have to load the colour space converter filter as well. 180 degree rotation works fine, as does vertical and horizontal (though vertical seems to not actually change the image in any way). 90 degrees, 270 degrees and diagonal all produce garbled images though:
http://i.imgur.com/4riVR5p.jpg

I'm told that would be because none of the video renderers support dynamic resolution changing, and that test clip is not the same width as it is height (meaning when you rotate it in certain ways, the resolution will change).

pirlouy
31st December 2013, 13:37
@JanWillem32: I'm quite sure you won't be interested, but maybe curiosity will persuade you.

Samsung TV offers a setting called "Dynamic Contrast" in options (not "CE dimming"), which changes original image, but I find it to be "nice looking" sometimes. You have 3 settings, but "low" is the best, 2 others change image too much.

Do you think you could write a shader for this, or it would me more complicated for the GPU ?
I suppose it changes pixel like this:
(post-resized by renderer)
Old Pixel: R=50; G=40; B=230
New Pixel:
R= 50+((50-128)*5%) = 46 (rounded)
G= 40+((40-128)*5%) = 36 (rounded)
B= 230+((230-128)*5%) = 235 (rounded)

Do you think my reasoning is stupid ?:confused:
Is there a simple way to test this algorithm with MPC ?

fagoatse
31st December 2013, 14:15
@JanWillem32: I'm quite sure you won't be interested, but maybe curiosity will persuade you.

Samsung TV offers a setting called "Dynamic Contrast" in options (not "CE dimming"), which changes original image, but I find it to be "nice looking" sometimes. You have 3 settings, but "low" is the best, 2 others change image too much.

Do you think you could write a shader for this, or it would me more complicated for the GPU ?
I suppose it changes pixel like this:
(post-resized by renderer)
Old Pixel: R=50; G=40; B=230
New Pixel:
R= 50+((50-128)*5%) = 46 (rounded)
G= 40+((40-128)*5%) = 36 (rounded)
B= 230+((230-128)*5%) = 235 (rounded)

Do you think my reasoning is stupid ?:confused:
Is there a simple way to test this algorithm with MPC ?

AFAIK the dynamic contrast option works by measuring ambient light in one's room so I guess it's impossible to simulate it correctly using shaders alone.

JanWillem32
1st January 2014, 00:00
romulous, this shader will work for now. I can also write a variant that masks the artifacts that occur when using diagonal flipping on rectangular textures.

pirlouy, that's a static contrast method. It's away from gray, and will crush near-black and near-white in the process. I don't like static nor dynamic contrast filters, as these always cause distortion. (I've already written several static contrast methods already, nonetheless.)
It's possible to do correct enviroment light adaptations (see the CIECAM02 transforms for reference). These filters require a lot of parameters, but are reasonably simple otherwise. These filers don't offer 'low'/'medium'/'high' options by the way. There's no place for user preferences in such filters. The ambient light factors can both be measured and estimated (again, see the CIECAM02 shader).

Example shader code (a funny one, as it only requires sampling a pixel and one multiply-add operation):#define CenterValueRed .5
#define CenterValueGreen .5
#define CenterValueBlue .5
#define ContrastRed .05
#define ContrastGreen .05
#define ContrastBlue .05
static const float4 Contrast = 1.+float4(ContrastRed, ContrastGreen, ContrastBlue, 0.);
static const float4 ScaledCenterValue = float4(CenterValueRed*ContrastRed, CenterValueGreen*ContrastGreen, CenterValueBlue*ContrastBlue, 0.);

sampler s0 : register(s0);

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float4 s1 = tex2D(s0, tex);// original pixel
return s1*Contrast-ScaledCenterValue;// process contrast and output
}// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// flip and rotate
// This shader shoud be run as a screen space pixel shader when enabling diagonal flipping, else both modes will work.
// This shader requires compiling with ps_2_0, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// If possible, avoid compiling with the software emulation modes (ps_?_sw). Pixel shaders require a lot of processing power to run in real-time software mode.
// This shader can flip and rotate full images.
// Note than when enabling diagonal flipping, the resizing factor has to be lowered in advance to allow the full image to be visible.

// FlipHorizontal, FlipVertical and FlipDiagonal: 0 is disabled, 1 is enabled
// To rotate by a quarter clockwise, enable FlipHorizontal and FlipDiagonal.
// To rotate by a half, enable FlipHorizontal and FlipVertical.
// To rotate by a quarter counter-clockwise, enable FlipVertical and FlipDiagonal.
#define FlipHorizontal 0
#define FlipVertical 0
#define FlipDiagonal 0

sampler s0;
#if FlipDiagonal
float2 c0;
float2 c1;
#endif

float4 main(float2 tex : TEXCOORD0) : COLOR
{
tex -= .5;
#if FlipHorizontal && FlipVertical
tex = -tex;
#elif FlipHorizontal
tex.x = -tex.x;
#elif FlipVertical
tex.y = -tex.y;
#endif
#if FlipDiagonal
tex = tex.yx*c1*c0.yx;
#endif
return tex2D(s0, tex+.5);// sample and output
}

romulous
1st January 2014, 01:46
romulous, this shader will work for now. I can also write a variant that masks the artifacts that occur when using diagonal flipping on rectangular textures.

Hi Jan,

Thanks - just trying it out in MPC-HC now (using madVR). Just testing the 90 degrees clockwise flip to begin with. On some of the videos, I see this (I tested 5 videos, these occurred on 4 of them - the remaining video was fine):
http://i.imgur.com/AcbOWwi.png

Is that the artifacts that you were referring to, or are these different ones?

Thanks!

JanWillem32
1st January 2014, 04:06
This shader is a little bit heavier, but can mask artifacts:// (C) 2013 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// flip and rotate with background mask
// This shader shoud be run as a screen space pixel shader when enabling diagonal flipping, else both modes will work.
// This shader requires compiling with ps_2_0, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// If possible, avoid compiling with the software emulation modes (ps_?_sw). Pixel shaders require a lot of processing power to run in real-time software mode.
// This shader can flip and rotate full images.
// Note than when enabling diagonal flipping, the resizing factor has to be lowered in advance to allow the full image to be visible.

// FlipHorizontal, FlipVertical and FlipDiagonal: 0 is disabled, 1 is enabled
#define FlipHorizontal 0
#define FlipVertical 0
#define FlipDiagonal 0
// BackgroundMask: red, green, blue and alpha vector to return on border values when FlipDiagonal is activated, intervals [0, 1]
#define BackgroundMask float4(0., 0., 0., 0.)

sampler s0;
#if FlipDiagonal
float2 c0;
float2 c1;
#endif

float4 main(float2 tex : TEXCOORD0) : COLOR
{
tex -= .5;
#if FlipDiagonal
float2 geometryswap = c1*c0.yx;
float2 border = .5*geometryswap;
if((abs(tex.x) > border.x) || (abs(tex.y) > border.y)) return BackgroundMask;// draw background color on background pixels
#endif
#if FlipHorizontal && FlipVertical
tex = -tex;
#elif FlipHorizontal
tex.x = -tex.x;
#elif FlipVertical
tex.y = -tex.y;
#endif
#if FlipDiagonal
tex = tex.yx*geometryswap;
#endif
return tex2D(s0, tex+.5);// sample and output
}

romulous
1st January 2014, 04:28
Yes, thanks - that seems to remove the artifacts pretty well. It's much appreciated - you make writing these things look so easy! :) Is how 'heavy' the shader is based on your video card, or video card+cpu, or overall system?

I think Blight was wondering what pixel shader profile was required - is that the "ps_2_0" listed in the header?

JanWillem32
1st January 2014, 05:55
This shader really isn't that heavy compared to complex filters, such as some forms of resizeing, debanding, denoising, sharpening and frame interpolating types (and it was indeed really easy to write as well). An old, low-end GPU/IGP might choke on executing this shader in some cases, but setting a few shaders is usually not really a problem in most cases. This particular shader indeed requires the Direct3D 9 minimum of PS 2.0 support from the GPU, as indicated. Not that it really matters, as a video renderer can simply auto-detect ps_2_0, ps_2_a, ps_2_b and ps_3_0 levels from the D3D9 support caps report and use the highest level available. If a shader fails, that particular shader's title can simply be reported back to the user (already a required feature, as a video renderer supporting custom shaders has to deal with truely faulty shaders as well). There are also not that many active users using a GPU with lower than PS 3.0 (and PS 2.0 or higher) support anymore.
For MPC-HC I've advised some time ago to no longer even store the pixel shader compiling level with every shader. The interface to the video renderers doesn't need this parameter at all. The pixel shader menus can simply default to PS 3.0 every time, and offer the other three modes for testing purposes only.
For MadVR the case is even easier. MadVR notes in the list of system requirements "graphics card with full D3D9 / PS3.0 hardware support". There is no use in compiling pixel shaders at a lower level than that.

James Freeman
1st January 2014, 11:44
@JanWillem32

Can you please make "Sharpen Complex 2" only for Chroma?

I am on to something big here (huge improvement to chroma upscaling resolution).
I can almost make 4:2:0 to look like 4:4:4.


I'll explain:

First I captured a Belle-Nuit (http://www.belle-nuit.com/test-chart) test chart in 4:4:4 (rgb32) video.
Then I ran it through x264 to make a 4:2:0 video (and lose chroma information).
Then I opened both videos side by side and activated your pixel shader called "chroma for SD&HD video input" to show only chroma information on both of them.
Then I activated "sharpen complex 2" (pre-resize) on the 4:2:0 video and carefully tweaked it look like the 4:4:4 video (yes, it can even do that).

The idea is quite simple: Sharpen the chroma before upscaling.
The Sharpen 4:2:0 chroma results are very nice, almost like the original 4:4:4 (giving the saturated colors back to high frequency details).
Now I only need to put the Sharpen Chroma together with the untouched Luma for a perfect 4:2:0 -> 4:4:4 conversion.
That's why I need the Sharpen Complex 2 only for Chroma, leaving the Luma untouched.

Hope that captures you interest.

Happy New Year !