Video pixel shader pack [Archive] - Page 5

markanini

14th March 2011, 18:13

I'm excited about these scripts but I can't get my head around how to load them. Will any of these scripts be added to MPC-HC core?

Qaq

14th March 2011, 19:05

Open the shader editor by going to View > Shader Editor. In the title textbox at the top, enter the name of the shader, and select ps_3_0 from the drop down list on the right. Open shader file into a text editor, and copy and paste the contents into the shader editor. Close MPC. The shader will then be found under the Play > Shaders menu.

JanWillem32

28th March 2011, 19:51

For the latest pack I took some extra time to look up assembly math. It was pretty hard to improve my original shaders, but it was worth it. I made improvements to almost all shaders, and added a few new ones.
Color gates: These are a bit tricky to get working (the numerical intervals are narrow), but it's a popular filter, I've even seen this effect installed on an older smartphone (camera effect filter). It's an effect filter that allows an interval of colors to pass-trough unmodified, to highlight them and change the rest of the picture to grayscale.
Because the floating point surfaces behave differently compared to the integer types, it's sometimes necessary to make two different types of shaders for the same function.
To accommodate a common multi-pass function I added "optimized path for up-sampling floating point surfaces". It has 3 passes to up-sample a 4:2:0 source, and 2 passes to up-sample a 4:2:2 source. It's actually easy to merge multiple shaders that have only a single input in a single pass. To show how it's done, I also added all (optional) color control functions to step 3.

Deshi

29th March 2011, 08:45

Hi JanWillem32 !

I want to try your spline6 resizers with your "optimisation trick".
I mean, for height and width modifying as follow (for my 720p TV) :
float c1;
to
#define c1 (1/1280.)
float2 c1;
to
#define c1 float2(1/1280., 1/720.)
float c0;
to
#define c0 1280
float2 c0;
to
#define c0 float2(1280, 720)

But I was wondering about 2 things :
1) Do I have to modify the line : #define Magnify (4/3.) ?
2) If my input video is in 1080p, will it still works ? Because I would put these shaders in screen space.

Thanks

JanWillem32

29th March 2011, 09:48

1) Do I have to modify the line : #define Magnify (4/3.) ?Yes, the normal scalers get input numbers for the scaling and positioning and use custom surface sizes to work with. External shaders don't have that luxury.
2) If my input video is in 1080p, will it still works ? Because I would put these shaders in screen space.Another quote from those two shaders:
// This shader should be run as a screen space pixel shader if you are up-scaling.
// This shader should not be run as a screen space pixel shader if you are down-scaling.

I know, I really should get started to get 2-pass scaling working in the rendering engine. Additional scalers could have been integrated months ago if that item wasn't bugged.
Speaking of bugs, I messed up brightness, gamma and noise detection parameters for "sharpen complex v3 + deband + denoise" and "sharpen complex v2 + deband + denoise". These cause banding in low-light near-pure red, green or blue gradients. After some fine-tuning of those two I'll release another shader pack.

Deshi

29th March 2011, 10:44

Thanks for the quick reply !

So if I understand correctly I should put this :
#define Magnify (16/9.) with the fixed values for the "floats".

Does it mean that all video will be output in 16/9 and thus altering the aspect ratio ?

Does your last comment means that we should be able soon to choose in the renderer settings of MPC-HC the spline resizer instead of bicubic ?

One last for the road : :rolleyes:

Is there a benefit in putting deband + denoise before the resize is made ?

Thanks again.

JanWillem32

29th March 2011, 11:38

The correct item would be #define Magnify (2/3.) , as (1280/1920.) and (720/1080.) resolve to that fraction.
If I manage to get 2-pass scaling to work, I can probably get spline, Lanczos and other kinds of scalers integrated. The shader part is easy, but the (CPU-based) vertex setup of the 2-pass scalers is bad. It can't handle rotation at all and the current code uses incorrect math.
Sharpening kernels generally benefit from higher resolutions, but it depends on what you like. Just try some settings, but remember that most of the RGB-type shaders (including the default scalers) are written to be used on linear RGB input, so it's usually beneficial to include gamma conversion shaders in the filter chain.

Deshi

29th March 2011, 13:30

The correct item would be #define Magnify (2/3.) , as (1280/1920.) and (720/1080.) resolve to that fraction.
To be honnest I don't understand why I have to put (2/3.) if I get values to #define c1 (1/1280.), #define c1 float2(1/1280., 1/720.), #define c0 1280 and #define c0 float2(1280, 720).
But since I can't use the shader for both up/down-scaling it's not really an issue. For the down-scaling part it would be #define Magnify (3/2.), right ?
I've no idea how to get it in "shaders" or "screen space shaders" depending on resolution of the input...
Just for info : will it change the aspect ratio of the input ?

If I manage to get 2-pass scaling to work, I can probably get spline, Lanczos and other kinds of scalers integrated. The shader part is easy, but the (CPU-based) vertex setup of the 2-pass scalers is bad. It can't handle rotation at all and the current code uses incorrect math.
Knowing that the work is in progress is still a good news :cool:
If you solve this, does it mean that the included renderer of MPC will be able to perform as MadVR ?
In an utopic setting, will it be possible to have a renderer in which you can choose/set options like resizer, sharpen, denoise, deband, deringing... ?

Sharpening kernels generally benefit from higher resolutions, but it depends on what you like. Just try some settings, but remember that most of the RGB-type shaders (including the default scalers) are written to be used on linear RGB input, so it's usually beneficial to include gamma conversion shaders in the filter chain.
The conversion of gamma was implied, but in my question I was deliberatly refering only to deband + denoise. Setting in the shader all the values of sharpening to 0.
But then, is there a benefit to use the sharpen + deband + denoise in screen space ? Before re-converting to video RGB of course...

JanWillem32

29th March 2011, 16:22

To be honnest I don't understand why I have to put (2/3.) if I get values to #define c1 (1/1280.), #define c1 float2(1/1280., 1/720.), #define c0 1280 and #define c0 float2(1280, 720).The original frame is 1920×1080 and it's scaled to 1280×720, that's a factor 2/3 in both dimensions.
#define c1 (1/1280.), #define c1 float2(1/1280., 1/720.), #define c0 1280 and #define c0 float2(1280, 720) only work in screenspace. For down-scaling I advise to use the default registrations of c0 and c1. That's because a lot of 1080p source material isn't 1920×1080 at all. I have videos that use 1440×1080 and 1920×800, for example.
But since I can't use the shader for both up/down-scaling it's not really an issue. For the down-scaling part it would be #define Magnify (3/2.), right ?(2/3.)I've no idea how to get it in "shaders" or "screen space shaders" depending on resolution of the input...I use .REG files to switch profiles (only works if the program isn't running). It's a manual method, but it's a lot faster than setting up shaders in the combine menus.
An example, for setting a preset of some shaders in both video resolution space and screenspace:Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Software\Gabest\Media Player Classic\Shaders]
"Combine"="RGBtYCC|CRS5Ch420|CRS5Ch422|sc10a1.5"
"CombineScreenSpace"="CRS6h1080|CRS6w1080"
"Initialized"=dword:00000001

[HKEY_CURRENT_USER\Software\Gabest\Media Player Classic\Settings]
"ShaderListScreenSpace"="CRS6h1080|CRS6w1080|"
"Shaders List"="RGBtYCC|CRS5Ch420|CRS5Ch422|sc10a1.5|"
"ToggleShader"=dword:00000000
"ToggleShaderScreenSpace"=dword:00000000(Yes, even I don't use those huge names for the shaders internally.)Just for info : will it change the aspect ratio of the input ?That only happens when the magnification factor for the height and width shader are different.Knowing that the work is in progress is still a good news :cool:
If you solve this, does it mean that the included renderer of MPC will be able to perform as MadVR ?
In an utopic setting, will it be possible to have a renderer in which you can choose/set options like resizer, sharpen, denoise, deband, deringing... ?I'll just try to work my magic in the limited amount of time I have. I'd love to work on a lot of things, but as I'm the only developer currently working on the video renderers, it's pretty hard to get everything done. We simply lack developers that can write C++, do some management in the bug tracker, maintain this forum section or be available on the IRC channel. Integrating some shaders to work like the final pass shader (color management+dithering) would be very nice, but it takes a lot of time to develop that kind of systems.The conversion of gamma was implied, but in my question I was deliberatly refering only to deband + denoise. Setting in the shader all the values of sharpening to 0.
But then, is there a benefit to use the sharpen + deband + denoise in screen space ? Before re-converting to video RGB of course...The higher the resolution you place the filter in, the lower the chance of aliasing becomes, but it also adds to the risk of blurring some low-contrast details that are already blurred by up-scaling. Higher resolutions use relatively more processing power than lower ones.
Filtering in the lowest of the two resolutions will often blur less details, with the exception of fine checkerboard patterns and similar things (those just scale badly).
The "sharpen complex v3 + deband + denoise" and "sharpen complex v2 + deband + denoise" shaders can blur a lot if EdgeSharpen (detection limit) is disabled. Be careful with that parameter. The sharpen parameters should cause only very minor blurring when set to 0. Good luck in finding the settings you like.

For Pixel Shader Scripts, test 24, I updated the "sharpen complex v3 + deband + denoise" and "sharpen complex v2 + deband + denoise" shaders. They now work properly again. I also added "optimized path for up-sampling chroma on integer surfaces" and cleaned up a bit of code.

CruNcher

29th March 2011, 17:07

JanWillem32 im not so into after XP Media architecture but especially for Vista/7 wouldn't be utilizing DirectCompute be much more powerfull and also allow more flexibility for the Players Post Processing (Scaling) (lower cpu overhead, less PCI-E transfer overhead) theoreticaly lower power consumption :) ?

so for the most efficiency wouldn't it be better to utilize 2 different paths

Windows XP = Shader via VMR9 Renderless (Directx 9 on GPUs supported most Nvidia/ATI/Intel)
Vista/7/8 = Direct Compute Shader (any renderer ?) (Directx 10/11 Newer GPU generation starting already with G92 for Nvidia, even earlier for AMD/ATI)

I didn't benchmarked Vista/7 Video efficiency yet compared to XP but according to Microsoft they improved quite a lot in Rendering efficiency, im still happy with XP performance though :)

It seems their are no real Benchmarks of the Media Part differences existing and comparing vs XP either with Aero or without (and DWM entirely disabled)

Though seeing that Intel for example is only going exclusive in their Media SDK 2.0 with Vista/7 is a little frightening (no official XP support anymore, though they are the only ones might be also a try to push their Hollywood approved DRM ecosystem faster on as many Systems as possible, users need to change to Vista/7 to utilize it and their Partners 1080p services)

"Specifically, DirectCompute technology helps accelerate the performance across several scenarios that historically took a long time to complete when just using the CPU. CyberLink's support of DirectCompute in their latest applications results in higher performance across these scenarios - something that our joint customers have asked for."

I guess Arcsoft utilizes it already for their Sim3D and SimHD on Windows XP they do it knowingly with either CUDA/OPENCL(mainly for ATI compatibility)/CPU same as Cyberlink does for their Truetheater PP or Corel/Intervideo for WinDVD

JanWillem32

31st March 2011, 21:34

I like DirectCompute too, but for graphics with synchronization for the threads it requires a DirectX 11 video card and a supported OS. DirectX 10 hardware can still use DirectCompute in offline mode, but I don't know if a DirectCompute shader on Directx 10 hardware can be modified to render images.
In the future, many developers hope to indeed be able to set up a full rendering engine to be managed by the GPU. Rendering paths currently use the CPU at every step of drawing operations to set up the GPU and start rendering objects, while the input and output objects themselves are nearly always loaded in the video memory and cache.
I saw this article a few days ago, it seems that the manufacturers are ready to move to low-level control: http://www.bit-tech.net/hardware/graphics/2011/03/16/farewell-to-directx/ .

For more efficiency in any custom renderer, it would be best to drop the VMR-9 and EVR mixers and use a custom mixer (even for the DirectX 9 platform). A renderer written in DirectX 11 that's written to use level 4.0 instructions will also work on DirectX 10 and 10.1 hardware.
I would most certainly like to help writing a DirectX 11 renderer, but only once a custom mixer is done and the EVR CP-sync merge is completed (both are long overdue). That's going to require some more C++ developers. If anyone's interested to write code or manage communication for the project, they're most welcome.

Vista's WMP features EVR. If you compare that to the previous VMR-9 (windowed) renderer, it's indeed a big improvement, but it's still inferior to the quality provided in custom renderers,

Aero is a nice GUI renderer, it's not really that heavy on the CPU, GPU and the two kinds of RAM. The only problems I have with it are the forceful synchronization method in windowed mode (sometimes even in exclusive mode) that can cause tearing.

I don't know anything about Intel's Media SDK 2.0, but I do know that Intel never provided support for their GPU's DXVA on Windows systems older than Vista, so I'm not that surprised.

I'd definitely like to look at the code by Arcsoft. It's a lot of work to write and maintain code for three types of APIs. I've also never written anything with CUDA or OPENCL.

I fixed some things in the colorfulness gamma processing for the color control shaders. no new functions were added this time.

CruNcher

1st April 2011, 16:59

I like DirectCompute too, but for graphics with synchronization for the threads it requires a DirectX 11 video card and a supported OS. DirectX 10 hardware can still use DirectCompute in offline mode, but I don't know if a DirectCompute shader on Directx 10 hardware can be modified to render images.
In the future, many developers hope to indeed be able to set up a full rendering engine to be managed by the GPU. Rendering paths currently use the CPU at every step of drawing operations to set up the GPU and start rendering objects, while the input and output objects themselves are nearly always loaded in the video memory and cache.
I saw this article a few days ago, it seems that the manufacturers are ready to move to low-level control: http://www.bit-tech.net/hardware/graphics/2011/03/16/farewell-to-directx/ .

For more efficiency in any custom renderer, it would be best to drop the VMR-9 and EVR mixers and use a custom mixer (even for the DirectX 9 platform). A renderer written in DirectX 11 that's written to use level 4.0 instructions will also work on DirectX 10 and 10.1 hardware.
I would most certainly like to help writing a DirectX 11 renderer, but only once a custom mixer is done and the EVR CP-sync merge is completed (both are long overdue). That's going to require some more C++ developers. If anyone's interested to write code or manage communication for the project, they're most welcome.

Vista's WMP features EVR. If you compare that to the previous VMR-9 (windowed) renderer, it's indeed a big improvement, but it's still inferior to the quality provided in custom renderers,

Aero is a nice GUI renderer, it's not really that heavy on the CPU, GPU and the two kinds of RAM. The only problems I have with it are the forceful synchronization method in windowed mode (sometimes even in exclusive mode) that can cause tearing.

I don't know anything about Intel's Media SDK 2.0, but I do know that Intel never provided support for their GPU's DXVA on Windows systems older than Vista, so I'm not that surprised.

I'd definitely like to look at the code by Arcsoft. It's a lot of work to write and maintain code for three types of APIs. I've also never written anything with CUDA or OPENCL.

I fixed some things in the colorfulness gamma processing for the color control shaders. no new functions were added this time.
direct link: http://www.mediafire.com/?azp7ak75uy2u8z9

Yep i see what Huddy said was indeed starting a lot of discussions over on beyond3d but most see it as a step back from clean APIs to oldshool direct hardware coding, and surely Microsoft wont let this happen and many Devs also don't really want it too ;)
The Rendering Architect of the Frostbite 2.0 Engine even took it into his GDC presentation http://publications.dice.se/attachments/GDC11_DX11inBF3_Public.pdf ;)

About the Aero (DWM) thing i saw this though i couldn't confirm this myself and it might be FUD but it seems plausible if their are really these functions missing in WDDM 1.1 i wonder if it was added back into WDDM 2.0 though ?

http://www.youtube.com/watch?v=ToFgYylqP_U
http://www.youtube.com/watch?v=ay-gqx18UTM

When WDDM 2.0 was announced back in 2006 @ WinHec it sounded really good but we still not their http://forum.beyond3d.com/showthread.php?t=59068

http://download.microsoft.com/download/5/b/9/5b97017b-e28a-4bae-ba48-174cf47d23cd/pri103_wh06.ppt <- WinHEC 2006 presentation.

I tried todo some experiment on Win XP capturing the Windows screen (via bitblit function which is slow on XP and said to be faster on Vista/7 due to going through Direct3D) and Encoding it on the GPU and comparing that to a CPU Framework i could save some energy so far but the frame drops on Direct3D 9 are heavy the CPU overhead is still to heavy (trying to capturing game content but with mixed GPU/CPU Kernel, preparing GPU kernel only test) and unmanaged a pain (Video capturing works quiet well especially if its accelerated).

Though under XP and VMR9 i experience another Playback problem currently and im not sure where it comes from i suspect Windows High Resolution Kernel timing and Decoder that doesn't support it but im not entirely sure http://forum.doom9.org/showpost.php?p=1488882&postcount=144 Cyberlink guys seem todo it quiete different then other ISVs http://forum.doom9.org/showpost.php?p=1488268.

JanWillem32

2nd April 2011, 04:57

The sheets from that presentation are quite informative for people that can already develop for a DirectX-driven engine. I lacks the full presentation text or whitepaper, so it's not completely clear on all points for me. I did see that they are aiming for the level 5.0 instructions. I would be careful with that, as it will not work with DirectX 10 (level 4.0) and 10.1 (level 4.1) hardware at all. DirectX 11 hardware is still a bit rare in the general mid-to-high-end section of the market they are aiming for.
I still hope it will be possible to insert execution-level assembly code in the GPU as a processor unit in the future. Inserting execution-level assembly code without an API or driver translation has been possible for CPU's since they were first created (some assembly :p of the CISC to RISC elements required with the ancient x86+extensions instruction set).
For the parts with dependent-level instructions on the GPU (the typical vertex and pixel shaders), the assembly code is already nearly directly injected into the GPU, so those parts are really efficient already.

I don't know much about WDDM, as I've never worked on fundamental GUI elements. It becomes a lot less of an obstacle in exclusive mode, of course too. I'll probably have to gain some experience with it to get windowed mode applications I'm working on to work properly.

Bitblit functions are slow in general for operations on the backbuffers, no matter the DirectX version (or OpenGL for that matter). It's much faster to make a screenspace copy of the last item in the actual render chain. The penalty is that if you incur an anti-aliasing filter or more items in the transfer step to a back-buffer, your screenshot of a rendered image will not have those filters.
Video capturing is also heavy because of the encoding load. Anyone that has done capturing, mastering and editing on high-quality digital video content (not the 8-bit consumer-grade junk), will acknowledge that converting more than a terabyte of raw images to JPEG2000 on even a good RAID storage system will take like forever. The 8-bit consumer-grade junk is heavy to encode in real-time conditions too, of course. Since most consumers and professionals lack the storage capacity and sufficient writing speed on the medium to write out raw images, it's often required to write out encoded video. Unless you use a dedicated external encoding solution, there will always be a big performance hit for capturing with a video codec active on the same processor.

When it comes to timing, I hope to finally get rid of the EVR and VMR mixers for the common renderer in MPC-HC. Neither will signal the allocator correctly about the timings, so a lot of correction is required to make both work (for convenience, I omit the cases with badly encoded video and the VC-1 timestamps problem in MPEG transport streams).
Maybe if developers come along, willing to help with the mixer and renderers in general, the project can finally improve in quality and performance. There has already been talk about changing the font rendering engine for usage in the subtitle, OSD and stats screen rendering. I hope we can gather enough resources and people to get that done.

burfadel

2nd April 2011, 07:57

@Janwillem32

The shader v25 are great. The optimised path for up-sampling chroma on floating point surfaces works very nicely. The previous version I was using v23? had issues, so its great that is been resolved! I must admit, I have gone off the sharpen complex + deband + denoise filters, to me, it seems they are only ideal on almost perfect sources (using light denoise). Not just perfect encodes of ordinary sources!

I'm now using the optimised path for floating point surfaces, then the unsharp filter, which gives a nice result across all sources (not just across HQ source and encoding like the sharpen complex filters), and I add the greyscale noise right at the end. Its a very nice randomisation of the greyscale noise.

On the point of the greyscale filter, I found the strength a little high, I had to change it back to around 2 from 7 to look good on the tv. For me, the ideal amount of noise is just a little as it appears at viewing distance. It does a good job of hiding banding, whilst bringing out other details and making the picture actually look sharper! Is it possible to have the deband shader by itself, instead of combined with the sharpen and denoise filter? I do realise if you do use the noise shader it has to be placed after the sharpen/deband/denoise shader!

The worst thing about all these shaders is putting them into mpc-hc, you have to triple check to make sure all of them has been entered correctly, and can't modify the shaders unless a video is playing (which is very silly IMO).

Anyways, thanks for the great work, its very much appreciated!

Edit:
I should point out that I do like the concept of the sharpen+deband+denoise filter, it but for me, like suggested in the info, it did seem to pick the banding up more, and on some dark scenes it did oversharpen some noise which of course made it look not so good. Of course, a heavier denoise filter helps, but that could possibly remove some picture detail.

I think on high bitrate, 10 bit encodes of good quality digital sources, this filter chain would work really well (light denoise only).

I should also point out that even for normal material, its a shader that seems to work well or not on a per video basis!

The floating point optimised chroma upsample, then unsharp (I upped the strength slightly), then greyscale noise (I reduced the strength as said previous) works well with all material I tested it with. On a tv show I had recorded from HDTV, I've had two other people comment on why it looked so much better on my TV then it did on their new LED HDTV's (watched live, + replay from their own recordings). The funny thing about this is, the TV is A Toshiba CRT HDTV (the fact its CRT I think helps), but since it doesn't have a HDMI output I have been using the Svideo input - 1024x768! I must admit the picture quality even with SDTV stuff looks amazing on this TV. Further to their astonishment, I had already encoded it to X264 CRF 18, and reduced the resolution to just below SDTV. I used FFDSHOW to resize the output to 1024x768/

I would like to know how to use the resize filters (I believe they're really resampling filters?), in terms of designating a fixed output size. Having fixed output sizes such as 1024x768, 1680x1050, 1920x1080 etc for screen sizes, I think would be quite beneficial, as I have no idea how to use them currently! if the aspect ratio comes out wrong, maybe setting the aspect override in MPC should correct for it...?

JanWillem32

2nd April 2011, 09:59

Thank you, burfadel. I've also noted that the "sharpen complex + deband + denoise"-type shaders are indeed sensitive to the input they get. A lot of the general color controls affect the normal debanding capacity a lot, and like many shaders, they are meant for linear RGB.
// This shader benefits from converting to linear RGB, instead of using video gamma input directly.
Maybe I have to write a better description for that. It just means that this type of shader expects linear RGB by means of pre-processing by one of the two "gamma conversion of video RGB to linear RGB"-type shaders and post-processing by the color management function on the linear input setting, or one of the two "gamma conversion of linear RGB to video RGB"-type shaders.
Using a filter for linear inputs is quite common, it's used often in Photoshop and many other renderers use it, too. The bilinear and bicubic scalers in MPC-HC assume a linear input, as well.

The recent builds I've been posting on the main thread have a modified version of "semi-random colored surface noise" as an option for the dithering. It features contour detection, to exclude source areas that are already noisy, and the quantized noise ranges from 3 to 31 levels (generally lower than the standard noise shaders). The "sharpen complex + deband + denoise"-type shaders are detail detection shaders for multiple levels in multiple directions. The processing load from executing the blur and sharpen filters is very little compared to this detection they use. The sharpen filters can be disabled (instructions are included what numbers to set), but they can be a bit blurry without the balancing of the sharpening parts. Just try some settings, and test to get the image processing you want.

I don't like how the shaders are switched, either. VMR-9 (renderless) even has a bug that disallows access to the screenspace shaders menu. The easiest way to set shaders without having to start a video, is currently by using .REG profiles. It might also be better if shaders could be loaded from a "shaders" folder, instead of a .INI file or the registry. That would make inserting and removing a lot easier. I'll have to look if I can make that possible.

The resizers are prototypes for integration. I currently use the spline6 types, as you can see in the .REG example of post #209. I made a few presets with different scaling and noise filtering amounts. The scalers are actually quite simple: disable all internal scaling of MPC-HC, set a fixed scaling amount for the horizontal and vertical dimensions and scale the picture. The amounts are completely fixed, so I have to use different ones for all input resolutions. If I would be bothered to change the output resolution, I would have to make a completely new set of shaders and registry presets for that resolution, too.

Deshi

6th April 2011, 16:01

@burfadel

Hi, I'm trying to understand your chain of shaders but since I'm not so good in english...
Do you use Y'CbCr instead of RGB shaders simply because of your CRT output ?

So your chain looks like that :
Shaders,
- 1. RGB to Y'CbCr for SD&HD video input for floating point surfaces (in the optimized folder)
- 2. special 4÷2÷0 to 4÷2÷2 intermediate Catmull-Rom spline5 chroma up-sampling (in the optimized folder)
- 3. special 4÷2÷2 Catmull-Rom spline5 chroma up-sampling and color controls (in the optimized folder)
- unsharp luma mask for SD&HD video (higher settings)
Screenspace Shaders :
- sharpen + deband + mild denoise (sharpen at 0 ?)
- semi-random grayscale noise (lower settings)

Am I right ? Hopefully... :rolleyes:

burfadel

6th April 2011, 16:26

I'm not so sure about the Y'CbCr, JanWillem32 might be able to answer that better for you! I'm actually outputting from the graphics card as s-video, so 1024x768. I disabled mpc-hc's internal h264 DXVA and ffmpeg decoders, as well as the Xvid decoder and using ffdshow instead. I use the deband filter in ffdshow, with strength of around 1.8, and the radius at the default 16, as well as ffdshow's post-processing filter. I use the ffdshow resizer set to 1024x768, using spline resizer for both luma and chroma, and have some luma sharpening and a little chroma gaussian blur+sharpen when resizing.

For the semi-random greyscale noise I actually had to decrease the strength as 7 ended up with a very noisy picture, I have it set at around 2 instead. The semi-random greyscale noise shouldn't be run as a screenspace shader either, according to the comments in the shader file.

My shader chain:
- 1. RGB to Y'CbCr for SD&HD video input for floating point surfaces (in the optimized folder)
- 2. special 4÷2÷0 to 4÷2÷2 intermediate Catmull-Rom spline5 chroma up-sampling (in the optimized folder)
- 3. special 4÷2÷2 Catmull-Rom spline5 chroma up-sampling and color controls (in the optimized folder)
- 16-235 to 0-256 for SD&HD video input
- unsharp luma mask for SD&HD video (higher settings)
- semi-random grayscale noise (lower settings)

The 16-235 to 0-256 for SD&HD video input is highlighted because I was told it shouldn't be needed! however, for me anyway, the picture does look better both on the computer and tv with it :)

JanWillem32

7th April 2011, 00:25

@Deshi: Y'CbCr, xyY and other luma-chroma systems are just another way of encoding visibly representable colors. They can always be transformed back and forth to the R'G'B' values by matrices. The matrix and the following gamma correction to linear RGB depends on the specification of the input image. The choice of calculating in a luma-chroma, R'G'B' or linear RGB system depends on the filter's transformation type. With the exception of "0-256 to 16-235 for SD&HD video output" (memo: needs proper naming), color controls, color management and dithering, all filters will only depend on the specification of the input image. It's perfectly normal to have filtering with luma-chroma, R'G'B' and linear RGB systems in a single processing chain.

@burfadel: The "16-235 to 0-256 for SD&HD video input" shader is a correction shader for if the mixer fails (very rare). On normal video it breaks the white point, because of the chroma transformation. That's generally very undesirable. If the white point should be adapted, it should be done in linear RGB space with "brightness, contrast and gamma control for RGB", or more conveniently, with the color controls provided in "3. special 4÷2÷2 Catmull-Rom spline5 chroma up-sampling and color controls".
The other point is dynamics: that shader removes 13% of the normal intervals. That's a lot of clipping in the white and black ranges. If for example a night scene is encoded at the bottom [0, .125] R'G'B' interval (quite normal), this shader chain will crush half of the bottom brightness range in that scene to black.
The usual method of changing color controls, is done by using grayscale gamma and colorfulness gamma to balance the picture.
If that doesn't work out, try colorfulness gamma, hue, saturation and the three RGB gamma controls instead.
Lastly, if that's not enough, try the colorfulness gamma, hue, saturation and the nine RGB gamma, brightness and contrast controls instead. The brightness and contrast controls will damage dynamic ranges (black and white crushing) and will change the white point if the values for red, green and blue are not the same.
At least with this method you have direct control on what kind of color transformations are performed.

Deshi

7th April 2011, 15:31

@burfadel & @JanWillem32

Thanks for the answers

@JanWillem32

What are the differences between floating point surfaces and integer surfaces ?

JanWillem32

7th April 2011, 21:11

What are the differences between floating point surfaces and integer surfaces ?DirectX 9 specifies a lot of different formats for projecting colors on a surface: http://msdn.microsoft.com/en-us/library/bb172558%28v=VS.85%29.aspx . It's basically a setting that allows the output of 32-bit floating point math of RGBA colors to be rounded and stored into memory or a file format. Note that old GPUs may have 16- or 24-bit internal processing pipelines, but even low-budget video cards have been internally 32-bit for years now.
Currently available formats in MPC-HC for the surface, backbuffer and display modes: X8R8G8B8/A8R8G8B8 and A2R10G10B10.
X8R8G8B8/A8R8G8B8 is the 8-bit RGB mode, 24 integer bits are used for RGB, 8 bits are discarded. (The subtitle and OSD screen do use the alpha transparency channel of A8R8G8B8 to make overlays over the main video.)
A2R10G10B10 is the 10-bit mode, 30 integer bits are used for RGB, 2 bits are discarded. It requires enabling the 10-bit RGB Output mode to activate it on the surface and backbuffer. Setting 10-bit output on the display mode requires Windows 7 or Server 2008 R2 and activation of the D3D Fullscreen Mode.
A16B16G16R16F and A32B32G32R32F are floating point surfaces, activated by Half Floating Point Processing and Full Floating Point Processing, respectively. These modes will override 10-bit output mode on the working surfaces mode, but not on the backbuffer and display mode.
Normal range RGB color data has an interval of [0, 1]. The integer types will scale to that range by dividing the integer data in the 8- and 10-bit formats by 255 or 1023, respectively. That's called quantization. The floating point formats can store data in that interval natively, because of the available exponent bits.
For the floating point formats (in correct order for the IEEE floating point data format);
3 sign bits are used to allow RGB values to be negative, this modifies a format's interval to [-1, 1];
30 or 69 integer bits are used for most of the normal range RGB data;
15 or 24 bits are used for exponent data to scale the integer values to very small binary fractions, such as i/4096 or to very large ones, such as i*8192, this modifies the format's interval to [-∞, ∞];
16 or 32 bits are discarded.
A basic overview of the technical data on floating point specifications: http://en.wikipedia.org/wiki/IEEE_754-2008 .

The reason why I have to write different shaders for integer and floating point formats, is mainly because of the sign bit. The signed values allow the normal Y'CbCr intervals to be used: [0, 1], [-.5, .5] and [-.5, .5]. Integer surfaces require an offset of .5 to compensate for that.
Another issue is exponentiation by even values on negative values. For example: .25² = -.25² = .0625 . For color data with negative values, it's undesirable to become positive like that, so compensation code to keep negative values negative is required for it.

Fluffbutt

11th April 2011, 16:08

Just out of interest, do the shaders in MPC_HC not work in the x64 version?

Mine are switched off, and the options to turn them on a greyed out and untouchable.

G_M_C

12th April 2011, 07:50

@ Fluffbutt: Did you try other renderer. EVR-CP is the one I use, and it all works for me.

@JanWillem32: I only want to use you shaders for up-sampling chroma. I use full fp processing and i use 10 bit output.

In 'combine shaders' i have set '4÷2÷0 to 4÷2÷2 intermediate Catmull-Rom spline5 chroma up-sampling for SD&HD video input' followed by '4÷2÷2 Catmull-Rom spline5 chroma up-sampling for SD&HD video input'.
But now i read you have 'optimized shaders' for chroma up-sampling. Should i use those ?

My question: Most things we watch are x264 encoded / 4:2:0 video. As i only want to use your shaders for up-sampling chroma an use full fp processing and use 10 bit output to try to get the best colors (image) i can get on my setup. Could write down for how i can achichieve what i want, and how /what shaders to use ?

Fluffbutt

12th April 2011, 09:28

G_M_C !!!!

Thanks you!!

It works perfectly - the last time I tried EVR-CP (on 7 x32 install) it complained about missing entries in about 3 dll files (evr.dll, or one)

I didn't think to try it on the new install of 7 x64 because of that.

<I bow to your erudition>

G_M_C

13th April 2011, 16:24

@JanWillem32: I only want to use you shaders for up-sampling chroma. I use full fp processing and i use 10 bit output.

In 'combine shaders' i have set '4÷2÷0 to 4÷2÷2 intermediate Catmull-Rom spline5 chroma up-sampling for SD&HD video input' followed by '4÷2÷2 Catmull-Rom spline5 chroma up-sampling for SD&HD video input'.
But now i read you have 'optimized shaders' for chroma up-sampling. Should i use those ?

My question: Most things we watch are x264 encoded / 4:2:0 video. As i only want to use your shaders for up-sampling chroma an use full fp processing and use 10 bit output to try to get the best colors (image) i can get on my setup. Could write down for how i can achichieve what i want, and how /what shaders to use ?

Did you fall of the WWW Janwillem ?
;)

JanWillem32

16th April 2011, 13:56

Sorry I didn't respond earlier.
Anyway, for both x86 and x64: EVR CP, EVR Sync and VMR-9 (renderless) have shader support. VMR-9 (renderless) needs to have a bug fixed for the grayed out screen space shaders item. Also, I'd like the shader menus to work in offline mode, instead of only during rendering.

@G_M_C: The optimized path uses a set of 3 shaders to up-sample 4:2:0 input (1, 2 and 3) or two shaders to up-sample 4:2:2 input (1 and 3). Just chain the shaders as the very first items. The last shader also features full color controls and a linear gamma output option. For the rest, it's up to you. There are a lot of possible shader chains, and what people like to render with them is all very different.
Renderer settings I use:
D3D Full Screen Mode, 10-bit RGB Output, Full Floating Point Processing, Disable desktop composition (Aero), Flush GPU after Present
(No VSync is required on a CRT in D3D Full Screen Mode, my projector does need VSync, but it features native 24/1.001 and 24 Hz modes to make it a lot easier.)

Shader chain:
1. RGB to Y'CbCr for SD&HD video input for floating point surfaces
2. special 4÷2÷0 to 4÷2÷2 intermediate Catmull-Rom spline5 chroma up-sampling for SD&HD video input
3. special 4÷2÷2 Catmull-Rom spline5 chroma up-sampling for SD&HD video input (linear output enabled, sometimes I also set colorfulness gamma to lower a high colorfulness on some video sources)
sharpen complex, deband and denoise, r=5 (denoise filtering strength dependent on the source)

Screenspace shaders: (I hope I can integrate some new scalers soon...)
Catmull-Rom spline6 height resizer (with correct scaling preset)
Catmull-Rom spline6 width resizer (with correct scaling preset)
final pass: color management with an ICC profile installed system-wide and random ordered dithering

I've re-written the sharpen complex, deband and denoise combination, and I'm going to rename it to something more correct. There's one dilemma: the type of composing code for the detection method has two good candidates, dot and length. I'd like users to try out what they like most, so I included both types. Both types are about equally heavy in processing, have relatively about the same amount of sharpening, have the same controls for the parameters, are now more gamma-corrected than their predecessors and have correct math.
To directly see the difference between the dot and the length methods, I've included modified RGB grayscale shaders, featuring both methods. Linear gamma conversion shaders are also included, as all these shaders require linear RGB input.
Please tell me what you see, what I can improve, and what you like or dislike about these shaders. I'll update the main shader package with some bug fixes and new items soon.

G_M_C

21st April 2011, 10:25

Sorry I didn't respond earlier.
Anyway, for both x86 and x64: EVR CP, EVR Sync and VMR-9 (renderless) have shader support. VMR-9 (renderless) needs to have a bug fixed for the grayed out screen space shaders item. Also, I'd like the shader menus to work in offline mode, instead of only during rendering.

@G_M_C: The optimized path uses a set of 3 shaders to up-sample 4:2:0 input (1, 2 and 3) or two shaders to up-sample 4:2:2 input (1 and 3). Just chain the shaders as the very first items. The last shader also features full color controls and a linear gamma output option. For the rest, it's up to you. There are a lot of possible shader chains, and what people like to render with them is all very different.
Renderer settings I use:
D3D Full Screen Mode, 10-bit RGB Output, Full Floating Point Processing, Disable desktop composition (Aero), Flush GPU after Present
(No VSync is required on a CRT in D3D Full Screen Mode, my projector does need VSync, but it features native 24/1.001 and 24 Hz modes to make it a lot easier.)

Shader chain:
1. RGB to Y'CbCr for SD&HD video input for floating point surfaces
2. special 4÷2÷0 to 4÷2÷2 intermediate Catmull-Rom spline5 chroma up-sampling for SD&HD video input
3. special 4÷2÷2 Catmull-Rom spline5 chroma up-sampling for SD&HD video input (linear output enabled, sometimes I also set colorfulness gamma to lower a high colorfulness on some video sources)
sharpen complex, deband and denoise, r=5 (denoise filtering strength dependent on the source)

Screenspace shaders: (I hope I can integrate some new scalers soon...)
Catmull-Rom spline6 height resizer (with correct scaling preset)
Catmull-Rom spline6 width resizer (with correct scaling preset)
final pass: color management with an ICC profile installed system-wide and random ordered dithering

I've re-written the sharpen complex, deband and denoise combination, and I'm going to rename it to something more correct. There's one dilemma: the type of composing code for the detection method has two good candidates, dot and length. I'd like users to try out what they like most, so I included both types. Both types are about equally heavy in processing, have relatively about the same amount of sharpening, have the same controls for the parameters, are now more gamma-corrected than their predecessors and have correct math.
To directly see the difference between the dot and the length methods, I've included modified RGB grayscale shaders, featuring both methods. Linear gamma conversion shaders are also included, as all these shaders require linear RGB input.
Please tell me what you see, what I can improve, and what you like or dislike about these shaders. I'll update the main shader package with some bug fixes and new items soon.
direct link: http://www.mediafire.com/?wouof3og84n97va

Thx for the tips JanWillem; I've been running the upsampling shaders 1-2-3 for a short while now, and it seems to work perfectly. GPU usages seems to be significantly lower (25~30 with 1080p on my HD5770, which switches to 400 GPU/900 Mem when running the shaders) with these three, as opposed to the 'regular'/non-optimized shaders.

JanWillem32

25th May 2011, 17:50

After a few final touches, I've packed the v1.0 of Video pixel shader pack.
Compared to the previous version, I've updated all shaders. A lot of bugfixes, optimizations and a few new filter types were added. See the new opening post of this thread for reference.

For "sharpen complex, deband and denoise" I've been trying to get gamma+brightness correction sorted out. Especially small gradients with patterns are difficult to blur and sharpen. A dark scene makes that even more difficult.
I've been using a video gamma to linear factor of 2.6 to compensate a bit for that (the normal shader uses 2.4). Near-black source banding is unfortunately very common (and inevitable with current standards) in consumer-grade video. Bending the gamma curve to a higher value does make the picture generally quite dark in most videos, but is in my case quite effective for keeping darker scenes with about the same quality as lighter ones.
I haven't found settings/filters yet that can really work with a lower gamma (2.2 to 2.4), without using a synthetic video input in the renderer (those simply don't have the source banding problem). I still need to work on that.

G_M_C

26th May 2011, 20:16

Jan, i want to report this:
I'm using R25 of your optimized shaders for chroma upsampling on floating point surfaces. I have a HD5770 running on Win7-64. I use MPC-HT 32 bit, your test version R2964 (where ctrl-j still worked as it should).

When i enable RGB colorcontrols (#define RGBColorControls 1) and set blue brightness slightly higher (#define BlueBrightness 0.1) I get a fully blue screen, like BlueBrighness was defined @ 10 in stead of 0.1.

Jan-Willem; This problem remains still with V1.00

JanWillem32

26th May 2011, 22:45

That could be something hardware-specific, as I can't replicate it. I'll make a few variants that use other assembly structures to test.
Try for example:
s1.b += BlueBrightness;
instead of the complete line that processes contrast and brightness for RGB.

Video pixel shader pack v1.1 changelog:
"brightness, contrast and gamma control for RGB": corrected brightness calculation.
"cubic B-spline6 width resizer" and "cubic B-spline6 height resizer": corrected naming for variables.

G_M_C

30th May 2011, 17:21

Jan-Willem; Have not found time to test the changes, sorry.

Will do testing as soon as i can, using newest drivers/version of your test-builds etc.

Roco

17th June 2011, 15:29

JanWillem32,
I'm trying to write a shader for MPC-HC, is it possible to take two different parts of the screen (e.g the first-third and the third-third at x axis) and put them side by side to compose a new frame, or a tile? Alternatively, can you point me to where can I ask this question? Or, is there a relative example?

JanWillem32

17th June 2011, 16:45

That one's not so hard. Note that this shader will only use point sampling/nearest neighbor to move the parts around. Sub-pixel interpolation for scaling and moving items requires adding code for interpolation. If you want to construct something like the sphere or wave shaders, point sampling just won't look good.sampler s0;

float4 main(float2 tex : TEXCOORD0) : COLOR
{
return (tex.x < 1/6. || tex.x > 5/6.)? float4(0, 0, 0, 1)// black borders
: tex2D(s0, tex+float2((tex.x > .5)? 1/6. : -1/6., 0));// slide 1/3. horizontal parts inward
}

Roco

18th June 2011, 11:12

JanWillem32, thank you very much! :)
I asked this question to understand by-example the fundamentals without bothering you, but although this is 100% clear and I've learned a lot, I'm having a hard time to extend this to moving an arbitrary part of the screen to a desired position using both x & y axis. I currently don't need borders, but it was very useful to know how -thanks!

I just want to arrange parts of the original frame to compose a new frame which will be smaller, so this final composition would be scaled to cover the whole frame. I'm working on it and although I have some progress, it's more difficult than I thought -it's like trying to solve an ancient mind puzzle with minimum input data! I know I should be reading books instead -and I usually do, lots of them, but spending a month to study 1000 things in order to use just 1, is not very efficient, especially when there is no spare time left.

I don't want to mess up with your thread, but if you can provide me just a hint about how to take 3 arbitrary parts (x, y) and put them into arbitrary positions (x, y) to build a new borderless frame, it would be of great help -I'd greatly appreciate it.

EDIT: Movement at pixel units is all I need in my case. About scaling, I might have to think about interpolation next, thanks!

JanWillem32

18th June 2011, 12:57

Don't worry about a messy thread. I'm fine with this, as long as it draws some attention.
Randomization code is expensive, but the two noise effect shaders have examples of how to implement it, if you really need it.
Ordered switching is a lot less expensive. A nice example of that is the random ordered dithering code. With a bit of editing, you can also use that method on other textures, too:// (C) 2011 Jan-Willem Krans (janwillem32 <at> hotmail.com)
// This file is part of Video pixel shader pack.
// This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
// This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

// flip and rotate sampling direction for RGB
// This shader can be run as a screen space pixel shader.
// This shader requires compiling with ps_2_0, but higher is better, see http://en.wikipedia.org/wiki/Pixel_shader to look up what PS version your video card supports.
// If possible, avoid compiling with the software emulation modes (ps_?_sw). Pixel shaders require a lot of processing power to run in real-time software mode.
// This shader will flip and rotate the red, green and blue components individually every 32 frames.
// Note that this shader is only accurate in rotating squares.

sampler s0;
float3 c0;

float4 main(float2 tex : TEXCOORD0) : COLOR
{
float ct = frac(c0.z/256.+.0625);// 8-state counter

float2 rs, gs, bs, txi = tex.yx;
if(ct < .125) {rs = tex*float2(-1, 1); gs = txi*float2(-1, 1); bs = tex*-1;}
else if(ct < .25) {rs = txi*float2(1, -1); gs = txi*-1; bs = tex;}
else if(ct < .375) {rs = tex*float2(1, -1); gs = txi; bs = tex*float2(-1, 1);}
else if(ct < .5) {rs = txi*float2(-1, 1); gs = tex*-1; bs = txi*float2(1, -1);}
else if(ct < .625) {rs = txi*-1; gs = tex; bs = tex*float2(1, -1);}
else if(ct < .75) {rs = txi; gs = tex*float2(-1, 1); bs = txi*float2(-1, 1);}
else if(ct < .875) {rs = tex*-1; gs = txi*float2(1, -1); bs = txi*-1;}
else {rs = tex; gs = tex*float2(1, -1); bs = txi;}

return float4(tex2D(s0, frac(rs)).r, tex2D(s0, frac(gs)).g, tex2D(s0, frac(bs)).ba);// sample RGB positions and output
}It takes a while before you get used to programming for SIMD vectors. Playing with the basic tex.xy, tex.yx, and the 1-x or 1-y variants is a nice start. Note that if you set U and V sampler states on the input to mirror mode, you can remove the frac() parts for the sampling position. The same with the c0 register: 3 out of 4 numbers aren't used in this case.

Roco

19th June 2011, 14:39

Good example JanWillem32,
with your two posts and a little experimentation, I finally managed to have results! :)
Thank you very much!

JanWillem32

19th June 2011, 20:07

You're very welcome. If you need any more code examples or optimization, please tell me. If you've finished something, I can also include a commented and copyrighted version of the code in a next release.

Qaq

23rd June 2011, 17:19

JanWillem32, I'm trying to figure out the best way to fix that "Fellowship Of The Ring" green tint:
http://www.nerdsociety.com/unboxing-lotr-ee-bluray-green-tint-issue/
Is it hard to make a special shader for green color correction? And thanks for your work.

JanWillem32

23rd June 2011, 17:55

I don't think a hue or saturation shift is useful for this problem ("brightness, contrast, grayscale gamma, colorfulness gamma, hue and saturation control for SD&HD video input").
For "brightness, contrast and gamma control for RGB" the green controls are all separate. "GreenGamma" would be the first to experiment with, as this one doesn't affect the dynamic range. If that's not enough, you can try Brightness and Contrast settings.
Because the optimized forms of the 4:2:2 chroma up-sampling shader were easy to combine with all color controls, they are also available in those shaders.

JanWillem32

6th July 2011, 07:32

I've added and fixed a few things, so I'm releasing a new version.

log:
added "3LCD panel software alignment, Catmull-Rom spline6 interpolated" shaders
added "r=6, sharpen complex, deband and denoise" and "r=6 blur" shaders
corrected "sharpen complex, deband and denoise" shaders for bad performance during texture sampling stages and a clipping artifacts problem on very sharp contours
added "flip and rotate sampling direction for RGB"

mindbomb

8th July 2011, 05:06

im getting artifacts when using chroma blur, usually they happen horizontally across the bottom of the screen.

im wondering if im using it right, do you just copy and paste the text into the shader editor?

im using a 4350, windows 7, mpc hc 1.5.2.3329 (sse2), cat 11.6

edit: i think my hardware was just crappin out on me, i overclocked the card and now it is fine.

JanWillem32

8th July 2011, 06:19

GPU-Z is usually a good tool to log statistics of the GPU for diagnostics. If GPU usage hits 100% usage, frames start to drop, and if temperatures are high, the chances of getting artifacts become bigger.

mindbomb

8th July 2011, 07:13

it seems to drop frames when it gets to around 85% actually. idk, maybe im bandwidth limited or something.

JanWillem32

8th July 2011, 07:39

It's probably the memory bus and memory speed. Many low-end cards have enough memory, but lack the ability to really fill it. A HD4350 only has a 64-bit memory bus and usually DDR2 memory.

alph@

8th July 2011, 18:37

i have tried your version of mpc (1.5.2.3329), why the image is so sharp, with the internal mpc decoder (ffmpeg) H264.
http://www.zimagez.com/miniature/jahn.png (http://www.zimagez.com/zimage/jahn.php)

mpc jannwillem (h264 ffmpeg)interne.

http://www.zimagez.com/miniature/pot57.png (http://www.zimagez.com/zimage/pot57.php)

potplayer-diavc-ffdshow(just resize)

JanWillem32

8th July 2011, 19:36

Well, that looks like EVR CP with default settings in the ATi CCC video tab: http://forum.doom9.org/showthread.php?p=1512344#post1512344 (what a coincidence).
First, let's try to set those filters to something you like. Next, you'll probably would want to enable the "Touch Window From Inside" and "Keep Aspect Ratio" settings under "View", "Video Frame" to correct scaling a bit.
That should clean things up quite a bit.
After that, you can try out the new scalers under the "View", "Options", "Output" tab. (There are only a few options right now, but I'll add more later on.) For pixel shaders, don't forget the chroma up-sampling shaders. On the ATi platform there's no chroma up-sampling without them if you enable 10-bit RGB output or full or halt floating point processing.
Depending on what filters you enable, you can make the picture in MPC-HC look like the bottom one, or whatever you like. To stay a bit impartial, all regular pixel shaders in the pack should work fine in PotPlayer, too.

alph@

8th July 2011, 20:23

yes you are right, the edge ehancement is active (45),this is the first time I see work :),I already disable the other filters (noise, deblocking ....)for the ratio and the output level, do not worry,everything is ok to be display on my lcd sony 46 hx 700,I do not use shaders with potplayer because I use madvr,I use your shader with tokplayer,It's very convenient to use, a dialog box with the shaders, the change is instantaneous,an can be done when the player is in 'pause'.
http://www.zimagez.com/miniature/tok0.png (http://www.zimagez.com/zimage/tok0.php)
shader selection in tokplayer

the chroma up-sampling shaders is necessary, I do not see too much difference?
thanks.

JanWillem32

8th July 2011, 20:56

The chroma up-sampling shaders only work on point sampled chroma in the renderer. It's something that the shaders I wrote can take advantage of when a different color format than X8R8G8B8 is used on ATi hardware. (10-bit RGB output or full or half floating point processing modes change surfaces to A2R10G10B10, A32B32G32R32F and A16B16G16R16F respectively.) The default chroma up-sampling by the display drivers is only a bilinear kernel, so I've been trying to disable it universally and allow the user to also select other filters instead (with little success yet, unfortunately).

G_M_C

8th July 2011, 21:17

The chroma up-sampling shaders only work on point sampled chroma in the renderer. It's something that the shaders I wrote can take advantage of when a different color format than X8R8G8B8 is used on ATi hardware. (10-bit RGB output or full or half floating point processing modes change surfaces to A2R10G10B10, A32B32G32R32F and A16B16G16R16F respectively.) The default chroma up-sampling by the display drivers is only a bilinear kernel, so I've been trying to disable it universally and allow the user to also select other filters instead (with little success yet, unfortunately).

probably offtopic;

ITC processing is a feature that enables display processors to use the appropriate pixel data processing algorithms based on specific content type to ensure video quality.
With ITC processing, the graphics driver enables the display to use its own video quality processing algorithms for movies played in full-screen mode on HDMI™ displays.
[...]
Select—Enables ITC processing for HDMI displays that are capable of the feature. When movies are played in full-screen mode, the display’s processors can be used to ensure video quality.
Clear—Disables ITC processing. Video quality is ensured by the graphics driver for all types of contents displayed.

I wonder, when reading this: Does enabling ITC processing on Ati disable the interferance of CCC ? And does it force the GPU to just output data as-is, and leaving processing up to an external processor ?

JanWillem32

8th July 2011, 22:03

Chroma up-sampling is a required feature of any video player. There's no possibility of setting something else than full range 4:4:4 RGB on the back buffer. On top of that, transmission trough the video ports limits the output accuracy (that's why the output should be dithered). The signal is only good enough to go straight to the display panel(s) without any further digital processing at all. (And even then, I really hate being limited to 8-bit and 10-bit outputs. Projection systems with 12-bit panels and better have been around for a bit more than a decade now.)
If the external processor has access to the source file and handles decoding, rendering, color management, dithering and the execution stages to the display's analog controls with equal or better DCI compliant studio formats, I'd be willing to use it. Else, it's just another obstruction that the software rendering stages can do much better.
http://forum.doom9.org/showthread.php?p=1507282#post1507282It's already difficult for software developers to get along with the internal hardware of PCs (drivers, interaction with other software, CPU/GPU performance and so on). External hardware is at a whole other level. I believe that the CRT TV my parents bought in 1994 behaved a lot better then most of the current TVs in the same price class today. Although it was limited to D-SUB and lesser analog connections, it behaved quite similar to a monitor after geometry adjustment.

TheElix

8th July 2011, 22:14

Projection systems with 12-bit panels and better have been around for a bit more than a decade now.)12-bit output on projectors?! For real?