View Full Version : Video pixel shader pack
Pages :
1
[
2]
3
4
5
6
7
8
9
10
11
JanWillem32
16th November 2010, 11:48
I've looked up "÷", it is even part of the old ASCII extended character table (even non-Unicode filesystems can use those).
The reserved characters for both Unicode and non-Unicode filesystems in Windows are: < (less than), > (greater than), : (colon), " (double quote), / (forward slash), \ (backslash), | (vertical bar or pipe), ? (question mark) and * (asterisk).
Even very old filesystems support the ASCII extended character table for file names.
What happens if you unpack in a simple folder such as c:\temp\ , or in another folder with a very short file name structure?
It could also be Winrar, does it allow reading LZMA2 compressed file data? It's quite a new method to store data inside of 7-zip files.
TheElix
16th November 2010, 13:05
Heh, you were right. A newer version of WinRar did the job for me. I'm sorry.
JanWillem32
18th November 2010, 04:47
I've made some good progress on "test 5, sharpen complex v3 + deband". It rarely causes banding on gradients anymore, and the standard sharpen values are much more usable than those in the test 3 version. The other pixel shaders only needed a little bit of comment cleanup.
Only "sharpen chroma for SD&HD video input" could use a stronger version to accompany the current one. It's rather hard to decide how much stronger that version should be, as it depends a lot on the quality of up-sampling by the renderer, and the DXVA or software decoder. I do not think anyone would use this shader on actual 4:4:4 images, but 4:2:0 and 4:2:2 images can be up-sampled by many different methods.
This complete batch can be considered final, if no errors are found in the code.
The hybrid is not the most practical shader to implement. It's much easier to understand how to use the two separate shaders. The reason I made this shader, is to illustrate that he stacking method is wrong for heavy shaders. There isn't an option to use a complete frame in-between shaders, unless you use one shader as a non-screenspace shader, and one as a screenspace shader. That's why the hybrid is lighter on resources than the two separate shaders stacked (see the test in the opening post).
clsid
18th November 2010, 18:18
A bit off-topic, but wouldn't it also be a good idea to change how the shaders are stored?
I would suggest storing them as separate files instead of as strings in Registry/INI. That would have the following advantages:
- Makes it easier to add/remove shaders. Simply add or remove a file.
- Shaders can be modified with any text editor.
- The internal editor could be removed.
- No need to embed any default shaders in the executable.
- No bugs with storing large shaders in INI anymore.
Files could be placed in a subdir called "Shaders" or "PixelShaders" and files given the extension ".shader".
The minimum required PixelShader version could for example be placed on the first line of the shader file.
@JanWillem32
Are you also a C++ programmer? If so, would you be able to implement the above, and perhaps also the automatic shader version functionality that was previously discussed?
burfadel
18th November 2010, 20:27
I would also have added better portability between computers. If you copy MPC-HC to another computer, having it as the above suggestion would save setting them up on each application.
The internal shaders should still be that, stored in the programme, and only new or altered shaders needing to be placed using the method CLSID suggested. In the meny system, the additional shaders can show as a submenu to the current shader list.
JanWillem32
19th November 2010, 07:02
This sounds interesting. Let me add a few more things that might be improved:
- The ability to switch pixel shaders in the standard menu, not only the right-click menu.
- The ability to switch pixel shaders without having to start a video. (I use the D3D fullscreen mode.)
- Automatically setting the highest available pixel shader version for the resizer.
- Allowing a pixel shader to use the output from the resizer, but without the binding to the screen refresh rate like the screen space pixel shaders do. (My CRT can do 160 Hz for a smoother output, but no normal pixel shader can use that many frames per second.)
I've taken a look at the source code, but I could not find anything but the C++ and pixel shader codes for the resizers. I will definitely need someone to guide me though it, else I'll be searching for a very long time to find all the right items.
I focused from the beginning on the Y′CbCr-type shaders and "sharpen complex v3 + deband", because these needed the most work. The other default shaders could also lose quite a few instructions and have some comments inserted. Shall I work a on those too?
I am also looking for new, usable shaders and maybe some pixel shader resizers that can do a bit more than the bicubic resizer.
If the pixel shaders are to be loaded from a separate folder, I think it would be best if they are stored as .txt files, so that users can simply open them and read the comments in notepad.
CiNcH
19th November 2010, 19:51
I am also looking for new, usable shaders and maybe some pixel shader resizers that can do a bit more than the bicubic resizer.
Cool. Is changing the texture size possible without renderer interaction?
Are sharpen complex v3 and deband also available as standalone shaders?
PetitDragon
19th November 2010, 22:17
Yea, standalone versions of Sharpen Complex v3 and Deband will be very helpful.:)
JanWillem32
20th November 2010, 12:35
Changing the output texture size can only be handled by a pixel shader if it's embedded into the main program code, but that's not much of a problem.
"Sharpen complex v3 + deband" can be altered to change or disable functions. Disabling functions don't have much of an impact on the processing requirements, because the detection method that is used for both, requires the most processing.
I added a few comments to the test 6 version to make editing easier. I also changed the sharpen function to use smoothstep (Hermite spline function), instead of linear sharpen adaptation.
I cleaned up more comments to avoid confusion with the analog YUV&Y′PbPr and the digital Y′CbCr.
CiNcH
20th November 2010, 14:35
Just fur my understanding... Do the ATi PostProcessing algorithms that can be enabled within CCC work on the YUV image? Is color space conversion done afterwards and is it triggered by the renderer (as the connection format between the decoder and the renderer is a YUV format)? Custom MPC shaders work on the converted RGB image, right?
JanWillem32
20th November 2010, 15:10
MPC-HC shaders work on RGBA with a range of <0,1> (or rather <0,1], because of the conversion with integers) for both output and input. There is no access to any Y′CbCr data, that's why it's easy to fail at the conversion to Y′CbCr in a pixel shader. Pixel shaders that work on Y′CbCr data directly, do exist, by the way.
Driver-based decoders (DXVA), driver-based renderers, software-based decoders and software-based renderers can all access Y′CbCr and RGB data, depending on what stage they are processing in. It is even possible for some filters to work in the temporal space, to make frame-by frame comparisons.
CiNcH
20th November 2010, 15:25
Driver-based decoders (DXVA), driver-based renderers, software-based decoders and software-based renderers can all access Y′CbCr and RGB data, depending on what stage they are processing in.
I am just trying to understand which stage is introducing the chroma upsampling error with ATi. Standard renderers don't have this problem with ATi DXVA, so it seems to be the custom presenters that are at fault. On the other hand, nVIDIA DXVA does not suffer from this problem either when using the custom presenters.
JanWillem32
20th November 2010, 19:05
The ATi driver can deliver various formats to the renderer, if the renderer supports it. On my computer (with Catalyst 10.5a), there isn't a problem with chroma up-sampling when transferring video in a Y′CbCr 4:2:0 or 4:2:2 format to EVR CP, as the receiving renderer does the up-sampling. However, when RGB is transferred, the driver doesn't up-sample chroma as it should during the Y′CbCr to RGB conversion. On top of that, a lot of the processing/filtering options in CCC don't work if the output is RGB. There's no direct way for the renderer to "know" either that the input RGB data is faulty.
The advantage of the RGB mode is that it can do 10-bit integer per component, while all other formats are stuck with 8-bit integer (or less, when in a planar format like NV12). Not all of my videos are stored 8-bit, Y′CbCr, and 4:2:0 sub-sampled. Transferring them to the renderer as such would compromise quality.
To take it even further, the complete software chain can be opened to work with fp16 and fp32 (16-bit and 32-bit floating point) in DirectX 9 and 10 modes. DirectX 11 can finally handle fp64 too. DirectX 9 can output up to 10-bit integer RGB to the video card's output frame, DirectX 10 and 11 can use the output from the processing format without changing it. The video card then uses the color LUT to convert the colors of the output frame and (preferably) input that at the highest acceptable bit depth format for the digital connection, or pass it trough a DAC to output analog signals. Such processing chains are completely normal for full-screen DirectX games with HDR, but for a video player with that many dependencies on external hardware/software decoders and renderers, I can imagine it would be very hard to create such a chain. I do believe better quality modes can be achieved in some time, as the standards and capable processors are already available.
CiNcH
21st November 2010, 01:15
there isn't a problem with chroma up-sampling when transferring video in a Y′CbCr 4:2:0 or 4:2:2 format to EVR CP
So you are talking about the connection format between decoder and renderer (as the presenter component itself always works on RGB)? YV12 and YUY2 for example? The colorspace conversion is done by the mixer component, right? Guess that GPU colorspace conversion and the other Catalyst post processing algorithms only kick in when using NV12 colorspace (through DXVA2).
However, when RGB is transferred, the driver doesn't up-sample chroma as it should during the Y′CbCr to RGB conversion.
So this is the case when using DXVA and NV12 colorspace, right? But both, EVR Standard and EVR Custom, trigger DXVA with NV12 colorspace. Only EVR Custom suffers from the bad chroma upsampling however.
de66ka
21st November 2010, 09:30
Hello JanWillem32,
I'm using your "brightness, contrast,....control" script adjusting the output of my renderer. What I'm missing is a feature to adjust the "gamma".
Is there the possibility to implement such a control in this script?
Thanks in advance
de66ka
JanWillem32
21st November 2010, 13:37
@CiNcH I only have bad chroma up-sampling when I force 10-bit RGB input. The "Display Stats" screen then also shows RGB32 input. (The internal format is R10G10B10X2.) The ATi driver doesn't mind converting the processing format to 10-bit RGB, but fails at up-sampling while doing so. Do you have an example of instances when up-sampling fails with other input formats?
@de66ka That was easy. It is however a RGB color control function, unlike the other functions that work on Y′CbCr components.
CiNcH
21st November 2010, 13:58
I am not using 10-bit at all.
I did the following tests now:
Source: recording on ATV with red logo on black background (MPEG-2 576i) (chroma_test.ts (http://members.inode.at/762450/chroma_test.ts))
Filter chain: ffdshow -> EVR Custom (MPC-HC)
ffdshow YUY2 output: chroma not interpolated
ffdshow NV12 output: chroma not interpolated
ffdshow RGB32 output: chroma is properly upsampled within ffdshow
I verified this with the CyberLink decoder.
Filter chain: CyberLink Video/SP Decoder (PDVD10) -> EVR Custom (MPC-HC)
CyberLink DXVA NV12: chroma not interpolated
CyberLink SW YUY2: chroma not interpolated
I tried the same thing with Standard EVR.
Filter chain: ffdshow -> EVR
ffdshow YUY2 output: chroma is properly upsampled
ffdshow NV12 output: chroma is properly upsampled
JanWillem32
21st November 2010, 15:35
That is really odd. I can't reproduce the problem at all. DXVA testing for SD content is not possible with Catalyst 10.5a. All software decoders, including the internal one, are rendered okay. I only had to disable 10-bit RGB input mode for testing, because it is incompatible with the current software decoders.
What driver version are you using? I haven't used any other versions after 10.5a, because they fail at DXVA decoding with 10-bit RGB output. Maybe newer drivers have a few other glitches, as well.
CiNcH
21st November 2010, 15:43
This (http://forum.doom9.org/showthread.php?p=1456318#post1456318) is what it looks like when chroma upsampling fails.
I am using Catalyst 10.11 with a Radeon HD 3650. I always had this problem with the MPC-HC Custom Presenter.
JanWillem32
21st November 2010, 16:20
A HD 3650 is one of the UVD+ chips. That is a bit hard to compare to my card that supports UVD 2. There is a difference in the amount of work that the video decoder, mixer and renderer offload to the GPU. Now I'm wondering too if the hardware, driver or software is to blame in your case. In my case it is clear that the driver skips the up-sampling step when converting colorspaces. In your case, the output from the video decoder is in a planar format, but the mixer/renderer skips up-sampling. That has never happened to me.
Well, at least your GPU can easily handle the up-sampling scripts.
CiNcH
21st November 2010, 16:34
Question is how standard EVR handles it properly..
JanWillem32
21st November 2010, 19:01
On top of that, how differently does MPC-HC treat the three EVR types? I always thought that the only difference was the subtitle overlay and the different synchronization clock, but for example, the "Display Stats" screens are different too.
I've made some progress on editing the standard shaders that were not in my releases. I only found one problem. The denoise shader is double defect. It works by sampling 128 pixels in an eight-pointed star-shape, instead of a normal circle. It does nothing more than adding up the values, so it only blurs and brightens. It does not compare any color, brightness or anything that would make it a real denoise shader. Does anyone know some shaders that can do a real denoise? Preferably one that has the same quality as the versions in the nVidia and ATi control panel.
de66ka
22nd November 2010, 06:55
@JanWillem32
Thank you for implementing the Gamma-Switches in your script. Works like a charme. So I'm able to correct the output without changing the driver settings. For my taste most of the HD-Videos are way too dark.
Thank you, de66ka
JanWillem32
23rd November 2010, 13:11
I agree that many Blu-ray movies are mastered quite a bit darker then what I've seen in theaters. A few seem a bit brighter as well (Alice In Wonderland and some anime productions for example).
Outside of the ugly teal-and-orange, loudness war and dirty tape problems, that are present in the studio master, Blu-ray mastering seems to be difficult, too. Setting the normal gamma from 2.6 in the studio master for Digital Cinema to the two gamma functions in BT.709-5 for Blu-ray doesn't always turn out that well.
Many consumer-grade LCD panels have very bad dark-to-black accuracy, that's a common problem. I've even seen that a TV totally crushes dark scenes to black by dimming the back lights. My old CRT may not be the brightest anymore (at 89.8 cd/m² maximum) and the D-sub analog connection is leaky (I can easily see left-to-right bleeding artifacts), but it still outperforms my very expensive projector whens it comes to displaying a very dark scene (not having a back light has its advantages). The general image that my projector makes is less "hard", it's easier to watch for a longer period, unless it projects a very, very bright picture hat makes my eyes hurt because of the total amount of light.
Considering the great difference in consumer-grade viewing systems/screens, it's hard to choose the perfect color layout for a Blu-ray. I just hope that people mastering Blu-rays will choose to feed the raw Digital Cinema master (with the standard mathematical conversion to BT.709-5) with a lot higher and more consistent quality then they have done before.
I need more inspiration to make "denoise", but all other shaders are finished. Maybe I will make an intermediate version "sharpen complex v2 + deband" on PS 2.0a minimum to fill the gap between v1 and v3.
Phaser
25th November 2010, 12:04
JanWillem32, is it possible to make a separate Deband shader v3 (without sharpen complex)?
JanWillem32
25th November 2010, 12:44
I'm looking if I can make the version of "test 4, sharpen complex v3 + deband" that will only compile on DirectX 10/11, more DirectX 9 friendly, as that one can also do denoise, but the instruction limit is a big problem (it's actually faster, even with more instructions). I made new versions of almost all shaders to make use of some speed-up tricks, but I really want to make "sharpen complex v2 + deband", a revision of "sharpen complex v3 + deband" and at least something that can denoise for a next release.
At the moment "sharpen complex v3 + deband" can become deband-only if you define "SharpenFull" and "SharpenPartial" 0.
toniash
25th November 2010, 13:23
@JanWillem32
What values do you would recoomend for SD?
Thanks for your work
PetitDragon
25th November 2010, 14:42
but I really want to make "sharpen complex v2 + deband", a revision of "sharpen complex v3 + deband"
can't wait to see that!:)
CiNcH
25th November 2010, 16:48
I'm looking if I can make the version of "test 4, sharpen complex v3 + deband" that will only compile on DirectX 10/11, more DirectX 9 friendly,...
ps_3_0 is DirectX 9c, isn't it?
G_M_C
25th November 2010, 19:04
I'm following this thread with great interest. I've tested hybrid v3 with 720p/1080p film, and it works fine on my HD5770. The GPU gets worm, shure, but it easily keeps up with framerate. I found the hybrid/debanding shader very usefull on encodes of live-concert performances (like Muse's Seaside rendezvous) where the shader seemed to help reduce banding often seen looking at lightbeams created by the spotlights.
If you make another hybrid, say v4, is it not also a good idea to make the hybrid-components available seperately ?
(Super werk Jan Willem, erg bruikbaar !)
JanWillem32
26th November 2010, 01:08
@toniash For which of the 31 scripts do you need values? I did a lot of guessing work for on most shaders to get them at least functional.
@PetitDragon v1 and v2 will be made by weakening the v3 version, so I'm going to finish that one first. I'm planning to bind v3 to the limits of PS 3.0, v2 to the limits of PS 2.0a and v1 to the limits of PS 2.0.
@CiNcH That's right. http://en.wikipedia.org/wiki/Pixel_shader
@G_M_C Spotlights can illuminate a lot of dust in the air. Many automatic sharpening filters in cameras and after-effect filters will over-sharpen because of that. It's good to know that debanding can work to soften things up. My newer shaders can also be modified to compensate for noise, that may help as well. A HD5770 can easily execute hybrid test 6 or 7 on its idle clocks, maybe it will even heat up a bit less with those. Hybrid versions are just for solving a performance problem, by the way. Once the "combine shaders" method is improved to the point that it doesn't take a huge performance hit anymore, the hybrids are useless (see my test in the opening post). (Verder, ik heb het graag gedaan, hoewel dit beduidend meer tijd heeft gekost dan ik geanticipeerd had.)
The "test 7, sharpen complex v3 + deband + denoise" shader was made by altering the DirectX 10/11 version, it is a lot different from the test 6 version (also included). Adjusting values was a real disaster, and it still needs some work to get decent parameters for the outer radial layers. It can at least do a decent job on dithering, grainy and mosquito noise. Sharpening detection is somewhat limited (a circle radius of 2.5 pixels), but it performs quite well. On top of that it is a lot lighter in general, unless it denoises and debands a completely blurry or plain frame.
Other shaders have not been improved a lot, there's only a minor speedup for some. I lowered the default sharpening values for most of the sharpening shaders. That's because the original shaders horribly over-sharpen and I don't want to adjust the output of my versions to match that anymore. Altering the sharpening values to your preferences should be easy enough if you read the comments.
CiNcH
26th November 2010, 13:10
Are shaders > ps_3_0 even compiled? I thought that the custom presenter within MPC was DirectX9.
JanWillem32
29th November 2010, 20:06
I use a separate program that can use a decent font size, automation, optimize functions and syntax highlighting for writing scripts. For testing purposes, I often compile something for DirectX 10. If I like what I see, I can then try to simplify the code to suit the limits of the lower pixel shader versions.
The pixel shaders in MPC-HC are all DirectX9, with either a PS 2.0, 2.0a or 3.0 profile.
I made a test 8 release. Only the sharpening shaders have been changed from the test 7 versions. I'm very happy with the performance of "test 8, sharpen complex v3 + deband + denoise". Although the sharpening is a bit mild, it does apply a very good "clean-up" effect, even when zooming in on an image. On top of that, it performs well on very high contrasts, so even with higher sharpening values, it doesn't tend to over-sharpen.
I'm looking forward to make a final release soon, that includes v1 and v2 versions of "sharpen complex v3 + deband + denoise" and updated functions for the regular "edge sharpen" and "sharpen" functions.
To reply to my own opening post; I believe all of my shaders, except for the hybrid, are good enough to replace the complete set of shaders included currently in MPC-HC. I'm looking forward to make a request soon to include them in a future MPC-HC build. If any other other program can use my shaders, whether it's for video, still images or 3-D rendering, I will be happy to submit my shaders for those programs too. Please inform me which programs could make good use of them.
burfadel
30th November 2010, 03:54
In the hybrid script, you repeated this line in the comments at the top:
// For video processing, an above average GPU is required, often even on full GPU and memory clock speeds, this is written in 2010.
It seems to work well though!
JanWillem32
30th November 2010, 04:43
You are right, so I cleaned up a few comment lines.
TheElix
30th November 2010, 18:28
Shaders Disabled
http://rghost.ru/3435311/thumb.png (http://rghost.ru/3435311.view)
Hybrid test 8 enabled
http://rghost.ru/3435325/thumb.png (http://rghost.ru/3435325.view)
Ugh... In photography it's called excessive digital noise reduction. Not a good thing.
JanWillem32
30th November 2010, 18:53
I wish that I could provide with automatic noise adaptation. Unfortunately, that will be quite hard. I can however see if the sharpening to contrast level adaptation can be improved, without damaging the dynamic range.
This scene can still be properly modified if you lower the "NoiseLevel" factor.
The default:
// NoiseLevel; <.5,4>, detection noise factor, .5 is for very plain, lossless, synthetic images, 1 is normal for digital lossy video and images, higher values will help counter grainy noise on surfaces and gradients, at the cost of the detail level
#define NoiseLevel 1
TheElix
30th November 2010, 19:32
I understand that at brighter scenes and in other source videos your shader will work differently. However, if we take sharpen complex v.2 for example it will show consistency at various conditions. Anyhow, keep up the good work!
JanWillem32
30th November 2010, 23:18
The original sharpen complex 2 is not consistent either. When it processes images, it compares the calculated edge detection value to the edge sharpening limit. That edge detection value can never reach the edge sharpening limit on low brightness images, because it doesn't correct for the brightness.
To solve that in the test 9 version of my shader, I added a bit of brightness detection (full detection is much too taxing on registers). It does cost a bit of debanding quality and it is just under the maximum complexity limit for PS 3.0, so I can't really add more functions. I do like how it handles a certain scene with a night sky. The brightness and contrast adaptation is more balanced when compared with the test 8 version.
Edit:
I managed to make a small improvement in the distribution of the sampling area, brightness calculation and debanding quality for the test 10 version.
TheElix
2nd December 2010, 22:53
Sorry for the delay. Just downloaded and tested your v10 Hybrid shader. Here we go:
Shaders Disabled
http://rghost.ru/3462453/thumb.png (http://rghost.ru/3462453.view)
Hybrid v10 Shader
http://rghost.ru/3462463/thumb.png (http://rghost.ru/3462463.view)
Sharpen Complex v2 Shader
http://rghost.ru/3462491/thumb.png (http://rghost.ru/3462491.view)
Well... Much better than v8! It sharpens the image in quite a different way than Sharpen Complex v2. Although the digital noise grain becomes more distinctive too. Let's look at another picture:
Shaders Disabled
http://rghost.ru/3462510/thumb.png (http://rghost.ru/3462510.view)
Hybrid v10 Shader
http://rghost.ru/3462512/thumb.png (http://rghost.ru/3462512.view)
Here I can't say the shader is benefical to the picture. Especially in the dark areas of space. But maybe I got the wrong idea as to what this shader does.
nurbs
2nd December 2010, 22:58
What's the problem with the second set of pictures? I'm looking at them on a CRT here and the dark areas of space look fine.
TheElix
3rd December 2010, 00:36
The transients between far-standing colors are more sharp thus more evident.
JanWillem32
3rd December 2010, 11:04
The first example has a lot of noise, of different types. It will be hard to remove most of it, while keeping the details. I think that it could use a "NoiseLevel" setting of 1.5 or 1.75 to clean things up a bit.
The second example is synthetic. I use a "NoiseLevel" setting of .75 on those because the shader doesn't need to clean up anything else but quantization and lossy compression noise. If you think it over-sharpens too, you can lower the sharpening amounts.
CiNcH
3rd December 2010, 18:14
YUV <16,235> to any RGB conversion is a bit lossy, so if the RGB range is only <16,235>, floating points get truncated more than with a range of <0,255>.
What is the better trade-off? The loss of precision due to truncation or the banding that is introduced when expanding levels?
JanWillem32
4th December 2010, 01:36
First of all, all shaders in MPC-HC output "R32G32B32A32_FLOAT" (128 bits in total), even if the "Full Floating Point Processing" option is turned off. I advise to use that option, as it allows less degradation from the different rendering stages, even if input and output are limited to 8- or 10-bit integer formats.
The video file to display path in Windows requires Y′CbCr data to be converted to RGB, even if data is converted to Y′CbCr for DisplayPort, HDMI or dual link HD-SDI transport. This website specifies that all DirectX color formats are full range RGB:
http://msdn.microsoft.com/en-us/library/ff471325%28VS.85%29.aspx
For BT.601 [SD] and BT.709 [HD], the obligated Y′CbCr conversion to Wide Gamut RGB color space or sRGB color space changes the gamma functions, chromaticities of the primary colors and the white point. This means Windows, DirectX, drivers and the video card will process these in a wrong way if the input from the video renderer is limited range RGB.
That's why I've been busy making shaders that can correct these functions. I do want to remind everyone that conversion to Y′CbCr with <16,235> or <64,940> luma ranges, is really a thing for the video card and drivers to handle. The best I can do, is provide shaders that can compensate functions for the <16,235> and <64,940> luma ranges inside of full range RGB. However, it is far from correct and all further changes to the picture after those shaders will further lower the color quality. (I added warnings in the comments of the new shaders.)
Once I'm finished with those shaders, I will post them, but it's really a lot of work to make and comment them.
I know that I didn't quite answer your question, but I think I made it clear that using a limited range RGB format is problematic, as it doesn't relate very well to the limited range Y′CbCr formats.
Edit:
I decided to reduce the amount of code, to reduce the size of the shaders (without compromising any quality), and I corrected some small errors.
To illustrate the RGB gamma differences between the common formats, I included basic gamma transformation shaders. (To complete them, the CIE-type color transformation matrix for both input and output has to be added as well, but that's more something for someone who's programming a video renderer or mixer.) These are not for regular use, so I marked the file names with a "~". I also included a text file with the standard Y'CbCr and RGB conversion codes for developers. The "detect even or odd coordinates" shader was used to verify some code of the chroma up-sampling shaders, it might also be useful for other developers.
Edit:
I resolved a few mistakes in a few shaders. I added basic sharpen complex "v2", "v1" versions and "v3" with 5 different noise level presets (in alphabetical order).
Edit:
Because I wanted to speed up some instructions on some shaders, I tried to use partial precision mode on a few. However, I never saw any performance improvement. I now know why.
I read this article, and tried the "PSPrecision" application:
http://ixbtlabs.com/articles2/ps-precision/
The application stated:
Device: ATI Radeon HD 4800 Series
Driver: aticfx32.dll
Driver version: 8.17.10.24
Registers precision:
Rxx = s23e8 (temporary registers)
Cxx = s23e8 (constant registers)
Txx = s23e8 (texture coordinates)
Registers precision in partial precision mode:
Rxx = s23e8 (temporary registers)
Cxx = s23e8 (constant registers)
Txx = s23e8 (texture coordinates)
32-bit precision for floating-point numbers is not bad at all (unless you need doubles or larger for iterative functions or scientific calculations), but it seems partial precision mode for pixel shaders is simply ignored by my video card. As the examples state in the article, it could have been a lot worse. The only thing that makes shaders run faster on my hardware is by compiling them with the highest available PS version.
CiNcH
6th December 2010, 23:00
For BT.601 [SD] and BT.709 [HD], the obligated Y′CbCr conversion to Wide Gamut RGB color space or sRGB color space changes the gamma functions, chromaticities of the primary colors and the white point. This means Windows, DirectX, drivers and the video card will process these in a wrong way if the input from the video renderer is limited range RGB.
So this means that the ATi CCC color pixel format 'RGB 4:4:4 Studio' is basically useless as some component within the video display chain (DirectX, Windows Color Management) will always distort the result in some way due to full range assumption?
How about good old Overlay? It bypasses all these "layers", doesn't it? Too bad it expands levels by itself.
JanWillem32
7th December 2010, 01:37
There are two ways of compressing the ranges, the cheap way is by lowering the RGB contrast to a direct range of <16,235> or <64,940>. That will work on displays that simply stretch RGB linearly up again to <0,255>, <0,1023> or more, depending on the display's processor calculation bit depth, followed by the step of direct output to the display (it's only impractical to first compress and then expand again if it's not necessary). However, many devices expect perfect Y′CbCr with BT.601, BT.709, or BT.1361 characteristics, even when RGB is received (but the first Y′CbCr to RGB conversion step is omitted). That will distort the color matrix and gamma. A common example: the basic red is much purer for Wide Gamut RGB, sRGB, BT.709 and BT.1361 than for BT.601. If a TV expects BT.601 but receives red from a Wide Gamut RGB color space or sRGB color space, it will crush it to the maximum, even if the format is limited range RGB.
I designed my shaders to work by conversion to Y′CbCr, compress the ranges from <0,1>, <-.5,.5> and <-.5,.5> to <1/8,235/256>, <-15/32,15/32> and <-15/32,15/32> (as the Y′CbCr standards dictate for signed floating points) and then output RGB again, without gamma or color matrix correction, as those functions only distorted the picture even more when I tested Y′CbCr limited range on my projector against full range RGB (display color profile disabled for this test).
I do not know what functions the drivers use to convert the input RGB data. I do know it requires full range RGB input to function in the first place.
Any video card has to receive a full range, full resolution RGB texture/frame before it will output, because there are no other formats it will take, even if the driver converts it to another format for the output port connection. The overlay renderer still has a color mixer inside, otherwise it couldn't output any decent information to the system.
There's no way you can use a pure Y′CbCr path on a PC, other hardware units will often convert Y′CbCr to RGB and vice versa too, to apply some filters. Banding is quite a big problem because the input formats are often not that great, and many processing formats are limiting as well. Neither my old CRT as my new (and expensive) projector can do any debanding. Things started to improve when I used a calibration device on both, and used some software filtering to improve the picture. I replaced my stand-alone blu-ray player because it couldn't come close to my PC in neither audio or video processing.
By the way, those who use display calibration correctly, don't have to worry too much about Wide Gamut RGB, sRGB, BT.601, BT.709, BT.1361 color spaces and gamma functions, as these are all converted to the display's own capabilities by RGB to RGB transfer functions. If banding is in the source, the best thing you can do is to keep the digital pipeline from source file to display panel as wide as possible, and maybe filter a bit in-between.
Deshi
8th December 2010, 11:50
Hi everyone,
I've trouble testing your PS.
When I compile them in MPC-HC I can't get over 125 lines of codes, meaning that most of your PS won't work.
My VGA supports PS 4, but in MPC-HC I can only choose PS 3 ou PS 3sw max.
Is that the problem or do I missed something somewhere in the process of creating the PS from the start ?
Thanks for the help.
PetitDragon
8th December 2010, 13:45
Hi everyone,
I've trouble testing your PS.
When I compile them in MPC-HC I can't get over 125 lines of codes........
Uncheck "store settings to .ini file" in your mpc-hc options sheet.
JanWillem32
8th December 2010, 13:56
That is a very common problem if you save settings to a .INI file. I know it's nice to have portability for your settings, but it can't save big shaders (yet). I personally stopped using the saving to .INI setting.
If you want to make a copy of your settings from the registry:
- run regedit
- expand the tree to "HKEY_CURRENT_USER\Software\Gabest\Media Player Classic"
- right-click the "Media Player Classic" key
- click the export option
It produces an executable .REG file that you can use as a back-up or to transfer settings to another PC. I use several smaller .REG files to switch between shader presets, too.
Maybe I should add a warning inside my shaders to avoid using a "sw" (software) mode, as it often doesn't work at all. Shader code is heavy to run on even a reasonably new CPU, as it's meant to run in parallel on many pixels at once. GPUs don't mind working on over a million parallel threads, but CPUs have much fewer cores to compute those.
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.