Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > VapourSynth

Reply
 
Thread Tools Search this Thread Display Modes
Old 28th September 2013, 15:06   #281  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by madshi View Post
I've been working on a debanding algorithm for madVR. I've taken the core algorithm idea of flash3kyuu_deband (which is really extremely simple), but modified it a bit. Wanted to let you know what I did, so you can implement the same thing in flash3kyuu_deband, if you like. Basically there's one point where the algorithm decides whether to use the original pixel value or the average of the 4 reference pixels. I've improved this decision making which allowed me to increase the thresholds a little bit. This results in stronger debanding, with hopefully not much more detail loss.

Of course the additional checks will eat up quite a bit of performance (and might be hard to implemented with SSE?), so I'm not sure if you want to do this, but that's your choice, of course. Some of the checks might be a bit redundant, I'm not sure. But I thought I'd rather add a few more checks to make sure the higher thresholds don't come with too many negative side effects...

Basically my decision making looks like this:

Code:
// orgPixel = original pixel value
// refPixel = one of the 4 reference pixels selected by the algorithm
// surPixel = one of the 8 pixels directly surrounding the original pixel
// refPixelsAvg = simple mean average of the 4 refPixels
// localContrast = max dif between the 8 surPixels and the orgPixel
// surroundContrast = max dif between the 4 * 9 surPixels left, top, right and bottom of the "localContrast" 9 pixel block
// maxRefPixelsDif = max dif between the 4 refPixels and the orgPixel
// refPixelsDifSum = sum of the absolute dif between each refPixel and the orgPixel

float3 result = ( (abs(refPixelsAvg - orgPixel) > 2.0 / 255.0) ||
                  (localContrast                > 2.5 / 255.0) ||
                  (surroundContrast             > 3.5 / 255.0) ||
                  (maxRefPixelsDif              > 3.5 / 255.0) ||
                  (refPixelsDifSum              > 6.5 / 255.0)    ) ? orgPixel : refPixelsAvg;

// in pixel shaders one 8bit step is 1.0 / 255.0
Here's the end result, with no grain and no dithering as part of the debanding algorithm. madVR processes in high bitdepth and applies TPDF random dithering as a last step, anyway, so grain/dithering is not needed as part of the algorithm:

Thanks madshi, the result looks good to me. I will try to implement this when I have time later. In the meantime, can you explain a bit more about surroundContrast? I am not quite understand how to calculate it, does it work like http://imgur.com/M2JShFi ?
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 28th September 2013, 15:29   #282  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by SAPikachu View Post
In the meantime, can you explain a bit more about surroundContrast? I am not quite understand how to calculate it, does it work like http://imgur.com/M2JShFi ?
Yes, almost like that. I've done this in 2 passes. In the first pass I'm calculating the "localContrast" for every pixel and store that in a helper texture/buffer. In the second pass I'm reading the "localContrast" for the pixels (x+3,y), (x-3,y), (x,y+3) and (x,y-3) and calculate the max of those 4 contrasts. The result is the "surroundContrast".

As I wrote before I'm not sure if we really need to do so many checks. Maybe the whole 2-pass thing and "surroundContrast" is overkill. Maybe you could skip the whole thing and get similar results. The key thing is to at least add *some* more checks because the default check is not good enough if you increase the thresholds (at least I thought so after testing that). In my first try I only checked "abs(refPixelsAvg - orgPixel)" and "maxRefPixelsDif" and that already worked quite nicely. But I found one case where detail suffered due to the increased thresholds, so I added some more checks to reduce the detail loss. I'm not even sure how much the added checks help, to be honest. Maybe you can try first without the "surroundContrast" check. Maybe that already works well enough...
madshi is offline   Reply With Quote
Old 29th September 2013, 01:48   #283  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by madshi View Post
Yes, almost like that. I've done this in 2 passes. In the first pass I'm calculating the "localContrast" for every pixel and store that in a helper texture/buffer. In the second pass I'm reading the "localContrast" for the pixels (x+3,y), (x-3,y), (x,y+3) and (x,y-3) and calculate the max of those 4 contrasts. The result is the "surroundContrast".

As I wrote before I'm not sure if we really need to do so many checks. Maybe the whole 2-pass thing and "surroundContrast" is overkill. Maybe you could skip the whole thing and get similar results. The key thing is to at least add *some* more checks because the default check is not good enough if you increase the thresholds (at least I thought so after testing that). In my first try I only checked "abs(refPixelsAvg - orgPixel)" and "maxRefPixelsDif" and that already worked quite nicely. But I found one case where detail suffered due to the increased thresholds, so I added some more checks to reduce the detail loss. I'm not even sure how much the added checks help, to be honest. Maybe you can try first without the "surroundContrast" check. Maybe that already works well enough...
Thanks for your explanation. I think I will try the algorithm without "surroundContrast" first, since it seems this will have big performance impact, if the result is fine without this that will be great.
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 29th September 2013, 09:35   #284  |  Link
turbojet
Registered User
 
Join Date: May 2008
Posts: 1,840
Looking forward to it. If there's one thing I dislike about f3kdb is the added grain at times but I'll take it over the banding. It would be nice to see some comparison pics of real world source to see how much detail is lost or I could just wait.

Speaking of comparisons, it's unfortunate the second post of this thread hasn't been updated with pics of current f3kdb. Going by them f3kdb is the worst performer to me, which was once true and I abandoned it because of it but things changed at some point and now is on top imo.
__________________
PC: FX-8320 GTS250 HTPC: G1610 GTX650
PotPlayer/MPC-BE LAVFilters MadVR-Bicubic75AR/Lanczos4AR/Lanczos4AR LumaSharpen -Strength0.9-Pattern3-Clamp0.1-OffsetBias2.0
turbojet is offline   Reply With Quote
Old 29th September 2013, 18:48   #285  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
I've done some more testing and I think we can forget the whole "surroundContrast" thing. It did reduce detail loss ever so slightly, but the difference was really small, and I think the performance loss isn't worth it. I've removed this check from madVR now, too.

If you want to check for detail loss, I can suggest this image. I've also created a small comparison image which shows the difference between f3kdb(Y=128,Cb=128,Cr=128), GradFun3(smode=2,thr=0.7) and madVR. The debanding smoothness is comparable between all three with these settings. But f3kdb loses a lot of details in this case. With the improved checks madVR loses much less detail and is on a similar level as GradFun3 with these test images.

However, when using the default f3kdb and GradFun parameters, the extra checks I added don't seem to help much. With the default parameters, f3kdb and GradFun3 produce comparable results, but both leave quite a bit of banding in the test images. So the extra checks are mainly useful for users who want to use higher thresholds.

@turbojet, I'd recommend to turn on error diffusion (Floyd Steinberg) in f3kdb and to set grain to 0. IMHO when using error diffusion, there's no need to add grain, too.
madshi is offline   Reply With Quote
Old 30th September 2013, 02:29   #286  |  Link
mandarinka
Registered User
 
mandarinka's Avatar
 
Join Date: Jan 2007
Posts: 729
Quote:
Originally Posted by madshi View Post
I've done some more testing and I think we can forget the whole "surroundContrast" thing. It did reduce detail loss ever so slightly, but the difference was really small, and I think the performance loss isn't worth it.
Well, for encoding-time filtering, it might still make sense, because performance cost might not be a problem for many users and those details can be precious.
mandarinka is offline   Reply With Quote
Old 30th September 2013, 07:11   #287  |  Link
turbojet
Registered User
 
Join Date: May 2008
Posts: 1,840
madshi: Thanks for the comparison, I'm mainly paying attention to the real life source as I rarely see banding half as bad as the animated pic. Madvr definitely looks good in for both. So does gradfun3 but in realtime, gradfun3 gives me some nasty edge artifacts, less so with smode=2 but still very visible. Sharpening may have something to do with this but f3kdb and no debanding doesn't show the issue. My cpu can't handle smode=2 in realtime on HD sources, SD is fine however. A few questions:

1, Is that an encode or realtime comparison?
2. Where does the 64 and 128 come from in f3kdb? The first param, range, max value is 31 and I couldn't find anything besides grainc/grainy that allows it but the comparison doesn't look grainier but instead stronger or blurred.

As for the unwanted grain at times, setting grainc=0, grainy=0 definitely resolves the issue but also reveals blocks and ringing the grain was masking. Reducing grain from 64 to 16 seems to have mostly fixed the issue without the side effect and also changing random algo from uniform to gaussian made a positive impact. This is what I settled with on the movie I'm currently watching, an older, grainy source BD: f3kdb(grainy=8, grainc=8,blur_first=false,random_algo_ref=2,random_algo_grain=2). Has anyone played with the grain parameters?
__________________
PC: FX-8320 GTS250 HTPC: G1610 GTX650
PotPlayer/MPC-BE LAVFilters MadVR-Bicubic75AR/Lanczos4AR/Lanczos4AR LumaSharpen -Strength0.9-Pattern3-Clamp0.1-OffsetBias2.0

Last edited by turbojet; 30th September 2013 at 07:15.
turbojet is offline   Reply With Quote
Old 30th September 2013, 08:54   #288  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by mandarinka View Post
Well, for encoding-time filtering, it might still make sense, because performance cost might not be a problem for many users and those details can be precious.
Well, we're talking about *significantly* slower speed for a benefit which is barely noticeable (even when pixel peeping), so I don't think it's worth it, even for encoding-time filtering. But of course that's only my personal opinion. I don't mind if SAPikachu implements it.

Yesterday I also played with trying to analyze & compare the gradiant direction/angles, as another check to reduce detail loss. The speed penalty of doing this is even worse, and the benefit once more barely visible, so at some point I stopped and gave up.

Quote:
Originally Posted by turbojet View Post
So does gradfun3 but in realtime, gradfun3 gives me some nasty edge artifacts, less so with smode=2 but still very visible. Sharpening may have something to do with this but f3kdb and no debanding doesn't show the issue. My cpu can't handle smode=2 in realtime on HD sources, SD is fine however.
Ah, ok. Didn't see such edge artifacts with GradFun3 here, but I've only worked with a limited set of samples, and didn't sharpen afterwards.

Quote:
Originally Posted by turbojet View Post
1, Is that an encode or realtime comparison?
I've tested on still images, so I don't have to frame step to a suitable frame all the time. But what I can say is that madVR's algorithm consumes about 15ms render time per 1080p frame on my Intel HD4000. So no problem for 24p, but probably a bit too slow for 60p with the HD4000. Hmmmm... Maybe I can remove the extra checks for the "low" setting (there will be a low and a high setting in madVR), to make it 60p capable, I'll give that a triy... I haven't tested f3kdb and GradFun3 for speed, only for quality.

Quote:
Originally Posted by turbojet View Post
2. Where does the 64 and 128 come from in f3kdb?
Sorry for being unclear, that's the value for the three "Y, Cb, Cr" thresholds. Increasing these thresholds in f3kdb improves banding removal strength, but also reduces detail and if you go too high, it introduces artifacts around edges. The detail loss is noticeably reduced by the checks I've added and the edge artifacts are totally gone.

Quote:
Originally Posted by turbojet View Post
As for the unwanted grain at times, setting grainc=0, grainy=0 definitely resolves the issue but also reveals blocks and ringing the grain was masking.
Well, f3kdb is meant to reduce banding, not ringing. But then, the madVR dithering is somewhat similar to the grain which f3kdb adds. So when using madVR the end result should be roughly comparable to using f3kdb with grain.

Quote:
Originally Posted by turbojet View Post
changing random algo from uniform to gaussian made a positive impact.
Hmmmm... Not sure right now, does the guassian change only improve the grain, or does it also help even if you turn grain off in f3kdb?
madshi is offline   Reply With Quote
Old 30th September 2013, 12:49   #289  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by madshi View Post
Well, we're talking about *significantly* slower speed for a benefit which is barely noticeable (even when pixel peeping), so I don't think it's worth it, even for encoding-time filtering. But of course that's only my personal opinion. I don't mind if SAPikachu implements it.

Yesterday I also played with trying to analyze & compare the gradiant direction/angles, as another check to reduce detail loss. The speed penalty of doing this is even worse, and the benefit once more barely visible, so at some point I stopped and gave up.
I think I will make it togglable and let user choose.

Quote:
Originally Posted by madshi View Post
Hmmmm... Not sure right now, does the guassian change only improve the grain, or does it also help even if you turn grain off in f3kdb?
Just FYI, random_algo_grain controls grain distribution, it has no effect when grainy/c is turned off. On the other hand random_algo_ref affects reference pixel selection and always have effect. @turbojet, which one did you mean?
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 1st October 2013, 00:09   #290  |  Link
turbojet
Registered User
 
Join Date: May 2008
Posts: 1,840
I had madvr dither off last night, enabling it definitely changed things. I ran across a scene last night from an old SD xvid encode (where banding is most prevelant) that didn't look good at all at with previously mentioned settings and enabling dither didn't help much. It was improved by either increasing grain or increasing strength, I prefer the former on this particular scene but prefer latter on other scenes, defaults are really effective on this one. Is there a way to use a lot more grain on flat surfaces, like walls, then the everything else?

Both reference and grain were changed to guassian but the former made the bigger impact on the old BD. On this other scene, it doesn't make much of a difference. Not sure what guassian does, the math is well above me, but changing lumasharpen to it made a noticeable improvement and removed a uniform pattern I was starting to notice often. So I decided to try it in f3kdb, a filter I had been meaning to tweak for over a year and finally found a good reason to.

E: Does random_algo_grain matter when dynamic_grain=false (default)?
__________________
PC: FX-8320 GTS250 HTPC: G1610 GTX650
PotPlayer/MPC-BE LAVFilters MadVR-Bicubic75AR/Lanczos4AR/Lanczos4AR LumaSharpen -Strength0.9-Pattern3-Clamp0.1-OffsetBias2.0

Last edited by turbojet; 1st October 2013 at 06:12.
turbojet is offline   Reply With Quote
Old 1st October 2013, 08:17   #291  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Do you have a sample where the difference between gaussian "reference" and default is clearly visible? I'd be interested in testing that...
madshi is offline   Reply With Quote
Old 1st October 2013, 08:48   #292  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by turbojet View Post
I had madvr dither off last night, enabling it definitely changed things. I ran across a scene last night from an old SD xvid encode (where banding is most prevelant) that didn't look good at all at with previously mentioned settings and enabling dither didn't help much. It was improved by either increasing grain or increasing strength, I prefer the former on this particular scene but prefer latter on other scenes, defaults are really effective on this one. Is there a way to use a lot more grain on flat surfaces, like walls, then the everything else?
Not possible with f3kdb (at least for now), I think you may try masktools to filter details and flat area separately, but performance may suffer...

Quote:
Originally Posted by turbojet View Post
E: Does random_algo_grain matter when dynamic_grain=false (default)?
Yes, dynamic_grain just controls whether noise pattern is different for each frame, noise pattern is generated in the same way so it will be affected by random_algo_grain too.
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 3rd October 2013, 00:52   #293  |  Link
turbojet
Registered User
 
Join Date: May 2008
Posts: 1,840
madshi: Sorry, I'm having some problems concentrating lately, too much stress/things on my mind. I did some comparing of the 2 algo's and on clean HD sources their is very little difference but when going SD -> HD there's a lot more grain with uniform, could be either negative or positive, here's a photo comparison and video. Script's used were f3kdb() and f3kdb(random_algo_ref=2,random_algo_grain=2) Concerning the edge artifacts with gradfun3(smode=0) it's pretty noticeable on SD -> HD but otherwise it's not, smode=3 is just as fast with fewer edge artifacts. A few questions:

1. What was the percentage gain on gpu (from 10 to 40% being 400%) of madvr's debanding on HD4000?
2. Does it require a delay to keep a/v sync like avisynth does? This is an issue with live tv.

SAPikachu: Thanks for the tip on masktools I haven't tried it yet but will soon to see if it's usable in realtime.
__________________
PC: FX-8320 GTS250 HTPC: G1610 GTX650
PotPlayer/MPC-BE LAVFilters MadVR-Bicubic75AR/Lanczos4AR/Lanczos4AR LumaSharpen -Strength0.9-Pattern3-Clamp0.1-OffsetBias2.0
turbojet is offline   Reply With Quote
Old 3rd October 2013, 08:00   #294  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by turbojet View Post
I did some comparing of the 2 algo's and on clean HD sources their is very little difference but when going SD -> HD there's a lot more grain with uniform, could be either negative or positive, here's a photo comparison and video. Script's used were f3kdb() and f3kdb(random_algo_ref=2,random_algo_grain=2)
Hmmmm... That's quite interesting. Will have a look at this later...

Quote:
Originally Posted by turbojet View Post
1. What was the percentage gain on gpu (from 10 to 40% being 400%) of madvr's debanding on HD4000?
Gain compared to what? CPU you mean? I haven't done any speed comparisons. My current development PC happens to have a quite fast quad-core CPU but a quite slow GPU. So maybe f3kdb() via AviSynth might even have been even faster on my PC than running it on the GPU. But that's not the point of madVR. GPU speed seems to climb much faster than CPU speed, so it makes sense to run everything on the GPU. And upgrading the GPU to a much faster model is easier than upgrading your CPU to a much faster model. I haven't compared CPU vs GPU. I can only tell you that my HD4000 needs 7ms per 1080p frame for debanding in "low" setting (comparable to f3kdb() default parameters). Of course with a fast GPU the time will be much lower. Probably with e.g. a GeForce 660 (which people often recommend for madVR) it would be under 1ms per 1080p frame. And for SD frames the time will be much lower still.

Quote:
Originally Posted by turbojet View Post
2. Does it require a delay to keep a/v sync like avisynth does? This is an issue with live tv.
No, of course not. No madVR algorithm requires a delay, or ever will require a delay. That said, I don't know what happens with live TV if you enable the "delay playback start until all queues are full" madVR setting. Might be that playback is delayed a little then, but that's an optional feature, and off by default.
madshi is offline   Reply With Quote
Old 3rd October 2013, 09:30   #295  |  Link
turbojet
Registered User
 
Join Date: May 2008
Posts: 1,840
Quote:
Originally Posted by madshi View Post
Hmmmm... That's quite interesting. Will have a look at this later...
On that same sample, range=31 masks a vertical strip of visible macroblocks much more than 16 (default) it's to the right of the picture and above the paper. dynamic_grain=true makes the area above the paper on the right more dynamic which is less distracting to my eyes in this case. Haven't compared these to defaults on other samples but have watched a couple HD and SD shows with them. Other options that I plan to experiment with is seed and random_param_ref / random_param_grain when I get some time. Can someone simply explain these options beyond the txt?

Quote:
Gain compared to what? CPU you mean? I haven't done any speed comparisons. My current development PC happens to have a quite fast quad-core CPU but a quite slow GPU. So maybe f3kdb() via AviSynth might even have been even faster on my PC than running it on the GPU. But that's not the point of madVR. GPU speed seems to climb much faster than CPU speed, so it makes sense to run everything on the GPU. And upgrading the GPU to a much faster model is easier than upgrading your CPU to a much faster model. I haven't compared CPU vs GPU. I can only tell you that my HD4000 needs 7ms per 1080p frame for debanding in "low" setting (comparable to f3kdb() default parameters). Of course with a fast GPU the time will be much lower. Probably with e.g. a GeForce 660 (which people often recommend for madVR) it would be under 1ms per 1080p frame. And for SD frames the time will be much lower still.
I meant gpu usage with madvr's deband disabled to when it's enabled. Presentation time increase would vary a lot depending on gpu and settings wouldn't it? While relative percentage shouldn't vary much or maybe it would. I agree mostly on gpu but there are situations when offloading option to cpu would be beneficial, 48/50/60/120 fps, igp only htpc, etc. but I'm not asking for it.

Quote:
No, of course not. No madVR algorithm requires a delay, or ever will require a delay. That said, I don't know what happens with live TV if you enable the "delay playback start until all queues are full" madVR setting. Might be that playback is delayed a little then, but that's an optional feature, and off by default.
That's good to know, 720i/p MPEG2 at 6-7 mbps could really use debanding but that's about the worse case scenario.
__________________
PC: FX-8320 GTS250 HTPC: G1610 GTX650
PotPlayer/MPC-BE LAVFilters MadVR-Bicubic75AR/Lanczos4AR/Lanczos4AR LumaSharpen -Strength0.9-Pattern3-Clamp0.1-OffsetBias2.0

Last edited by turbojet; 3rd October 2013 at 09:32.
turbojet is offline   Reply With Quote
Old 9th November 2013, 04:47   #296  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by madshi View Post
Ok, after a lot of user testing and fine tuning, here's the "final" version of the algorithm. There have been a number of changes.

First some definitions:

Code:
// orgPixel = original pixel value
// refPixel = one of the 4 reference pixels selected by the algorithm
// refPixelPair = 2 refPixels build a pair; if you connect the two refPixels of a pair with a straight line, the orgPixel lies right in the middle of the line
// avgDif = simple mean average of all 4 refPixels
// maxDif = max dif between the 4 refPixels and the orgPixel
// midDif = dif between the orgPixel value and the average value of a refPixelPair
Now we have 3 thresholds we can check: "avgDif", "maxDif" and "midDif", where "midDif" must be checked twice, once for each "refPixelPair". The logical purpose of the "midDif" is that debanding should really be done mostly for gradients. If you look at a refPixelPair, the orgPixel should have a value which is between the two values of the refPixelPair. If it's outside, it's less likely to be a banded pixel, but instead it's more likely to be e.g. a highlight. Because of this the "midDif" check is a useful check to separate detail from banding.

Instead of doing a simple binary check for each threshold I'm now doing "fuzzy logic". Here's the code I'm using:

Code:
      float3 avg = (pix1 + pix2 + pix3 + pix4) / 4.0;
      float3 avgDif = abs(avg - orgPix);
      float3 maxDif = max(abs(pix1 - orgPix), max(abs(pix2 - orgPix), max(abs(pix3 - orgPix), abs(pix4 - orgPix))));
      float3 midDif1 = abs(pix1 + pix3 - 2 * orgPix);
      float3 midDif2 = abs(pix2 + pix4 - 2 * orgPix);
      float3 factor = pow(saturate(3.0 * (1.0 - avgDif  / threshAvgDif)) *   // "saturate" clips to 0..1 range
                          saturate(3.0 * (1.0 - maxDif  / threshMaxDif)) *
                          saturate(3.0 * (1.0 - midDif1 / threshMidDif)) *
                          saturate(3.0 * (1.0 - midDif2 / threshMidDif)), 0.1);
      result = orgPix + (avg - orgPix) * factor;

// in pixel shaders one 8bit step is 1.0 / 255.0
And because this is not complicated enough yet, I've added two more tricks:

(1) gradient angle check

This is quite complicated and requires an extra algorithm step - resulting in a noticable performance drop. But it does result in a nice improvement in the "debanding smoothness vs detail loss" ratio. We try to estimate the gradient angle for every pixel. This is done by reading the 8 surround pixels and doing some math on them - but instead of reading the directly surrounding pixels I'm using a read distance of 20 pixels (!!). This is done to give us a better estimate of the real gradient angle. Here's the code:

Code:
      float3 p00 = tex2Dlod(SourceSampler, float4(Tex.x - xPixSize * 20, Tex.y - yPixSize * 20, 0, 0));
      float3 p10 = tex2Dlod(SourceSampler, float4(Tex.x,                 Tex.y - yPixSize * 20, 0, 0));
      float3 p20 = tex2Dlod(SourceSampler, float4(Tex.x + xPixSize * 20, Tex.y - yPixSize * 20, 0, 0));
      float3 p01 = tex2Dlod(SourceSampler, float4(Tex.x - xPixSize * 20, Tex.y,                 0, 0));
      float3 p21 = tex2Dlod(SourceSampler, float4(Tex.x + xPixSize * 20, Tex.y,                 0, 0));
      float3 p02 = tex2Dlod(SourceSampler, float4(Tex.x - xPixSize * 20, Tex.y + yPixSize * 20, 0, 0));
      float3 p12 = tex2Dlod(SourceSampler, float4(Tex.x,                 Tex.y + yPixSize * 20, 0, 0));
      float3 p22 = tex2Dlod(SourceSampler, float4(Tex.x + xPixSize * 20, Tex.y + yPixSize * 20, 0, 0));
      float3 gx = (p20 + p21 * 2.0 + p22) - (p00 + p01 * 2.0 + p02);
      float3 gy = (p00 + p10 * 2.0 + p20) - (p02 + p12 * 2.0 + p22);
      float3 angle = (abs(gx) < 0.01 / 255.0) ? 1.0 : (atan(gy / gx) / PI + 0.5);
The output will be a value between 0.0 and 1.0, telling us the estimated gradient angle of the pixel. Now in the main algorithm step for each "refPixel" we now also read the corresponding gradient angle. Then we compare the estimated gradient angle of each "refPixel" to the estimated gradient angle of the "orgPixel". If the max difference of those 4 "refPixel" angles does not exceed a specific limit (new threshold named "maxAngle") then we apply a boost to the debanding strength. This results in a nice debanding smoothness improvement without harming detail (in most cases). We apply the boost simply by increasing the other 3 thresholds (avgDif, maxDif, midDif) by a specified factor. This factor I'm calling "angleBoost" in madVR.

(2) fade in/out detection

In my experience, banding artifacts are especially annoying when there's a fade to/from white or black, because in such situations the bands start to move around, creating false image edges, which I find extremely distracting. Because of that I've added a detection for when there's a fade from/to white or black. During the fade I'm using a stronger debanding strength (higher thresholds). If you need more detail on how I'm detecting a fade in/out, let me know...

The biggest problem for me is to find good values for all those new thresholds (avgDif, maxDif, midDif, angleBoost and maxAngle). I'm still in the process of researching this, with the help of some madVR users. Here's what I'm currently using, but this is subject to change:

Code:
low:    avgDif = 0.6 / 255; maxDif = 1.9 / 255; midDif = 1.2 / 255; angleBoost = 1.9; maxAngle = 10;
medium: avgDif = 1.8 / 255; maxDif = 4.0 / 255; midDif = 2.0 / 255; angleBoost = 1.6; maxAngle = 22;
high:   avgDif = 3.4 / 255; maxDif = 6.8 / 255; midDif = 3.3 / 255; angleBoost = off; maxAngle = off;
The medium preset is roughly comparable to the current default f3kdb() settings, with maybe slightly higher debanding quality and slightly higher detail preservation. But of course this differs, depending on which exact video you test with. Sometimes the new thresholds not only preserve more detail, but also more banding. The default madVR configuration is currently "low" for normal scenes and "high" when there's a fade in/out. But none of these thresholds are set in stone yet. The algorithm itself is not likely to change, anymore, now, I hope.
Thanks very much madshi. I will look into it as soon as I have time (but this won't happen fast because I have been really busy recently..).
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 13th December 2013, 09:43   #297  |  Link
vood007
Registered User
 
Join Date: Jan 2011
Posts: 84
Is there anything i can do to speed up initial flash3kyuu_deband.dll loading? I use it for realtime watching in MPCHC/FFDShow but the additional 2 sec delay when opening a video almost kills the fun. Tried loading the dll from ramdisk but did not help at all. Any suggestions?
vood007 is offline   Reply With Quote
Old 13th December 2013, 14:30   #298  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
Not trying to keep SAPikachu from answering, but did you try out the current madVR test build? It features debanding based on flash3kyuu's algorithm. You can find the option in the settings, "processing">"artifact removal".
sneaker_ger is offline   Reply With Quote
Old 14th December 2013, 00:21   #299  |  Link
turbojet
Registered User
 
Join Date: May 2008
Posts: 1,840
Try lowering the buffer, you shouldn't need to keep any past frames and I was able to get away with 3 future frames, any less it would drift out of sync. So 0 3 in ffdshow.
__________________
PC: FX-8320 GTS250 HTPC: G1610 GTX650
PotPlayer/MPC-BE LAVFilters MadVR-Bicubic75AR/Lanczos4AR/Lanczos4AR LumaSharpen -Strength0.9-Pattern3-Clamp0.1-OffsetBias2.0
turbojet is offline   Reply With Quote
Old 14th December 2013, 02:57   #300  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
f3kdb itself shouldn't take 2 secs to load (in my test it took less than 1 sec for 1920x1080 video), so I guess you need to tweak some of your ffdshow settings.
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Reply

Tags
avisynth, deband

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:54.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.