View Full Version : Filter prototype...
defunkt
21st January 2005, 00:25
Problem I find with filters available for de-noising source video is that they invariably destroy good detail as they remove rubbish. Been experimenting recently with MaskTools & creating masks that preserve the good and discard the bad when used with Overlay() to merge information from the original sharp source and a heavily smoothed version of the frame. Yielded some good results based on enhancing midtones and deprectaing very dark/light areas & also enhancing the dark side of edges while deprectaing the light side of edges against which mosiquito noise tends to be much more visible.
What I couldn't find a way to do with MaskTools was differentiate between foreground objects (which should stay sharp & crisp) & background areas like sky (which benefit from heavy smoothing). This prompted me to knock up my own filter (my first) and try some ideas.
After a promising start with one method I'm about ready to abandon it. However I can't help but think that somebody with better math, better C++ or a better understanding of UV values might see merit in the basic idea and that it's just that my implementation is naive. So...
Given a frame like...
http://www.ionwerks.net/defunkt/source.png
... if the filter I've knocked up (http://www.ionwerks.net/defunkt/myfilter.zip) is passed this frame (I use the smoothed version) with the UV planes intact and the Y plane turned into an edge mask like...
smooth=Deen(UnFilter(source, -100, -75), "a2d", 2, 10, 12)
result=YV12Convolution(smooth, "1 1 1 0 1 1 1", "1 1 1 0 1 1 1")
result=YV12LUTXY(smooth, result, "y x - abs")
result=MergeLuma(smooth, result)
result=MyFilter(result)
http://www.ionwerks.net/defunkt/edges.png
... it will return the mask below...
http://www.ionwerks.net/defunkt/result.png
... which is basically what I set out to do. What it does is create a look-up table for each possible UV value which contains the average value of the corelated Y plane pixels. Essentially the average incidence of edges, these values are then written into the Y plane of the result.
The problem is that while the basic shape of the histogram (of derived values) is generally quite consistant, and any given result represents the relative importance of image areas in the frame, it's prone to spikes and largish variations between very small chroma changes which makes the whole mask quite 'liquid'.
I'm just wondering if some math whizz might be able to suggest how to deal with these variations using some sort of normal density function or the like?
:confused:
E-Male
21st January 2005, 00:56
interesting, i've been waiting for somthing like this
i won't be able to be much help for you
but i'll try it
could you make a version that only outputs the mask?
thx
Mug Funky
21st January 2005, 13:31
that mask looks like just the sort of thing the filmlook thread needs - at least on that one frame it seems to distinguish between in-focus and out-of-focus, meaning we could adapt that to make a depth-of-field faker.
the idea seems pretty good. i don't fully understand what it's doing though :)
could make a good artefact-protection for other filters that do good, but also remove too much (like temporal median, or removedirt).
Didée
21st January 2005, 14:03
Welcome onboard, defunkt!
Honestly speaking, I don't fully understand what exactly you are doing there. Could you please elaborate a lil' more what's going on in the process you're doing, for a slow dumb*** like me. The idea of getting measurements of what is (&should be) sharp, and what is supposed to be flat, is clear. It's just the details of operation that escape me ...
Especially the chroma thingy irritates me. At one point I understand you're using the chroma planes to store lookups, at another point I understand you're doing some luma<-->chroma correlation evaluation. So it seems in fact I understand nothing :eek:
But there's a weak probability I could pull some methods out of the drawer - I've some minor experience in creating spatial masks through avisynth :)
Originally posted by Mug Funky
... artefact-protection for other filters that do good, but also remove too much (like temporal median, or removedirt).
Mug, *did* you have a look at kassandro's RemoveGrain/Repair (http://www.removegrain.de.tf/) combo scripts, a.k.a. "RemoveDust()"? Quite powerful in itself, and get even better with some small additions ;)
(But the doc is a PITA to read)
Richard Berg
24th January 2005, 18:48
DeGrain (http://bag.hotmail.ru/degrain/degrainmedian.dhtml) is also a good/similar filter for that sort of noise.
Back to the thread subject, though -- I think this foreground/background idea would make a great plugin, or addition to MaskTools, however it's implemented.
defunkt
25th January 2005, 19:41
Hello again. My apologies for not explaining the idea more fully, it's actually very simple (probably too so) but possibly only once adequately explained. It is based on the following two assumptions...
1. Chroma planes contain useful (if imperfect) information about discrete objects within a frame. For the frame above, the sky has one incidence of UV values (blue), the tree another (green) etc. etc.
2. Areas which have a lot of edges tend to also have a lot of texture between the edges which is worth keeping sharp (and often in the foreground). And vice versa.
Here is the source...
PVideoFrame __stdcall MyFilter::GetFrame(int n, IScriptEnvironment* env) {
PVideoFrame source = child->GetFrame(n, env);
PVideoFrame result = env->NewVideoFrame(vi);
const unsigned char *sourceY, *sourceU, *sourceV;
unsigned char *resultY;
unsigned int ystride = source->GetPitch(PLANAR_Y);
unsigned int uextent = source->GetRowSize(PLANAR_U);
unsigned int uheight = source->GetHeight(PLANAR_U);
unsigned int ustride = source->GetPitch(PLANAR_U);
unsigned int u_count[255] = {0}, u_value[255] = {0};
unsigned int v_count[255] = {0}, v_value[255] = {0};
unsigned int x, y, store;
if (vi.IsYV12()) {
sourceY = source->GetReadPtr(PLANAR_Y);
sourceU = source->GetReadPtr(PLANAR_U);
sourceV = source->GetReadPtr(PLANAR_V);
for (y = 0; y < uheight; y++) {
for (x = 0; x < uextent; x++) {
u_count[sourceU[x]]++;
v_count[sourceV[x]]++;
store = sourceY[x * 2]
+ sourceY[(x + 1) * 2]
+ sourceY[(x * 2) + ystride]
+ sourceY[((x + 1) * 2) + ystride];
u_value[sourceU[x]] += store;
v_value[sourceV[x]] += store;
}
sourceY += ystride * 2;
sourceU += ustride;
sourceV += ustride;
}
for (int i = 0; i < 256; i++) {
if (u_count[i]) u_value[i] /= u_count[i];
if (v_count[i]) v_value[i] /= v_count[i];
}
resultY = result->GetWritePtr(PLANAR_Y);
sourceU = source->GetReadPtr(PLANAR_U);
sourceV = source->GetReadPtr(PLANAR_V);
for (y = 0; y < uheight; y++) {
for (x = 0; x < uextent; x++) {
resultY[x] = __min((u_value[sourceU[x]] + v_value[sourceV[x]]), 255);
}
resultY += ystride;
sourceU += ustride;
sourceV += ustride;
}
return result;
} else {
return source;
}
}
For each chroma plane two arrays are maintained indexed by U or V pixel values. Every time a given UV value is encountered in the source its count is incremented and the value of the corelated pixels in the Y plane (the edge mask) is added to a running total. This total is afterwards divided by the count yielding the average incidence of edge pixels for any given U & V value. This average is then used to write the mask, again for each incidence of U & V in the source.
The above scene from SPR was what prompted me to start this and it illustrates the goal well. Other striking examples...
http://www.ionwerks.net/defunkt/149636.png
http://www.ionwerks.net/defunkt/183817.png
...however it's not unusual for it to return frames like...
http://www.ionwerks.net/defunkt/070038.png
http://www.ionwerks.net/defunkt/087610.png
...neither of which are actually flawed for my purposes, namely choosing the mix from sharp/smooth versions. The first indicates that there is little detail to maintain anywhere (favour the smoothed version), the second that all areas are detailed (favour the sharp version) though it still manages to deprecate the little bit of sky.
And for any given frame it yields a useful estimation. The problem is the fluctuations from frame to frame. When applied as a mask with any sort of weight it's inclined to make the detail crawl in a small, but undesirable manner. I still think there's the potential for an incredibly useful filter but until I can resolve the variations (have had some small success with one technique) I don't see it as fit for any purpose.
Didée
26th January 2005, 13:26
Thanks for clarifying your thoughts, defunkt. Now I understand you much better :)
Let me say that I'm currently fiddling with a very similar problem, only less for denoising, but more for sharpening tasks. It's a rather complex enhancement script, where I need a reliable exclusion of "flat" areas, in order to avoid enhancing the crap out of them - and that's even needed two times, in two different contexts. But since the major CPU time (should) go into the other parts, that exclusion thingy has to be as fast as possible. Currently I'm using a relative simple, luma-only, area-min-max-evaluation, with some additional twists. It doesn't work all that bad, but surely needs further refinement.
However you see: my interest is there (though I can't help you a bit on the coding side ;) )
Your approach of chroma evaluation really has its strong points, especially in separating "objects" apart. The first two of above screenshots seem quite nice and usable.
However, I think you can't rely on chroma information solely. You'll come across enough scenarios where this approach, alone on its own, grips too short. Especially the third of above pictures shows a big problem: most parts of the frame have rather similar chroma values, resulting in "one big block of everything". But still, there is much detail in the foreground, which shouldn't get (masked-ly) mixed with the similar-colored background.
Also, especially dark areas need special treatment. In dark areas, chroma information often is poor, but still much detail might be there that should be preserved.
Imagine, in the 4th of above screenshots, all those dark areas (the uniforms) would undergo strrrong blurring - that would not be the wanted effect.
So, I don't see your current approach as flawed or something, not at all. But it definetly needs some assistance of other, additional techniques.
Readers, let the ideas bubble! ;)
IanB
26th January 2005, 14:44
defunkt,
Maybe some transform of U & V like to Hue and Saturation (or just Hue alone) could lead to interesting possibilities?
There maybe some milage in remebering that U & V are really signed values offset by 128. i.e.
U= 16, V= 16 -> u=-1.0, v=-1.0 == Green(ish)
U=240, V=128 -> u=+1.0, v=0.0 == Blue(ish)
V=128, V=240 -> u=0.0, v=+1.0 == Red(ish)
Perhaps crunching the number of bits for U & V values down to 7 or even 6 bits or perhaps a nonlinear map (lookup table).
Just some random musings
IanB
defunkt
26th January 2005, 23:08
Actually I finally bumped into what I had suspected existed, a dumb pointer error *blush*. Working much better now, still inclined to fluctuate but I'm quietly confidant of getting on top of this. Have updated the source/pic's above to reflect the change.
@E-Male: Will post a working version to try soon.
@IanB: Thank you. Your 'random musings' exactly match most of mine, though I'll have to try them all again now. I'm particularly interested in boiling it down to Hue. I doubted my implementation here, the code example I found on the net (Foley, van Dam, et al.) was definately influenced by luma which seems wrong to me and a look at the source for Tweak suggested it should be possible to evaluate hue without regard for luma. Can anyone (DG?) point me to code which will derive Hue 0-359 (or higher resolution) given U & V?
@Didée: This is destined to be the 3rd component of the mask I use for merging smooth/sharp. I'd take issue with your point about not smoothing dark areas. In fact the base component of my mask is built by accentuating midtones and deprecating light & dark areas. This is after all what XVID's AQ is all about is it not? Here is what I do to build the mask already, I find it very effective against my pet hate Mosquito Noise...
smooth=Deen(UnFilter(source, -100, -75), "a2d", 2, 10, 12)
m_tone=YV12LUT(smooth, "x 127 > 255 x - x ? 2 *")
m_blur=YV12Convolution(smooth, "1 1 1 0 1 1 1", "1 1 1 0 1 1 1")
m_edge=YV12LUTXY(smooth, m_blur, "y x - 127 +")
weight=YV12LUTXY(m_tone, m_edge, "y x + 127 -")
return Overlay(smooth, source, 0, 0, weight) Given a frame like this (http://www.ionwerks.net/defunkt/144594_base.png) it returns this (http://www.ionwerks.net/defunkt/144594_done.png). I suspect the 4th component of the mask will end up being some sort of protection for very fine lines.
morsa
27th January 2005, 00:36
Really good results!!!
You could also think about getting an upsample chroma, may be I'm crazy but using something like the scale2X algorithm wich is really fast could give you, I don't exactly know,a higher resolution chroma to work with...
Just a thought
Here comes another WAY TOO CRAZY thought:
What would happen if we were using this Grren mask to do the block matching in Manao's MVtools to get a better frame interpolation?
Last time I tested MVTOOLS for interpolating frames it performed far better when I first made some kind of "image segmentation" thru avisynth filter chain....
PS: After replying to an E-male post on "getting a film look" this other question came to my mind.
What would happen if you were using HSV or HSI colorspace?
the H channel (HUE) would have far better info than the U and V channel in YUV.Or may be using H and I.......
IanB
27th January 2005, 15:12
defunkt,Can anyone (DG?) point me to code which will derive Hue 0-359 (or higher resolution) given U & V?Hue is related to atan2(u, v) (atan2 is the 4 quadrant arctan).
A fast implementation could go something like thisconst int Kr=16; // choose to suit precision required
const unsigned K180=(180*binary_degrees_factor);
unsigned short reciprocal[128]; // Load with R[i]=(1<<Kr)/i * scale factor to suit atanTable
unsigned short atanTable[128]; // Load with A[i]=K180/2*arctan(i*scale_factor)
BYTE u, v;
unsigned U, V;
if ((u >= 128) && (v >= 128)) { // Quadrant 1
U=u-128;
V=v-128;
hue=atanTable[(U*reciprocal[V])>>Kr];
}
else if ((u >= 128) && (v < 128)) { // Quadrant 2
U=u-128;
V=128-v;
hue=K180 - atanTable[(U*reciprocal[V])>>Kr];
}
else if ((u < 128) && (v < 128)) { // Quadrant 3
U=128-u;
V=128-v;
hue=K180 + atanTable[(U*reciprocal[V])>>Kr];
}
else /* if ((u < 128) && (v >= 128)) */ { // Quadrant 4
U=128-u;
V=v-128;
hue=2*K180 - atanTable[(U*reciprocal[V])>>Kr];
}
You can adjust the size of the tables and the various scale factors to suit the required precision you want. Probably 512 or 256 binary-degrees per unit circle will be sufficient for this application.
Load the tables in the constructor, include all scale factors or offsets in the table calculations so the main grunt loops only have to do an array reference. Hard hack any table values that are indeterminate like R[0] and tan(pi/2) to give an overall acceptable result.
IanB
Didée
27th January 2005, 17:18
Hey defunkt, that doesn't look too bad indeed! Reminds me a little of VagueDenoiser.
Not much time for my usual blabla, so rather brief:
That little script works already good, but has its weaknesses with faint detail, dim areas, and dark frames. You know.
During today's lunchbreak, I fiddled a detail protection mask. Based on FineEdge() (some might know it from iiP), which has already good response to detail, and is quite robust against noise.
Modulated it so, that even weak detail gets masked strongly, while still being robust against noise. (I particularily like weak detail ;) )
Take a look at the following's output:EdgeSense = 8
MoreNoise = false
o = last
ox = o.width
oy = o.height
o2 = MoreNoise ? o.DEdgeMask(0,255,0,255,"1 2 1 2 1 2 1 2 1",U=2,V=2) : o
edge = o2.FineEdge(EdgeSense)
edgemax = edge.expand.reduceby2().expand.bicubicresize(ox,oy,1.0,.0)
edge2 = yv12lutxy( edge,edgemax,
\ yexpr="x y / 255 * y y 32 + / * x x 64 + / * "+string(EdgeSense/2)+" *",
\ U=-128,V=-128)
#stackvertical(o,edge2)
interleave(o,edge2)
return( last )
#----------------------------------------
function FineEdge( clip clp, int "div" )
{ logic( clp.DEdgeMask(0,255,0,255,"8 16 8 0 0 0 -8 -16 -8", divisor=div)
\ ,clp.DEdgeMask(0,255,0,255,"8 0 -8 16 0 -16 8 0 -8", divisor=div)
\ , "max", Y=3,U=2,V=2 )
}
Adjust EdgeSense to your liking & the actual noise level. For stronger noise, activate "MoreNoise".
If you mix that mask with your "weight" mask through "logic("max")", then even more detail is retained.
***
However, I'd think of a (new) plugin that averages over big radii, where the weighting of each pixel (how much it may contribute to the averaging) is modulated as follows:
- the more similar a pixel's color is to the center pixel (from your initial idea), the more it's included
- the higher the edge value (of above's script), the less it's included
Rather complex, but basically should work out, I think.
kassandro
27th January 2005, 23:48
Originally posted by defunkt
store = sourceY[x * 2]
+ sourceY[(x + 1) * 2]
+ sourceY[(x * 2) + ystride]
+ sourceY[((x + 1) * 2) + ystride];
Shouldn't it be
store = sourceY[x * 2]
+ sourceY[(x * 2 + 1]
+ sourceY[(x * 2) + ystride]
+ sourceY[((x * 2)+ 1 + ystride];
instead?
for (int i = 0; i < 256; i++) {
if (u_count[i]) u_value[i] /= u_count[i];
if (v_count[i]) v_value[i] /= v_count[i];
}
To have true averaging, you should take
for (int i = 0; i < 256; i++) {
if (u_count) u_value[i] /= (4*u_count[i]);
if (v_count[i]) v_value[i] /= (4*v_count[i]);
}
instead.
If you use true avaraging as suggested above, then clipping is not necessary and you can replace
resultY[x] = __min((u_value[sourceU[x]] + v_value[sourceV[x]]), 255);
by
resultY[x] = (u_value[sourceU[x]] + v_value[sourceV[x]]) / 2
Because no information is clipped away, the resulting mask should also be more informative.
[i]Originally posted by Didée
Mug, *did* you have a look at kassandro's RemoveGrain/Repair (http://www.removegrain.de.tf/) combo scripts, a.k.a. "RemoveDust()"? Quite powerful in itself, and get even better with some small additions ;)
(But the doc is a PITA to read)
If you have some suggestions for improving it, I truely would like to include them in the documentation. I'm planning a new release in about 10 days.
defunkt
31st January 2005, 09:54
Haven't had much joy dealing with the fluctuations in the mask my filter creates though I haven't had a lot of time to spend on it. Will keep plugging away at it as time allows and post again if it turns into something undeniably useful.
@IanB: Cheers, I'd never have worked that out by myself. Initial results using Hue not great but not exhaustively tested.
@Didée: That script produces a very interesting mask though it seemed to me to outline fine lines rather than mask the actual lines, and in the process reintroduces edge artifacts. Maybe some sort of logical AND with itself offset by say 1 pixel leaving only the bits in common?
@Kassandro: Quite right regarding the pixel offset. True averaging might preserve a more informative mask but in order to be of use the values would have to be scaled up, the most obvious method being a simple multiplication which will of course re-introduce clipping.
Thanks all for your input.
708145
31st January 2005, 14:17
I couldn't follow all of your discussion and I'm still an avs n00b.
I just wondered if blurring the mask would help :confused:
Some kind of small gaussian blur? Or do you already do that?
bis besser,
Tobias
joshbm
17th February 2005, 03:29
Yes, if you have followed at all the PAL Movie thread, they have talked about creating a "fake DOF". It appears that if we can utilize something like this to detect objects and their relative distance to the camera... voila! You have a nice mask for a fake DOF.
What would this do? For us DV to Big screen movie type enthusiasts this would create a wonderful image that has a similar depth of field to that of a movie camera, without the need to buy new Mini-35mm adapters for our DV cameras.
Quite exciting! If you can get a method for tracing objects and their distance, this definately could be used as a "fake DOF".
Regards,
Joshbm
E-Male
17th February 2005, 03:53
well, as i see it, it wouldn't really detect the depth, but it doesn't have to
it detects the sharpnes of areas, so we can just sharpen the sharp ones and blur the unsharp ones and so get closer to the film like in-/out of focus effect
defunkt
17th February 2005, 05:57
I didn't actually do any more with this, mainly because I wasn't getting anywhere. But if you think it might be of some use you can download the filter, source & a short readme here:
http://www.ionwerks.net/defunkt/edgefreqmask.zip
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.