New plugin TNLMeans [Archive]

AVIL

19th May 2006, 08:24

Hi all,

A new plugin is born (silently?).

Name: TNLMeans.
Age: v1.0 Beta 1
Parents: Tritical
Job: 3D Denoiser

I'm surprised. First for the quality of denoising. Second for the absence of news on this forum. I advice to try it. Only a complaint: its slow.

Thanks Tritical.

foxyshadis

19th May 2006, 12:13

I read the papers linked to the filter on his site (http://www.missouri.edu/~kes25c/), and it looks like an awesome little denoiser. I always prefer to know how something works and comparisons to existing methods before I dig in. Anyway, I'm interested regardless of speed and I'll report back once I've given it a whirl. I know Tritical has been quite busy lately though, cool that he's managed to make this.

buzzqw

19th May 2006, 12:13

wow...

where ?

BHH

AVIL

19th May 2006, 12:52

At buzzqw:

on his site (http://www.missouri.edu/~kes25c/),

buzzqw

19th May 2006, 12:54

ok found on Tritical site ! thanks foxyshadis

just one note... is very very slow :eek: (from 9.3 fps to 0.7... and it has broken multithread on x264...now the second core is near sleeping... don't know about this)

BHH

foxyshadis

19th May 2006, 13:24

It doesn't break multithreading, it's just that the x264 part of processing (the second core) is only 10% of what it was, because fetching the avisynth frame is so much slower. If you used MT plugin, you'd probably see both cores at full steam (and an amazing 1.2 fps, heh). I guess that's why it's labeled beta, probably mostly unoptimized. xD

Oh, and if you think it's slow now, try enabling temporal processing (az=1+). 3 times as slow for radius 1 as 0. =D

tritical

19th May 2006, 19:06

Yeah, I didn't post anything about this filter because for the moment it is really too slow to be usuable, though the method does produce nice results. The reason it is so slow is that it for every pixel in the frame you do a convolution of size (2*Ax+1)*(2*Ay+1) (the search window) and for every pixel in the search window you have to calculate the gaussian weighted sum of squared differences of its support window (Sx*2+1)*(Sy*2+1) to the center pixel's support window. In the paper they suggest using a search window of 21x21 (Ax=10,Ay=10) and a support window of 7x7 (Sx=3,Sy=3)... which takes about 100 seconds per frame for a 720x480 frame on my comp (1.6 pentium M).

In the paper they suggest two methods to speed the process up: 1.) do blocks instead of single pixels and 2.) a multiscale version that uses Ax=Ay=1 at the initial resolution and Ax/2,Ay/2,Sx/2,Sy/2 at 1/2 the original resolution. I have implemented the block based approach (Bx/By), and it does cut computation time by roughly (2*Bx+1)*(2*By+1). However, the results really start to suffer with Bx/By greater than 2. The multiscale version should speed things up by about 16x, but I haven't implemented it yet and don't know how much it impairs the output. Possibly the combination of a multiscale version and Bx=By=1 should get it into the fps range instead of the spf range.

Fizick

19th May 2006, 20:06

tritical's filters are always high quality.
Here he beat my early fft3d by slowness ? :)
I will try it, thanks.

But I really like this:

TO DO LIST:

- add multiscale version
- Release a v1.0 final

Intrerestly, what tritical's filter will be first "final v.1.0" filter. :)

tsp

19th May 2006, 20:19

tritical: What about using a fft to speed up the convolution? Something like the true gausian function in variable blur?

tritical

21st May 2006, 13:41

The weights for the search window around each pixel are dependent on the local image structure and aren't known until the gaussian weighted ssd for the neighborhood of each pixel inside the search window is calculated. Therefore, the fft approach wouldn't help (I think anyways). I did realize one simple optimization for the non-block based method which takes advantage of the fact that the weights are symmetric (if pixel A is assigned weight y in pixel B's window then pixel B will also be assigned weight y in pixel A's window). Therefore, as you process the image you can buffer results and cut the computation time in half. I also implemented a multi-scale version and it works reasonably well. I had to make some adjustments to the method in the paper because it tended to mess up fine details/lines... so the overall speed increase isn't quite as much as expected. Some timings on my comp for a 720x480 frame with ax=ay=10 and sy=sx=3:

no multiscale, bx=by=0: ~69 seconds
no multiscale, bx=by=1: ~15 seconds
no multiscale, bx=by=2: ~5 seconds
no multiscale, bx=by=3: ~3 seconds
multiscale, bx=by=0: ~10 seconds
multiscale, bx=by=1: ~2 seconds
multiscale, bx=by=2: ~1 second

For comparison, tbilateral with diameterL and diameterC set to 21 takes about ~1.5 seconds per frame on my comp.

Anyways, I'll hopefully have a new version up in a few days with the multiscale option and bx=by=0 2x speed up. As far as results go I would say that nl-means is superior to bilateral filtering for zero mean gaussian noise, but it seems to be less effective than bilteral filtering for typical mpeg compression artifacts.

Mug Funky

21st May 2006, 14:21

i can't seem to be able to access the site. uni of missouri (huh, rhymes) gives me a 404 and invites me to check out their athletics program amoung other things :(

could some kind soul mirror this so i, and similarly download handicapped lurkers could try it out? sounds promising.

Wilbert

21st May 2006, 15:03

@tritical,

Is your homepage down?

edit: i should have read Mug's post :)

Serbianboss

21st May 2006, 16:10

Any mirror for download?

Backwoods

21st May 2006, 18:43

4th on the request.

buzzqw

21st May 2006, 19:07

sorry i have now only the dll
www.64k.it/andres/TNLMeans.dll

BHH

( i hope tomorrow to recover the full zip file)

Guest

21st May 2006, 19:10

This link works fine for me:

http://www.missouri.edu/~kes25c/

Maybe it just came back up.

tritical

21st May 2006, 22:54

There was a power problem of some kind on campus yesterday which effected a few computer systems so that might be the reason it was down. Also, if anyone is trying the filter a few faster settings to try would be:

tnlmeans(ax=3,ay=3,sx=2,sy=2,bx=1,by=1) or
tnlmeans(ax=4,ay=4,sx=3,sy=3,bx=2,by=2)

those should give pretty good results without being unbearably slow (ax=3,ay=3,sx=2,sy=2,bx=1,by=1 will probably be the new defaults). The 'b' parameter that I added to limit the minimum amount of weight that the center pixel can be given seems to be set too low, and impedes more than it helps. Try setting it to a really large value ~1000000.0 to disable its effect.

Isochroma

23rd May 2006, 04:36

Thanks tritical, this is a very interesting filter, and I'm sure it will improve rapidly...

Here are some nice screenshots made with virgin transport streams @ 1920x1080p.

For the fft3d versions, I used:

FFT3DFilter(sigma=4,plane=4)

and for the TNL versions:

tnlmeans(ax=3,ay=3,az=10,sx=2,sy=2,bx=1,by=1)

-------------------------------------------------
Gladiator frame 8408, original (http://chromasubs.com/Misc/NLMeans/Gladiator-8408.png)
Gladiator frame 8408, fft3d (http://chromasubs.com/Misc/NLMeans/Gladiator-8408-fft3d-1.png)
Gladiator frame 8408, TNL (http://chromasubs.com/Misc/NLMeans/Gladiator-8408-TNL-1.png)
Gladiator frame 8408, Neat Image (http://chromasubs.com/Misc/NLMeans/Gladiator-8408-NI-2.png)

Its biggest problem is areas that are not finely detailed, ie. vague/blurry. These areas are blurred alot, kind of like what Neat Image does. Check these images. Note how it retains detail nicely, but the vague area in the lower right-corner gets severely blurred:

Gladiator frame 41923, original (http://chromasubs.com/Misc/NLMeans/Gladiator-41923.png)
Gladiator frame 41923, fft3d (http://chromasubs.com/Misc/NLMeans/Gladiator-41923-fft3d.png)
Gladiator frame 41923, TNL (http://chromasubs.com/Misc/NLMeans/Gladiator-41923-TNL.png)
Gladiator frame 41923, Neat Image (http://chromasubs.com/Misc/NLMeans/Gladiator-41923-NI-2.png)

The image could be broken down into blocks and those blocks split until similarity yields enough advantage for cross-window averaging to have maximum profitability with minimum penalty.

Small pixel windows could be expanded until decreasing similarity makes further expansion unprofitable.

The window is compared to other windows located on a rectangular grid defined by window width/height? Overlap or exhaustive search might yield significantly better matches, even if its range is shorter.

Use motion vectors to calculate rotations and other non-shift translations between frames, then apply before window search, could create much more and better same- and cross-frame matches.

tritical

23rd May 2006, 10:23

Have you tried decreasing the h parameter? The h parameter controls the strength of the denoising (higher = stronger, lower = weaker). The default of 3.0 is definitely way too much for most sources.

The image could be broken down into blocks and those blocks split until similarity yields enough advantage for cross-window averaging to have maximum profitability with minimum penalty.

Small pixel windows could be expanded until decreasing similarity makes further expansion unprofitable.
I don't understand exactly which windows you are talking about. The search window (ax,ay), support window (sx,sy), or base window(bx,by)? Generally, bx=by=0 so that the base window is a single pixel and ax/ay as large as possible will yeild the best results. The size of the support window and the standard deviation of the gaussian weights for the ssd calculation 'a' are where there is no easy rule. The larger the support window the better fine texture will be preserved, but less noise will be removed. Likewise, the smaller 'a' is the better fine texture will be preserved, but less noise will be removed. When bx/by are > 0 I've coded it so that the weights for the support window are all 1.0 for pixels up to the base window size and then it starts to drop off according to the 'a' parameter. So using bx/by > 0 is somewhat like using bx=by=0 with 'a' set to a larger value.

The window is compared to other windows located on a rectangular grid defined by window width/height? Overlap or exhaustive search might yield significantly better matches, even if its range is shorter.
I assume you are talking about the base window and bx or by is greater than 0? In this case the frame is indeed split into blocks, but when comparing neighboring blocks to be averaged, the comparisons are done to all possible blocks of that size on the image inside of the search window not just to other bx/by blocks. Though in Beta 1 there was bug that causes it to miss 2*Bx+2*By blocks (increase ax/ay by the values for bx/by as a workaround).

Use motion vectors to calculate rotations and other non-shift translations between frames, then apply before window search, could create much more and better same- and cross-frame matches.
It could, but it would make the current version look like the fastest filter ever :).

Isochroma

23rd May 2006, 17:17

Thanks for the reply!

I tried increasing the ax/ay parameters, they did increase the denoise, but only slightly from 3 to 6 at the expense of ax*ay time.

It was far more effective to increase the az parameter, but beyond 10 frames further noise decreases were also only slight.

It seems this filter is really good to start, I like its low memory usage. Some optimization could yield better speed without any quality loss, such as caching window-matches between frames and on the same frame. After all, it uses 1/4 the memory of fft3d, so it can still bloat up significantly and remain quite usable.

I read all three papers, and was very impressed with the entropy-inverse post-denoise image comparisons for various denoisers. It struck me as very excellent that the TNL-means algorithm left an almost random EI-PDI. This gives me hope that your implementation can improve to reach the quality level demonstrated in the paper. Specifically, they mention that noise removal doesn't harm fine detail, and from the nature of the EI-PDIs, I would assume that it can be the best possible denoiser. Since denoisers remove high frequencies more than low ones, there is no particular reason why they must be so hard on vague areas, which contain few high frequencies.

Last night I also tested Neat Image at medium and aggressive noise-reduction settings on the two images above. Check the added links in the previous post. The aggressive settings was indistinguishable from the TNL-means version, and took about 1/8th the time to generate. Neat Image uses a highly optimized wavelet transform to remove noise, and has been independently rated as the best still-image denoiser.

Alain2

23rd May 2006, 18:23

those should give pretty good results without being unbearably slow (ax=3,ay=3,sx=2,sy=2,bx=1,by=1 will probably be the new defaults)
True it's faster, but it's clearly not as good as current default settings, at least on anime I tried on..

Dreassica

23rd May 2006, 19:00

True it's faster, but it's clearly not as good as current default settings, at least on anime I tried on..

But at speeds such as 0.17fps it's practically undoable to make encodes with it. Better wait for more optimised versions. But for filtering out a few pictures its perfectly workable.

Isochroma

23rd May 2006, 20:48

Very strange... vague areas especially are shifted, yes spacially shifted, like a halo that shifts. This does not occur at all in the fft3d images. Just flick between these new sample images, and you'll see the effect. Also a lot of smearing on vague areas.

Gladiator, frame 2565 (http://chromasubs.com/Misc/NLMeans/Gladiator-2565.png)
Gladiator, frame 2565, fft3d (http://chromasubs.com/Misc/NLMeans/Gladiator-2565-fft3d.png)
Gladiator, frame 2565, TNL (http://chromasubs.com/Misc/NLMeans/Gladiator-2565-TNL.png)

Yes, they are the same frame!

Some points to note about these images:

1. Upper background 'halos' and the halo is shifted to the left.

2. Peak brightness decreases significantly; notice the highlights in the grass center-bottom and especially bottom left corner. In the original and fft3d images, they are a light yellow, while in TNL they have darkened to brown.

3. Ringing is removed. Check the top edge of the hand, bright area. fft3d does not touch the ringing, TNL removes it completely.

AVIL

28th May 2006, 11:42

Applying TNLmeans filter on a video with 748x564 framesize, (mod 4) virtualdub can´t load the script. The error that arises is :

Resize: YV12 must be multiple of 4

after adding borders up to 768x576, scripts run flawessly

tritical

29th May 2006, 01:04

For multiscale it does require mod 8 since it uses avisynth's internal resizers to create the half scale image. Even though multiscale defaults to false, I seem to have messed up the check that decides whether or not to create the resized clip and it ends up always being created unless "ms=false" is explicitly specified. I'll put up a fixed version later today (the actual filter operation is not effected by the bug).

Egh

3rd July 2006, 01:12

Looks like this filter is the first one from tritical to be released as "1.0 final". Not that 1.0.1 was released short afterwards ^^

gizmotech

27th July 2006, 19:31

Well I'll say this, The filter has very nice results. Unfortunately the amount of artifacts it generates are so stagering at low settings that I wonder if it will ever be useable.

Titrical, do you have any further plans for development on the filter?

Gizmo.

tritical

27th July 2006, 21:11

What settings are you using that generate lots of artifacts?

Atm I don't plan on developing TNLMeans any further. The main reason being that there are already better denoising methods that make use of the same general idea as nl-means... so I would probably just work on developing a filter around one of those. To me the most interesting is using the block matching idea in the transform domain (dft for instance). The general idea is to break the frame into overlapping blocks, for each block you find a set of blocks that are similar after performing a 2d transform with hard or soft thresholding. You then stack those similar 2d blocks and perform a 3d transform with hard or soft thresholding and then perform the inverse 3d transform to get estimates of the pixel values at each location. Of course that is leaving out some of the details... here's a link to a paper describing this idea http://www.cs.tut.fi/~foi/3D-DFT/BM3DDEN_article.pdf. Their results show that their method beats the exemplar-based method which is basically a much more complicated version of nl-means. I made a filter based on that paper, but for some reason the results I got were never that great so I stopped working on it.

Fizick

27th July 2006, 22:40

Some great results and info about speed from this article:

At http://www.cs.tut.fi/~foi/3D-DFT, we provide a collection of the original and denoised test images that
were used in our experiments, together with the algorithm implementation (as C++ and MATLAB functions)
which produced all reported results. With the mentioned parameters, the execution time of the whole algorithm
is less than 9 seconds for an input image of size 256 x 256 on a 3 GHz Pentium machine.

3D fft is good, of course ;) but it is not true 3D.
Article is applied to photo.
We have a video, why we must work per-frame?

But really some time ago I thinked (a little :)) how to apply MVTools block-matching denoising method to single frame copies :)

tritical

27th July 2006, 23:11

That method could be extended to use other frames in exactly the same way that nl-means can be. Specifically, you can just extend the search area in which you are looking for similar 2d blocks to include other frames. That way you don't need to include any motion-compensation... of course motion compensation could help in the case that the search area is not wide enough to cover the amount of motion.

MfA

27th July 2006, 23:39

How well does the algorithm work if you use the plain block SSD instead of the gaussian weighted one? There are many more ways of speeding up the former.

Fizick

28th July 2006, 16:14

The same authors have method for video.
http://www.cs.tut.fi/sgn/lf3d/video/

But the algorithm kills too many details IMHO.

Egh

28th July 2006, 22:48

Would be interesting if some guru (liek tritical) developed denoiser/enhancer filter *specially* designed for anime-style material. We have variety of filters, most of them are not valid for anime style video.

I'd say for last year or so TNLMeans is the only new generic filter which is good for anime (other filters are either not new, or not really good for anime).

gizmotech

29th July 2006, 00:24

720x480 video stream

ax3,ay3,h=0.3

Generates all over the video source @ random small 2x2 blocks of solid white or black. ax2 and ay2 w/ h0.5 also produced a similar result.

It seemed to occur along intersecting edges of strong colors. Where a whitish edge would come in contact w/ a hard black, or a solid red would come in contact w/ a solid green. This was tested on anime.

Other then the speed (and these random artifacts) I had no problems w/ the filter. 0.67fps isn't that slow when you think about how I used to have to achive this level of fine detail cleaning in the past.

Gizmo.

Alain2

29th July 2006, 01:43

Would be interesting if some guru (liek tritical) developed denoiser/enhancer filter *specially* designed for anime-style material. We have variety of filters, most of them are not valid for anime style video.

I'd say for last year or so TNLMeans is the only new generic filter which is good for anime (other filters are either not new, or not really good for anime).
I would say that for anime content motion compensation denoisers are very good; maybe it's more than a year old, but not sure, and it's been really improved within the last year, with scripts like removenoisemc (for film content it maybe kills too much details, depending on the source)
And frfun7 is also really adapted to anime content, not to mention fft3dfilter that is adapted to everything

foxyshadis

29th July 2006, 02:16

CelForground/CelBackground is what I've been pining for ever since the prereleases showed up. The former is exceedingly effective, but at one frame a second it isn't terribly usable (plus the background gets mulched without CelBackground). ;_;

Frfun7 tends to smear thin lines, but I still use it.

tritical

29th July 2006, 04:24

gizmotech, could you try this build TNLMeans.dll (http://bengal.missouri.edu/~kes25c/TNLMeans.dll) and see if it fixes the artifact problem.

gizmotech

30th July 2006, 00:28

Will do,

I'll throw it into a huffy over night (yay 5 hours for 4k frames :P)

Gizmo.

Update:
With an updated setting (trying to speed it up a bit) I haven't seen any of the previous errors. I will be running the huffy later tonight.

gizmotech

30th July 2006, 14:25

We're all good! Updated dll seems to have fixed the artifacts.

Thanks :)

tritical

31st July 2006, 02:22

k, I'll put up a new version with the changes. What was happening was that some of the calculated weights for unique blocks would get extremely small (right at the limit of double precision), but wouldn't be zero. That caused some instability when the final pixel values were calculated.

MfA

2nd August 2006, 01:56

Just like with Bilateral filtering and Susan denoising NL-means has a very close cousin ... TLS denoising (http://www.accidentalmark.com/research/papers/Hirakawa05TLSDenoiseICASSP.pdf). Although in this case it was just independent invention, rather than insufficient knowledge of what came before.

tritical

2nd August 2006, 20:48

Thanks for the link, I'll definitely take a look. For the moment I decided to just make a simplified version of the bm3ddft idea. It first breaks the image into overlapping blocks of nxn. For each block it checks the similarity of all blocks within the search area (Ax x Ay x Az) based on sse with subtracted means (not gaussian weighted). It keeps the best 'nb' number of blocks with a normalized difference less than 'md' (nb and md are adjustable parameters). It then performs a 3d dft on those blocks and does wiener filtering on the resulting coefficients. Finally, it does the inverse tranform and gives a weight to each pixel based on the distance from the center of the nxn window and the total variance. All of those estimates/weights are then stored, and, after all blocks have been processed, the final pixel values are computed. So far the results seem pretty good.

MfA

4th August 2006, 12:50

Im not too enamored with transform domain linear filtering ... it can't handle outliers very well, you have to be very conservative with that md.

How well does it do on curved edges in cartoons/anime?

For a pixel based approach ... I think you could use local geometric moments (maybe invariants, so you can make use of symmetries) to quickly find potential matches, and then use bilateral filtering with those matches. I might give it a try.

tritical

4th August 2006, 21:52

It handles curved edges alright, but overall it is nothing great. I think nl-means could still be improved by adaptively setting the std of the gaussian for sse based on whether the current pixel is in a uniform or non-uniform area. Some example pics:

original image:
http://bengal.missouri.edu/~kes25c/orig.png

tnlmeans(h=2.2,ax=10,ay=10,bx=0,by=0,sx=3,sy=3,a=1.0)
http://bengal.missouri.edu/~kes25c/nl1.png

tnlmeans(h=2.2,ax=10,ay=10,bx=0,by=0,sx=3,sy=3,a=1000.0)
http://bengal.missouri.edu/~kes25c/nl2.png

and for comparison, dfttest(sigma=0.005,max2dblocks=25,max2ddiff=100.0,bsize=16,sa=10)
http://bengal.missouri.edu/~kes25c/dft1.png

MfA

5th August 2006, 00:33

NL doesn't do so well where the wires cross the pole.

Terka

27th March 2008, 23:27

any news about TNLMeans?

sho_t

30th September 2008, 18:36

Before, I asked tritical to make "TNLMenas on GPU" with mail.
But his answer is "I don't have the motivation or the time to do it. Maybe in the future."
If it is not impolite, Someone make it, please.

Please forgive my poor English. It's my first post on Doom9 forum, I hope that I didn't make a mistake.

Nightshiver

30th September 2008, 20:34

You didn't, but Terka just gravedug a 2 year old thread.

Dogway

27th May 2010, 22:46

Hello. Normally I use tnlmeans with MT and no problems, but I switched on the multiscale to speed up things and didnt work. Is it incompatible with multithreading, can someone confirm this? Thanks
ah by the way, just out of curiosity, is multiscale the same as halving ax/ay or theres more to it than that?

kedautinh12

11th April 2021, 03:49

New ver TNLMeans 1.1 by pinterf
https://github.com/pinterf/TNLMeans