View Full Version : non-ringing Lanczos scaling
NicolasRobidoux
6th June 2011, 03:04
Quick note: LBB has two versions; I used the less smooth one. LBB-Nohalo has two versions; I used the smoother one. In my limited testing, each is the best of the two alternatives.
madshi
6th June 2011, 08:24
clown:
standard Lanczos4: http://madshi.net/clownLanczos4.png
non-ringing Lanczos4: http://madshi.net/clownNonRingingLanczos4.png
modified ICBI: http://madshi.net/clownICBI.png
Progressive Refinement: http://madshi.net/clownPR.png
NNEDI3: http://madshi.net/clownNNEDI3.png
Imagiris: http://madshi.net/clownImagiris.png
lighthouse:
standard Lanczos4: http://madshi.net/lighthouseLanczos4.png
non-ringing Lanczos4: http://madshi.net/lighthouseNonRingingLanczos4.png
modified ICBI: http://madshi.net/lighthouseICBI.png
Progressive Refinement: http://madshi.net/lighthousePR.png
NNEDI3: http://madshi.net/lighthouseNNEDI3.png
Imagiris: http://madshi.net/lighthouseImagiris.png
@Nicolas, can you do the lighthouse in 400% instead of 200%, please?
NicolasRobidoux
6th June 2011, 19:04
...
@Nicolas, can you do the lighthouse in 400% instead of 200%, please?
Done. I put them at the same place (removed the 200% ones).
Note: LBBNohalo, for example, preserves more of the moire present in the house siding.
Does it make sense to you (or *.mp4_guy) that we actually do a 400% of the original lighthouse image (pre-downsampling with Lanczos 3)? It's available on the http://www.general-cathexis.com/interpolation/index.html site.
We could call them "biglighthouse" or something like that?
NicolasRobidoux
6th June 2011, 19:18
The results make me wonder:
1) How well Non-ringing Clamped EWA Jinc Lanczos 3 would turn out.
2) Whether Clamped EWA Jinc Lanczos 3 (or a non-ringing version) should replace LBB as finishing scheme for Nohalo subdivision.
madshi
6th June 2011, 19:18
@Nicolas, if you think it's worthwhile we can do that (biglighthouse, I mean).
@Anybody, is anybody able to provide 400% NNEDI3 screenshots of those 4 images (black&white, parking, clown, lighthouse)? I'm an AviSynth noob, unfortunately. The originals can be found in Nicolas' download links.
LoRd_MuldeR
6th June 2011, 19:38
@Anybody, is anybody able to provide 400% NNEDI3 screenshots of those 4 images (black&white, parking, clown, lighthouse)? I'm an AviSynth noob, unfortunately. The originals can be found in Nicolas' download links.
This should do the trick:
ImageSource("C:\Foo\Bar.png")
NNEDI3_rpow2(rfactor=4)
NicolasRobidoux
6th June 2011, 19:50
I've looked at the image which http://www.general-cathexis.com/interpolation/index.html states the "small" lighthouse was obtained from, namely http://www.general-cathexis.com/interpolation/BigLight.png, and it's pretty clear to me that this image too saw some significant processing. Too much haloing and aliasing, for example, to be a reasonably unprocessed image. It actually looks to me like it was obtained by downsampling with Lanczos 2 from an even larger original (or else, by some sort of smoothing followed by USM).
I still think it's a good test image, so I'll proceed to enlarge "biglighthouse" 4x anyway. But I wanted this fine print out in the open.
NicolasRobidoux
6th June 2011, 20:06
Here are the "biglighthouse" enlargements: http://web.cs.laurentian.ca/nrobidoux/misc/halodemo/biglighthouse/
madshi
6th June 2011, 20:12
Here's another good candidate for testing:
http://img69.imageshack.us/img69/6127/z3hs1n8ns1t.png
madshi
6th June 2011, 20:33
This should do the trick:
ImageSource("C:\Foo\Bar.png")
NNEDI3_rpow2(rfactor=4)
Thanks, that works.
NicolasRobidoux
6th June 2011, 21:26
Here are the "castle" enlargements: http://web.cs.laurentian.ca/nrobidoux/misc/halodemo/castle/.
(Note: The "castle" is another test image with haloing in the original.)
madshi
6th June 2011, 21:33
I've added NNEDI3 screenshots to all the previous comparison of the last 1-2 days. And here are the castle and biglighthouse comparisons:
castle:
standard Lanczos4: http://madshi.net/castleLanczos4.png
non-ringing Lanczos4: http://madshi.net/castleNonRingingLanczos4.png
modified ICBI: http://madshi.net/castleICBI.png
Progressive Refinement: http://madshi.net/castlePR.png
NNEDI3: http://madshi.net/castleNNEDI3.png
Imagiris: http://madshi.net/castleImagiris.png
big lighthouse:
standard Lanczos4: http://madshi.net/biglighthouseLanczos4.png
non-ringing Lanczos4: http://madshi.net/biglighthouseNonRingingLanczos4.png
modified ICBI: http://madshi.net/biglighthouseICBI.png
Progressive Refinement: http://madshi.net/biglighthousePR.png
NNEDI3: http://madshi.net/biglighthouseNNEDI3.png
Imagiris: http://madshi.net/biglighthouseImagiris.png
NicolasRobidoux
6th June 2011, 21:38
Without a doubt, the "Robidoux" EWA filter is out of the running.
VSQBS is nice but too blurry, I would guess, for most people (although it may work well with "pixel art," noisy, or jpeg compressed images, and it's even cheaper than a bicubic scheme: it's a "clean" linear scheme with an 8 point stencil in 2D).
And I consistently find that LBB has more "jaggies" than LBB-Nohalo.
On my "side", this leaves (no surprise here) LBB-Nohalo and Clamped EWA Jinc Lanczos 3, the two schemes I presented at Libre Graphics Meeting 2011 (http://river-valley.tv/better-and-faster-image-resizing-and-resampling/).
NicolasRobidoux
6th June 2011, 21:45
Unless someone suggests more test pictures, I think call-in's open.
madshi
6th June 2011, 22:36
So here are my quality judgements. I'm leaving nip2vsqbs and standard Lanczos4 out of these judgements because nip2vsqbs is way too soft and Lanczos4 has way too many ringing artifacts.
black&white: NNEDI3 > ICBI > PR > EWA LanczosSharp > nonRingingLanczos4 > EWA Robidoux > LBB > LBB-Nohalo > Imagiris
parking: NNEDI3 > Imagiris = ICBI > nonRingingLanczos4 > PR > LBB-Nohalo > LBB > EWA LanczosSharp > EWA Robidoux
clown: NNEDI3 > Imagiris > ICBI > EWA LanczosSharp > nonRingingLanczos4 > PR > LBB-Nohalo > LBB > EWA Robidoux
lighthouse: NNEDI3 > ICBI > Imagiris > EWA LanczosSharp > nonRingingLanczos4 > EWA Robidoux > PR > LBB-Nohalo = LBB
castle: NNEDI3 > Imagiris > nonRingingLanczos4 > EWA LanczosSharp > PR > LBB = LBB-Nohalo > EWA Robidoux > ICBI
biglighthouse: NNEDI3 > Imagiris > ICBI > PR > EWA LanczosSharp > nonRingingLanczos4 > LBB-Nohalo > EWA Robidoux > LBB
Here comes my opinion about those scaling algorithms one by one, sorted for quality:
(1) NNEDI3: clear winner for me. Not as sharp as Imagiris, but always artifact free, always aliasing free, always halo free, always smooth and pretty, big congrats to tritical
(2) Imagiris: best in sharpness, but too many artifacts to win, probably not a good choice for video, due to artifacts
(3) ICBI: somewhat comparable to NNEDI3 in looks, but a clear step down, also shows some minor artifacting sometimes
(4) EWA LanczosSharp + nonRingingLanczos4: EWA LanczosSharp has noticeably lower aliasing levels than nonRingingLanczos4, but has more halos and is a bit softer
(5) PR: weird algorithm, produces sometimes quite nice results, sometimes rather bad results, sometimes weird artifacts, no good choice for video, due to artifacts
(6) LBB-Nohalo: Too much aliasing for my taste. But on the positive side no halos (surprise!)
(7) LBB: In many cases almost identical to LBB-Nohalo. Sometimes a tiny bit worse.
(8) EWA Robidoux: Too much aliasing and softer than LBB(-Nohalo).
Of course that's just my personal opinion. Anyone else willing to do a detailed judgement?
Malcolm
30th August 2011, 22:40
(1) NNEDI3: clear winner for me. Not as sharp as Imagiris, but always artifact free, always aliasing free, always halo free, always smooth and pretty, big congrats to triticalimho: NNEDI3 has some artifacts in this region: http://img718.imageshack.us/img718/1903/lighthousennedi3crop.th.jpg (http://imageshack.us/photo/my-images/718/lighthousennedi3crop.jpg/)
madshi
31st August 2011, 08:45
imho: NNEDI3 has some artifacts in this region
Ok, have to agree with that. But then, these artifacts don't fall into the "nasty" category, IMHO. I consider the aliasing visible in most other screenshots worse.
Anyway, what is your preference? Which algorithm like you best overall?
nevcairiel
31st August 2011, 13:18
Imagiris looks like some kind of oil-painting filter from Photoshop, horrible imho. Alot of detail looks just washed out. Quite obvious on the castle image, not as pronounced on big lighthouse.
How did you make it #2?
I agree with your #1, NNEDI3 seems to be the best in all of the sample images.
madshi
31st August 2011, 14:05
Imagiris looks like some kind of oil-painting filter from Photoshop, horrible imho. Alot of detail looks just washed out. Quite obvious on the castle image, not as pronounced on big lighthouse.
How did you make it #2?
Well, I do like the Imagiris sharpness, and aliasing is nicely low, too. Yes, sometimes it looks a bit like oil painting, that's true. But which algorithm would you select as #2? ICBI? It's not bad, but it does sometimes still show artifacts, too, and is much softer than Imagiris. Of course 400% is also rather big. The Imagiris oil painting look is less pronounced at lower scaling factors. But anyway, for my taste NNEDI3 is far ahead of everything else. I'd really like to know how the following algorithm compares, though:
http://www.resampling.narod.ru/
tritical
31st August 2011, 16:27
No enlargement method will be completely free of artifacts (including NNEDI3) simply because the missing information makes it ambiguous in some cases as to what the pixel value in the enlarged image should be. I do think NNEDI3 hides its artifacts well though... mainly because the output it predicts tends to look like image structure it saw during training, and its output does not change substantially for small changes of the input (which is a major problem with a lot of heuristic edge directed methods).
However, IMO using NNEDI3 for enlarging will give you an image that really should be sharpened.
The downsizing process for a gamma corrected image (in a good software implementation - and pretty close to what you get with a camera) is basically:
1.) undo gamma correction
2.) low pass filter (gaussian blur)
3.) apply gamma correction
4.) sample or interpolate
3 and 4 can be switched without significant change (none if simply sampling). So the inverse steps would be:
1.) enlarge with NNEDI3
2.) undo gamma correction
3.) attempt to inverse gaussian blur (sharpen)
4.) apply gamma correction
A while ago I actually trained some mixture of experts models to inverse different amounts of gaussian blur (both on linear light and gamma corrected) in an attempt to improve NNEDI3 enlargements. It worked quite well. I can't remember why I never released it. The only downside was that the user would have to select between the different models (trained to inverse different amounts of gaussian blur) and choose what looked best. There was no attempt to automatically choose which inverse to apply.
madshi
31st August 2011, 17:04
its output does not change substantially for small changes of the input
That's quite important.
However, IMO using NNEDI3 for enlarging will give you an image that really should be sharpened.
Agreed.
The downsizing process for a gamma corrected image (in a good software implementation - and pretty close to what you get with a camera) is basically:
1.) undo gamma correction
2.) low pass filter (gaussian blur)
3.) apply gamma correction
4.) sample or interpolate
3 and 4 can be switched without significant change (none if simply sampling). So the inverse steps would be:
1.) enlarge with NNEDI3
2.) undo gamma correction
3.) attempt to inverse gaussian blur (sharpen)
4.) apply gamma correction
A while ago I actually trained some mixture of experts models to inverse different amounts of gaussian blur (both on linear light and gamma corrected) in an attempt to improve NNEDI3 enlargements. It worked quite well. I can't remember why I never released it. The only downside was that the user would have to select between the different models (trained to inverse different amounts of gaussian blur) and choose what looked best. There was no attempt to automatically choose which inverse to apply.
Sounds interesting. What happens if you use wrong blur parameters?
I've had two thoughts about how NNEDI3 could eventually be improved:
(1) NNEDI3 is trained to find "missing" lines (mainly for deinterlacing), correct? And when "enlarging" the image, you don't modify the original pixels, you just find new pixels for the "missing" lines, correct? But in a downsampled image every pixel is already some kind of filtered. So wouldn't it be better to rewrite *all* output pixels, instead of just filling the missing pixels? Shouldn't that allow you to produce sharper results?
(2) Have you thought about trying some different criteria during NNEDI3 training? E.g. it seems that your new version using the absolute error works better than the squared error. Here my thinking is that the human eye is most sensitive to edges, but neither absolute nor square error take that into account. Wouldn't it make sense to find an error measurement which puts more priority on edge quality than on just the average error? Maybe you could try using things like "PSN&ER", as described here:
The base of the suggested algorithm (M-spline) lies in application of the new PSN&ER metrics (peak to peak signal to noise and edges ratio) which significantly more accurate (compared with PMS metrics) agrees with the visual valuation of similarity.
Does that make any sense to you?
NicolasRobidoux
31st August 2011, 17:16
I believe that many people consider MAE (mean absolute error a.k.a. l1 or L1 error) to be a better measure than MSE/PSNR/RMS (a.k.a. l2 or L2) error in image processing.
madshi
31st August 2011, 17:23
Interesting, didn't know that.
tritical
31st August 2011, 18:24
(2) Have you thought about trying some different criteria during NNEDI3 training? E.g. it seems that your new version using the absolute error works better than the squared error. Here my thinking is that the human eye is most sensitive to edges, but neither absolute nor square error take that into account. Wouldn't it make sense to find an error measurement which puts more priority on edge quality than on just the average error? Maybe you could try using things like "PSN&ER", as described here:
I've thought about it before. In the past I've tried SSIM. The biggest problem is that for online training - the only way to be able to train on lots of data in a reasonable amount of time - I need a quickly differentiable error metric that is independent between patterns. That essentially limits me to Lp norms (absolute error, squared error, etc...). It would be possible to use a second metric for fine tuning after a good initial solution is found, and that optimization could be done in a batch manner without needing derivatives. However, I have not found a metric that improves significantly over L1. One thing to remember is that the prediction network is only trained on parts of the image that pass the pre-screener... so effectively it is concentrating on edge areas.
The "PSN&ER" reference also caught my attention when I read the page you linked to, but they don't explain anything about it - and a search with google turned up nothing. One metric that I have looked into is PSNR-HVS-M. So far I have not done this 'fine tuning' of the NNEDI3 models using a more complex metric. It is a good idea though.
tritical
31st August 2011, 18:33
(1) NNEDI3 is trained to find "missing" lines (mainly for deinterlacing), correct? And when "enlarging" the image, you don't modify the original pixels, you just find new pixels for the "missing" lines, correct? But in a downsampled image every pixel is already some kind of filtered. So wouldn't it be better to rewrite *all* output pixels, instead of just filling the missing pixels? Shouldn't that allow you to produce sharper results?
I don't think NNEDI3 should modify the original pixels. It is trained to recreate missing pixels at the same scale and sharpness as the input (and it is trained in gamma corrected values). It essentially corresponds to exactly what step 1 of the enlargement process should be IMO. I do, however, think the original pixels (and the predicted pixel values) should be modified when performing the blur inversion (sharpening).
Sounds interesting. What happens if you use wrong blur parameters?
Nothing really. You would just end up with a sharper or blurrier version of the image. It is basically user preference anyway. In a metric evaluation, choosing the incorrect one would result in objectively worse quality than choosing the right one. In my experience, current objective metrics are quite horrible for judging image enlargement (shrink an image (usually using a very poor method), enlarge, compare to original).
madshi
31st August 2011, 18:55
Ah, that's interesting!
Looking forward to what you'll be cooking up next! :)
NicolasRobidoux
31st August 2011, 18:58
I've thought about it before. In the past I've tried SSIM...
My student Adam Turcotte and I have devised an improved SSIM which is smoother with respect to variations in image size, alignment and boundary features and which runs much much faster. It otherwise gives pretty much the same answer.
We have unpublished code which relies on the VIPS library. It is lightning fast. (For one thing, we figured out how to eliminate two of six filtering steps.)
We may be interested in working with you to tweak it for training use.
You can access prelimary versions through my ohloh web site:
www.ohloh.net/accounts/NicolasRobidoux
(We're unfortunately behind in putting stuff on the web/packaging things so they are useable by others, but we could up the priority.)
NicolasRobidoux
31st August 2011, 19:18
Then again, if you are only using SSIM on small patches of fixed size <= 256x256 or so which are not the result of downsampling and are not too small, our improvements probably don't matter.
IanB
31st August 2011, 22:31
(2) Have you thought about trying some different criteria during NNEDI3 training? E.g. it seems that your new version using the absolute error works better than the squared error. Here my thinking is that the human eye is most sensitive to edges, but neither absolute nor square error take that into account. Wouldn't it make sense to find an error measurement which puts more priority on edge quality than on just the average error? ...
A cheapish implementation could be to scale the absolute error values by the 1st derivative of the image, e.g. E*(mx'+c), the current model is m=0 and c=1. I would start with conservative values like m=1/16th.
Thoughts?
NicolasRobidoux
31st August 2011, 22:39
Note: If I remember correctly, MAE already gives more weight to "boundary tracking" errors (near rapid changes) than RMSE.
NicolasRobidoux
31st August 2011, 22:44
Opinion: Already in the current NNEDI, the most noticeable errors come from creating sharp interfaces where there are none. Given that derivative computations are greatly affected by noise, my guess is that NNEDI will not be improved by using weights involving gradients, at least not on "unclean" natural images.
Rumbah
4th September 2011, 14:37
Would it be possible to release the training program and some interesting work units and let the community use their horse power to execute the training?
That way you can experiment with different settings without having to run the training itself as it seems very time consuming. I guess a lot of people would help and contribute to get a "better" nnedi.
redfordxx
4th November 2011, 19:35
I read only few posts here, not all, so I don't know if this contributes to something or it is already used or known or implemented.
Recently I was interested in resizing and/or sharpening and generally got annoyed by ringing...so I got idea
Here:
Ringing occures when second derivation changes the sign (convex changes into concave)
So if that could be limited, there will be no more ringing.
I also tried to make script which does it, bot go stuck on other thing: noise...:-(
NicolasRobidoux
29th November 2011, 03:14
As documented in http://forum.doom9.org/showthread.php?p=1541448#post1541448, I have produced sharper versions of the Robidoux, Jinc-windowed Jinc 3-lobe and 4-lobe EWA (Elliptical Weighted Averaging) filters for ImageMagick. I have posted the results under the names RobidouxSharp.png, JincJinc3blur0p88549061701764.png and JincJinc4blur0p88451002338585141.png alongside my earlier results in the folders found at http://web.cs.laurentian.ca/nrobidoux/misc/halodemo/ (web.cs.laurentian.ca/nrobidoux/misc/halodemo/).
madshi
29th November 2011, 10:04
Hmmmmm... The new images are definitely sharper, but I'm still not sure if I prefer them over your old LanczosSharp. The new images have noticeably more ringing and more aliasing in some of the test images. E.g. check out the roofs on the "biglighthouse" and "castle" images. Your old LanczosSharp method shows some ringing there, but at least it's relatively faint and not aliased. Both new methods show stronger and more aliased ringing there. The wood wall in the left house in "biglighthouse" also shows a lot more ringing (if it's ringing what I see there) compared to the old LanczosSharp image. In the "parking" image, one of your new LanczosSharp variations shows much stronger ringing at the left side of the parking case, the other one weaker ringing, compared to your original LanczosSharp implementation. Weird. Both new algorithms have more ringing and aliasing artifacts within most of the texts in the "parking" image. Your original LanczosSharp produced more natural and artifact free images, IMHO, but also softer.
Just my 2 cents, of course. And to be honest, I've no clue about the math behind your old and new algorithms. So I can only compare the image results with my eyes.
What do you think?
*.mp4 guy
29th November 2011, 12:28
Most of the test images are more aliased then would generally be expected. This results in a marked bias in favor of more aggressive filters, likely other images would show the opposite result.
NicolasRobidoux
29th November 2011, 14:02
Hmmmmm... The new images are definitely sharper, but I'm still not sure if I prefer them over your old LanczosSharp.
What do you think?
@madshi: Thank you very much for your feedback Mathias.
-----
In a ("mathematical") sense, the new Jinc-windowed Jinc filters are the sharpest Jinc-windowed Jinc EWAs possibles with the given number of lobes. No surprise, they are more aliased.
I agree that with this collection of test images, this is a Goldilocks situation: "classic" LanczosSharp appears to be a bit too soft, and the new ones too sharp.
It turns out that I had done my preliminary testing with a group of images which had little or no haloing as well as generally softer. With those, the new sharper versions were, to my eyes, right at the edge of "too sharp." With this group of test images, they appear to me to have crossed the line. I myself find them too aliased. And I don't like that the existing halos are sometimes noticeably amplified.
What this suggests to me is that the "best Jinc-windowed Jinc EWAs" have a support that sits between the "new" and the "old." In http://imagemagick.org/discourse-server/viewtopic.php?f=22&t=19636&start=30#p78406, I suggested rescaling the Jinc-windowed Jinc EWA filters so that the radius is exactly the number of lobes, which is basically what is the case for orthogonal Sinc-windowed Sinc Lanczos. I don't have a mathematical justification for this choice. But this produces a scheme which sharp Goldilocks may approve of without any of the halo-free bears complaining too loudly.
This particular group of test images made me realize that, for example, the 4-lobe version is not as good across the board as I thought.
madshi
29th November 2011, 15:06
@*.mp4 guy, you may be right, I'm not sure. Here on doom9 we're usually more concerned about video than still photo, and video is often softer than these still photos. But then there's a big difference between DVDs and good Blu-Rays. Some Blu-Rays are very detailed and would probably also result in aliasing and definitely in haloing with too sharp resampling filters.
@Nicolas, have you thought about adding some anti-ringing logic to your resamplers? E.g. if you compare your EWA Lanczos results (any of them) of the "castle" image to my "non-ringing Lanczos", mine has noticeably less ringing at the roof. Maybe if you used a similar technique in your algorithms you could improve quality further? Just a thought, though... Personally, I'm tending to go totally non-linear at the moment. I think we're at the limit of what "simple" linear resampling can do. If we want another significant jump in image quality, I think non-linear algorithms will be the way to go. Which is confirmed by the NNEDI3 results, which I still find noticeably superior to any other algorithm tested in this thread. As a start I'm planning to look into improving my non-ringing Lanczos algorithm, maybe by adjusting the resampling weights to match the edge directions or something like that. I'll probably use a more trial-and-error approach instead of science and math, though, because I'm not too great in the latter areas... ;)
NicolasRobidoux
29th November 2011, 15:53
...
@Nicolas, have you thought about adding some anti-ringing logic to your resamplers? E.g. if you compare your EWA Lanczos results (any of them) of the "castle" image to my "non-ringing Lanczos", mine has noticeably less ringing at the roof. Maybe if you used a similar technique in your algorithms you could improve quality further? Just a thought, though... Personally, I'm tending to go totally non-linear at the moment. I think we're at the limit of what "simple" linear resampling can do. If we want another significant jump in image quality, I think non-linear algorithms will be the way to go. Which is confirmed by the NNEDI3 results, which I still find noticeably superior to any other algorithm tested in this thread. As a start I'm planning to look into improving my non-ringing Lanczos algorithm, maybe by adjusting the resampling weights to match the edge directions or something like that. I'll probably use a more trial-and-error approach instead of science and math, though, because I'm not too great in the latter areas... ;)
@Mathias:
There is a lot here.
Initial answer:
It's only been about a year (I believe) that anyone got a truly good EWA scheme working. So, I'm still exploring linear EWA. One way of modifying weights is by changing the windowing function. In quick experiments documented on the ImageMagick forums, I found that Jinc is probably the best windowing function for Jinc. But I'm not sure. And it does lead to lots of haloing sometimes.
What seems to be tricky is to side step the zero sum. Weights/kernels can be modified to reduce haloing (say), but then some other annoying artifact sticks its head out.
My novel schemes LBB and LBB-Nohalo are nonlinear. LBB is specifically designed to prevent haloing. LBB really can be described as bicubic with very strong ringing suppression. LBB-Nohalo does that, and attempts to straighten/sharpen diagonal lines. But they don't work well with images which contain haloing to start with, and they are aliased with "old skool CG." A Masters student of mine (Chantal Racette) is defending this Friday, with a core component of the thesis a discussion of these two methods. (She actually was involved in the intial ImageMagick fixing of EWA, but this part of her work did not really make it in her thesis.)
I have, on the back of an envelope, a less "halo averse" bicubic method which should be an improvement on LBB. A more balanced scheme if you wish. It is nonlinear, although less so than LBB. I have high hopes for this one. It is based on a biquadratic method discussed in the thesis which works fairly well in 1D but is not very good in 2D.
NNEDI3 is simply amazing. Although I wonder how it does with video, because it's clear that it "invents" attractive pieces where other schemes would add artifacts, and it would not surprise me that this type of strong nonlinearity would make details "jump" from one frame to the next. I also, personally, do not particularly like its "glossiness". Naturally messy things like grass or conifer branches don't look natural to me. It's almost as if NNEDI3 creates a "plastic world" ("I'm a Barbie girl, in a Barbie world..."). It makes think of what comes out of using a Jensen filter.
I do wonder if a nonlinear face or vertex split subdivision method which explicitly takes into account what we "expect" to see at higher resolution with a collection of archetypal data patches could almost match NNEDI3. But I have no time for that now.
I can use math to guide me, but math is limited in the guidance it provides. My opinion is that designing a good image resampling method is a craft. I trust my eyes more than numbers alone. I do find that there are some mathematical properties that provide useful guidance or tuning values. But this whole thing still appears to me hugely complicated, and I don't have a clear vision of what is "the" way.
Good luck. Keep documenting what you try, please.
madshi
29th November 2011, 16:12
NNEDI3 is simply amazing. Although I wonder how it does with video, because it's clear that it "invents" attractive pieces where other schemes would add artifacts, and it would not surprise me that this type of strong nonlinearity would make details "jump" from one frame to the next.
NNEDI3 was originally invented for deinterlacing, so video is the primary purpose. And it seems to work quite well there. The problem you're anticipating can definitely happen with certain algorithms but it seems NNEDI3 is not one of them.
I also, personally, do not particularly like its "glossiness". Naturally messy things like grass or conifer branches don't look natural to me.
NNEDI3 decides for which parts of the image to use the NNEDI3 algorith and for which parts simple Bicubic. Usually grass and trees end up being upscaled with Bicubic. I've once tried forcing NNEDI3 on for all pixels and it didn't look well for grass and trees at all, IMHO. It looked more fractal like than natural there. The real purpose of the Bicubic switch was to save performance. Luckily it also seems to avoid many of those fractal artifacts. Other non-linear algorithms have similar problems. E.g. this one easily beats NNEDI3, from what I can see:
http://resampling.narod.ru/
But it doesn't look good for grass and trees, either. I think these kind of problems should be solvable, though. Grass and trees are usually image areas with a lot of random directions. It should be possible to detect such areas and use a different algorithm for them.
I believe this "unnatural" or "fractal" like look of grass/trees usually happens if you have an algorithm which tries to find edge directions in the image and interpolate them "correctly". Doing so often works *great* for most image content, just not for mother nature (grass, trees).
I can use math to guide me, but math is limited in the guidance it provides. My opinion is that designing a good image resampling method is a craft. I trust my eyes more than numbers alone. I do find that there are some mathematical properties that provide useful guidance or tuning values. But this whole thing still appears to me hugely complicated, and I don't have a clear vision of what is "the" way.
Yeah, I think logical thinking and good ideas might work better for image resampling than pure math. Oh well, probably a good combination of both would be ideal. Maybe I should polish my math up... :o
Keep documenting what you try, please.
Will do. You, too, please.
NeuralSpline
30th May 2012, 17:04
Dear experts, I suggest to discuss results of work of my NeuralSpline2 algorithm and to make the comparative review other methods. I send result of work on an example "lighthouse" test image. Look:
http://resampling.narod.ru/image/lighthouse_x4.jpg
madshi
31st May 2012, 15:11
Hi Vitaly,
your algorithm looks quite promising, but I wish I could test it myself. Maybe you could make a test exe available which would allow us to test your algorithm with any images we like? You could watermark the upscaled images, if you're afraid of misuse.
FWIW, compared to NNEDI3 your algorithm seems to produce noticeably sharper results. You also get a couple edges with difficult angles reproduced better than NNEDI3. On the other hand, your algorithm also produces a couple more artifacts compared to NNEDI3. I think you improved the problem with false geometrical/fractal structures in random image areas (like grass) compared to earlier versions of your algorithm? I still see a slight tendency towards that, though. E.g. there's a geometrical pattern to the grain in the cloud on the right side of the light tower top in your image.
NeuralSpline
31st May 2012, 17:19
Hi, madshi!
I read earlier about a problem of a grass and trees. In my algorithm there is no transition on bicubic. The grass and similar structures from the point of view of mathematics come nearer to white noise. Work of my algorithm is result of training on a large number of test images. The answer to input data is the most probable decision. And for such difficult structures of decisions can be so much that most probable decision looks blur. And on the contrary if to try will adjust algorithm on the beautiful decision it is self-deception, and the decision will be unstable. For upscaling video thus there will be a feeling that in a grass there are a lot of ants. But I really give to this problem a lot of time. But it very much and is very difficult to restore lost information.
Still there are small artifacts, but it still the early version and it is still far from an ideal. Work proceeds. I will present the new version of algorithm of NeuralSpline2 soon.
The demo of the version isn't present because there is no time for it, though I understand that it would be need to have it.
If you have interesting tests, I can show work of my algorithm to listen to critical remarks which are necessary for me.
madshi
31st May 2012, 17:57
It's interesting that your algorithm is also using a trained neural network. My impression about how your algorithm works was quite different. So it seems your algorithm is somewhat similar in its concept compared to NNEDI3? What I can say is that NNEDI3 switches to Bicubic for some parts of the image. It does that to save performance, but without that switch to Bicubic, NNEDI3 would produce quite bad (artificial) looking grass.
Ok, if you're interested, I've marked a few problem areas in your image. For your comparison, here is NNEDI3 compared to your algorithm:
NNEDI3: http://madshi.net/biglighthouseNNEDI3.png
NeuralSpline2: http://madshi.net/lighthouse_x4.jpg
Let me comment on the "red circles" I've drawn into your image:
(1) NNEDI3 is clean here (although soft), while your algorithm shows artifacts.
(2) Same here.
(3) The thickness of the "dark lines" is not stable with your algorithm. It's more stable with NNEDI3, but of course NNEDI3 is once again much softer.
(4) There appear to be diagonal geometrical structures (in both directions) in your image.
(5) The window bar goes a bit zig-zag, it's not a straight line. It looks a bit better (but softer) in the NNEDI3 image.
(6) Your algorithm tries to find hard lines in the left dark part of the vertical wood slats, which are not correct/natural.
Not sure if that is of any help.
NeuralSpline
31st May 2012, 21:57
Thanks, madshi!
My algorithm is the solve inverse task and is adjusted on minimum RMSE. Where I can learn about algorithm NNEDI3?
point 1 will be solve soon, it is a shortcoming because of not final of algorithm.
point 2, 5 and 6 my algorithm very sensitive to texture and here algorithm tries to find something. Probably, noise is the reason for that. NNEDI3 isn't so sensitive to texture and isn't so sensitive to noise. I dont know where optimum value of sensitivity.
point 4 - yes, it it is possible to see on a grass, it mean that in a small window details at an angle 45 and 0 degrees are most appreciable, intermediate corners mask. This problem isn't solved with sensitivity because of increase of artifacts, it is physical limit. But i think about.
point 5 is a real problem it is necessary to think... while there is no idea how to allocate such areas that for them to look for the optimum decision.
I hope in the final version the majority of problems it will be solved.
madshi
31st May 2012, 22:25
NNEDI3 is an AviSynth filter which is available with source code licenced under the GPL here:
http://web.missouri.edu/~kes25c/
tritical
8th June 2012, 09:02
I've actually been working on nnedi4 for a while... it makes some changes to the prescreener/predictor network architectures of nnedi3. I've also been working on a supervised deconvolution filter called nnsharpen which uses a neural network to undo different types of distortions such as Gaussian blur, square blur, circular blur, and the blurring associated with different types of downsizing and then enlargement by nnedi4. It can work in gamma corrected or linear light. I've put example images up at http://bengal.missouri.edu/~kes25c/nnedi4/. Here is a 4x enlargement of the original lighthouse image: http://bengal.missouri.edu/~kes25c/nnedi4/lh_4x_nnedi4_nnsharpen-3.png. Since the deconvolution filter is supervised it requires the user to pick the appropriate distortion type and strength.
madshi
8th June 2012, 10:03
Hi tritical,
hmmmmm... I'm not totally sure whether I prefer NNEDI3 or NNEDI4. I can see some improvements in NNEDI4, e.g. the bottom roof line of the left house in lh_4x is noticeably better with NNEDI4 compared to NNEDI3. But then there are also cases where NNEDI3 looks better to me. E.g. the fonts in the "parking" image look clearly better to me with NNEDI3. The castle looks better (less artifacts) with NNEDI4 again. I had those horizontal weave artifacts with NNEDI3 there, which NNEDI4 doesn't seem to have, anymore. But then probably the parameters you're using are different from those I were using? So I'm not sure if the improvements in the castle image are due to algorithm improvements or due to different parameters?
The NNEDI4 big lighthouse with sharpening to me seems to have an overall comparable quality to NeuralSpline2's image. NeuralSpline is able to get some edges better and also is able to get many edges *very* sharp, but it also shows more artifacts, which is a significant disadvantage. It seems to me, though, that your nnsharpen algorithm tends to add a quite noticeable amount of ringing to the image (easily visible in "parking" and "castle"). I think you really need to work on that. Maybe a simple delimiting as used by LimitedSharpen as a post processing step would help?
> Since the deconvolution filter is supervised it requires the
> user to pick the appropriate distortion type and strength.
Will that stay this way for the final version? That would mean that nnsharpen will mostly be useful for video reencoding where people can play with the settings to find the optimal parameters for every movie. It would not be as useful for real life playback situations (where users generally don't tend to adjust settings all the time)? Or what do you think?
Thanks!
tritical
8th June 2012, 18:24
Differences between nnedi3 and nnedi4 are due to algorithm changes, the parameters are exactly the same. The major changes were actually in the prescreener (mainly speed related), not in the predictor so there isn't a lot of difference between nnedi3/nnedi4 in terms of actual output. However, in all of the objective tests I've performed nnedi4 always beats nnedi3 in terms of ssim/mae/mse/gradient-similarity. Personally, I either prefer nnedi4 over nnedi3 or am indifferent in all of these images. On the parking image I actually prefer nnedi4 if I had to choose one.
On the ringing, it is mainly a matter of which downsampling method the deconvolution is based on. I can generate an nnsharpen'd version of the castle image with much less ringing (by assuming lanczos3, lanczos4, etc... downsample instead of box average downsample), but it is noticeably less sharp in other areas of the picture. Since the small castle image already has a good amount of ringing it is likely that a sharp kernel was used somewhere in the process of getting it to that size. However, for all of these enlargements I simply tried out all of the downsampling choices and picked the one I thought looked best. Others might prefer a different setting. It also might be possible to get better results using traditional sharpening methods, but I have not tried.
Will that stay this way for the final version? That would mean that nnsharpen will mostly be useful for video reencoding where people can play with the settings to find the optimal parameters for every movie. It would not be as useful for real life playback situations (where users generally don't tend to adjust settings all the time)? Or what do you think?
This is an issue for any filter (denoise, sharpen, etc..) with parameters isn't it? Best you can do is choose defaults that work alright for most cases.
madshi
8th June 2012, 18:54
This is an issue for any filter (denoise, sharpen, etc..) with parameters isn't it? Best you can do is choose defaults that work alright for most cases.
Originally you said "it requires the user to pick the appropriate distortion type and strength". That sounded as if nnsharpen wouldn't work well at all if the user doesn't pick the right parameters. If it is possible to set "defaults that work alright for most cases" for nnsharpen, then there should be no problem.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.