NNEDI - intra-field deinterlacing filter - Page 12

Dark Shikari · 29th January 2009, 23:28

Also, what about using a metric other than mean squared error? SSIM might be a good one to try for, or perhaps something like x264's psy-RD metric.

MfA · 30th January 2009, 15:18

There is no need to simultaneous optimize interpolation and texture synthesis, unlike encoding there is no gain to be had from reusing artefacts as texture. You can always add texture in a separate pass.

Dark Shikari · 30th January 2009, 19:22

Quote:

Originally Posted by MfA

There is no need to simultaneous optimize interpolation and texture synthesis

Why not?

MfA · 31st January 2009, 20:19

Because "You can always add texture in a separate pass.". With H264 if the noise doesn't get encoded then all you have left is a smoothed result, if slightly misaligning edges etc. allows you to maintain some texture and get a better looking picture that's what you do ... because there is no better alternative. With interpolation you have the luxury to add texture afterwards, so you can concentrate on making the optimal interpolator for features which can be well predicted first (mostly edges).

Weighted predictors are not great texture synthesizers anyway.

tritical · 4th February 2009, 07:03

@Dark Shikari
I have tried some other metrics, but with the current algorithm and number of data points it really has to be something completely independent of other pixels (or at least independent of pixels in other clusters) so that the weights for each cluster can be learned separately. SSIM can be computed for a single pixel change, but it doesn't work very well in my experience.

The latest idea I've been working with is to switch from a bunch of separate non-linear regression problems to one classification problem. In other words, switch from learning interpolation functions for a given set of groupings (the clusters learned through k-means combined with euclidean distance metric) to learning the grouping function for 'n' sets of linear interpolation weights.

I initialize k-means as usual (to get the initial groupings), but instead of regrouping the pixels based on euclidean distance to the mean of each cluster, I calculate the linear least squares interpolation coefficients for each cluster. I then reassign pixels to the cluster whose interpolation coefficients give the minimum squared error for that pixel, and keep repeating that until convergence (overall mse stops dropping significantly). I've found that it only needs ~16-32 sets of coefficients to get very nice results (I used about 5 million points from my 740 frame video to cluster, then tested it on the whole video by choosing the best set of weights for each pixel). Now, the problem becomes creating a classifier to choose which set of weights to use.

MfA · 4th February 2009, 19:15

A problem with MSE (and SSIM) is that it heavily punishes outliers, which is fine for a quality metric ... but not good in a classifier.

PS. I find it curious you chose to optimize interpolation without simultaneously optimizing the classifier (ie. optimizing the interpolation weights with an oracle classifier). I would have expected you to optimize both at the same time. What classifier did you end up using?

tritical · 4th February 2009, 21:49

As you say, it's possible to iteratively optimize both... train classifier(s) a little, train interpolation coefficients a little (or solve if direct solution exists), keep repeating. Or did you have something different in mind?

I haven't gotten that far yet. I am still testing different classifiers. Main restriction on what can be used is computation time, since it is typically going to be run on ~25-35% of pixels in a frame (~80k-120k for a 720x480 frame). What has worked best is a basic nn trained to select classes by minimizing squared error of the resulting interpolation (if there are 16 classes, then the nn has 16 output neurons and the one with the largest value is the chosen class). Actually, it worked better to not have the nn choose a single class, but to use its outputs as linear combination weights (either after applying softmax activation or normalizing so they sum to 1). However, having it output combination weights makes iterative optimization with the interpolation coefficients more complicated.

MfA · 4th February 2009, 23:15

Quote:

Originally Posted by tritical

Actually, it worked better to not have the nn choose a single class, but to use its outputs as linear combination weights (either after applying softmax activation or normalizing so they sum to 1). However, having it output combination weights makes iterative optimization with the interpolation coefficients more complicated.

Wouldn't that only make sense if during application of the filter you also use the weighted combination of all predictors? Doesn't seem a realistic option.

By the way, why did you decide to second guess the CMA-ES algorithm? (ie. why not just let it optimize the entire system of both classifiers and predictors in one go.)

madshi · 9th November 2009, 13:43

@tritical,

have you checked out iNEDI yet? It seems to be an noticeable improvement over the original NEDI algorithm. Maybe you could implement some of iNEDI's ideas into your NNEDI?

http://www.tecnick.com/pagefiles/app...cola_Asuni.pdf

tetsuo55 · 9th November 2009, 14:42

Looks like even iNEDI has been surpassed:

http://www.comp.leeds.ac.uk/bmvc2008.../papers/43.pdf

Is up to 10 times faster when compared to NEDI too.

EDIT:
And even ICBI has been surpassed:

http://www.eurasip.org/Proceedings/E...1569192778.pdf

tritical · 9th November 2009, 18:56

Learning the interpolation weights based on the low res image works alright for image enlargement, but isn't any good for deinterlacing because too much information is missing. For now I'm more interested in deinterlacing interpolation than enlargement. The iterative energy minimization post-processing described in the ICBI paper is interesting though, and could be useful for deinterlacing. I will try running it on the result of nnedi2/eedi2 and see how it looks.

It also doesn't appear that MEDI > ICBI > iNEDI is always the case based on the results in that last paper. Looks like it depends on the image content, which isn't surprising. It would be interesting to see how nnedi2 compares psnr/ssim wise.

In the future I plan to revisit nedi... mainly because at the time I wrote ediupsizer I didn't have a full understanding of the mathematics/concepts involved. I will definitely keep these papers in mind as well

.

tetsuo55 · 9th November 2009, 19:05

Quote:

Originally Posted by tritical

It also doesn't appear that MEDI > ICBI > iNEDI is always the case based on the results in that last paper. Looks like it depends on the image content, which isn't surprising. It would be interesting to see how nnedi2 compares psnr/ssim wise.

At a first glance i was confused about this too.

But as i read more, it became more and more obvious that the newer ones, and especially MEDI goes for the phycovisually better result, at the cost of some PSNR and SSIM

Also i find it very interesting that these algo´s can be used for deinterlacing.

It would be great to have a universal, MEDI based scaler-deinterlacer that always works, regardless of I or P content

MfA · 9th November 2009, 22:19

The extreme staircasing of the image in the MEDI paper for bilinear makes me think they used decimation for downsampling in their tests (which makes the results pretty much irrelevant for normal upsampling).

PS. the gaps from the spokes to the rim are pretty damning, I'm 99% sure they used decimation ... poor show.

tetsuo55 · 9th November 2009, 22:22

Yeah i think the top 3, should be tested on realworld moving video.

madshi · 13th November 2009, 10:12

For those interested, here's the Clown resampled by ICBI:

http://madshi.net/clownICBI.png

ICBI seems to be a bit soft to me, but on the positive side it looks quite smooth and natural and doesn't seem to add any artifacts (other than those already in the source, like the pole halo in the Clown image).

tetsuo55 · 13th November 2009, 10:30

I agree that its very soft.

It appears like most high quality interpolators create a soft image.

But we could always add a sharpener at the end.

EDIT: actually i would describe it as slightly out of focus

Mystery Keeper · 14th November 2009, 01:48

I tried to implement ICBI for deinterlacing. It doesn't work well. Well, it does work, but NNEDI2 works better and faster than my pixel shaders 3 implementation.
tetsuo55
ICBI is adjustable method. You can get sharper image with it if you play with parameters.

madshi · 14th November 2009, 09:45

Quote:

Originally Posted by Mystery Keeper

ICBI is adjustable method. You can get sharper image with it if you play with parameters.

Which parameters did you change in which direction to get a sharper image? Probably choosing sharper parameters comes at the cost of curve smoothness, I guess?

MfA · 14th November 2009, 15:36

Softness in an interpolator isn't a bad thing.

There is a difference between interpolation and super-resolution. An interpolator generally conserves the original pixels ... this is fundamentally incorrect if you are trying to reconstruct a non-smoothed higher resolution image. For instance just because a pixel covers an edge in the low resolution image doesn't mean it covers it in the higher resolution one, so mixing colors from both sides of the edge could be fundamentally incorrect for the non smoothed higher resolution image.

To benchmark interpolators you should compare against the smoothed version of the higher resolution image, not the original higher resolution image. Sharpening and texture synthesis are not interpolation.

Which is not to say you couldn't do a single step super resolution algorithm ... it just wouldn't be a pure interpolator and shouldn't retain the original pixels from the low resolution image.

Mystery Keeper · 16th November 2009, 04:43

Quote:

Originally Posted by madshi

Which parameters did you change in which direction to get a sharper image? Probably choosing sharper parameters comes at the cost of curve smoothness, I guess?

Why, of course it does. Second parameter - beta, is there to limit the smoothing out.

29th January 2009, 23:28	#221 \| Link
Dark Shikari x264 developer Join Date: Sep 2005 Posts: 8,666	Also, what about using a metric other than mean squared error? SSIM might be a good one to try for, or perhaps something like x264's psy-RD metric. __________________ Follow x264 development progress \| akupenguin quotes \| x264 git status ffmpeg and x264-related consulting/coding contracts \| Doom10

4th February 2009, 07:03	#225 \| Link
tritical Registered User Join Date: Dec 2003 Location: MO, US Posts: 999	@Dark Shikari I have tried some other metrics, but with the current algorithm and number of data points it really has to be something completely independent of other pixels (or at least independent of pixels in other clusters) so that the weights for each cluster can be learned separately. SSIM can be computed for a single pixel change, but it doesn't work very well in my experience. The latest idea I've been working with is to switch from a bunch of separate non-linear regression problems to one classification problem. In other words, switch from learning interpolation functions for a given set of groupings (the clusters learned through k-means combined with euclidean distance metric) to learning the grouping function for 'n' sets of linear interpolation weights. I initialize k-means as usual (to get the initial groupings), but instead of regrouping the pixels based on euclidean distance to the mean of each cluster, I calculate the linear least squares interpolation coefficients for each cluster. I then reassign pixels to the cluster whose interpolation coefficients give the minimum squared error for that pixel, and keep repeating that until convergence (overall mse stops dropping significantly). I've found that it only needs ~16-32 sets of coefficients to get very nice results (I used about 5 million points from my 740 frame video to cluster, then tested it on the whole video by choosing the best set of weights for each pixel). Now, the problem becomes creating a classifier to choose which set of weights to use. Last edited by tritical; 4th February 2009 at 07:17.

4th February 2009, 19:15	#226 \| Link
MfA Registered User Join Date: Mar 2002 Posts: 1,075	A problem with MSE (and SSIM) is that it heavily punishes outliers, which is fine for a quality metric ... but not good in a classifier. PS. I find it curious you chose to optimize interpolation without simultaneously optimizing the classifier (ie. optimizing the interpolation weights with an oracle classifier). I would have expected you to optimize both at the same time. What classifier did you end up using? Last edited by MfA; 4th February 2009 at 19:22.

9th November 2009, 14:42	#230 \| Link
tetsuo55 MPC-HC Project Manager Join Date: Mar 2007 Posts: 2,317	Looks like even iNEDI has been surpassed: http://www.comp.leeds.ac.uk/bmvc2008.../papers/43.pdf Is up to 10 times faster when compared to NEDI too. EDIT: And even ICBI has been surpassed: http://www.eurasip.org/Proceedings/E...1569192778.pdf __________________ MPC-HC, an open source project everyone can improve. Want to help? Test Nightly Builds, submit patches or bugs and chat on IRC Last edited by tetsuo55; 9th November 2009 at 14:49.

9th November 2009, 22:19	#233 \| Link
MfA Registered User Join Date: Mar 2002 Posts: 1,075	The extreme staircasing of the image in the MEDI paper for bilinear makes me think they used decimation for downsampling in their tests (which makes the results pretty much irrelevant for normal upsampling). PS. the gaps from the spokes to the rim are pretty damning, I'm 99% sure they used decimation ... poor show. Last edited by MfA; 9th November 2009 at 22:27.

30th January 2009, 15:18	#222 \| Link
MfA Registered User Join Date: Mar 2002 Posts: 1,075	There is no need to simultaneous optimize interpolation and texture synthesis, unlike encoding there is no gain to be had from reusing artefacts as texture. You can always add texture in a separate pass.

31st January 2009, 20:19	#224 \| Link
MfA Registered User Join Date: Mar 2002 Posts: 1,075	Because "You can always add texture in a separate pass.". With H264 if the noise doesn't get encoded then all you have left is a smoothed result, if slightly misaligning edges etc. allows you to maintain some texture and get a better looking picture that's what you do ... because there is no better alternative. With interpolation you have the luxury to add texture afterwards, so you can concentrate on making the optimal interpolator for features which can be well predicted first (mostly edges). Weighted predictors are not great texture synthesizers anyway.

4th February 2009, 21:49	#227 \| Link
tritical Registered User Join Date: Dec 2003 Location: MO, US Posts: 999	As you say, it's possible to iteratively optimize both... train classifier(s) a little, train interpolation coefficients a little (or solve if direct solution exists), keep repeating. Or did you have something different in mind? I haven't gotten that far yet. I am still testing different classifiers. Main restriction on what can be used is computation time, since it is typically going to be run on ~25-35% of pixels in a frame (~80k-120k for a 720x480 frame). What has worked best is a basic nn trained to select classes by minimizing squared error of the resulting interpolation (if there are 16 classes, then the nn has 16 output neurons and the one with the largest value is the chosen class). Actually, it worked better to not have the nn choose a single class, but to use its outputs as linear combination weights (either after applying softmax activation or normalizing so they sum to 1). However, having it output combination weights makes iterative optimization with the interpolation coefficients more complicated.

9th November 2009, 13:43	#229 \| Link
madshi Registered Developer Join Date: Sep 2006 Posts: 9,140	@tritical, have you checked out iNEDI yet? It seems to be an noticeable improvement over the original NEDI algorithm. Maybe you could implement some of iNEDI's ideas into your NNEDI? http://www.tecnick.com/pagefiles/app...cola_Asuni.pdf

9th November 2009, 18:56	#231 \| Link
tritical Registered User Join Date: Dec 2003 Location: MO, US Posts: 999	Learning the interpolation weights based on the low res image works alright for image enlargement, but isn't any good for deinterlacing because too much information is missing. For now I'm more interested in deinterlacing interpolation than enlargement. The iterative energy minimization post-processing described in the ICBI paper is interesting though, and could be useful for deinterlacing. I will try running it on the result of nnedi2/eedi2 and see how it looks. It also doesn't appear that MEDI > ICBI > iNEDI is always the case based on the results in that last paper. Looks like it depends on the image content, which isn't surprising. It would be interesting to see how nnedi2 compares psnr/ssim wise. In the future I plan to revisit nedi... mainly because at the time I wrote ediupsizer I didn't have a full understanding of the mathematics/concepts involved. I will definitely keep these papers in mind as well .

9th November 2009, 22:22	#234 \| Link
tetsuo55 MPC-HC Project Manager Join Date: Mar 2007 Posts: 2,317	Yeah i think the top 3, should be tested on realworld moving video. __________________ MPC-HC, an open source project everyone can improve. Want to help? Test Nightly Builds, submit patches or bugs and chat on IRC

13th November 2009, 10:12	#235 \| Link
madshi Registered Developer Join Date: Sep 2006 Posts: 9,140	For those interested, here's the Clown resampled by ICBI: http://madshi.net/clownICBI.png ICBI seems to be a bit soft to me, but on the positive side it looks quite smooth and natural and doesn't seem to add any artifacts (other than those already in the source, like the pole halo in the Clown image).

13th November 2009, 10:30	#236 \| Link
tetsuo55 MPC-HC Project Manager Join Date: Mar 2007 Posts: 2,317	I agree that its very soft. It appears like most high quality interpolators create a soft image. But we could always add a sharpener at the end. EDIT: actually i would describe it as slightly out of focus __________________ MPC-HC, an open source project everyone can improve. Want to help? Test Nightly Builds, submit patches or bugs and chat on IRC Last edited by tetsuo55; 13th November 2009 at 10:44.

14th November 2009, 01:48	#237 \| Link
Mystery Keeper Beyond Kawaii Join Date: Feb 2008 Location: Russia Posts: 724	I tried to implement ICBI for deinterlacing. It doesn't work well. Well, it does work, but NNEDI2 works better and faster than my pixel shaders 3 implementation. tetsuo55 ICBI is adjustable method. You can get sharper image with it if you play with parameters. __________________ ...desu!

14th November 2009, 15:36	#239 \| Link
MfA Registered User Join Date: Mar 2002 Posts: 1,075	Softness in an interpolator isn't a bad thing. There is a difference between interpolation and super-resolution. An interpolator generally conserves the original pixels ... this is fundamentally incorrect if you are trying to reconstruct a non-smoothed higher resolution image. For instance just because a pixel covers an edge in the low resolution image doesn't mean it covers it in the higher resolution one, so mixing colors from both sides of the edge could be fundamentally incorrect for the non smoothed higher resolution image. To benchmark interpolators you should compare against the smoothed version of the higher resolution image, not the original higher resolution image. Sharpening and texture synthesis are not interpolation. Which is not to say you couldn't do a single step super resolution algorithm ... it just wouldn't be a pure interpolator and shouldn't retain the original pixels from the low resolution image. Last edited by MfA; 14th November 2009 at 15:46.

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode