View Full Version : Pixel fairness, the "Keys" cubics, and mathematical correctness
Katie Boundary
26th February 2015, 18:18
A few days ago, I was having a problem with Decimate() working correctly following separatefields but malfunctioning when it came after uncomb, and I found myself intensely interested in finding out which resizing filters would yield the best anti-aliasing. I noticed that, although none of them really restored straightness to the lines on Data's uniform...
http://img.photobucket.com/albums/v415/DWJohnson/lanczos3_zpskdfb754p.jpg
...some filters reduced the perception of jagginess by making those lines darker. I ended up finding a LOT of information that I wish I had found sooner, and in particular, these two pages caught my attention:
http://entropymine.com/imageworsener/filterfairness/
http://www.imagemagick.org/Usage/filter/
That first page expresses perfectly what I meant by "mathematical correctness" in my thread on Grid Overlay resizing. The second reminded me of something in the AVIsynth documentation, which stated that the most "numerically accurate" cubic filters were the ones that followed the "b+2c=1" formula. The IM documentation refers to this as the "Keys" family.
This leads me to two separate sets of questions. Regarding the first page...
1) is that how all bilinear filters work?
2) Is there a difference between linear interpolation and triangle filters?
3) what's the difference between virtualdub's "bilinear" and "precise bilinear" resizing methods?
4) does anyone know of a resizing method that exhibits "pixel fairness", other than grid overlay?
Regarding the second page...
1) what is so damn special about the"b+2c=1" formula?
2) what is meant by numerical accuracy in this case? is it the same thing as pixel fairness?
I also finally learned the correct pronunciation of "Lanczos" :D
wonkey_monkey
26th February 2015, 22:24
I noticed that, although none of them really restored straightness to the lines on Data's uniform...
No plain resizing filter is going to do that. You need an edge-interpolation filter like NNEDI (or you need to tweak what other processing you're doing to get to that point in the first place, because you should be able to get nice clean 24p out of the DVDs - at least in non-FX scenes)
The problem with fairness - as I think has been brought up several times before - is that when you're trying manipulate a set of pixels which represent an image, then those pixels are not and should not be treated as blocks of colour with a boundary. They are point samples - pixels in the "block of colour" sense don't really have any meaning until you display them on a physical screen. That's why "first page guy" gets a result he doesn't expect when he applies a triangle filter. It's not because the filter is wrong, it's because the conception of pixels as blocks of colour is not how the filter works.
On the first page, the guy ends with this:
Another crazy idea for fixing the algorithm: instead of basing the target sample on a small number of source samples, base it on an infinite number of source samples; i.e., fit a curve to the source samples, and take its integral.
This first part of this "crazy idea" is basically how resize filters generally work. A bicubic filter, for example, effectively fits a bicubic curve to the source samples and resamples it at different points than the original. As for "taking the integral"... well, I'm not sure that makes much sense in the context, but it sounds like taking the average of the fitted curve over a small range, which will just blur the image.
1) is that how all bilinear filters work?
Most, I think, but simply taking the weighted average of the two nearest source pixels - as opposed to fitting the triangle over a varying range of source pixels - is also sometimes referred to as "bilinear" (see next answer)
3) what's the difference between virtualdub's "bilinear" and "precise bilinear" resizing methods?
To the best of my recollection, the difference is as above - bilinear simply takes the weighted average of the two nearest source pixels, so once you start shrinking by more than 50% you'll start to get aliasing. Precise bilinear filters over an appropriate number of source pixels (e.g. for a 25% resize, it'll sample over, umm, 8? or something...), like the triangle filter described on the first page.
Regarding the second page...
1) what is so damn special about the"b+2c=1" formula?
2) what is meant by numerical accuracy in this case? is it the same thing as pixel fairness?
I couldn't find "b+2c=1" or "numerical accuracy" anywhere on the second page, so I don't know what you're referring to.
As for your troubles with TNG, just buy the blu-rays, they're pure 24p throughout.
Katie Boundary
27th February 2015, 02:48
you should be able to get nice clean 24p out of the DVDs - at least in non-FX scenes)
Well, I was able to get nice clean 23.976p out of them. The problem, as I already stated, was that decimate was acting incorrectly and deleting the wrong frames :) Regardless, this thread is about the artificial darkening or brightening of parts of an image, with implications that go WAY beyond Star Trek.
The problem with fairness - as I think has been brought up several times before - is that when you're trying manipulate a set of pixels which represent an image, then those pixels are not and should not be treated as blocks of colour with a boundary. They are point samples
You're correct in that it has been brought up before, but like I said elsewhere, treating bitmaps as 2D waveforms and making unfounded assumptions about where pixels come from is the true error in thought. As has already been demonstrated, this allows the overall brightness or color of an image, or region of an image, to change just by resizing it, and that's something that I simply abhor.
Most, I think, but simply taking the weighted average of the two nearest source pixels - as opposed to fitting the triangle over a varying range of source pixels - is also sometimes referred to as "bilinear" (see next answer)
Cool. Is that first method what simpleresize and fastbilinearresize do, and does it exhibit pixel fairness? It seems like it would leave the value of the white pixel at 250...
To the best of my recollection, the difference is as above - bilinear simply takes the weighted average of the two nearest source pixels, so once you start shrinking by more than 50% you'll start to get aliasing. Precise bilinear filters over an appropriate number of source pixels (e.g. for a 25% resize, it'll sample over, umm, 8? or something...), like the triangle filter described on the first page.
:cool:
I couldn't find "b+2c=1" or "numerical accuracy" anywhere on the second page, so I don't know what you're referring to.
AVIsynth documentation. And that page DID have a little diagram with the Keys cubics plotted along the b+2c=1 function...
http://www.imagemagick.org/Usage/img_diagrams/cubic_survey.gif
As for your troubles with TNG, just buy the blu-rays, they're pure 24p throughout.
My questions did not make any mention of Star Trek. Buying Star Trek on Blu-Ray won't help me with Carmen Sandiego or Birds of Prey. I was merely using that frame as an example of the darkening and brightening artifacts that interpolation-based resizing methods are prone to.
Katie Boundary
27th February 2015, 04:10
Also, the documentation for simpleresize and bicublin describes them as "unfiltered" resizers. What's the difference between resampling and filtering in a resizing context?
wonkey_monkey
27th February 2015, 11:03
treating bitmaps as 2D waveforms and making unfounded assumptions about where pixels come from is the true error in thought.
What's unfounded about it? It's how digitising works. A continuous signal is (usually after being low-pass filtered to avoid aliasing) point-sampled and from that point-sampled digital representation the continuous signal can be recovered and resampled as necessary.
It seems to work well enough for everyone else.
As has already been demonstrated, this allows the overall brightness or color of an image, or region of an image, to change just by resizing it, and that's something that I simply abhor.
You may abhor it, but that doesn't mean it's wrong.
I was merely using that frame as an example of the darkening and brightening artifacts that interpolation-based resizing methods are prone to.
What brightening/darkening artefacts? The only artefacts I can see are those caused by a missing field.
colours
27th February 2015, 13:42
Regardless, this thread is about the artificial darkening or brightening of parts of an image, with implications that go WAY beyond Star Trek.
My guess is it's something to do with gamma correction.
You're correct in that it has been brought up before, but like I said elsewhere, treating bitmaps as 2D waveforms and making unfounded assumptions about where pixels come from is the true error in thought.
Thinking that sampling is wrong is the true "error in thought", in my not-humble-at-all opinion.
As has already been demonstrated, this allows the overall brightness or color of an image, or region of an image, to change just by resizing it, and that's something that I simply abhor.
Fun fact: windowed sinc interpolation (e.g. Lanczos) tends to "pixel fairness" as you increase the number of taps. There's no integer number of taps where Lanczos resampling is exactly "pixel fair", but it certainly converges very quickly.
1) is that how all bilinear filters work?
2) Is there a difference between linear interpolation and triangle filters?
3) what's the difference between virtualdub's "bilinear" and "precise bilinear" resizing methods?
4) does anyone know of a resizing method that exhibits "pixel fairness", other than grid overlay?
1: No. Some do interpolation instead of widening the kernel when downscaling, which leads to more aliasing.
2: "Linear" interpolation can refer to either a triangular kernel (e.g. bilinear resize), or more generally to linear (https://en.wikipedia.org/wiki/Linear_map) interpolation.
pandy
27th February 2015, 14:00
What brightening/darkening artefacts?
Probably problem with not normalized video (gamma corrected).
wonkey_monkey
27th February 2015, 16:09
Probably problem with not normalized video (gamma corrected).
My guess is it's something to do with gamma correction.
Actually it's not - check the first page KB linked to. That shows how, with a triangular filter, you can end up with an "unfair" pixel value (i.e., shrinking an image to 5/7 results in the total value of pixels being less than 5/7 the original total).
Instead it is all, as you put it, just to do with:
thinking that sampling is wrong is the true "error in thought", in my not-humble-at-all opinion.
colours
27th February 2015, 16:58
It's hard to tell what the exact cause of the brightness variation is if KB doesn't post the full avs script. Hey, we even have an emoticon for that!
:script:
The reason I believe it's more likely to be due to a lack of gamma correction (or some other gamma-related shenanigans) rather than unfairness in resampling is that I'm assuming the picture is getting upscaled by an integer factor, where resampling is necessarily fair.
Katie Boundary
27th February 2015, 19:38
A continuous signal is (usually after being low-pass filtered to avoid aliasing) point-sampled and from that point-sampled digital representation the continuous signal can be recovered and resampled as necessary.
Please explain why you assume that all bitmaps are necessarily continuous signals.
http://img.photobucket.com/albums/v415/DWJohnson/minecraft/mcsealevel.jpg
What brightening/darkening artefacts? The only artefacts I can see are those caused by a missing field.
I could post side-by-sides showing the effects of various resizing algorithms on the brightness of the stripe, but like I said, this issue has ramifications beyond Star Trek and beyond separatefields()
My guess is it's something to do with gamma correction.
YUV colorspace has gamma?
Fun fact: windowed sinc interpolation (e.g. Lanczos) tends to "pixel fairness" as you increase the number of taps. There's no integer number of taps where Lanczos resampling is exactly "pixel fair", but it certainly converges very quickly.
So the page said. However, it leads to other kinds of nastiness...
http://img.photobucket.com/albums/v415/DWJohnson/lanczos10_zpsbohlnueo.jpg
See those black lines? Those aren't scanlines. That's ringing. EVEN THE RINGS HAVE RINGS. Anything sharper than Catrom should be regarded as the work of the devil.
It's hard to tell what the exact cause of the brightness variation is if KB doesn't post the full avs script. Hey, we even have an emoticon for that!
:script:
Which script? I ran that frame through bilinear, Hermite, Lanczos, Spline64, Catrom, and a bunch of other resizers. And that's not counting the original page that doesn't use AVS scripts at all (http://entropymine.com/imageworsener/filterfairness/)
colours
27th February 2015, 20:15
Please explain why you assume that all bitmaps are necessarily continuous signals.
Continuous domain. In other words, a signal defined over the real numbers. Rasterisation is done by convolving this continuous-domain signal with some point spread function then sampling. This psf can be a rectangular kernel, or it can be a Gaussian kernel, or it can be anything at all, really; the psf functions as the low-pass filter davidhorman was referring to. (Also note, the word "sampling" shows up again!)
YUV colorspace has gamma?
Yes. It's dumb, but yes. Too bad it's still sorta necessary because otherwise the luma plane would have too little precision at 8 bpc, and high bit-depth video processing still isn't mainstream yet.
See those black lines? Those aren't scanlines. That's ringing. EVEN THE RINGS HAVE RINGS. Anything sharper than Catrom should be regarded as the work of the devil.
That's just the Gibbs phenomenon (https://en.wikipedia.org/wiki/Gibbs_phenomenon). Calling windowed sinc with many taps "sharp" is misleading; it's more correct to call it "non-blurring".
Which script? I ran that frame through bilinear, Hermite, Lanczos, Spline64, Catrom, and a bunch of other resizers. And that's not counting the original page that doesn't use AVS scripts at all (http://entropymine.com/imageworsener/filterfairness/)
Any of them. I don't care about what some other site says when you're making claims about something you notice on your source. As it is, there's not enough information for anyone else to make a reasonable comment without resorting to guesswork. At the very least, describe what you're using in your experiments to enough detail that other people can reproduce your results.
*.mp4 guy
27th February 2015, 22:02
All competent interpolation/decimation filters approximate pixel fairness on a linear scale, some fancier ones try to work on a gamma-aware scale as well, with varying results. In the example from that website (which is very odd) the majority of the issue is caused by an incorrect down-sampling kernel, which has not been expanded concomitant with the down-sampling ratio.
In English, the more pixels you intend to throw away, the more blurring you need to do in order to maintain consistent output. If you only keep 5/7 of your pixels, you need to expand the decimation/interpolation kernel by 7/5 relative to its purely interpolating variant. This will ensure that every output pixel is created from a sufficient number of input pixels that it will tend to be consistent, this includes tending towards 'pixel fairness'.
Blurring is always needed when reducing resolution, because you have less space to store information (less pixels->less information space). Since you have less space, you need to remove information from your image so that it will fit into the smaller space without problems. This is done with blurring operators. In order to find the most efficient blurring operators, you have to make assumptions about the image, the more closely these assumptions match reality, the better the results will be.
The above information is all unconditionally true, if someone says something that directly contradicts it, they don't have a strong grasp of the fundamental issues underlying interpolation.
Things only get messy when you need to actually do things and therefore are required to make assumptions about the image you are working with. Sampling theory is one such set of assumptions, it has the impressive quality of allowing perfect reconstruction of a down-sampled signal if you satisfy the version of reality it requires. Unfortunately, doing this is impossible, even for artificially generated signals.
People tend to get very defensive of their models of reality, so it is actually surprisingly difficult to have a good conversation about interpolation. Smart people especially don't like to admit that they have an imperfect understanding of something, but all models are imperfect and rely on assumptions about reality that are not always true.
A pixel may be a square, or a point. Or it might be something else. You just have to find the least-bad assumption and use it. If you need to believe that you're using something perfect, you will have many problems.
wonkey_monkey
27th February 2015, 22:12
In the example from that website (which is very odd) the majority of the issue is caused by an incorrect down-sampling kernel, which has not been expanded concomitant with the down-sampling ratio.
Wouldn't expanding the kernel in that case make that one pixel even darker? Or would the compensation come from neighbouring pixels no longer being pure black?
*.mp4 guy
27th February 2015, 22:21
Both things would happen. The center pixel would be darker, and the surrounding pixels would be lighter. In this case, I suspect it will give 'perfectly fair' results, but I'm not invested enough to dust off some c code and make 100% certain.
Katie Boundary
28th February 2015, 02:06
Continuous domain. In other words, a signal defined over the real numbers.
Oh, okay.
That's just the Gibbs phenomenon (https://en.wikipedia.org/wiki/Gibbs_phenomenon). Calling windowed sinc with many taps "sharp" is misleading; it's more correct to call it "non-blurring".
I think of it internally as "fake sharpening" or as "the ugly thing that you don't want if you plan to compress with MPEG" :)
Any of them. I don't care about what some other site says when you're making claims about something you notice on your source. As it is, there's not enough information for anyone else to make a reasonable comment without resorting to guesswork. At the very least, describe what you're using in your experiments to enough detail that other people can reproduce your results.
Okay then
mpeg2source("721.d2v")
separatefields()
bilinearresize(640,480)
I'm a script minimalist :)
All competent interpolation/decimation filters approximate pixel fairness on a linear scale, some fancier ones try to work on a gamma-aware scale as well, with varying results. In the example from that website (which is very odd) the majority of the issue is caused by an incorrect down-sampling kernel, which has not been expanded concomitant with the down-sampling ratio.
Ah. So if we were to use vdub or avisynth instead, they would have produced different results from what that fellow got?
Blurring is always needed when reducing resolution
Unless you're doing nearest neighbor. But we don't talk about nearest neighbor...
if you satisfy the version of reality it requires...
Haha I love that phrase
People tend to get very defensive of their models of reality
No shit http://img.photobucket.com/albums/v415/DWJohnson/Emoticons/emote-roflmao.gif
A pixel may be a square, or a point. Or it might be something else.
Yay! Proof that I'm not crazy!
Katie Boundary
28th February 2015, 02:12
Oh wait I just realized something. Nearest neighbor is perfectly fair if you're upscaling by an integer factor... like when I expand a 240-scanline field to a vertical resolution of 480.
I must have taken stupid pills that day >.<
colours
28th February 2015, 03:52
*.mp4 guy, I entirely agree with your post, except…
All competent interpolation/decimation filters approximate pixel fairness on a linear scale, some fancier ones try to work on a gamma-aware scale as well, with varying results. In the example from that website (which is very odd) the majority of the issue is caused by an incorrect down-sampling kernel, which has not been expanded concomitant with the down-sampling ratio.
This is not true; the downsampling kernel has been expanded. Check the diagrams in the article.
Also, I guess I'm obliged to point out that thinking of pixels as square boxes still fits the sampling model, since I've been trying to hammer in the concept of sampling in both this thread and the other resizing thread… Sampling is an extremely useful framework to use to study image resizing and filtering in general, even if you don't have the usual bandlimitation assumption.
Katie Boundary:
I think of it internally as "fake sharpening" or as "the ugly thing that you don't want if you plan to compress with MPEG" :)
I'll just quote this Xiph video (http://xiph.org/video/vid2.shtml). There's a bit on the Gibbs phenomenon about 19 minutes in. I know you said Monty was "wrong" before, but he really isn't.
mpeg2source("721.d2v")
separatefields()
bilinearresize(640,480)
I'm a script minimalist :)
Assuming your source is 480i or 480p, this is an upscale by an integer factor, which I've already mentioned is necessarily fair regardless of kernel. (Ignoring pesky boundary effects, of course.)
Why? Consider whatever contribution a source pixel has in the output; shift the output by two pixels and you've shifted the source pixel by one pixel, so this shifted source pixel must have exactly the same contribution. QED.
Katie Boundary
28th February 2015, 05:12
I'll just quote this Xiph video (http://xiph.org/video/vid2.shtml). There's a bit on the Gibbs phenomenon about 19 minutes in. I know you said Monty was "wrong" before, but he really isn't.
Well, there he's talking about audio waveforms and doesn't mention bitmaps at all, so of course he's correct there :)
Assuming your source is 480i or 480p, this is an upscale by an integer factor, which I've already mentioned is necessarily fair regardless of kernel. (Ignoring pesky boundary effects, of course.)
Oh, I think I see what's happening. The sharpening filters are making the stripe brighter, but they're incapable of making the black background much darker to offset that brightness increase. That'll screw up the overall brightness and color levels even if the filter is "pixel-fair".
However, any filters that do not use negative lobes should leave brightness and color levels intact if what you say is true. If I consider that the crappy results I got from Gaussresize and a b=1, c=0 cubic were entirely due to blurring and not a loss of overall brightness, then this explains why bilinear and Hermite looked so much alike.
colours
28th February 2015, 05:56
Well, there he's talking about audio waveforms and doesn't mention bitmaps at all, so of course he's correct there :)
The key takeaway should really be that if you think of sinc as a sharpening filter, you run into the paradox that applying a sinc filter twice doesn't sharpen the source any more than applying it only once. This applies equally well to images. I should've made this clearer; my bad.
Katie Boundary
28th February 2015, 06:13
The key takeaway should really be that if you think of sinc as a sharpening filter, you run into the paradox that applying a sinc filter twice doesn't sharpen the source any more than applying it only once. This applies equally well to images. I should've made this clearer; my bad.
if you're not resizing, then it won't even sharpen the first time, according to Anthony Thyssen (the Imagemagick guy)...
But yes, that is a very cool thing to notice!
*.mp4 guy
28th February 2015, 07:24
Well, I had been assuming that he made a mistake. But I have checked, and he may not have. Such cases are uncommon, on average, the average value of the image will be maintained.
Katie Boundary
28th February 2015, 09:11
After having a few hours to digest some of this information, I've realized that the Xiph video actually makes my case for me regarding the treatment of images as waveforms. Sinc-like filters, and cubics with ringing values above zero, create ringing because they expect ringing to have been in the original image of which the bitmap is an approximation, and are attempting to reconstruct the "lost" ringing. And they expect this ringing to have been present in the original image because they are treating the image like a waveform. However, we know that in the vast majority of cases, this ringing was never present... because images are not waveforms. In other news, I'm becoming increasingly fond of Hermite resizing.
Oh, new questions. Would it be correct to say that linear interpolation always takes the form of a triangle filter, but not all triangular filters are linear interpolation? And when downsizing, what's the advantage of using a really big triangular filter instead of chaining a bunch of smaller ones together (i.e., doing multiple consecutive linear interpolations, each by a factor of 2 or less)?
colours
28th February 2015, 11:59
However, we know that in the vast majority of cases, this ringing was never present... because images are not waveforms. In other news, I'm becoming increasingly fond of Hermite resizing.
What do you think an image becomes when you resize it to arbitrarily large dimensions? Hint: it's an eight-letter compound word starting with W and ending with M, in your terminology.
If you want to talk about resampling, you can't avoid talking about sampling. Like *.mp4 guy said, different filters are ideal under different assumptions; a sinc filter is not ideal for most real (and even artificial) images, because it comes with the typically untrue assumption that the source is perfectly bandlimited prior to sampling.
This is a trade-off that has to be made because a perfectly bandlimited image would either have lots of ringing or be very blurry, when compared to an aliased image. Would you rather have an image that takes four times as many pixels for the same effective resolution plus some theoretical niceness properties, or an image that looks sharp and ringing-free, antialiasing be damned?
Also, Hermite resampling is bad. It doesn't even correctly interpolate linear gradients, and surely you'd want a resampler to at least be able to handle gradients!
On the whole though, it seems that you've finally managed to understand the basics of sampling. Only took about a few weeks of hurfdurfing on everyone's part…
Oh, new questions. Would it be correct to say that linear interpolation always takes the form of a triangle filter, but not all triangular filters are linear interpolation? And when downsizing, what's the advantage of using a really big triangular filter instead of chaining a bunch of smaller ones together (i.e., doing multiple consecutive linear interpolations, each by a factor of 2 or less)?
Like I mentioned, there are two related but distinct senses of the word "linear", and you should be clear on which one you're referring to.
Chaining multiple triangular filters is kinda pointless, because it pretty much leads to a Gaussian filter, in which case why not just use a Gaussian directly? For some bizarre unworldly reason some Photoshop users like to resize in multiple steps (https://en.wikipedia.org/wiki/Stairstep_interpolation), which is naught more than a waste of CPU cycles. It doesn't hurt (if you don't mind the blur), but it doesn't help either.
vivan
28th February 2015, 17:44
Ringing IS an artifact because it does look bad.
Luckily you can get greatly reduce ringing by limiting the result. You lose some sharpness this way, but this could be fixed by using more taps. That's what madVR's "antiringing" is (http://forum.doom9.org/showthread.php?t=145358) (and close to "repair" solution posted in that thread).
Then use polar resampler (jinc) instead of planar and you'll get someting that can rival nnedi.
Katie Boundary
28th February 2015, 18:57
What do you think an image becomes when you resize it to arbitrarily large dimensions?
"ugly" :p
If you want to talk about resampling
...which I don't... oh wait, actually I do. Let's go back to those original questions. What's so special about the b+2c=1 cubics?
Would you rather have an image that takes four times as many pixels for the same effective resolution plus some theoretical niceness properties, or an image that looks sharp and ringing-free, antialiasing be damned?
Well, I don't actually do huge upscales by more than a factor of 2. I mostly use resizers to correct the aspect ratios of anamorphic footage; to mix sources with different aspect ratios at a single intermediate aspect ratio (like warping 1.85:1 and 2.35:1 footage to 2.1:1 so they can be mixed); and to do crappy pseudo-IVTC with separatefields and selecteven. So I'd never be dealing with a situation more severe than a 78% increase in pixel count (720x240 to 640x480). I'll take the theoretical niceness properties, please.
Also, Hermite resampling is bad. It doesn't even correctly interpolate linear gradients, and surely you'd want a resampler to at least be able to handle gradients!
I'm aware of that, but the tradeoff is that it handles sharp edges better than bilinear. Sharp edges are far more common than perfectly linear gradients, so I'm willing to make that tradeoff. It's a no-op if the source and destination resolutions are the same (a very important quality to me) and it preserves wide-area color and brightness levels better than any of the ringing filters do. If for whatever reason I'm hell-bent on preserving linear gradients, I'll use bilinear for slight resizing and a cubic with values like b=0.5,c=0.25 for more extreme resizing.
Edit: on second thought, it occurs to me that the more extreme the upscale, the more valuable ringing becomes. I'm therefore considering the following system: bilinear for downscaling, Hermite or 1-lobe sinc for 1x-2x upscaling, Catrom or 2-lobe sinc for 2x-3x upscaling, 3-lobe sinc for 3x-4x upscaling, 4-lobe sinc for 4x-5x upscaling, and so on.
On the whole though, it seems that you've finally managed to understand the basics of sampling. Only took about a few weeks of hurfdurfing on everyone's part…
I understood sampling from the beginning, I just disagreed with the pretense of treating images like waveforms :)
Like I mentioned, there are two related but distinct senses of the word "linear", and you should be clear on which one you're referring to.
When I say "linear interpolation", assume I mean "drawing straight lines connecting the dots and figuring out where the interpolated points would fall on that line"
When I say "triangle filter", assume I have no idea what I'm talking about because I'm still fuzzy on that part.
foxyshadis
1st March 2015, 00:48
...which I don't... oh wait, actually I do. Let's go back to those original questions. What's so special about the b+2c=1 cubics?
The B/C formula overall is important because it can generate any continuous cubic filter. They're simple, they're fast, they're extremely easy to modify, and as in that illustration you posted earlier, the b+2c=1 variants around 1/3,1/3 just happen to fall right into a middle ground that's subjectively pleasant and easy to tweak. Therefore they've been studied and used extensively. Mathematically, it's nothing but the merging of two prior branches of research into one, and Mitchell-Netravali was just the simplification of it into an easily modifiable piecewise equation.
So it's historically special, but aside from tradition there's nothing keeping anyone from jumping off the line. (See Adobe, below.)
Well, I don't actually do huge upscales by more than a factor of 2. I mostly use resizers to correct the aspect ratios of anamorphic footage; to mix sources with different aspect ratios at a single intermediate aspect ratio (like warping 1.85:1 and 2.35:1 footage to 2.1:1 so they can be mixed); and to do crappy pseudo-IVTC with separatefields and selecteven. So I'd never be dealing with a situation more severe than a 78% increase in pixel count (720x240 to 640x480). I'll take the theoretical niceness properties, please.
He means that for properly band-limited images with that theoretical niceness, you'd need HD pixel resolution just to have apparent SD resolution. That's why shortcuts get taken and practical utility always wins.
I'm aware of that, but the tradeoff is that it handles sharp edges better than bilinear. Sharp edges are far more common than perfectly linear gradients, so I'm willing to make that tradeoff.
Of course you like Hermite, it's as close to nearest-neighbor as cubics get. :p What looks good probably varies enormously based on the material you're dealing with (old TV shows, HD movies, cartoons, Minecraft) and what you want the output to look like.
When I say "triangle filter", assume I have no idea what I'm talking about because I'm still fuzzy on that part.
A triangle filter is another name for what you call a linear filter, you've got it.
For some bizarre unworldly reason some Photoshop users like to resize in multiple steps (https://en.wikipedia.org/wiki/Stairstep_interpolation), which is naught more than a waste of CPU cycles. It doesn't hurt (if you don't mind the blur), but it doesn't help either.
I don't think anyone does stepped bilinear, they use stepped bicubic because the fake sharpness from the negative lobes increases with each iteration. Obviously, ringing stacks up too. Now that photoshop has a more complicated bicubic sharper function, I don't think anyone does that anymore. Adobe cubics are all weird too, with B=0 and a high C (http://entropymine.com/resamplescope/notes/photoshop/), probably because their users demanded MOAR SHARPNESS over the years and didn't care about ringing. (Note that in IM's nomenclature, a filter's "blur" is how many times wider it is than standard, pulling in more or fewer nearby pixels; Photoshop must use it to compensate for the high aliasing of B=0.)
I don't know why they didn't just add a proper lanczos alternate all along.
Katie Boundary
1st March 2015, 01:29
The B/C formula overall is important because it can generate any continuous cubic filter.
But aren't all cubics continuous?
band-limited images
:confused:
Of course you like Hermite, it's as close to nearest-neighbor as cubics get. :p
Actually I like it because it's as close to trapezoidal as any of these filters get, and grid resizing takes the form of a trapezoidal filter at moderate scaling factors :)
A triangle filter is another name for what you call a linear filter, you've got it.
Are you sure one isn't just a special case of the other like point resizing is to a box filter? Because this whole business with the triangular filter using THREE source pixels to calculate the value of the center pixel is very unlike how linear interpolation was first explained to me, and what you're saying contradicts what others are saying.
EDIT: I busted out my best crayons and let my inner five-year-old go to work on the problem, and this is what I came up with...
http://img.photobucket.com/albums/v415/DWJohnson/resample_zpszzx6dk5k.jpg
The little white lines going up from the x-axis mark the borders between pixels in the original 7x1 image. The little white lines going down from the x-axis mark the borders between pixels in the new 5x1 image. Blue dots represent the "sample points" from which the "waveform" is interpolated. The blue line is the interpolated "waveform" itself. Green dots are the new, interpolated samples. Using this method, the green dot in the center should stay at exactly 250 brightness and not be affected by either of the neighboring blue dots. It has a triangle shape. I am therefore inclined to believe that Imageworsener is in fact using a wider triangle that it should be using, unless the triangle in this drawing has absolutely nothing to do with where the term "triangular filter" comes from.
Edit 2: I busted out my red crayons and drew what I THINK is the un-normalized trapezoidal filter that corresponds to "pixel mixing" or grid overlay resizing by this scale factor. Tell me if I got it right:
http://img.photobucket.com/albums/v415/DWJohnson/resample2_zpswfxs5k4d.jpg~original
Katie Boundary
1st March 2015, 07:26
wait wait wait wait. Now I see why I was getting confused. Using strict two-tap linear interpolation, the pixels in the 7x1 bitmap that have 0% brightness also have 0% influence on the middle pixel of the 5x1 bitmap, and the pixel in the 7x1 bitmap that has nearly 100% brightness also has 100% influence on the middle pixel of the 5x1 bitmap, so the shape of the filter and the shape of the waveform end up being nearly congruent basically by coincidence. More generally speaking, when downsizing, some line segments in an interpolated waveform won't have any interpolated samples fall on them at all, and this can cause weird aliasing effects, so people would rather use a triangular filter with a wider "support" instead of actual linear interpolation. I still have no idea why that latter method is called "linear" resizing, but holy crap everything makes so much more sense now.
EDIT: Blue is what the waveform looks like with two-tap linear interpolation, green is what it looks like after being run through a triangular filter that has been expanded by the scaling factor, and red is what it looks like to a grid overlay:
http://img.photobucket.com/albums/v415/DWJohnson/resample3_zpskfocmf5j.png
colours
1st March 2015, 10:57
But aren't all cubics continuous?
All cubic functions also have infinite support (https://en.wikipedia.org/wiki/Support_%28mathematics%29), and tend to infinity at the endpoints. So why doesn't cubic interpolation diverge?
That's because cubic interpolation uses a piecewise cubic function for the kernel, not just one cubic function; we just call it "cubic" instead of "piecewise cubic" because that's a mouthful. "Continuity" here refers to the endpoints of adjacent pieces having the same value.
The space of piecewise cubic functions has infinitely many degrees of freedom, so we need to set some reasonable restrictions. First, we require that the function is even; i.e. k(x) = k(−x). Second, we require that it is identically zero outside of the interval [−2,2]. Third, we require that k(x) is a cubic function within the intervals (0,1) and (1,2).
A cubic function (https://en.wikipedia.org/wiki/Cubic_function) has four coefficients, and since k(x) is defined by two cubic functions, this leaves us with eight degrees of freedom. Keys provided a one-parameter family of such (piecewise) cubic kernels in 1989 (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1163711); note that ImageMagick's documentation (http://imagemagick.org/Usage/filter/#cubics) is actually wrong and attributes the wrong family of cubic kernels to Keys. Keys's one-parameter family further adds the restrictions of continuity (at x=1, x=2), continuous differentiability (at x=0, x=1, x=2), and interpolation (k(0)=1, k(1)=0). Eight variables, seven independent equations, so that leaves one degree of freedom.
The Catmull-Rom kernel is part of Keys's family and just to make things more confusing, it also corresponds to this thing known as cubic Hermite spline interpolation (https://en.wikipedia.org/wiki/Cubic_Hermite_spline) where the derivatives are given by centred finite differences (https://en.wikipedia.org/wiki/Numerical_differentiation#Finite_difference_formula). The Hermite kernel corresponds to assuming all the derivatives are zero instead of assigning some sane value. (Actually, the same section of the IM documentation also wrongly calls the Hermite kernel part of the b-spline family, when it's actually a tensioned cardinal spline. Ho hum.)
Mitchell and Netravali (http://www.cs.utexas.edu/users/fussell/courses/cs384g/lectures/mitchell/Mitchell.pdf) provided a two-parameter family that kept the aforementioned restrictions of continuity and continuous differentiability, but loosened the interpolation condition. Instead of requiring the interpolatory property, they only required that k convolved with any constant discrete signal results in a constant continuous signal with the same value. Let's call this constant reconstruction.
(Aside: for kernels which do not possess the constant reconstruction property, such as Lanczos, we normally apply normalisation so that the weights add up to 1 anyway. As such, constant reconstruction is not a strictly required property of the kernel, but having this property does allow simpler theoretical analysis. Note also that every kernel which possesses the constant reconstruction property even when expanded is bandlimited, and thus cannot have finite support.)
In the Mitchell-Netravali family of cubic kernels, setting b=0 recovers Keys's family. Any positive value of b will lead to blurring, even if you sample at exactly the original sample locations, and likewise any negative value of b will lead to sharpening. Avisynth's resizing filters skip resampling as an optimisation if the output width (or height) is exactly the same as the input and there's no shift, which leads to the funny and possibly unexpected result of BicubicResize and GaussResize not being continuous in the src_width/src_height values. Debicubic (http://avisynth.nl/index.php/Debicubic) does not use this optimisation, which might be a further source of confusion.
Other than constant reconstruction, we can also consider exact reconstruction of sampled linear gradients (i.e. affine functions (https://en.wikipedia.org/wiki/Affine_transformation)). The cubic b-spline (b=1, c=0) and Catmull-Rom (b=0, c=0.5) are both easily proven to have this gradient reconstruction property, so we can interpolate between the two to get the relation b+2c=1, which turns out to be exactly the constraint needed for gradient reconstruction.
</blog>
Katie Boundary
1st March 2015, 13:58
I hate it when the documentation is wrong.
Regardless, where does the b+2c=1 formula come from, and what does the avisynth documentation mean by "numerical accuracy"?
colours
1st March 2015, 14:27
I hate it when the documentation is wrong.
It's not wrong in the sense that it's wrongly documenting what IM is doing, it's wrong in the sense that some of the exposition in it unrelated to IM is incorrect. A far lesser evil than outright deceptive documentation, I'd say.
Regardless, where does the b+2c=1 formula come from
[b+2c=1] turns out to be exactly the constraint needed for gradient reconstruction.
Reading comprehension, grumblegrumble.
and what does the avisynth documentation mean by "numerical accuracy"?
For a twice-differentiable function sampled at an interval h, using a cubic kernel from the Mitchell-Netravali family with b+2c=1 converges most quickly (O(h²)) to the original function as the sampling interval h tends to zero. Cubic kernels not satisfying b+2c=1 only converge at O(h). This is not very relevant for real use cases because the image signal is usually sampled only once with a fixed sampling interval, but what is relevant is that the fast convergence criterion is exactly the same as the gradient reconstruction constraint.
Katie Boundary
1st March 2015, 18:16
the linear gradient thing again
Well yeah but I mean... that's it? It has no other properties?
For a twice-differentiable function sampled at an interval h, using a cubic kernel from the Mitchell-Netravali family with b+2c=1 converges most quickly (O(h²)) to the original function as the sampling interval h tends to zero. Cubic kernels not satisfying b+2c=1 only converge at O(h).
I have absolutely no idea what that means. The only OH that I'm familiar with is peroxide.
colours
1st March 2015, 19:18
Said "other properties" would be… exactly the next two sentences you quoted. This is the most ELI5-style explanation I can come up with:
Say you have some sufficiently smooth signal. You can do this thing: sample with a sampling interval h to get a discrete signal, then filter it with a specified cubic kernel to recover a continuous signal.
What b+2c=1 guarantees is that, as you decrease h, the error between the reconstruction and the original function goes to zero as fast as possible among all possible cubic kernels in the Mitchell-Netravali family. If b+2c is not 1, the error only decreases at most a constant times faster than using a box filter (i.e. nearest neighbour interpolation).
It'll be kind of difficult to discuss this without at least a basic knowledge of some calculus, because that's what all the theory builds off of.
And you're familiar with peroxide, but not, say, water or alcohol? :rolleyes:
Katie Boundary
1st March 2015, 20:12
So the b+2c=1 family is most accurate when dealing with waveforms. Got it. Thanks. :)
In other news, it looks like the original 5/7 experiment would have worked perfectly if the size of the filter had been expanded, not by 40%, but by 24.5%. I don't know if this would have yielded perfect fairness for other images, though.
And you're familiar with peroxide, but not, say, water or alcohol? :rolleyes:
Water puts the letters in a different order and alcohols usually have a C in there somewhere :)
colours
2nd March 2015, 01:20
So the b+2c=1 family is most accurate when dealing with waveforms. Got it.
No, you don't get it. You have this false dichotomy of "waveform" versus "discontinuous signal" in your mind. It is perfectly valid to have a discontinuous signal defined over the real numbers, or in the case of images, defined over the Euclidean plane. When you sample the signal, you don't care about whether it's continuous or continuously differentiable or whatever, you just sample the darned thing.
Unless you're taking "waveforms" to mean specifically the smooth signals instead of just being another synonym for "signals", in which case, yes, you're right, but your terminology is confusing.
Also note that sampling signals which are not sufficiently smooth will lead to aliasing, because step discontinuities require arbitrarily high frequencies to reproduce. In other words, either you smooth the step discontinuities first (in which case you end up with a smooth signal, or "waveform"), or you just leave it alone and get lots of aliasing.
In other news, it looks like the original 5/7 experiment would have worked perfectly if the size of the filter had been expanded, not by 40%, but by 24.5%. I don't know if this would have yielded perfect fairness for other images, though.
Worked "perfectly"? How so?
(I don't have the time to work out the equations and such at the moment; I might or might not get back to this later.)
Katie Boundary
2nd March 2015, 05:25
Unless you're taking "waveforms" to mean specifically the smooth signals instead of just being another synonym for "signals", in which case, yes, you're right, but your terminology is confusing.
In this context, I mean "a curved line showing the peaks and troughs of a sound, radio, electrical, ocean, or other wave, or a set of samples representing such a curved line".
Obviously not using the geometric definition of "line", but whatever.
Worked "perfectly"? How so?
It would have yielded a center pixel with a brightness of 178.57143
colours
3rd March 2015, 04:20
Your "24.5%" figure is wrong, and the exact value is 1/4 = 25%. It's fine to play with numbers to get a feel for what the rough values should be, but you should try working out the exact rational values whenever possible.
This is, of course, preconditioned on the centre source pixel being "fairly" weighted in the destination. As before, call the seven source pixels x[0], x[1], … , x[6], and since we'll also be sampling outside this range, let x[-1] and x[7] be some values. Consider filtering our seven-sample signal with a triangular kernel expanded by a factor of 1.25 (i.e. k(z) = |1-z/1.25|), normalising, then sampling.
x[-1] and x[7] each contribute 1/31 weight.
x[0] and x[6] each contribute 21/31 weight.
x[1] and x[5] each contribute 9/31 + 13/30 = 673/930 weight.
x[2] and x[4] each contribute 17/30 + 1/7 = 149/210 weight.
x[3] contributes 5/7 weight.
If we take periodic extension (x[-1] = x[6] and x[7] = x[0]) as our boundary conditions, this changes x[0] and x[6]'s contributions to be 22/31 each.
Note how 22/31 = 5/7-1/217, 673/930 = 5/7+61/6510 and 149/210 = 5/7-1/210 are all not 5/7.
There're actually two sides to this concept of "fairness". One is as mentioned in the ImageWorsener article, that the sum of weights contributed by each source sample is the source width divided by the destination width. (Assuming that we're resampling only horizontally.) The other is that the sum of weights in each destination sample is 1.
This is exactly the kind of problem Fourier analysis handles almost trivially. Unfortunately, you're probably not going to understand this since apparently you don't (yet?) have high school calculus under your belt. Long story short, if your resampling kernel is independent of the resampling factor and is also identically zero outside of some finite interval, then it cannot simultaneously be fair in both senses. (In fact, I've already mentioned this.)
The latter meaning of fairness is actually also something I've already mentioned; it's exactly the concept of constant reconstruction. This is also mentioned in Avery Lee's (VirtualDub's developer) blog post on how to make a resampling filter (http://virtualdub.org/blog/pivot/entry.php?id=86). If we decide to stick to a fixed kernel, we obviously would prefer to have destination fairness instead of source fairness.
While a triangular filter expanded by a factor of 1.4 in the 7-to-5 example would be "source-fair", it's not "destination-fair", so when we normalise the blending weights, the source fairness property gets destroyed in the process. Conversely, when dealing with a "destination-fair" filter that's not "source-fair", we can also normalise the source weights, but this leads to ugly position-dependent brightness variations in the output so we don't do this.
Katie Boundary
4th March 2015, 00:37
Your "24.5%" figure is wrong, and the exact value is 1/4 = 25%. It's fine to play with numbers to get a feel for what the rough values should be, but you should try working out the exact rational values whenever possible.
That's exactly what I did. I must have made an error somewhere :(
This is, of course, preconditioned on the centre source pixel being "fairly" weighted in the destination. As before, call the seven source pixels x[0], x[1], … , x[6], and since we'll also be sampling outside this range, let x[-1] and x[7] be some values. Consider filtering our seven-sample signal with a triangular kernel expanded by a factor of 1.25 (i.e. k(z) = |1-z/1.25|), normalising, then sampling.
x[-1] and x[7] each contribute 1/31 weight.
x[0] and x[6] each contribute 21/31 weight.
x[1] and x[5] each contribute 9/31 + 13/30 = 673/930 weight.
x[2] and x[4] each contribute 17/30 + 1/7 = 149/210 weight.
x[3] contributes 5/7 weight.
lolwut? no. Only three pixels would have any weight. The middle one would be 5/7, its immediate neighbors would be 1/7 each, and the rest would be zero.
you don't (yet?) have high school calculus under your belt
Calculus is a college thing. I did very well in math right up through and including trig, but then brickwalled at calculus.
colours
4th March 2015, 01:31
lolwut? no. Only three pixels would have any weight. The middle one would be 5/7, its immediate neighbors would be 1/7 each, and the rest would be zero.
I'm not looking at only the middle destination sample; I'm looking at all five of them.
Calculus is a college thing. I did very well in math right up through and including trig, but then brickwalled at calculus.
I don't know about you, but over here we study things like the epsilon-delta definition of limits in high school. You can't call what you're doing "math" unless it's abstract nonsense; it's no different from plain arithmetic otherwise.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.