PDA

View Full Version : Thrashing out RGB <-> YUY2 issues


SansGrip
1st November 2002, 02:27
As part of my quest to understand just what the heck is going on with pixel value ranges I've been reading a variety of conflicting reports about RGB <-> YUY2 conversion.

Issue one:

Almost every document I've read says that the legal CCIR-601 range for YCbCr is 16-235 for luma and 16-240 for chroma, except for a couple of "explanations" of TMPGEnc's settings, which claim that the range begins at 8. The tooltip for the TMPGEnc "Output YUV data as Basic YCbCr not CCIR601" setting says "Outputs YUV data as Basic YCbCr, not CCIR601. Then, Y range of MPEG YCbCr is not 8 - 235 but 0 - 255. Since DV format movie has already recorded as CCIR601, better result can be expected if this option is enabled."

This seems to imply that, if this option is checked, TMPGEnc converts the RGB input to YCbCr without altering the 0-255 range. Presumably if this option isn't checked it scales to 8-235 during conversion. But why 8 and not 16?

Edit: At least when using the ReadAVS VFAPI plugin, TMPGEnc scales to 16-235, not 8-235. It looks like the tooltip is wrong. Unchecked is always the correct setting with ReadAVS. See later in the thread for my post (http://forum.doom9.org/showthread.php?s=&threadid=37082&perpage=20&pagenumber=2#post204385) about this.

Issue two:

While the majority of Avisynth filters operate in YUY2 space, it's not clear which of them are guaranteed to produce 16-235/240 output, or even if they should only produce that range. It certainly seems that mpeg2dec passes out-of-range pixels through without clipping, even though these pixels seem always to be compression artifacts.

It could be argued that each filter implementing its own 16-235 clamp is not only inefficient but also leads to "rounding" errors where colours get saturated at each end. Might it not be preferable for all filters to operate on the full 0-255 range and to clamp at the end of the filter run, perhaps by the core itself?

Should it be impossible to produce 0-255 YUY2 with Avisynth? It certainly looks as though Levels will not produce values outside that range, even given arguments such as:

Levels(16, 1, 235, 0, 255)

Unless I'm mistaken, common sense would dictate that this should take a 16-235 range and scale it to 0-255, but in fact the output remains within 16-235.

Issue three:

ConvertToRGB scales from 16-235 to 0-255 during conversion. I would argue there should be a means to disable this in order to reduce the number of transformations applied to the material. For example if frameserving to an application that expects RGB in the CCIR-601 range it is necessary to perform a Levels after the conversion to recompress to 16-235 (and this also applies to TMPGEnc, which I assume, at the moment anyway, has an incorrect "8-235" scaling algorithm. It seems this must be worked around by checking the aforementioned option and feeding it RGB already in the 16-235 range).

Issue four:

This document (http://www.intersil.com/data/an/an9/an9717/an9717.pdf) states "YCbCr is also defined to have been derived from gamma-corrected RGB (R'G'B') data." It also says "NTSC video is pre-corrected using a gamma of 2.2."

I could write everything I know about gamma with a thick marker on a cigarette paper, but if I'm reading this correctly one might need to "uncorrect" the gamma of NTSC video to display on a PC monitor. While IIRC the default Windows gamma is 2.2, the default Macintosh gamma (according to Photoshop) is 1.8. Er...?

Do the ConvertTo* filters do gamma correction also? Is the target of the data (e.g. TMPGEnc) doing gamma correction when it converts to YV12?

Well, those are some of the issues I have. I've looked and looked but have found no concise explanation for all this, nor do I seem to be able to contact the makers of TMPGEnc without buying the Plus version. Forgive me if this has been addressed elsewhere.

Thanks for listening ;).

WarpEnterprises
1st November 2002, 23:15
May I add something I noticed:
Given a video with a range more than 16-240 (e.g. TMPG checked), mpeg2dec is not able to read this back completely, whereas DVD2AVI.VFP with setting "TV" reads it back without clipping.

(this of course is no answer but more a further strangeness)

High Speed Dubb
1st November 2002, 23:37
Those are a lot of difficult questions...

On issue 1 (TMPGEnc quirks) &mdash; I realy don&rsquo;t know. With a noisy source there&rsquo;s some justification for going outside the [16, 235] range, since that prevents bias around the ends of the range. Maybe that&rsquo;s what they&rsquo;re aiming for. Or maybe the extra 8 help with inaccuracies in the encoding of the picture.

Issue 2 (color range for intermediate processing) &mdash; Yes, it would make some sense to process with the full 8 bits, and to correct at the end. There are a few cases (like gamma) for which this would need to be handled carefully, but it would generally improve image quality somewhat.

Technically there is also some reason to allow a [0,255] range with Y(black) = 0 &mdash; in other words, ignoring the spec range. If you have your overlay set to expect the full range, you can get perfectly good output that way. Of course, that assumes all the filters are able to properly handle data in which Y(black) = 0.

We had a long, drawn out debate on this on the DScaler list at some point. I think we decided to use the limited range so that we wouldn&rsquo;t need to change the range in software (remember that DScaler runs in real time), so that people wouldn&rsquo;t need to use separate overlay settings for DScaler alone, and so that we could deal with a single expected color range in all the internal processing steps.

Issue 3 (RGB range) &mdash; If you&rsquo;re outputting to RGB that makes sense. But if you&rsquo;ll be transforming back to YUY2 then you&rsquo;re better off using as many available bits as possible.

Issue 4 (Gamma correction) &mdash; The important gamma to worry about is the gamma of the display device. If you don&rsquo;t know what display is going to be used, then you should stick with the NTSC/PAL standards. If you know the monitor/viewing environment you&rsquo;re going to be using, then you can compensate properly.

I don&rsquo;t think the ConvertTo* filters need to account for gamma &mdash; Like you quoted, YCbCr is defined to have been derived from gamma-corrected RGB (R'G'B') data. So to you can use a linear transform to get back to R'G'B'.

SansGrip
2nd November 2002, 00:17
Given a video with a range more than 16-240 (e.g. TMPG checked), mpeg2dec is not able to read this back completely, whereas DVD2AVI.VFP with setting "TV" reads it back without clipping.

What do you mean by "read this back completely"? I've found that mpeg2dec passes through out-of-range values without alteration, at least from .vob sources.

I'm guessing the reason TV scale -> VFP shows a difference is because it scales to 16-235 during the transformation to RGB instead of clamping it. The probable outcome of this is that, for sources already in CCIR range (aside from encoding artifacts), the levels will then be incorrect. For example, what are meant to be solid black frames will have an average luma of about 30.

You can check what range you're getting from a filter with RangeInfo (see my sig). I find it quite useful when something looks wrong with the levels.

SansGrip
2nd November 2002, 00:36
Those are a lot of difficult questions...

I'm glad it's not just me being dumb ;).

With a noisy source there&rsquo;s some justification for going outside the [16, 235] range, since that prevents bias around the ends of the range.

Perhaps. An offset of 8 seems a lot, though, if your source is already properly scaled. What's more, I've read that most DVD players and TV-out cards clamp the range anyway, so you end up with bias regardless if that's your output device. It seems strange not to have standard CCIR-601 as an option.

I'm considering buying the Plus version (even though I don't need it) just to email tech support :).

Yes, it would make some sense to process with the full 8 bits, and to correct at the end. There are a few cases (like gamma) for which this would need to be handled carefully, but it would generally improve image quality somewhat.

I, rightly or wrongly, equate this with audio processing. The best editors store 24 bits internally regardless of the word size of the input wave. Professional digital audio workstations often use higher precisions. They then dither the output down to the desired word size (usually 20- or 16-bit). As Elizabeth Taylor once nearly remarked, you can't be too rich, too thin or have too many bits ;).

I know this is a pretty radical suggestion, but for very high quality processing where signal integrity is paramount, might there not be some attraction in using more than 8 bits internally?

I am now descending into my underground flame-proof chamber... ;)

Of course, that assumes all the filters are able to properly handle data in which Y(black) = 0.

My (brief) investigations indicate that almost all filters obey the CCIR range, at least as far as clamping their output if not actually taking it into account in their algorithms. I think there'd need to be significant rewriting if it was thought best to use all 8 (or more :D) bits. But hey, that's what open source is all about, right? ;)

I think we decided to use the limited range so that we wouldn&rsquo;t need to change the range in software (remember that DScaler runs in real time)

Not an issue for Avisynth though.

and so that we could deal with a single expected color range in all the internal processing steps.

This also doesn't rule out using all 8 bits. The most important thing is consistency, which at the moment (I would argue) Avisynth lacks in this respect.

But if you&rsquo;ll be transforming back to YUY2 then you&rsquo;re better off using as many available bits as possible.

Does the loss in conversion come from the transformation itself or the scaling that usually goes along with it?

I'm still not convinced that the ConvertTo* filters should be scaling. I think it would be a lot more flexible if they didn't, and Levels actually worked over the whole range instead of clamping to CCIR. That's a very unexpected behaviour for that filter, IMHO.

Issue 4 (Gamma correction)

I'm going to have to do some research on gamma because I have no real grasp on what it means. But I'll get back to you on this one :).

Did馥
2nd November 2002, 01:39
I'm getting really :confused: here ...

Somehow, this sounds like we all have encoded some sort of cr*p since we started using AviSynth.

Would someone please hit me, so that I wake up from this nightmare?

Si
2nd November 2002, 01:39
I'm considering buying the Plus version (even though I don't need it) just to email tech support

I tried a discussion over respecting the TFF flag when encoding progressive frames and due to me not being able to speak Japenese and them not speaking English we got nowhere - so I think you'll have the same problem unfortunately :(

regards
Simon

SansGrip
2nd November 2002, 01:50
Somehow, this sounds like we all have encoded some sort of cr*p since we started using AviSynth.

That's exactly the sinking feeling I had when I started (semi-) understanding what was going on. It's also exactly what I hope isn't true ;).

Seriously, I don't think we've been encoding cr*p, though it's possible that many people have been messing with the levels in ways they didn't intend. The problem as I see it is a matter of consistency, and figuring out if CCIR should apply to all YUY2 data or only that destined for TV. I'm tending towards the latter, but then my knowledge of the issue is inadequate. What we need is a video engineer to step in and smack us all in the head ;).

Would someone please hit me, so that I wake up from this nightmare?

I think I shouldn't hit you a second time... :D

SansGrip
2nd November 2002, 01:51
I tried a discussion over respecting the TFF flag when encoding progressive frames and due to me not being able to speak Japenese and them not speaking English we got nowhere - so I think you'll have the same problem unfortunately :(

If I do take that route I shall try to find a translator. Any offers? :)

ErMaC
2nd November 2002, 02:45
俺がんばるぜ。でも、保しない。俺の日本語はちょっと悪いから。

I'll try my best, but I can make no guarantees. My Japanese isn't that good.

Especially when it comes to technical writing, that's a whole other ballgame. You try explaining Dynamic Quantizers in a foreign language. :eek:

SansGrip
2nd November 2002, 02:53
I'll try my best, but I can make no guarantees. My Japanese isn't that good.

Wow, I was semi-joking, but thanks for the offer! I'll let you know if I need help :).

You try explaining Dynamic Quantizers in a foreign language.

I have a hard time explaining dynamic quantizers in my native tongue... :D

High Speed Dubb
2nd November 2002, 03:29
That would be a lot of trouble just to increase the range to [0,255]. In the long run, it might be better to introduce a 16 bit per component format, instead.

I don&rsquo;t know how the ConvertTo* filters relate to scaling.

SansGrip
2nd November 2002, 04:02
That would be a lot of trouble just to increase the range to [0,255].

It's not so much that I'm desperate to standardize to [0,255] instead of [16,235] and [16,240]. I'm more concerned with figuring out what we should currently expect from a filter. In RGB mode it should, of course, output [0,255]. But for YUY2 it seems unclear as to whether it should produce [0,255] or [16,235].

My thinking is still that it depends on the eventual target. In my case (MPEG-1 for TV output) I want my levels to stick within the CCIR range all the way through. But for someone encoding for display on a monitor that might not be the case. IIRC in the XviD lumi masking thread MarcFD mentioned that he wants to give XviD the full 8-bit range. Does XviD then clamp it, scale it, or pass it through? Of course, this then raises the question of what scaling/clamping the decoder does...

In the long run, it might be better to introduce a 16 bit per component format, instead.

Ah, so it wasn't such a stupid idea ;). I'd be very interested to hear from the core developers about this.

It might even be worth coming up with One True Format for use internally by the filters, one that any other format (RGBxx, YUV, etc.) can be transformed into quickly and losslessly. This would avoid having to handle both YUY2 and RGB (and with 2.5 YV12) formats within each filter. Perhaps 3.0 might have all filters working in a 16-bit 4:4:4 internally, with a final ConvertToWhatever at the end of the script... :)

I don&rsquo;t know how the ConvertTo* filters relate to scaling.

At the moment ConvertToRGB scales [16,235] to [0,255]. I assume (though I haven't looked) that ConvertToYUY2 scales [0,255] to [16,235]. From what I've read it appears possible to do the transformation without any scaling, but now we're coming down on the side of YUY2 == CCIR, I'm beginning to think the current behaviour is correct provided all filters obey [16,235] for YUY2.

As far as the Levels filter goes, and this:

Levels(16, 1, 235, 0, 255)

producing [16,235] output in YUY2 space, I'm assuming what happens is it expands the range then clamps. While of course this does maintain CCIR it's still somewhat unexpected and should I think be documented more clearly. For example, when the docs say that:

Level(0, 1, 255, 0, 255)

does nothing, that's really not true in YUY2 space. It's presumably clamping after not scaling, so any non-CCIR pixels will be removed.

I would also suggest, based on this thread so far, that the various mpeg2decs be altered so that they do not emit any pixels outside of CCIR, unless there's any indication these pixels are anything other than compression artifacts.

High Speed Dubb
2nd November 2002, 04:48
Yes, that sounds like the ConvertTos are scaling correctly. (I got confused there for a moment and thought you were using scaling in terms of changing the size of the image.)

In terms of TV output, it does make sense to limit to the range [16, 235]. That&rsquo;s the range the overlay expects, regardless of whether the material is NTSC or PAL. On the other hand, you can tweak your overlay settings to expect a different range, such as [0, 255].

trbarry
2nd November 2002, 05:51
I tend to believe in converting things as little as possible. But this means leaving Avisynth being a little more complex and processing in multiple formats, both for performance and to avoid the loss of precision that happens with multiple conversions.

So I don't really think it is worthwhile converting everything into some common 4:4:4 format just for simplicity. Anyone is free to convert to RGB now though many of the filters wouldn't work on it. (mine won't)

By the same token, I think that filters that work assuming values of 0-255 are just fine, even if all the valid data for the clip happens to be in the range of 16-235 or so.

The one exception is maybe an optional clamping step before returning the data if we find that accomplishes anything useful. And I don't think that has been shown yet.

- Tom

High Speed Dubb
2nd November 2002, 06:09
[0, 255] versus [16, 235] won&rsquo;t matter much for most filters. But there are exceptions. Gamma correction would be very messed up if it assumed the wrong 0. Any nonlinear transformations would have similar problems.

If the intermediate format has enough bits, then transferring to and from it won&rsquo;t lose much information. Rewriting filters to use the larger range would be a pain, though. I&rsquo;d suggest that this sort of thing should wait until everybody has SSE2 instructions available.

Richard Berg
2nd November 2002, 10:28
Providing an RGB64 or perhaps YUV64 format has been discussed and should still be considered "on the table" IMO when it comes to making Avisynth competitive with Premiere et al. The stalling point is that this sort of precision is only helpful when respected by all the intermediate filters you use, which even if we added it to the core tomorrow wouldn't happen for a year. And like Lindsey said, without SSE2 things could get S-L-O-W...

ErMaC
2nd November 2002, 10:53
Actually Premiere only processes in RGB32, however the production bundle of After Effects 5.0 and up can do 16-bit pers channel or RGB64, but only certain effects support that colorspace.
I really think the returns for working in something like RGB64 are moot when your source is YV12, as most of what we're doing is (DVD rips).
Perhaps when people are ripping HD-DVDs with 4:2:2 colorspace MP@HL MPEG2 it MIGHT be worthwhile. But until then I don't think it's important enough to devote time except for the people working with D16 (2880x2048) or Full Cineon (3656x2664) which has been digitized from 35mm film or digitally rendered in that resolution and is already in RGB64.
As for the ORIGINAL discussion (lets not get too off topic) I think that it'd be nice to know exactly what kind of degredation we are getting from the current setup as compared to an ideal filter chain (with no messing around of the 16-235 range). Is there some sort of way we could get a screenshot so we can see if it's even really worth worrying about?

SansGrip
2nd November 2002, 16:15
In terms of TV output, it does make sense to limit to the range [16, 235]. That&rsquo;s the range the overlay expects, regardless of whether the material is NTSC or PAL. On the other hand, you can tweak your overlay settings to expect a different range, such as [0, 255].

If that's what the overlay expects then it sounds more and more like we should limit YUY2 to [16,235] all the way through. This is certainly the simpler option, since it requires fixing only a few filters that don't currently clamp (e.g. mpeg2dec).

Edit: I've changed my mind ;). My inclination would be to remove clamping from all filters and do it at the end.

I hope, if the overlay wants [16,235], that the people who write the decoders know this... ;)

SansGrip
2nd November 2002, 16:28
But this means leaving Avisynth being a little more complex and processing in multiple formats, both for performance and to avoid the loss of precision that happens with multiple conversions.

Ideally any intermediate internal format would be able to receive any common format without loss. I've not actually come up with such a thing, I'm just throwing out ideals ;).

So I don't really think it is worthwhile converting everything into some common 4:4:4 format just for simplicity.

Personally I'm not looking forward to having to support both YUY2 and YV12 in every filter I write from 2.5 on...

Anyone is free to convert to RGB now though many of the filters wouldn't work on it. (mine won't)

Nor mine ;).

The one exception is maybe an optional clamping step before returning the data if we find that accomplishes anything useful. And I don't think that has been shown yet.

The problem as I see it is that nothing is clearly defined. Right now some filters obey [16,235] and some don't. Levels refuses to produce anything outside that range in YUY2 mode, but mpeg2dec happily passes through out-of-range pixels.

If we're suggesting that YUY2 could usefully be either [0,255] or [16,235] then I would say Levels needs to be fixed to remove the clamping. If we're saying YUY2 should be CCIR-only then either mpeg2dec needs to be fixed (along with a handful of others, including one of mine) or we need to simply clamp it at the end of the chain (using either my filter or MarcFD's optimized version).

If all we're gonna do is clamp at the end then it would make sense to remove the clamping from each individual filter. This would reduce cumulative errors at the ends of the range and improve performance.

SansGrip
2nd November 2002, 16:40
The stalling point is that this sort of precision is only helpful when respected by all the intermediate filters you use, which even if we added it to the core tomorrow wouldn't happen for a year.

I would argue, because I like arguing ;), that it would be helpful if the majority of the filters you used supported it. Or some of them. Heck, it would reduce rounding errors even if only two filters supported it, provided they were next to each other.

There's obviously a reason that very high precision has become defacto in professional digital audio, with 24-bit rendering a standard and sometimes 50-plus bits used internally. Perhaps I'm making a flawed analogy here, but I don't see why, theoretically speaking, video manipulation processes should be any less careful with signal fidelity than its audio counterparts.

And like Lindsey said, without SSE2 things could get S-L-O-W...

I've never believed that speed is a good reason to sacrifice quality. It's only a good reason to make quality optional :D.

SansGrip
2nd November 2002, 16:51
I really think the returns for working in something like RGB64 are moot when your source is YV12, as most of what we're doing is (DVD rips).

I don't believe this is true, for the same reason (and I apologise for bringing audio into this yet again) that, for example, WaveLab operates in 24 bits internally even when the source is 8-bit.

If we were simply taking that YV12 input and throwing it straight back out again then I would agree with you. But the more filtering you do the more rounding errors you introduce, especially if we're limiting YUY2 to [16,235] within each filter.

See this (http://www.digido.com/ditheressay.html) for example.

I think that it'd be nice to know exactly what kind of degredation we are getting from the current setup as compared to an ideal filter chain (with no messing around of the 16-235 range). Is there some sort of way we could get a screenshot so we can see if it's even really worth worrying about?

The problem is deciding which pixels have been altered by rounding errors and which by the filters. I would say at the moment any heavy processing in YUY2 mode will result in output somewhat saturated at the edges of the range. This might be detectable and measurable.

A possible way to know how much saturation is introduced by a particular chain would be to modify each filter so that they don't clamp then at the end highlight any pixels outside of CCIR range. This wouldn't be totally accurate, of course, because it would somewhat alleviate the rounding errors we're trying to detect.

trbarry
2nd November 2002, 16:59
Perhaps when people are ripping HD-DVDs with 4:2:2 colorspace MP@HL MPEG2 it MIGHT be worthwhile.

I don't know what HD-DVD will use but my HDTV ATSC caps are still in YV12 like regular DVD's. That's one of the reason I've been a fan of Avisynth YV12 support.

- Tom

SansGrip
2nd November 2002, 17:41
Actually Premiere only processes in RGB32

But that's the full 8 bits for each component. At the moment since each filter is clamping we're effectively working with 219 (224 for chroma) values. That's 14% (12%) fewer than Premiere.

SansGrip
2nd November 2002, 18:56
Just an update on the TMPGEnc issues I mentioned in my original post.

I decided to sit down and figure this out once and for all. This is the process I used:

Output YUY2 [16,235] from Avisynth. Read using TMPGEnc and the ReadAVS VFAPI plugin. Encode to MPEG-1, then fast recompress to Huffy with VirtualDub. Compare levels in original with levels in resulting AVI using RangeInfo.

Here's what I found regarding the Setting -> "Quantize Matrix" -> "Output YUV data in Basic YCbCr not CCIR601" option. Either:

1) The plugin transforms to RGB [0,255]. If the option is checked TMPGEnc passes this through without scaling and produces YV12 [0,255]. If not, it rescales to [16,235] (not [8,235] as claimed in the tooltip); or

2) The plugin transforms to RGB [16,235]. If the option is checked TMPGEnc rescales to YV12 [0,255]. If not, it passes the values through untouched.

I consider the former more likely.

So to preserve Avisynth levels the correct setting for that option is unchecked in all cases, regardless of the range of the original source material (e.g. YV12). This maintains an almost identical mean luminance to the original.

I apologise if this is somewhat off-topic, but I thought since I raised the issue I should probably add my findings :).

sh0dan
3rd November 2002, 15:32
I'm following this discussion on the sideline - I have a few points:

- My thoughs on increased precision was 15 bit per channel (to be able to use MMX very efficient). The extra bit just being padded in, for alignement. It should be a planar format. Planar is much easier to work with, IMO. It should be 4:4:4 (one chroma for each pixel) - doing 4:2:0 would be faster, but since it's subsampled - what's the point?

- I think increased precision is a good thing, and have had it in my thoughts for the next revision, but I have rejected it because:

- Speed reasons: My impression is that 75-80% of all users choose AviSynth for it's speed. Speed will suffer very much, since all data would have to be converted on both input and output. And filters will also be much slower. Probably at least a factor 4 slowdown. SSE2 will of course help, but with 16 bit's per pixel (per plane), it can still be very well optimized for existing processors (properly paired code can execute 2 64-bit MMX-operations in "one cycle").

- Implementation reasons: Support will be limited - YV12 makes sense to many people to implement, since it's faster. But how many will also implement YUV64? (Just think of how many external filters actually support RGB32?). Without at least a 75% filter implementation, the increased precision is useless.


I choose to make the sample processing handle floats internally, with built in autoconversion for filters. This is a small speed penalty to pay for the increased precision it offers.

I'm not against an YUV64 implementation. I would be very happy to support anyone interested in it, but rewriting all routines to YV12 is enough for me now. Perhaps in v3? Or would you be interested in helping implementing it into v2.5?

SansGrip
3rd November 2002, 16:23
I'm not against an YUV64 implementation. I would be very happy to support anyone interested in it, but rewriting all routines to YV12 is enough for me now. Perhaps in v3? Or would you be interested in helping implementing it into v2.5?

For something this major, v3 sounds good ;). Of course, I'd be interested in helping now I've written a few filters and dug around in the core source a little.

But I'm not really pushing for YUV64, I think YV12 will improve things significantly. What I'm mainly interested in is using all of the 8 bits that we have right now.

I might have misunderstood everything I've read on precision and cumulative errors, but it seems to me anyway that having each filter clamp to [16,235] individually is not only duplicated effort but also sure to introduce end-of-range saturation in each step of the chain, which produces more saturation in the next, and so on.

As the subject of the thread indicates I was merely seeking clarification rather than any major change in how things are done internally. If 8 bits is good enough for Premiere I would be inclined to think that it's good enough for Avisynth. But right now we're not even using the full 8 bits...

trbarry
3rd November 2002, 17:19
I might have misunderstood everything I've read on precision and cumulative errors, but it seems to me anyway that having each filter clamp to [16,235] individually is not only duplicated effort but also sure to introduce end-of-range saturation in each step of the chain, which produces more saturation in the next, and so on.


Do most filters really clamp to NTSC range? I know mine don't. And there is still a possibility (PAL TV caps?) that you are processing a Full Luma Range clip. That's why I thought it should only be done at the end of the script, and then only optionally.

- Tom

SansGrip
3rd November 2002, 17:52
Do most filters really clamp to NTSC range? I know mine don't.

All of the filter source code I've seen does. But that might not be a representative sample ;).

And there is still a possibility (PAL TV caps?) that you are processing a Full Luma Range clip.

AFAIK CCIR-601 applies to PAL and its derivatives too.

That's why I thought it should only be done at the end of the script, and then only optionally.

I think that should be the case, yes.

Xenoproctologist
3rd November 2002, 19:28
Originally posted by trbarry
And there is still a possibility (PAL TV caps?) that you are processing a Full Luma Range clip.

There are instances where one would want out-of-range values--most notably in calibration discs, where "blacker than black" is very useful for calibrating brightness.

Granted, I don't think there are many people pining to rip their AVIA calibration DVDs, but we wouldn't want to impose artificial limits on their ability to do so. ;)

High Speed Dubb
4th November 2002, 00:11
I do think allowing 16 bit color components will eventually be a worthwhile move. With smoothers, handling of rounding is very important to avoid blurring and maintain color accuracy. At low smoothing values, Grape Smoother is pretty much just a rounding trick. And Peach Smoother uses something similar to temporal dithering to avoid messing up colors. So the lack of precision can be a real hinderance.

With lots of filters run in a row, color depth problems would probably get somewhat worse.

SansGrip
4th November 2002, 00:33
With lots of filters run in a row, color depth problems would probably get somewhat worse.

Sometime, maybe tonight or tomorrow, I'm going to download the source to as many filters as possible and make a list of those that clamp in YUY2 mode. Then I'm going to try to think of a way to quantify the rounding errors we're getting right now. This at least would give an idea of whether it's even worth worrying about it.

My initial thoughts are along the lines of running two scripts, one with clamping filters and one with modified-not-to-clamp filters. In the latter I'd finish off with a LegalClip to clamp it, and then check out the difference.

My suspicion is there'd be noticible saturation at each end of the range, but we'll see what kind of results I get.

Is this a worthwhile experiment? Can anyone think of anything better? Should I have paid more attention to the "designing experiments" section of my psychology textbook? ;)

sh0dan
4th November 2002, 20:27
If anyone is testing the alpha, you could also try the new version, with the new "limiter" filter. It works like SansGrips Legalclip, except you can specify parameters, and it's much faster (ISSE optimized).

Usage:

Limiter(minimum luma, maximum luma, minimum chroma, maximum chroma)

Default parameters are the same as LegalClip (default YUV range). Funny thing is that it can also be used for color correction.

SansGrip
4th November 2002, 22:41
Alright, after testing every filter in my possession I've found some interesting results.

To test I made a quick producing filter to output a frame covering the entire YUY2 [0,255] range. I then used this script:

MakeRange()
TheFilterUnderTest()
RangeInfo()

This allowed me to see exactly how the filter was affecting the pixel values without having to find the source code for everything.

Some filters (notably bicubic/bilinear/lanczos when shrinking) produced extreme clamping, but I believe that was a result of the artificial nature of the input and nothing else, since SimpleResize preserved the range almost perfectly.

It seems Tom was right: all the filters I tested did not clamp to [16,235] with the following exceptions:

Levels
AddBorders
Letterbox
FadeOut

AddBorders, Letterbox and FadeOut all assume black=16 in YUY2 mode, which is probably a reasonable assumption. But if we're going to let people work with the whole range then IMHO they should have an option to set the black level.

Levels, as I've mentioned before, does clamp. This means something like Levels(16, 1, 235, 0, 255) does not produce the expected [0,255] range, but [16,235] instead. Again IMHO this is not correct behaviour.

To this end the next versions of my filters will not clamp :D.

In related news ;), I was intrigued by the clamping I saw from all the resizers except SimpleResize, so I decided to test them on some real-life frames (you guessed it: American Pie). To that end I chose one fairly representative frame (94590) and one exhibiting a large range (71574) and ran each resizer on them to see how the range was affected.

Since I also wanted to test ReduceBy2 I resized from the original dimensions -- 700x462 -- to 350x231.

Here are the results, with SAD representing the sum of absolute differences of each component's range.


#71574

orig: Y= 6-252 U= 90-150 V=114-168
bicubic 0.3-0.3: Y=15-236 U= 97-147 V=116-166 SAD=39
bicubic 0-0.5: Y=12-249 U= 94-148 V=114-167 SAD=16
bicubic 0-0.75: Y= 9-255 U= 94-148 V=114-168 SAD=12
bilinear: Y=15-229 U= 98-147 V=116-166 SAD=46
lanczos: Y= 6-255 U= 94-148 V=114-168 SAD= 9
simple: Y=12-240 U= 93-148 V=114-168 SAD=23
reduceby2: Y=15-234 U= 96-148 V=117-166 SAD=40

#94590

orig: Y=10-213 U= 97-159 V=117-175
bicubic 0.3-0.3: Y=14-203 U=102-157 V=119-162 SAD=36
bicubic 0-0.5: Y=13-208 U= 99-157 V=118-165 SAD=23
bicubic 0-0.75: Y=10-211 U= 99-158 V=118-167 SAD=14
bilinear: Y=14-200 U=102-157 V=119-161 SAD=40
lanczos: Y=11-212 U= 99-158 V=118-167 SAD=14
simple: Y=13-206 U= 99-157 V=119-162 SAD=29
reduceby2: Y=14-201 U=102-157 V=119-165 SAD=35


As you can see, lanczos seems to preserve the range best of all filters, with bicubic 0-0.75 close behind. Bilinear produces the worst range compression. This is presumably related to the relative smoothing properties of each algorithm. My conclusion from it is that I'm going to use lanczos from now on, and leave the smoothing to the, er, smoothing filters.

I thought it was interesting. YMMV ;).

iago
4th November 2002, 23:28
@SansGrip

Your latest test is very meaningful imho, and I think it supports the visual observations of many users (including me) for some time that LanczosResize has been giving pretty good results in their encodes. Thanks for the information you provide and for the efforts you put in the encoding scene.

iago

MaTTeR
5th November 2002, 00:08
SansGrip,

Thanks for the extensive testing. I'm starting to think that I should use Lanczos on all my rips too and not just on the 2CD versions. It's too bad we have no control over the detail it preserves though. IMO using Lanczos on 1CD rips might be overkill unless we lower the resolution to compensate. I might do some testing on that this weekend coming.

sh0dan,
Well since I always view my rips on TV then I'll be trying out the nifty "Limiter" also. This will be a nice addition. Man you crank this code out quick!

SansGrip
5th November 2002, 01:00
I'm starting to think that I should use Lanczos on all my rips too and not just on the 2CD versions. It's too bad we have no control over the detail it preserves though.

Well, in a way you can: use Lanczos then smooth until you've achieved the desired loss of detail. I would argue this is preferable, since this way you retain full range after the resize and can then control the smoothing much more finely than simply using a bicubic or bilinear.

It would be interesting to compare the following two scripts: 1) A bilinear or soft/neutral resize followed, perhaps, by very gentle smoothing to reach your desired compressibility; and 2) A Lanczos resize followed by stronger smoothing to reach the same level of compressibility.

My hunch is that the second method would preserve the existing range better than the first, and thus with better colour fidelity. If anyone wants to try it I'd be interested to hear the results :).

SansGrip
23rd November 2002, 20:53
Apologies for resurrecting an old thread, but I thought this the best place.

I've been practicing on LegalClip (my simplest filter :D) during my initial efforts to learn assembly language and the various multimedia extensions.

Here's 0.2 (http://www.jungleweb.net/~sansgrip/avisynth/LegalClip-0.2.zip) (and the source (http://www.jungleweb.net/~sansgrip/avisynth/LegalClip-0.2_src.zip)) rewritten in assembler and ISSE-optimized. Note that the ISSE code will only be used if the width is a multiple of four.

Also note that this filter isn't needed for Avisynth 2.5, since its Limiter() filter does the same thing.

Thanks to Tom Barry for his patience while I quizzed him... :)