x264: No subsampling vs. subsampling [Archive]

mzso

17th July 2012, 15:31

Hi!
After an argument I was wondering if chroma subsampling has any advantage outside speed.
I'm not aware of the internals of x264 but I thought maybe it could reduce the chroma information in the video in a more advanced manner, using complicated algorithms, instead of just cutting the resolution in half. (as far as I know x264 has a lot of psychovisual optimizations) It does reduce chroma information anyway (at least in lossy mode). Also the renderers wouldn't have to bother with upsampling.

If encoded to the same size which would look better? (lets assume that the subsampled has a good renderer)

Would encoding without subsampling have any encode time expense?

If a video can be encoded to a just as good (or better) quality without subsampling, why do they use 4:2:0 on Blu-rays?

If it can't be encoded to at least as good quality as with subsampling, what is the reason for it?

SassBot

17th July 2012, 16:16

turab

17th July 2012, 17:40

Like SassBot said, the idea is that chroma subsampling has no significant affect on quality as perceived by humans. This holds true in most cases anyway. This article explains when it's not the case: http://www.glennchan.info/articles/technical/chroma/chroma1.htm. If you notice artefacts after subsampling, it is then that you should consider a higher-resolution chroma format. If you compressed the video without subsampling and was aiming for the same size (compared to if you were subsampling), the video would look less sharp and you'd possibly get macroblock artefacts and so on. At least with subsampling there are no macroblock artefacts and the affect on sharpness is much less if present at all.

Would encoding without subsampling have any encode time expense?

Note that x264 doesn't necessarily do any subsampling. If you were converting to a different chroma format, then yes, it would affect time.

mzso

17th July 2012, 18:14

Chroma subsampling is still used on Blu-Rays to save space. It's the same reason chroma subsampling has been used on every home video format. The average person won't see the difference, anyway, unless they have a player with a chroma upsampling bug.
The point of my question was that wheter x264 can achieve a similar or better result without subsampling

If you compressed the video without subsampling and was aiming for the same size (compared to if you were subsampling), the video would look less sharp and you'd possibly get macroblock artefacts and so on. At least with subsampling there are no macroblock artefacts and the affect on sharpness is much less if present at all.
Not quite convinced. Why do you think that x264 would be so bad at reducing chroma information? Have you tried it? Do you know of any videos that can be compared?

SassBot

17th July 2012, 19:04

The point of my question was that wheter x264 can achieve a similar or better result without subsampling

Since subsampling reduces the amount of bits needed to compress the video, no. If you could get similar quality at the same bitrates without subsampled chroma there would be no reason to use subsample the chroma to begin with.

Dark Shikari

17th July 2012, 19:22

vivan

17th July 2012, 19:23

Why do you think that x264 would be so bad at reducing chroma information?Why do you think that x264 is reducing chroma infomation? It can output both 4:4:4 and 4:2:0. And why do you think it bad at doing it?

Do you know of any videos that can be compared?You can easily do it by yourself:
1) find any video withou subsampling (e.g. proper screen capture) or downscale any video by 50%.
2) encode it using x264. http://mewiki.project357.com/wiki/X264_Settings#output-csp
3) compare it (using not broken player & renderer).

Asmodian

17th July 2012, 19:39

I did a few limited tests with 4:2:2 source material (SVHS captures so very soft and very bad chroma already) using the lowest crf I thought gave a transparent 4:2:0 H.264.

Using 4:2:2 with the same settings gave a noticeably larger file that wasn't noticeably higher quality. Trying to match the size (I never really got the same size) resulted in a lower quality file. I never messed with chroma-qp-offset.

This is after deinterlacing, I could notice the difference (at least I thought I could) converting to 4:2:0 before deinterlacing.

Edit: I don't think any of this is relevant to modern digital source videos, sorry. Ah good ol' analogue formats. :o

benwaggoner

17th July 2012, 20:13

This is after deinterlacing, I could notice the difference (at least I thought I could) converting to 4:2:0 before deinterlacing.
The nice thing about 4:2:2 for interlaced sources is that each line has its own chroma sample. With interlaced 4:2:0 an interlaced macroblock would have, for example, lines 0 and 2 share one chroma sample and lines 1 and 3 a different chroma sample. Stretching a sample out across that bigger spatial area makes them less accurate in general, and makes it harder for deinterlacing algorithms to determine the the best Cb and Cr values for the progressive output.

This was a big deal with Main Profile MPEG-2 sources (and a big reason High Profile 4:2:2 MPEG-2 gets used for mezzanine files in the industry). I've done some experimentation with H.264 interlaced 4:2:0 using MBAFF, which seems to be a lot less problematic. This would make sense, as progressive macroblocks don't get the interlaced chroma, and interlaced blocks by definition will have less chroma correlation between physically adjoining lines versus temporally adjoining lines.

Still, I'm not quite ready to say MBAFF 4:2:0 is good enough for mezzanine use.

Edit: I don't think any of this is relevant to modern digital source videos, sorry. Ah good ol' analogue formats. :o
Subsampling is used in most digital source video. Consumer or prosumer gear generally uses 4:2:0. 4:2:2 is standard in most professional video workflows. 4:4:4 is used for high-end film production and such, but I almost never see a 4:4:4 source file in practice.

Didée

17th July 2012, 20:44

I've done some experimentation with H.264 interlaced 4:2:0 using MBAFF, which seems to be a lot less problematic. This would make sense, as progressive macroblocks don't get the interlaced chroma,
Ben, I'm fully with you concerning the drawbacks regarding chroma in interlaced 4:2:0. However, I don't see how MBAFF in H.264 could help to improve something? In case of (interlaced) 4:2:0 sampling, the (interlaced) chroma sampling is already in the source, before any encoder even gets its hands on the data. I understand MBAFF helps improving coding efficiency, but it doesn't change anything about the sampling of the source data.

(At some point time back, I cracked my head about how MBAFF might handle the point of different chroma sample positions for [progressive] vs. [interlaced]. Days and headaches later, I realized that it just doesn't, because there's no need. Chroma sample positions play a role when 4:2:0 is created from a higher (chroma) resolution parent. But when the encoder gets into play, all is already fixed and done.)

mzso

17th July 2012, 22:21

Subsampling was originally used to reduce analog video bandwidth. It provides no real benefit for compressed video beyond speed. It's largely around for legacy and analog reasons.

Note that in real-world video formats, 4:2:0 may be better or worse than 4:4:4 depending on bitrate because of how the codecs are designed and the psychovisual effects of trading off resolution for quality.

If the source is 4:2:0, there is little point in compressing as 4:4:4, naturally.

Thanks for the affirmation. So I it all depends on if the codec is good enough, since I assume the format doesn't have any requirements.

Why do you think that x264 is reducing chroma infomation? It can output both 4:4:4 and 4:2:0. And why do you think it bad at doing it?

You can easily do it by yourself:
1) find any video withou subsampling (e.g. proper screen capture) or downscale any video by 50%.
2) encode it using x264. http://mewiki.project357.com/wiki/X264_Settings#output-csp
3) compare it (using not broken player & renderer).
Since lossy compression is about removing information that's not (much) noticable it naturally reduces the chroma information (too).
I didn't say it was bad at it, that was "turab". I implied/assumed the opposite.

turab

17th July 2012, 22:35

Thanks for the affirmation. So I it all depends on if the codec is good enough, since I assume the format doesn't have any requirements.

Since lossy compression is about removing information that's not (much) noticable it naturally reduces the chroma information (too).
I didn't say it was bad at it, that was "turab". I implied/assumed the opposite.
I didn't mean to imply that x264 was a bad encoder... rather that chroma subsampling is a better solution to reduce data than lossy compression. Dark Shikari is saying it depends on the bitrate, and I suppose he meant that at high bitrates, it's actually better to avoid subsampling. That sounds about right.

Asmodian

18th July 2012, 00:40

Subsampling is used in most digital source video. Consumer or prosumer gear generally uses 4:2:0. 4:2:2 is standard in most professional video workflows. 4:4:4 is used for high-end film production and such, but I almost never see a 4:4:4 source file in practice.

I actually just meant that the results of my tests using SVHS sources probably do not mean much in this context. I understand the real "resolution" of the chroma on the tape is quite low; certainly there are almost no details in the chroma to be lost during sub sampling. If one used a "new" 4:2:2 or 4:4:4 source I imagine the results of a test like I had done could be different.

mzso

4th August 2012, 21:28

Come to think of it (after watching some 1080i50 videos) is probably the same thing with 50i -> deinterlacing(to 50p) and 50p (with a good encoder) at the same bitrate.

gyth

5th August 2012, 03:50

Do you know of any videos that can be compared?
Video game footage pulled from an emulator is one source of unsubsampled video.
Most of the recent videos at TASVideos.org are encoded in 8bit420 and 10bit444.
Subsampling artifacts are most noticeable in static, highly detailed areas with contrasting colors.
This probably happens more often in video games than real life sources.

http://i.imgur.com/pi0Yt.png

#http://tasvideos.org/2078M.html
a=ffvideosource("supermetroid-tas-reversebossorder-saturn.mp4").converttorgb24.subtitle("420")
b=ffvideosource("supermetroid-tas-reversebossorder-saturn_10bit444.mp4").converttorgb24.subtitle("444")
stackhorizontal(a,b).trim(43950,43950)

mzso

5th September 2012, 16:04

I didn't mean to imply that x264 was a bad encoder... rather that chroma subsampling is a better solution to reduce data than lossy compression. Dark Shikari is saying it depends on the bitrate, and I suppose he meant that at high bitrates, it's actually better to avoid subsampling. That sounds about right.

Actually he said it depends on the codec. Because there are a lot of sucky/weak encoders. I'd imagine the ones that don't focus on encoding subsampled sources and neglect any optimizations for unsubsampled sources would/could result in a worse outcome on lower bitrates.

benwaggoner

5th September 2012, 18:43

Ben, I'm fully with you concerning the drawbacks regarding chroma in interlaced 4:2:0. However, I don't see how MBAFF in H.264 could help to improve something? In case of (interlaced) 4:2:0 sampling, the (interlaced) chroma sampling is already in the source, before any encoder even gets its hands on the data. I understand MBAFF helps improving coding efficiency, but it doesn't change anything about the sampling of the source data.
I was assuming an encoder that took 4:2:2 input and thus was able to provide the "right" 4:2:0 samples for both progressive and interlaced macroblocks. Thus a interlaced block encoded from source frame lines 0, 2, 4, 8 and a progressive block encoded from source frame lines 0, 1, 2, 3 would both have perfect chroma samples.

Some MPEG-2 encoders can do this, but I don't know that x264 can.

One place where x264 could improve is by supporting input color spaces (both >4:2:0 and >8-bit) even with the output format is 8-bit 4:2:0, and using that higher precision in its internal optimizations. In the case of 10-bit sources, an encoder should be able to do a much more compression-friendly dither than a preprocessing dithering algorithm.

(At some point time back, I cracked my head about how MBAFF might handle the point of different chroma sample positions for [progressive] vs. [interlaced]. Days and headaches later, I realized that it just doesn't, because there's no need. Chroma sample positions play a role when 4:2:0 is created from a higher (chroma) resolution parent. But when the encoder gets into play, all is already fixed and done.)[/QUOTE]

zinga

9th January 2013, 07:27

Bumping old thread.

Out of interest, has anyone done any comparisons of this for x264? I'm interested in what quality you could get from a fixed size with various options.
For example, out of these three, which would be optimal for quality?
- encoding as 4:2:0
- encoding as 4:4:4 but the chroma QP offset raised enough to give similar resulting filesize as the above
- encoding as 4:4:4 but raising crf to give similar resulting filesize

Poutnik

16th March 2013, 08:06

Note that for human faces HVS is very sensitive.

The subsampling seems to me to be safer
than possibly noticable color quantization, except for high bitrates...

I may try some simulation TESTS with JPG subsampling modes saving to the same size.

Dark Shikari

16th March 2013, 08:52

I don't think you know what quantization (in terms of frequency-domain coding) actually means...

If you're worried about the average color changing, that's quite unlikely to happen given the way DC coefficients are coded in H.264 and the way chroma QPs are calculated.

Poutnik

16th March 2013, 09:55

Subsampling JPG test, saving both sampling variant with same (almost) target size
( Thumbnails as links to full size pictures )

It is "Quick and dirty" test using Irfanview ( + all Plugins ) on 24bit master PNG file.
Test was using switched ON/OFF JPG subsampling for saving to the same target size 1/25 1/50 1/100 of uncompressed size.

JPGs saved at compression ratio 1:25, 1st subsampled, 2nd without subsampling

http://imageshack.us/a/img571/817/subsamplingface1025subs.th.jpg (http://imageshack.us/photo/my-images/571/subsamplingface1025subs.jpg/)

http://imageshack.us/a/img585/6864/subsamplingface1025nosu.th.jpg (http://imageshack.us/photo/my-images/585/subsamplingface1025nosu.jpg/)

Both seems good.

JPGs saved at compression ratio 1:50, 1st subsampled, 2nd without subsampling

http://imageshack.us/a/img202/9130/subsamplingface1050subs.th.jpg (http://imageshack.us/photo/my-images/202/subsamplingface1050subs.jpg/)

http://imageshack.us/a/img842/7892/subsamplingface1050nosu.th.jpg (http://imageshack.us/photo/my-images/842/subsamplingface1050nosu.jpg/)

JPG without subsampling has light blocking.

JPGs saved at compression ratio 1:100, 1st subsampled, 2nd without subsampling

http://imageshack.us/a/img833/5674/subsamplingface1100subs.th.jpg (http://imageshack.us/photo/my-images/833/subsamplingface1100subs.jpg/)

http://imageshack.us/a/img607/7239/subsamplingface1100nosu.th.jpg (http://imageshack.us/photo/my-images/607/subsamplingface1100nosu.jpg/)

JPG with subsampling has medium blocking.
JPG without subsampling has strong blocking.

Poutnik

16th March 2013, 10:04

If you're worried about the average color changing, that's quite unlikely to happen given the way DC coefficients are coded in H.264 and the way chroma QPs are calculated.

I am not expert in AVC encoding while you are. I know DCT is just one part of the whole H264 schema.

Do you say not to worry about colour encoding artefacts while encoding full instead of quarter color planes at the same bitrate ?

I don't think you know what quantization (in terms of frequency-domain coding) actually means...

Hm, I am sad to hear it.

I do know what quantization coeficients and frequency domain means in FT or DCT context. Some years ago I was interested in forward and inverse Fourier transformation in signal processing. DCT is just next to it, using orthogonal discrete cosine polynoms. I know how DCT is defined and calculated. At least I knew it, possible some details refresh needed, it is long time.

I have even made for myself some Excel table, making quantized DCT coefficients calculations for given 8x8 value array. I used it to examine how XVID custom matrices work.

xooyoozoo

16th March 2013, 10:31

Poutnik

16th March 2013, 11:04

The luma plane doesn't need to be dragged down along with everything else, as a theoretical 'smart' encoder would be able to throw away exactly the same amounts of chroma bits in a full-color scenario versus a subsampled one. The advantage of this smart encoder is that in situations where it knows that we want good color, it acts accordingly, which would be impossible if the colors were already subsampled.

Lets suppose bitrate/final size is the limitation.
Is not the scenario analogical to encoding 720*576 PAL versus 360x288 PAL/2 at the same bitrate, applied to colour planes only ?
What is good for PAL/2 may not be good for full PAL, both for whole video and colour only. But I may be wrong.

I think a better test would be to first make jpegs using the normal 420 subsampling. Afterwards, extract those files's luma+chroma q-tables and apply that on the 444 versions. The filesizes here will obviously be larger, so increase the chroma qvalues (hopefully in a non-haphazard manner) until sizes between 420 and 444 files match.

Well it was quick and dirty test using Irfanview on 24bit master PNG file. Test was using switched ON/OFF JPG subsampling with the same target size 1/25 1/50 1/100 of unccompressed size.

Edit:I am aware it is not direct analogy to video encoding with same target size and different colour sampling/encoding.
In video case there would not be luma blocking differences, but possibly chroma blocking differences.

As not expert to X264 - I have always thought X264 encodes in 4:2:0.
But I am confused now by some posts - it seams if I provide 4:4:4 content to X264 ( e.g. as AVS RGB32 ), it will keep it... Yes or not ?

gyth

16th March 2013, 15:12

It is "Quick and dirty" test using Irfanview ( + all Plugins ) on 24bit master PNG file.
Did it have options for independent quality control of luma/chroma?
If so, what settings did you use?

Of course throwing away 75% of your data leaves you with less data.
The question is whether dct can do a better job with the same number of bits. (or a similar job with fewer bits)

Poutnik

16th March 2013, 15:19

Did it have options for independent quality control of luma/chroma?
If so, what settings did you use?

Of course throwing away 75% of your data leaves you with less data.
The question is whether dct can do a better job with the same number of bits. (or a similar job with fewer bits)

As already noted, is is only indirect analogy.

Comparative video encoding with full chroma and independent Q control versus standard subsampling ,
both same bitrate, has to be done.

gyth

16th March 2013, 17:39

Poutnik

16th March 2013, 18:52

http://www.ijg.org/files/jpegsr9.zip/usage.txt
"The -quality option has been extended in IJG version 7 for support of separate quality settings for luminance and chrominance"

Maybe -quality 90,90 vs. -quality 90,60 -sample 1x1 ???

It is possible way to go,
using different quality for different chroma sampling
to reach same target size.

MasterNobody

16th March 2013, 20:32

As not expert to X264 - I have always thought X264 encodes in 4:2:0.
But I am confused now by some posts - it seams if I provide 4:4:4 content to X264 ( e.g. as AVS RGB32 ), it will keep it... Yes or not ?
With "--output-csp i444" option x264 will encode the source in YUV 4:4:4 colorspace. If you will use "--output-csp rgb" option it will keep RGB colorspace (which is also 4:4:4).

Poutnik

17th March 2013, 08:24

Yeah, the H.264 specification has a High 4:4:4 Predictive Profile which supports 4:4:4, see:...

I posted a quote form that page too.
But unfortunately most (if not even all) ASICs are not able to decode it :(.
Can it be one of good reasons why 4:2:0 is still used ? ;)

With "--output-csp i444" option x264 will encode the source in YUV 4:4:4 colorspace. If you will use "--output-csp rgb" option it will keep RGB colorspace (which is also 4:4:4).

For 4:4:4 chroma and keeping bitrate it could be good combined with chroma-qp-offset, mewiki_X264_Settings_chroma-qp-offset (http://mewiki.project357.com/wiki/X264_Settings#chroma-qp-offset) The question is rule of thumb, what would be typical offset gaining similar bitrate for twice resolution. Somewhere it was said for normal encoding the Q higher by 6 gives about the half size. As double chroma res may have ( guessing ) better compressibility, perhaps chroma-qp-offset=10 would be about to use at same bitrate scenario.

chroma-qp-offset , Default: 0, Add an offset to the quantizer of chroma planes when encoding. The offset can be negative.

When using psy options are enabled (psy-rd, psy-trellis), x264 automatically subtracts 2 from this value to compensate for these optimisations overly favouring luma detail by default.

Note: x264 only encodes the luma and chroma planes at the same quantizer up to quantizer 29. After this, chroma is progressively quantized by a lower amount than luma until you end with luma at q51 and chroma at q39. This behavior is required by the H.264 standard.

BTW, are there SW decoders known to be able to play it ? (in my case are used MPC-HC/BE, FFDShow, VideoLAN in this order ) ?

Dark Shikari

17th March 2013, 09:30

x264 already gives +6 to chroma QP offset in 4:4:4 mode. This is completely arbitrary and probably not an ideal choice.

All libavcodec-based decoders should play 4:4:4 fine.

Poutnik

17th March 2013, 10:17

x264 already gives +6 to chroma QP offset in 4:4:4 mode. This is completely arbitrary and probably not an ideal choice.
Thanks for info, I have expected it may do something like that.
But... avisynth users of slow scripts will not like the idea to process data for 4:4:4 stuff . :o

Poutnik

17th March 2013, 16:12

Question:

When going for 4:4:4, why exactly would you choose YCbCr? Wouldn't it be better to directly use RGB?
Are there any advantages using YCbCr over RGB when going for 4:4:4?

Even without chroma subsampling YCbCr has big advantage of option to encode luma and chroma by different effort, as HVS response to them is different. It is the same principle as for audio encoding. What is less noticable is more omitted. In final effect, the video is aimed for human eyes, not for monitor VGA/DVI/HDMI interfaces. And, there is lot of B/W stuff, possibly later ported to H265. For it RGB is nonsense, Y8 or YV12 are good enough.

With incoming UHDTV ( 3840 × 2160, 7680 × 4320 , every bit will be counted. Ultra_high_definition_television (http://en.wikipedia.org/wiki/Ultra_high_definition_television)

gyth

17th March 2013, 17:53

Are there any advantages using YCbCr over RGB when going for 4:4:4?
Very often an object will be a single color, with lighting variations.
In RGB, all components would change; while in YCbCr, only the luma changes.

akupenguin

17th March 2013, 21:08

In RGB, all components would change; while in YCbCr, only the luma changes.
False. YCbCr is better in that respect than RGB, but not perfect.

Now, why don't codecs use colorspaces such as HSV which do have the property of brightness changes affecting only one component? I'd guess either because old TVs couldn't afford any transform more complicated than affine, or because HSV behaves badly under inter prediction and uniform quantization. And you'd have to replace DCT with I don't even know what to handle the fact that Hue is cyclic.

drmpeg

17th March 2013, 23:16

This paper has a table of coding gains for various RGB transforms. RGB to ITU-R BT.709 YCbCr has a gain of 3.44 dB.

http://wftp3.itu.int/av-arch/jvt-site/2003_09_SanDiego/JVT-I014.doc

Ron

mzso

18th March 2013, 10:50

False. YCbCr is better in that respect than RGB, but not perfect.

Now, why don't codecs use colorspaces such as HSV which do have the property of brightness changes affecting only one component? I'd guess either because old TVs couldn't afford any transform more complicated than affine, or because HSV behaves badly under inter prediction and uniform quantization. And you'd have to replace DCT with I don't even know what to handle the fact that Hue is cyclic.

Did anyone try to use HSV for a codec?