Compression of black bars in AVC, does it take a lot of space?

tijgert · 18th October 2009, 23:07

I've been experimenting with different encode settings. The original goal was to see for myself the quality difference (if any) at medium bitrates. By mistake I have made one version of 'The Patriot' (2:55hours) with the black bars cropped (17Gigs) and one version without cropping (19Gigs).

I'm having a hell of a lot of trouble actually seeing any difference in both encodes, but that *may* be because the moving part of the movies have roughly the same bitrates while the black bars eat up those 2 extra gigs. I don't really know.

What I'm actually asking you folks here is about the compressability of the black bars; Can anyone give me a reasonable conclusion of how cropping black bars affects movie quality when encoding at the same bitrate? (do they compress down to near nothing or are they hungry for bits?)

Boolsheet · 18th October 2009, 23:24

I know someone asked a similar question not too long ago. *pokes search* Ah here:
http://forum.doom9.org/showthread.php?t=149403

Mh, if I remember correctly there must be another thread with the same question...

tijgert · 18th October 2009, 23:36

The other topic actually implies a better quality movie WITH the black bars during a one pass encode.
But I was doing a 2 pass encode with Ripbot (I know, hardly professional but I'm just trying to learn from experience) and so I'm not sure how that translates.

I have no interest in any compliance with resolution restrictions by the way.

Sagekilla · 18th October 2009, 23:48

If you're using crf mode, the fact that the encode with black bars came out is normal and expected. Black bars = very flat, easy to encode. Because of how much space they take up in the frame, they decrease the overall complexity of the scene and x264 lowers the quantizer used for the frames. That's why you get a larger file size.

Otherwise, if you encoded using identical bitrates (using 2-pass, for example), there should be little to no difference since black bars consume VERY few bits.

LoRd_MuldeR · 18th October 2009, 23:53

What actually consumes bits would be the border between the black area and the actual content. If the border is perfectly sharp and aligned to mod16, the overhead for the black area will be negligible.

Dark Shikari · 18th October 2009, 23:53

You should always crop black bars if possible. They don't take up much space, but there's no reason to keep them (it wastes encoding and decoding time) and it slightly lowers the efficiency of motion compensation. It's worse if the black bars don't end on mod16 boundaries.

tijgert · 19th October 2009, 00:03

Yeah well I intended to crop the bars, I just forgot.
Since I encoded both attempts at different bitrates I also can't say that those 2 extra Gigs went to the black bars.
It'd be kinda much anyway.

This little foul-up just made me curious as to the effect of leaving the black bars in (with very crisp borders by the way)

benwaggoner · 19th October 2009, 09:32

Black bars from noisy sources are also a lot worse from both a compression efficiency and quality perspective. Even a little bit of noise in there can require a lot of bits to avoid blocking on LCD displays. Cropping them out thus saves some noise reduction tweaking and processing time as well.

Getting letterboxing to be mod16 is really important for quality with MPEG-2, since it doesn't have any good way of dealing with a really sharp edge mid-block.

shon3i · 19th October 2009, 09:57

Did is better to crop, and add new black bars if need (Blu-Ray), than keep original?

Dark Shikari · 19th October 2009, 09:59

Quote:

Originally Posted by shon3i

Did is better to crop, and add new black bars if need (Blu-Ray), than keep original?

Yes, and to add the new ones such that the borders of the black bars are mod16.

Sharc · 19th October 2009, 10:11

When Blu-ray compliance is required it is good practice to crop the black borders such as to leave a mod16 compliant active picture, and to add new black borders using AddBorders(...) which are noise-free.

Edit:
DS was faster ....
But question: If the added black borders are not mod16, wouldn't x264 just pad it anyway to be mod16 compliant?

Dark Shikari · 19th October 2009, 10:18

Quote:

Originally Posted by Sharc

But question: If the added black borders are not mod16, wouldn't x264 just pad it anyway to be mod16 compliant?

I meant that the edges of the borders were mod16, not the image itself.

Shevach · 19th October 2009, 12:03

Quote:

Originally Posted by LoRd_MuldeR

What actually consumes bits would be the border between the black area and the actual content. If the border is perfectly sharp and aligned to mod16, the overhead for the black area will be negligible.

I agree with the above quote. Indeed if a black border is not aligned to 16-pixels boundary then sharp vertical edge might cause appearance of high-frequency components after DCT.

I would clarify more points related to CABAC encoding. Notice that this point is relevant if the stream is encoded in CABAC mode.
Entropy models of an active video area and a black area usually are very different. Consequently when an encoder reaches the black area with the entropy models tuned for the active video "penalty" (extra) bits are generated.
In Theory of Information there is a particular term - Kullback–Leibler divergence which measures the expected number of extra bits required to code samples from a source with actual entropy P when using a code based on the entropy Q.

Moreover, within the black area CABAC is started to adapt it self to the actual entropy of black area and when the CABAC again returns to active video area it wastes extra bits until it tunes to the entropy of the active video area.

G_M_C · 19th October 2009, 12:25

Just theoretical;

If you need the black bars to get to a BD spec resolution. Is it not possible to encode the cropped image, and add the black bars back through adding a slice above and a slice under the real image ? (so in fact making slices useful in those cases ?).

How it can be done is another matter. But theoretically speaking; Should this not work ?

Shevach · 19th October 2009, 13:05

G_M_C,
I assume your comment relates to my explanation.

Indeed, top and bottom margins we can encode as a separate slice. After at most 64 MBs CABAC should tune itself to the actual entropy of the black area (since context models are initially initiated according to cabac_init_idc and SliceQP and are far from the actual entropy). In such case extra bits are generated only for first 64 MBs.
As for right and left margins 'slicing' is not a good idea.

Generally speaking everyone can conduct the following experiment:
1) Take two "different" video excerpts say with the same resolution 480x720 and the same number of pictures. Here "different" means with different entropy (e.g. one stream can be from cartoon another from live video).

2) Encode these two video streams in CABAC mode with constant QP (to exclude RateControl impact).
Suppose that the first stream gives X bits while another Y bits.

3) Then compose a new video by concatenating raw data of the first excerpt and the second one. The resulting video is in the resolution of 480x1440 (where left part is from the first stream and the right is from the second one).

4) Encode the concatenated video excerpt in CABAC mode with the same QP.
It is expected that the new stream gives much more bits than (X +Y), because entropy is changed across each row and CABAC can't stabilize.

akupenguin · 19th October 2009, 19:03

Slices can only end at macroblock boundaries, which is precisely the condition on which borders aren't too horribly inefficient anyway.
The number of bits wasted in waiting for cabac to readapt from coding "all skips" to coding real content, is less than the number of bits spent on a slice header. Because the border will be all skips, so it doesn't need to code any other type of token, so it doesn't contaminate any other part of the cabac model.
A new slice might possibly improve mv prediction along the top edge (which would otherwise use a constant 0 prediction due to the two neighbors with mv=0), and it would allow use of DC prediction for intra blocks there (which would otherwise average half of its neighbors from the black part). This doesn't do jack for the bottom border, since motionless skip blocks don't rely on neighbors even that much.

Quote:

Originally Posted by Shevach

Then compose a new video by concatenating raw data of the first excerpt and the second one.

That mixes a change in entropy coding with a change in motion interpolation. mvs that used to point off the left or right edge of the frame (and thus get extrapolated samples), now point into the other video (and thus get much worse values).

Shevach · 20th October 2009, 09:09

Quote:

Originally Posted by akupenguin

That mixes a change in entropy coding with a change in motion interpolation. mvs that used to point off the left or right edge of the frame (and thus get extrapolated samples), now point into the other video (and thus get much worse values).

The question is how from the mixture extracts the very impact of entropy change?

Let's consider a P-frame with width 64x16 pixels where all MBs at each even row are non-skip while all MBs at each odd row are skip.
In such case for mb_skip_flag syntax element only one context model with number 12 is always selected since left MB is always skip/non-skip and top is always non-skip/skip.

Upon an encoder finishes an even row the context model #12 contains LPS=1 (least probable symbol) with the probability 0.01875 (the minimal probability in CABAC).
Now the encoder starts to process the next MB row (odd row) where all syntax elements to code are mb_skip_flag and end_of_slice. The number of bits are generated for mb_skip_flag of the first MB is 6-7 bits. After 64 MBs mb_skip_flag is coded with ~1 bit. Thus ~2-3 extra bits are produced in average for coding of mb_skip_flags.

The above example shows how the change of entropy affects on total output size.

akupenguin · 20th October 2009, 15:38

Quote:

Originally Posted by Shevach

Let's consider a P-frame with width 64x16 pixels

64x16 MBs?

Quote:

Originally Posted by Shevach

The number of bits are generated for mb_skip_flag of the first MB is 6-7 bits. After 64 MBs mb_skip_flag is coded with ~1 bit. Thus ~2-3 extra bits are produced in average for coding of mb_skip_flags.

The number of bits generated for mb_skip_flag of the first MB is 5.7 bits. After 64 MBs, mb_skip_flag is coded with 0.027 bits. The total cost of all 64 flags is 46.2 bits, which is an overhead of 44.4 compared to the 1.8 bits it would cost to code 64 flags optimally predicted.
If you code each row as its own slice, then the slice header + NAL encapsulation costs about 80 bits (or more if you use mmco or wpred). And a new slice doesn't init the cabac state to "perfect prediction", it inits to some QP-dependent but otherwise constant value, which for the sake of simplicity I'll assume to be the 50% context. Coding 64 flags starting from there costs 17.1 bits. Total overhead 95 bits.

Shevach · 21st October 2009, 09:21

Quote:

Originally Posted by akupenguin

64x16 MBs?

The number of bits generated for mb_skip_flag of the first MB is 5.7 bits.

I see you are familiar with Theory of Information.
Indeed -log2(0.01875) = ~5.7 bits. This is an approximate value since CABAC is Q-coder, i.e. the multiplications for determination intervals are approximated. Therefore the exact number of bits that CABAC produces for the very first mb_skip_flag is believed to be between 5 - 6 bits. So, I was wrong in my estimation on 6-7 bits.
Anyway your reasoning sounds reasonable, i.e. slice headers can only deteriorate the situation. Frankly speaking I was not a proponent for slice headers. I wanted to show that a sharp change of entropy within a picture might give extra bits.

There is an issue how to init context models at the start of a slice in order to maximally approaches to the actual entropy of data within the slice? Two parameters cabac_init_idc and sliceQP affects on initial setting of context models.
One can selects best sliceQP and cabac_init_idc to get closest approximation to the actual entropy. In such case the number of extra bits generated by CABAC during adaptation is minimal.
For example let's assume that sliceQP=1 and cabac_init_idc=1 gives best approximation then the first MB should contain qp_delta in order to signal to a decoder the correct QP value (because sliceQP has been chosen only for CABAC initialization).
On the other hand all initial settings of context models are not uniformly dispersed. Therefore for some sources initial CABAC settings is far from the actual entropy. Unfortunately H.264 does not support a custom initiation of context models, therefore on many video sources CABAC generates a lot of extra bits at the start of slices untill it adapts itself.

Dark Shikari · 21st October 2009, 09:49

Quote:

Originally Posted by Shevach

I see you are familiar with Theory of Information.

Perhaps you should look again at who the person you're responding to is

Quote:

Originally Posted by Shevach

There is an issue how to init context models at the start of a slice in order to maximally approaches to the actual entropy of data within the slice? Two parameters cabac_init_idc and sliceQP affects on initial setting of context models.

Both of which, from my testing, are totally useless.

18th October 2009, 23:07	#1 \| Link
tijgert Registered User Join Date: Mar 2005 Location: Amsterdam Posts: 46	Compression of black bars in AVC, does it take a lot of space? I've been experimenting with different encode settings. The original goal was to see for myself the quality difference (if any) at medium bitrates. By mistake I have made one version of 'The Patriot' (2:55hours) with the black bars cropped (17Gigs) and one version without cropping (19Gigs). I'm having a hell of a lot of trouble actually seeing any difference in both encodes, but that may be because the moving part of the movies have roughly the same bitrates while the black bars eat up those 2 extra gigs. I don't really know. What I'm actually asking you folks here is about the compressability of the black bars; Can anyone give me a reasonable conclusion of how cropping black bars affects movie quality when encoding at the same bitrate? (do they compress down to near nothing or are they hungry for bits?) Last edited by tijgert; 18th October 2009 at 23:10.

18th October 2009, 23:24	#2 \| Link
Boolsheet Registered User Join Date: Apr 2009 Location: Switzerland Posts: 69	I know someone asked a similar question not too long ago. pokes search Ah here: http://forum.doom9.org/showthread.php?t=149403 Mh, if I remember correctly there must be another thread with the same question... __________________ My nightmares are horrifying, they're all interlaced!

18th October 2009, 23:48	#4 \| Link
Sagekilla x264aholic Join Date: Jul 2007 Location: New York Posts: 1,752	If you're using crf mode, the fact that the encode with black bars came out is normal and expected. Black bars = very flat, easy to encode. Because of how much space they take up in the frame, they decrease the overall complexity of the scene and x264 lowers the quantizer used for the frames. That's why you get a larger file size. Otherwise, if you encoded using identical bitrates (using 2-pass, for example), there should be little to no difference since black bars consume VERY few bits. __________________ You can't call your encoding speed slow until you start measuring in seconds per frame.

18th October 2009, 23:53	#5 \| Link
LoRd_MuldeR Software Developer Join Date: Jun 2005 Location: Last House on Slunk Street Posts: 13,248	What actually consumes bits would be the border between the black area and the actual content. If the border is perfectly sharp and aligned to mod16, the overhead for the black area will be negligible. __________________ Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

18th October 2009, 23:53	#6 \| Link
Dark Shikari x264 developer Join Date: Sep 2005 Posts: 8,666	You should always crop black bars if possible. They don't take up much space, but there's no reason to keep them (it wastes encoding and decoding time) and it slightly lowers the efficiency of motion compensation. It's worse if the black bars don't end on mod16 boundaries. __________________ Follow x264 development progress \| akupenguin quotes \| x264 git status ffmpeg and x264-related consulting/coding contracts \| Doom10

18th October 2009, 23:36	#3 \| Link
tijgert Registered User Join Date: Mar 2005 Location: Amsterdam Posts: 46	The other topic actually implies a better quality movie WITH the black bars during a one pass encode. But I was doing a 2 pass encode with Ripbot (I know, hardly professional but I'm just trying to learn from experience) and so I'm not sure how that translates. I have no interest in any compliance with resolution restrictions by the way.

19th October 2009, 00:03	#7 \| Link
tijgert Registered User Join Date: Mar 2005 Location: Amsterdam Posts: 46	Yeah well I intended to crop the bars, I just forgot. Since I encoded both attempts at different bitrates I also can't say that those 2 extra Gigs went to the black bars. It'd be kinda much anyway. This little foul-up just made me curious as to the effect of leaving the black bars in (with very crisp borders by the way)

19th October 2009, 09:32	#8 \| Link
benwaggoner Moderator Join Date: Jan 2006 Location: Portland, OR Posts: 4,770	Black bars from noisy sources are also a lot worse from both a compression efficiency and quality perspective. Even a little bit of noise in there can require a lot of bits to avoid blocking on LCD displays. Cropping them out thus saves some noise reduction tweaking and processing time as well. Getting letterboxing to be mod16 is really important for quality with MPEG-2, since it doesn't have any good way of dealing with a really sharp edge mid-block. __________________ Ben Waggoner Principal Video Specialist, Amazon Prime Video My Compression Book

19th October 2009, 09:57	#9 \| Link
shon3i BluRay Maniac Join Date: Dec 2005 Posts: 2,419	Did is better to crop, and add new black bars if need (Blu-Ray), than keep original?

19th October 2009, 10:11	#11 \| Link
Sharc Registered User Join Date: May 2006 Posts: 3,997	When Blu-ray compliance is required it is good practice to crop the black borders such as to leave a mod16 compliant active picture, and to add new black borders using AddBorders(...) which are noise-free. Edit: DS was faster .... But question: If the added black borders are not mod16, wouldn't x264 just pad it anyway to be mod16 compliant? Last edited by Sharc; 19th October 2009 at 10:16.

19th October 2009, 12:25	#14 \| Link
G_M_C Registered User Join Date: Feb 2006 Posts: 1,076	Just theoretical; If you need the black bars to get to a BD spec resolution. Is it not possible to encode the cropped image, and add the black bars back through adding a slice above and a slice under the real image ? (so in fact making slices useful in those cases ?). How it can be done is another matter. But theoretically speaking; Should this not work ?

19th October 2009, 13:05	#15 \| Link
Shevach Video compressionist Join Date: Jun 2009 Location: Israel Posts: 126	G_M_C, I assume your comment relates to my explanation. Indeed, top and bottom margins we can encode as a separate slice. After at most 64 MBs CABAC should tune itself to the actual entropy of the black area (since context models are initially initiated according to cabac_init_idc and SliceQP and are far from the actual entropy). In such case extra bits are generated only for first 64 MBs. As for right and left margins 'slicing' is not a good idea. Generally speaking everyone can conduct the following experiment: 1) Take two "different" video excerpts say with the same resolution 480x720 and the same number of pictures. Here "different" means with different entropy (e.g. one stream can be from cartoon another from live video). 2) Encode these two video streams in CABAC mode with constant QP (to exclude RateControl impact). Suppose that the first stream gives X bits while another Y bits. 3) Then compose a new video by concatenating raw data of the first excerpt and the second one. The resulting video is in the resolution of 480x1440 (where left part is from the first stream and the right is from the second one). 4) Encode the concatenated video excerpt in CABAC mode with the same QP. It is expected that the new stream gives much more bits than (X +Y), because entropy is changed across each row and CABAC can't stabilize.

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode