Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 8th May 2013, 17:33   #1  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Updated Rule of ^3/4 for H.264 high profile?

So, a rule of thumb in compression has long been that changing frame size while maintaining similar subjective quality doesn't entail a linear increase/decrease it bitrate.

The classic rule of thumb had been that the change in pixel area should match a bitrate change to the power of 3/4ths of the pixel area. Thus going from a 640x480 1 Mbps to 1280x960 would entail 4^0.75=2.83 Mbps. Conversely, going down to 320x240 would allow a reduction to 0.25^0.75=0.35 Mbps

This originally seems to have been calibrated around MPEG-4 pt 2, however. Zambelli did some extensive testing a few years ago and discovered 0.71 was a better value for VC-1. I imagine that in-loop deblocking would tend to reduce the value.

Thus, I expect that the exponent would be even lower for H.264. I was wondering if anyone has come up with their own rule of thumb, experimentally or otherwise. I expect that High Profile is probably lower than Main, which is lower than Baseline.

I expect the exponent would be even lower for HEVC with its huge block sizes and other features to very efficiently encode areas of lower detail.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 8th May 2013, 23:49   #2  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Do you think the difference between .75 and .71 reflects a difference between MPEG4part2 and VC-1? Or just that the rule of thumb never was that precise?
akupenguin is offline   Reply With Quote
Old 9th May 2013, 01:21   #3  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by akupenguin View Post
Do you think the difference between .75 and .71 reflects a difference between MPEG4part2 and VC-1? Or just that the rule of thumb never was that precise?
Yeah, that small difference isn't going to matter.

But more broadly, I do think more advanced codecs will need fewer pixels per bit as frame size goes up.

Hmmm. There's probably a similar heuristic for frame rate. It'll be lower since increasing frame rate reduces the motion between frames.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 9th May 2013, 10:00   #4  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,997
Quote:
Originally Posted by benwaggoner View Post
So, a rule of thumb in compression has long been that changing frame size while maintaining similar subjective quality doesn't entail a linear increase/decrease it bitrate.

The classic rule of thumb had been that the change in pixel area should match a bitrate change to the power of 3/4ths of the pixel area. Thus going from a 640x480 1 Mbps to 1280x960 would entail 4^0.75=2.83 Mbps. Conversely, going down to 320x240 would allow a reduction to 0.25^0.75=0.35 Mbps
If I understand this correctly, this rule for "similar subjective quality" applies only when watching the encoded video at its native resolution.
A different question would be about the crossover for the resolution for similar subjective quality, given a fix bitrate and a fix display size (e.g. TV screen).
Sharc is offline   Reply With Quote
Old 9th May 2013, 11:26   #5  |  Link
Manao
Registered User
 
Join Date: Jan 2002
Location: France
Posts: 2,856
Quote:
But more broadly, I do think more advanced codecs will need fewer pixels per bit as frame size goes up.
That's particularly true of HEVC, which really shines (against H264) at large resolutions thanks to 64x64 "macroblocks" and better spatial prediction - both for intra and motion vectors.

Since mpeg4p2 doesn't really have any intra prediction, and a very basic motion prediction, I think most of the explanation for the rule of thumb (instead of the expected linear relationship) lies in the interaction between 8x8 DCT transform, details & quantization. H264 has a good intra prediction, a better motion prediction (especially skip), an adaptive entropy coder, so it can handle large resolution a lot better. "Sadly", I think low resolution coding was improved even more (4x4 DCT, 4x4 partitions), so rule may not have changed that much. HEVC somewhat doesn't care about small resolutions (if I'm not mistaken, no more 4x4 inter partitions), and really improved the coding of large ones. There's no doubt in my mind that the exponent will be lower.

Quote:
There's probably a similar heuristic for frame rate.
Yes, and it depends of the codec too. Without hierarchical bframes (h264, hevc), you'll be taking a large bitrate penalty by increasing framerate. Hierarchical bframes are a must because doubling the framerate can be seen as adding a hierarchical layer of non-reference, highly quantized bframes that will be smaller than all the other frames. As for the content itself, inter frame motion gets smaller, but motion blur disappears, and sharper frames are harder to code.
__________________
Manao is offline   Reply With Quote
Old 10th May 2013, 20:03   #6  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by Manao View Post
Without hierarchical bframes (h264, hevc), you'll be taking a large bitrate penalty by increasing framerate. Hierarchical bframes are a must because doubling the framerate can be seen as adding a hierarchical layer of non-reference, highly quantized bframes that will be smaller than all the other frames. As for the content itself, inter frame motion gets smaller, but motion blur disappears, and sharper frames are harder to code.
The comparison I was thinking of was the same source reducing the frame size or frame rate, so the motion blur would be constant in the comparison.

For example, comparing the extra bits to encode a 60p source at a full 60p instead of just 30p. If the frame rate exponent was 0.5 (arbitrary choice), that would mean that doubling frame rate would require an increase of 41% in bitrate.

Comparing content shot at 24p with 1/48th of a second shutter versus 48p shot with 1/72nd shutter would require a higher exponent.

The hierarchical B-frame example is interesting, but the differences of higher frame rate wouldn't be just that. For one, we could have maxed out the hierarchy possible at 30p already (I think x264 still only does one reference and one non-reference layer of B-frames). Second, the higher frame rate means that there are twice as many frames to pick between for reference frames, so we'd be able to pick better reference frames some of the time.

We'd probably get longer B-frame chains on average as well, since there will be less new visual information per frame.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 10th May 2013, 23:08   #7  |  Link
Manao
Registered User
 
Join Date: Jan 2002
Location: France
Posts: 2,856
Quote:
The comparison I was thinking of was the same source reducing the frame size or frame rate, so the motion blur would be constant in the comparison.
Ah, but motion blur must be taken into account. When you reduce frame size, you downpass to avoid aliasing. It's only fair that if you reduce frame rate, you add motion blur, to avoid jerkiness - i.e. temporal aliasing. That said, I agree it's easier to ignore motion blur to compare things that are comparable.

Quote:
For one, we could have maxed out the hierarchy possible at 30p already (I think x264 still only does one reference and one non-reference layer of B-frames)
Yeah, but that's x264. H264 levels for 720p (3.1 & 3.2) set the maximum DPB size to 5 frames, which allows a fully fledged pyramidal structure with 15 Bframes, which is quite a lot already. Iirc, HEVC gives a 6-frames DPB, i.e 31 Bframes with a full pyramid. And even if you've filled out the pyramidal structure, you can still add non reference bframes. But if that happens, those added frames won't be smaller than all the other frames. However, they'll be as small as the smallest B frames already present in the low frame rate video - so quite small already.

I've made a test encoding with x264 + 3B hierarchical, and average bitrate for all bframes ended at half the overall bitrate. With non reference bframes roughly twice as small as reference ones, non reference bframes have an average bitrate of 3/8th of the overall. That roughly means doubling the framerate by adding non reference bframes would increase the bitrate by 37.5%. Adding a layer to the pyramid would probably reduce the increase to 30%.

Quote:
Second, the higher frame rate means that there are twice as many frames to pick between for reference frames, so we'd be able to pick better reference frames some of the time.
That would assume you're adding reference frames. If you do, then what you say is correct. But you'll reduce bitrate further if you add non reference frames instead.
__________________
Manao is offline   Reply With Quote
Old 13th May 2013, 18:50   #8  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by Manao View Post
Ah, but motion blur must be taken into account. When you reduce frame size, you downpass to avoid aliasing. It's only fair that if you reduce frame rate, you add motion blur, to avoid jerkiness - i.e. temporal aliasing. That said, I agree it's easier to ignore motion blur to compare things that are comparable.
Synthesizing motion blur is one model, but is very rarely done in practice. It's quite computationally expensive, among other things. It might be "fair" but it isn't how things are done today.

For example, The Hobbit was shot at 48p with 1/72nd exposure time. That's the average of the 1/48th you'd normally have with 24p and the 1/96th you'd expect with 48p.

I think the 48p version would have looked a lot better with a 1/96th shutter, and that was probably one of the reasons it looked so weird to so many customers.

Quote:
Yeah, but that's x264. H264 levels for 720p (3.1 & 3.2) set the maximum DPB size to 5 frames, which allows a fully fledged pyramidal structure with 15 Bframes, which is quite a lot already. Iirc, HEVC gives a 6-frames DPB, i.e 31 Bframes with a full pyramid. And even if you've filled out the pyramidal structure, you can still add non reference bframes. But if that happens, those added frames won't be smaller than all the other frames. However, they'll be as small as the smallest B frames already present in the low frame rate video - so quite small already.
Yes, I agree with all that.

Quote:
I've made a test encoding with x264 + 3B hierarchical, and average bitrate for all bframes ended at half the overall bitrate. With non reference bframes roughly twice as small as reference ones, non reference bframes have an average bitrate of 3/8th of the overall. That roughly means doubling the framerate by adding non reference bframes would increase the bitrate by 37.5%. Adding a layer to the pyramid would probably reduce the increase to 30%.
Fair enough. Of course, those gains are only possible if the lower frame rate encode maxed out at 7-8 B-frames. Which is certainly plausible. But low-motion content might have already used 15 B-frames at the original frame rate. I'd still expect significant per-frame bitrate savings in that case.

Quote:
That would assume you're adding reference frames. If you do, then what you say is correct. But you'll reduce bitrate further if you add non reference frames instead.
You can't add to the max number of reference frames, but having more frames to choose between would allow for somewhat better matches. In LFR, choosing between frames 1, 3, and 5 could exclude a better reference frame at 2, 4 or 6.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book

Last edited by benwaggoner; 14th May 2013 at 20:28. Reason: Fix Quotes
benwaggoner is offline   Reply With Quote
Old 14th May 2013, 06:39   #9  |  Link
ChiDragon
Registered User
 
ChiDragon's Avatar
 
Join Date: Sep 2005
Location: Vancouver
Posts: 600
Before release people here were claiming that The Hobbit would be shot at 1/48 despite the 48 fps. Didn't know it turned out otherwise.
ChiDragon is offline   Reply With Quote
Old 14th May 2013, 20:32   #10  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by ChiDragon View Post
Before release people here were claiming that The Hobbit would be shot at 1/48 despite the 48 fps. Didn't know it turned out otherwise.
I was wrong; it's actually 1/64th.

filmmakermagazine.com/60811-the-hobbit-arrives
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 18th March 2014, 19:02   #11  |  Link
Hyral
Digital video researcher
 
Join Date: Mar 2012
Location: Brazil
Posts: 11
Hello, I am developing my Master's dissertation on video quality for adaptive systems. I am very interested in exactly how the ^3/4 principle was developed in the first place and might be able to update it to H.264/x264 at 1080p. I understand this refers to subjective quality. Was this callibration done by a single individual's subjective perception, or were MOS tests conducted, or even a combination of these with objective metrics such as SSIM?
Hyral is offline   Reply With Quote
Reply

Tags
x**3/4, x264 bitrate

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:00.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.