lithiumdeuteride
31st August 2005, 02:54
I conducted a rather rigorous test using the 6 of 9 matrix to find the saturation points (both high and low) for various resolutions with a high-motion scene and a low-motion scene.
The scenes I used were from the movie Sin City, a R1, 16:9 widescreen DVD. The high-motion scene I used begins (roughly 14.5 minutes in) with two police cars launching over a hill, going to arrest Marv, and ends 113.113 seconds later just before it cuts to a woman sleeping in bed. The low-motion scene begins exactly where the high-motion one ends, starting with the woman sleeping, and ends 116.742 seconds later just before it cuts to an aerial shot of the city.
The audio, while not used in the testing, was extracted with the version of DGIndex found in the GordianKnot 0.35 package. GordianKnot 0.35 was used to crop the movie so no black bars or fuzziness around the edges remained, then to create four avisynth frameservers of different resolution. I'll discuss the resolution later.
VirtualDubMod 1.5.10.1 was used to set up the encoding.
I used XviD 1.0.3 and the following settings:
Profile @ Level = AS @ L5
In the first "more" dialog box:
Profile tab:
Profile @ Level = AS @ L5
Quantization type = MPEG-Custom, with the Six of Nine matrix
Adaptive Quantization = unchecked
Interlaced Encoding = unchecked
Quarter Pixel = checked
Global Motion Compensation = checked
B-VOPs = checked
Max consecutive BVOPs = 2
Quantizer ratio = 1.50
Quantizer offset = 1.00
Packed bitstream = checked
Closed GOV = checked
Level tab: no changes
Aspect Ratio tab:
Pixel Aspect Ratio = Square
In the second "more" dialog box:
the fields, from top to bottom, read 10, 1, 20, 5, 5, 5, 0, 0
In the "Zone Options" dialog box:
Start frame # = 999999
Weight = 1
all boxes unchecked
BVOP sensitivity = 0
In the "Advanced Options" dialog box:
Motion tab:
Motion search precision = 6 - Ultra High
VHQ mode = 1 - Mode Decision
Use chroma motion = checked
Turbo ;-) = checked
Frame drop ratio = 0
Maximum I-frame interval = 240
Cartoon Mode = unchecked
Quantization tab:
the fields, from top to bottom, read 2, 3, 2, 5, 2, 6
Trellis quantization = checked
Debug tab:
Automatically detect optimizations = selected
FourCC used = XVID
Whew. With that out of the way, here's the point of my experiment. Everyone hates it when their file comes out oversized, and if you're like me, you hate it when they come out undersized, too.
Each of the four avisynth files I used was at a different resolution designed to show the behavior of XviD's high and low saturation bitrates as a function of resolution and scene motion. The four resolutions I used, and their respective total number of pixels were
432 x 240 = 103,680
528 x 288 = 152,064
608 x 336 = 204,288
688 x 368 = 253,184.
This gives me four data points over a range of commonly-used resolutions. I believe that, except in extreme cases, the XviD codec cares only on the motion of the scene and the resolution, and doesn't care about the aspect ratio. For example, 640 x 480 is the same number of pixels as 1280 x 240, and will yield similar results. I haven't tested this hypothesis, though.
The data I collected were the saturation points, that is, the bitrates the codec refused to go beyond with the quantization settings and matrix I used. I think my quantizers are reasonable for most movies, because if you give it too much freedom with quantizer range, the video quality will decline when it decides to use large quantizer values. Anyway, I had the codec target a bitrate of 200 kbps, which it could not possibly attain, and 5000 kbps, which it could also not possibly attain. I recorded the resulting average saturation bitrates, based on the size of the output file, and plotted them against the resolution that produced that saturation bitrate.
For the high-motion scene, the data was as follows:
Resolution- Saturation Low- Saturation High
103680 ----- 722 ---------- 1846
152064 ----- 980 ---------- 2547
204288 ---- 1222 ---------- 3220
253184 ---- 1430 ---------- 3973
For the low-motion scene, the data was as follows:
Resolution- Saturation Low- Saturation High
103680 ----- 486 ---------- 1289
152064 ----- 641 ---------- 1722
204288 ----- 784 ---------- 2131
253184 ----- 903 ---------- 2482
I used MS Excel to plot these points and discovered a rather linear correlation between resolution and saturation bitrates. You can get the Excel file here:
http://www.techwarereview.com/non-website/saturation.xls
As you can see in that file, I found the bitrate/resolution pairs that cause high-motion scenes to become saturated low (oversized result) and those that caused low-motion scenes to become saturated high (undersized result). I averaged these together to find a nice, middle-of-the-road bitrate/resolution pair that has the greatest chance of coming out the proper size.
The equation that describes these low-risk bitrate/resolution pairs is
R = 9.7 * B^1.34
where R is the total resolution in pixels, B is the bitrate in kbps, * means multiplication, and ^ means an exponent. Remember, though, that this equation is only valid for the Six of Nine matrix, the quantization numbers 2, 3, 2, 5, 2, 6, and Quantizer ratio and offset of 1.50 and 1.00, respectively. If you decide to switch to a matrix with higher values (more compression), you should increase your resolution at any given bitrate, or you may run the risk of undersized files due to the more heavily-compressed frames.
This makes XviD a bit more involved than encoding with DivX 5. You need to first find your bitrate, then go back and adjust resolution accordingly to minimize the risk of a poorly-sized file.
The scenes I used were from the movie Sin City, a R1, 16:9 widescreen DVD. The high-motion scene I used begins (roughly 14.5 minutes in) with two police cars launching over a hill, going to arrest Marv, and ends 113.113 seconds later just before it cuts to a woman sleeping in bed. The low-motion scene begins exactly where the high-motion one ends, starting with the woman sleeping, and ends 116.742 seconds later just before it cuts to an aerial shot of the city.
The audio, while not used in the testing, was extracted with the version of DGIndex found in the GordianKnot 0.35 package. GordianKnot 0.35 was used to crop the movie so no black bars or fuzziness around the edges remained, then to create four avisynth frameservers of different resolution. I'll discuss the resolution later.
VirtualDubMod 1.5.10.1 was used to set up the encoding.
I used XviD 1.0.3 and the following settings:
Profile @ Level = AS @ L5
In the first "more" dialog box:
Profile tab:
Profile @ Level = AS @ L5
Quantization type = MPEG-Custom, with the Six of Nine matrix
Adaptive Quantization = unchecked
Interlaced Encoding = unchecked
Quarter Pixel = checked
Global Motion Compensation = checked
B-VOPs = checked
Max consecutive BVOPs = 2
Quantizer ratio = 1.50
Quantizer offset = 1.00
Packed bitstream = checked
Closed GOV = checked
Level tab: no changes
Aspect Ratio tab:
Pixel Aspect Ratio = Square
In the second "more" dialog box:
the fields, from top to bottom, read 10, 1, 20, 5, 5, 5, 0, 0
In the "Zone Options" dialog box:
Start frame # = 999999
Weight = 1
all boxes unchecked
BVOP sensitivity = 0
In the "Advanced Options" dialog box:
Motion tab:
Motion search precision = 6 - Ultra High
VHQ mode = 1 - Mode Decision
Use chroma motion = checked
Turbo ;-) = checked
Frame drop ratio = 0
Maximum I-frame interval = 240
Cartoon Mode = unchecked
Quantization tab:
the fields, from top to bottom, read 2, 3, 2, 5, 2, 6
Trellis quantization = checked
Debug tab:
Automatically detect optimizations = selected
FourCC used = XVID
Whew. With that out of the way, here's the point of my experiment. Everyone hates it when their file comes out oversized, and if you're like me, you hate it when they come out undersized, too.
Each of the four avisynth files I used was at a different resolution designed to show the behavior of XviD's high and low saturation bitrates as a function of resolution and scene motion. The four resolutions I used, and their respective total number of pixels were
432 x 240 = 103,680
528 x 288 = 152,064
608 x 336 = 204,288
688 x 368 = 253,184.
This gives me four data points over a range of commonly-used resolutions. I believe that, except in extreme cases, the XviD codec cares only on the motion of the scene and the resolution, and doesn't care about the aspect ratio. For example, 640 x 480 is the same number of pixels as 1280 x 240, and will yield similar results. I haven't tested this hypothesis, though.
The data I collected were the saturation points, that is, the bitrates the codec refused to go beyond with the quantization settings and matrix I used. I think my quantizers are reasonable for most movies, because if you give it too much freedom with quantizer range, the video quality will decline when it decides to use large quantizer values. Anyway, I had the codec target a bitrate of 200 kbps, which it could not possibly attain, and 5000 kbps, which it could also not possibly attain. I recorded the resulting average saturation bitrates, based on the size of the output file, and plotted them against the resolution that produced that saturation bitrate.
For the high-motion scene, the data was as follows:
Resolution- Saturation Low- Saturation High
103680 ----- 722 ---------- 1846
152064 ----- 980 ---------- 2547
204288 ---- 1222 ---------- 3220
253184 ---- 1430 ---------- 3973
For the low-motion scene, the data was as follows:
Resolution- Saturation Low- Saturation High
103680 ----- 486 ---------- 1289
152064 ----- 641 ---------- 1722
204288 ----- 784 ---------- 2131
253184 ----- 903 ---------- 2482
I used MS Excel to plot these points and discovered a rather linear correlation between resolution and saturation bitrates. You can get the Excel file here:
http://www.techwarereview.com/non-website/saturation.xls
As you can see in that file, I found the bitrate/resolution pairs that cause high-motion scenes to become saturated low (oversized result) and those that caused low-motion scenes to become saturated high (undersized result). I averaged these together to find a nice, middle-of-the-road bitrate/resolution pair that has the greatest chance of coming out the proper size.
The equation that describes these low-risk bitrate/resolution pairs is
R = 9.7 * B^1.34
where R is the total resolution in pixels, B is the bitrate in kbps, * means multiplication, and ^ means an exponent. Remember, though, that this equation is only valid for the Six of Nine matrix, the quantization numbers 2, 3, 2, 5, 2, 6, and Quantizer ratio and offset of 1.50 and 1.00, respectively. If you decide to switch to a matrix with higher values (more compression), you should increase your resolution at any given bitrate, or you may run the risk of undersized files due to the more heavily-compressed frames.
This makes XviD a bit more involved than encoding with DivX 5. You need to first find your bitrate, then go back and adjust resolution accordingly to minimize the risk of a poorly-sized file.