PDA

View Full Version : XviD Saturation Points with Six of Nine


lithiumdeuteride
31st August 2005, 02:54
I conducted a rather rigorous test using the 6 of 9 matrix to find the saturation points (both high and low) for various resolutions with a high-motion scene and a low-motion scene.

The scenes I used were from the movie Sin City, a R1, 16:9 widescreen DVD. The high-motion scene I used begins (roughly 14.5 minutes in) with two police cars launching over a hill, going to arrest Marv, and ends 113.113 seconds later just before it cuts to a woman sleeping in bed. The low-motion scene begins exactly where the high-motion one ends, starting with the woman sleeping, and ends 116.742 seconds later just before it cuts to an aerial shot of the city.

The audio, while not used in the testing, was extracted with the version of DGIndex found in the GordianKnot 0.35 package. GordianKnot 0.35 was used to crop the movie so no black bars or fuzziness around the edges remained, then to create four avisynth frameservers of different resolution. I'll discuss the resolution later.

VirtualDubMod 1.5.10.1 was used to set up the encoding.

I used XviD 1.0.3 and the following settings:

Profile @ Level = AS @ L5

In the first "more" dialog box:

Profile tab:

Profile @ Level = AS @ L5
Quantization type = MPEG-Custom, with the Six of Nine matrix
Adaptive Quantization = unchecked
Interlaced Encoding = unchecked
Quarter Pixel = checked
Global Motion Compensation = checked
B-VOPs = checked
Max consecutive BVOPs = 2
Quantizer ratio = 1.50
Quantizer offset = 1.00
Packed bitstream = checked
Closed GOV = checked

Level tab: no changes

Aspect Ratio tab:

Pixel Aspect Ratio = Square

In the second "more" dialog box:

the fields, from top to bottom, read 10, 1, 20, 5, 5, 5, 0, 0

In the "Zone Options" dialog box:

Start frame # = 999999
Weight = 1
all boxes unchecked
BVOP sensitivity = 0

In the "Advanced Options" dialog box:

Motion tab:

Motion search precision = 6 - Ultra High
VHQ mode = 1 - Mode Decision
Use chroma motion = checked
Turbo ;-) = checked
Frame drop ratio = 0
Maximum I-frame interval = 240
Cartoon Mode = unchecked

Quantization tab:

the fields, from top to bottom, read 2, 3, 2, 5, 2, 6
Trellis quantization = checked

Debug tab:

Automatically detect optimizations = selected
FourCC used = XVID

Whew. With that out of the way, here's the point of my experiment. Everyone hates it when their file comes out oversized, and if you're like me, you hate it when they come out undersized, too.

Each of the four avisynth files I used was at a different resolution designed to show the behavior of XviD's high and low saturation bitrates as a function of resolution and scene motion. The four resolutions I used, and their respective total number of pixels were

432 x 240 = 103,680
528 x 288 = 152,064
608 x 336 = 204,288
688 x 368 = 253,184.

This gives me four data points over a range of commonly-used resolutions. I believe that, except in extreme cases, the XviD codec cares only on the motion of the scene and the resolution, and doesn't care about the aspect ratio. For example, 640 x 480 is the same number of pixels as 1280 x 240, and will yield similar results. I haven't tested this hypothesis, though.

The data I collected were the saturation points, that is, the bitrates the codec refused to go beyond with the quantization settings and matrix I used. I think my quantizers are reasonable for most movies, because if you give it too much freedom with quantizer range, the video quality will decline when it decides to use large quantizer values. Anyway, I had the codec target a bitrate of 200 kbps, which it could not possibly attain, and 5000 kbps, which it could also not possibly attain. I recorded the resulting average saturation bitrates, based on the size of the output file, and plotted them against the resolution that produced that saturation bitrate.

For the high-motion scene, the data was as follows:

Resolution- Saturation Low- Saturation High
103680 ----- 722 ---------- 1846
152064 ----- 980 ---------- 2547
204288 ---- 1222 ---------- 3220
253184 ---- 1430 ---------- 3973

For the low-motion scene, the data was as follows:

Resolution- Saturation Low- Saturation High
103680 ----- 486 ---------- 1289
152064 ----- 641 ---------- 1722
204288 ----- 784 ---------- 2131
253184 ----- 903 ---------- 2482

I used MS Excel to plot these points and discovered a rather linear correlation between resolution and saturation bitrates. You can get the Excel file here:

http://www.techwarereview.com/non-website/saturation.xls

As you can see in that file, I found the bitrate/resolution pairs that cause high-motion scenes to become saturated low (oversized result) and those that caused low-motion scenes to become saturated high (undersized result). I averaged these together to find a nice, middle-of-the-road bitrate/resolution pair that has the greatest chance of coming out the proper size.

The equation that describes these low-risk bitrate/resolution pairs is

R = 9.7 * B^1.34

where R is the total resolution in pixels, B is the bitrate in kbps, * means multiplication, and ^ means an exponent. Remember, though, that this equation is only valid for the Six of Nine matrix, the quantization numbers 2, 3, 2, 5, 2, 6, and Quantizer ratio and offset of 1.50 and 1.00, respectively. If you decide to switch to a matrix with higher values (more compression), you should increase your resolution at any given bitrate, or you may run the risk of undersized files due to the more heavily-compressed frames.

This makes XviD a bit more involved than encoding with DivX 5. You need to first find your bitrate, then go back and adjust resolution accordingly to minimize the risk of a poorly-sized file.

stephanV
31st August 2005, 10:52
I don't understand... what did you do determine the low and high saturation values? Did you do a constant q encoding? or...?

heh??? /me is confused...

also you say this:

I used MS Excel to plot these points and discovered a rather linear correlation between resolution and saturation bitrates.
Yet you come up with R = 9.7 * B^1.34
Thats not linear. The average of straight lines is still a straight line...

lithiumdeuteride
31st August 2005, 17:03
I must add that these are all 2-pass encodes. To determine the low saturation bitrate, I encoded the scene at lower and lower TARGET bitrates until the final product did not decrease in size. I realized I could simply target an impossibly low bitrate, and the codec would select the maximum quantizer values trying to reach that target. Likewise, I could target an impossibly high bitrate, and the codec would select the lowest quantizer values and still not reach the goal.

You can, of course, give the quantizers (max and min) greater freedom. This will make it less likely that you'll end up with an oversized or undersized product, but will degrade the video quality. If you make sure all I-frames have quantizer of 2 or 3, the quality is very high, without being wasteful. Likewise, in-between 2 and 5 for P-frames, and in-between 2 and 6 (effectively 4 and 9) for B-frames.

The goal of my experiment was to assume a set of quantizers and a certain matrix will be used, and to find out how to minimize the chance of a poorly-sized result. The way to do this is to select the proper resolution once you know the target bitrate.

As for the linear thing, you're quite right. All the lines were pretty linear, but when I finally plotted the averages of "high-motion saturation low" and "low-motion saturation high", the R^2 value (a measure of the accuracy of the fit) was closer to 1 (a good thing) when I used an exponential fit. Also, the linear fit did not have a y-intercept of zero, which seemed slightly wrong to me.

stephanV
31st August 2005, 17:25
I must add that these are all 2-pass encodes. To determine the low saturation bitrate, I encoded the scene at lower and lower TARGET bitrates until the final product did not decrease in size. I realized I could simply target an impossibly low bitrate, and the codec would select the maximum quantizer values trying to reach that target. Likewise, I could target an impossibly high bitrate, and the codec would select the lowest quantizer values and still not reach the goal.
Then I must say your maximum bit rate at high motion is quite low... ive seen h263 do more than 4000kbps on q4 at your kind of resolutions. 6of9 is a high bit rate matrix right? Or am i mistaken?

As for the linear thing, you're quite right. All the lines were pretty linear, but when I finally plotted the averages of "high-motion saturation low" and "low-motion saturation high", the R^2 value (a measure of the accuracy of the fit) was closer to 1 (a good thing) when I used an exponential fit. Also, the linear fit did not have a y-intercept of zero, which seemed slightly wrong to me.
Your fit is not exponential, its a power and yes, you can make excel always draw a nice line between 4 points with different x coordinates, but that doesnt it make it always meaningful. You can force it to go through 0 too BTW. :)

When stuff fits too nicely, I always become suspicious.

lithiumdeuteride
31st August 2005, 17:42
The maximum bitrate is what it is. It is not low for the settings I used, because the settings I used are what generated this maximum bitrate. I don't know what settings h263 uses, but I'm willing to bet it's significantly different than the Six of Nine matrix, and quantizer restrictions of 2, 3, 2, 5, 2, 6.

My fit is a power, as you said. My mistake.

The power fit I used so closely matched the data that the equation it generated must be at least somewhat meaningful for these specific XviD settings.

Forcing a y-intercept of 0 for a linear fit destroys the fit if it would not naturally pass through the origin.

I assure you I did not fudge any of the data. I was looking for bitrate/resolution pairs that are as "safe" as possible. The power fit I used is not exact, of course, but offers a pretty good target for most movies. There is no reason to be suspicious.

stephanV
31st August 2005, 18:09
The power fit I used so closely matched the data that the equation it generated must be at least somewhat meaningful for these specific XviD settings.

For those settings, or for the 2 sources you used?


Forcing a y-intercept of 0 for a linear fit destroys the fit if it would not naturally pass through the origin.
But you said that it feels unnatural that any fit wouldnt go through 0. So you either you don't have enough data, the fit doesnt go through 0, or the relation is not linear. But i think right now its impossible to say which one is true. Again, you can let excel make a nice fit between almost any 4 random points.

I assure you I did not fudge any of the data.
I was never questioning your data/measurements. Sorry if I made it seem that way. I'm not trying to be offensive here. What I was questioning is the interpretation of the data, the amount of data on which it is based and moreover this comment:
The power fit I used is not exact, of course, but offers a pretty good target for most movies.
Is 1 low-motion source and 1 high-motion source really enough for that?

Teegedeck
1st September 2005, 08:32
lithiumdeuteride, that's quite an effort for a first post! :)

Though I must admit that I am not mathematically gifted and fail a little to see the relevance of the formula. I think I understand that it comes down to the facilitisation of a good guess at a fitting resolution given a constant quantizer, right?

Maybe that could be used in an automated encoding app for more inspired guessing at a first resolution than simple bits/pixel/fps ratio.

manono
1st September 2005, 10:56
Hi-

Maybe that could be used in an automated encoding app for more inspired guessing at a first resolution than simple bits/pixel/fps ratio.

There's already something for that in GKnot. It's called the Compression Test. He could have saved himself a lot of time and trouble by taking 10 minutes to run the test. All that data he gathered works only for Sin City anyway, and can no way be extrapolated for use with other movies. And since Sin City compresses like a sonuvabitch, it's a very atypical example he chose.

Teegedeck
1st September 2005, 14:39
I know that. But you have to choose a resolution for the compressibility check (pre-1st-pass) somehow, too, and I guess this is currently done by bits/pixel/fps ratio.

manono
1st September 2005, 16:05
Sure, I know you know, Teege. I was writing that mostly for lithiumdeuteride. He uses GKnot for at least some of his work, but either is unaware of the Compress Test, or has never tried it out. If you happen to choose the wrong res for the test, you ether run another one to check a different res, or kind of extrapolate from the results of the first test to choose the proper res. Me, I don't run them anymore, as if you've seen as many DVDs as I have, you can look at it and judge its compressibility, and the proper resolution. I get fooled sometimes, but not too often these days.

lithiumdeuteride
1st September 2005, 22:54
manono - Perhaps you could enlighten me as to the best settings for XviD, and provide your own formula for resolution/bitrate pairs.

About this compress test - Does it just scan the video and produce a general "compressibility factor", or does it use the XviD codec and make a reasonably accurate guess as to how far it can be compressed?

stephanV - No, it isn't really enough data to be all that accurate. But it's better than no data at all, and it has helped my files to come out closer to the target size (though sometimes they're still wrong). They're accurate much more often than they were before I did the testing, though.

manono
2nd September 2005, 02:59
Hi-

About this compress test - Does it just scan the video and produce a general "compressibility factor", or does it use the XviD codec and make a reasonably accurate guess as to how far it can be compressed?

If I have to choose one of those two, I guess it's the second. You set it up as if you were going to run the first pass in VDubMod, with codec, res, file size, audio bitrate or size, all XviD settings including matrix, B-Frame settings, etc. It scans the video, at the default 5% encoding 14 of every 280 frames throughout the movie, tosses out the first and last frames of the 14, and returns a percentage. It's usually said that you're looking for results in the 60-80% range. However, in general, the more film grain and noise there is, the more you can let that percentage drift towards the lower end, as grain/noise can hide artifacts. For old classic films, with lots of film grain, I find I can often get away with results in the 50% range. The compressibility of a movie varies depending on a number of factors, such as grain/noise vs clean video, light/bright vs dark, or large percentage of complex scenes (handheld cameras, moving cameras, action scenes, crowd scenes, or just leaves moving on trees; in general, if the whole screen is in movement it's a complex scene) vs llargely static scenes. Sin City compresses so well because it's very dark, and it's extraordinarily clean, with every frame having been run through a computer. If the percentage returned isn't to your liking, you adjust the file size, lower/raise the res, lower/raise the audio bitrate, add/remove filters, change the matrix, etc. Then to be sure, you run another Compress Test hoping to get better results.

I don't usually use the high bitrate matrices, but the higher the bitrate matrix you're using, I think the lower you want to let those percentages go. So, when using the sixofnine, I think you might be aiming for 40% or so. Someone correct me if I'm wrong about that.

Jonny's Enc (http://forum.doom9.org/showthread.php?s=&threadid=44414) is another and perhaps even better way to test the compressibility of a movie. For example, if you aim for a quant 3 average when using the MPEG matrix, but a quant 5 average when using sixofnine, you set it up to give results for quant 5 when using the sixofnine, and it'll let you know how far above or below an average quant 5 you'll be for the settings you've chosen.

lithiumdeuteride
7th September 2005, 23:54
manono - Thanks for the information. I'll use the compress test the next time I encode something.