Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th October 2006, 05:16   #1  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
CRF translated to DABR graph and discussion

I doubt this is anything new, but I'll flesh it out a little and see if anybody is interested in the topic. If this has been covered before please provide a link or two. I should warn that this stuff is all very tentative at the moment and needs much more testing. It is primarily just my old CCE stuff translated to x264 crf mode.

I spent some amount of time today translating a span of CRF 2% sample encodes into derived bitrate numbers (DABR). Much to my pleasure I found there seems to be a near perfect curve generated, and there also seems to be a strong possibility to use the Newton Ralphson convergence methods that are used to project CCE Q numbers into a predicted filesize.

Perhaps more importantly, it may provide a way to create branching logic in a program to allow the quality of a size constrained encode to be predicted before the actual encode is committed to.

As well this provides a replicable way to measure, quantify and predict the effect of various filters, matrices, etc. on final filesize.

Further, the accuracy of a 1% sample seems to be as effective as a 5% sample in my early tests. The DABR from the 1%, 2% and 5% were very close to each other.

There is a bitrate calculator spreadsheet attached to the post in my sig if anybody wants to replicate the numbers
and method used to create the graph. There may be some more information there that may be translatable for x264. Frankly I am just too new to x264 to know yet. Also, I have attached the spreadsheet data to the data post below if somebody wants to regraph it differently.



The cmdline used was:

Code:
C:\apps\x264.exe --crf 19 --ref 3
 --mixed-refs --bframes 3 --b-pyramid --b-rdo
 --bime --weightb --filter -2,-1 --subme 6 --trellis 
1 --analyse all --8x8dct --vbv-maxrate 25000 --me 
umh --threads 2 --thread-input --progress --no-psnr
 --output "D:\temp\auto3\movienohrQ(xx).mp4"
 "d:\temp\auto3\movie.avs"
The 2% range line was: SelectRangeEvery(600,12)

DABR conversion was done with - (((Sample_Size_In_Bytes*(100/Sample_Size_Percentage))*8)/1000)/(Total_Frames/Frame_Rate)



My plan is to set several encodes going tonight and see if the actual finished filesize ends up as predicted.



Last edited by DDogg; 8th October 2006 at 21:38.
DDogg is offline   Reply With Quote
Old 7th October 2006, 08:46   #2  |  Link
foxyshadis
Angel of Night
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,559
Interestingly, although AVC is theoretically supposed to double in q steps of 6, this roughly doubles in q steps of 5, nearing 6 as it rises. I wonder if that has to do with crf, the options used, or just x264's implementation.
foxyshadis is offline   Reply With Quote
Old 7th October 2006, 08:55   #3  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
x264's bitrate predictor for 2pass assumes the step is 5.5, based on a similar measurement a long time ago.
My hypothesis is: The threshold for the magnitude of information to keep does double exactly every 6 qps. But the information in the source is not quite evently (well, laplacianly) distributed over the range of magnitudes. If there's a little more information in the noise end of the spectrum, which gets lost completely at high qp, then that increases the rate change of bitrate as a function of qp.

Last edited by akupenguin; 7th October 2006 at 08:59.
akupenguin is offline   Reply With Quote
Old 7th October 2006, 16:49   #4  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
All data below and attached in excel spreadsheet.

Code:
CRF	  2% Size (BTs)		DABR in KBPS		Projected file size	Actual Filesize	dif1%	dif2%

CRF15.mp4	32,951,805		2237		1,647,590,250		1,574,893,880	0.9559	 4.41%
CRF16.mp4	28,046,445		1904		1,402,322,250		1,327,162,625	0.9464	 5.36%
CRF17.mp4	24,013,257		1631		1,200,662,850		1,126,649,226	0.9384	 6.16%
CRF18.mp4	20,662,192		1403		1,033,109,600		  958,217,194	0.9275	 7.25%
CRF19.mp4	17,846,928		1212		  892,346,400		  819,491,843	0.9184	 8.16%
CRF20.mp4	15,469,190		1050		  773,459,500		  702,993,061	0.9089	 9.11%
CRF21.mp4	13,472,796		 915		  673,639,800		  605,788,784	0.8993	10.07%
CRF22.mp4	11,803,302		 801		  590,165,100		  524,643,249	0.8890	11.10%
CRF23.mp4	10,338,978		 702		  516,948,900		  456,028,249	0.8822	11.78%
CRF24.mp4	 9,109,543		 619		  455,477,150		  396,922,397	0.8714	12.86%
CRF25.mp4	 8,012,111		 544		  400,605,550		  346,629,200	0.8653	13.47%
CRF26.mp4	 7,091,370		 482		  354,568,500		  303,524,399	0.8560	14.40%
CRF27.mp4	 6,263,265		 425		  313,163,250		  266,540,478	0.8511	14.89%
CRF28.mp4	 5,557,481		 377		  277,874,050		  234,993,050	0.8457	15.43%
CRF29.mp4	 4,923,517		 334		  246,175,850		  207,760,736	0.8440	15.60%
CRF30.mp4	 4,391,764		 298		  219,588,200		  183,939,507	0.8377	16.23%




Attached Files
File Type: zip dabr1bxls.zip (3.0 KB, 184 views)

Last edited by DDogg; 8th October 2006 at 21:43.
DDogg is offline   Reply With Quote
Old 7th October 2006, 22:49   #5  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Did all your tests use the same 2% of the movie? What's the distribution of projected sizes if you run one Q lots of times with different random 2%s?
akupenguin is offline   Reply With Quote
Old 7th October 2006, 22:57   #6  |  Link
virus
Senior n00b
 
Join Date: Jan 2004
Location: Italy
Posts: 446
Quote:
Originally Posted by foxyshadis View Post
Interestingly, although AVC is theoretically supposed to double in q steps of 6, this roughly doubles in q steps of 5, nearing 6 as it rises. I wonder if that has to do with crf, the options used, or just x264's implementation.
Also the type of source plays an important role. I've seen several cases where the "-6 QP -> 2x rate" rule of thumb holds decently (say, within 10-15%), but also a case where going from QP 16 to QP 10 brought a 3.8x increase in bitrate. I have no idea whether CRF's rates would differ significantly from constant QP in that specific case, though.
virus is offline   Reply With Quote
Old 7th October 2006, 23:27   #7  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
Quote:
Originally Posted by akupenguin View Post
Did all your tests use the same 2% of the movie? What's the distribution of projected sizes if you run one Q lots of times with different random 2%s?
Sorry, I don't completely follow your question. Presently I am just using the method of sampling we always used for CCE that turned out to be reliable in all regards. There was never a need for anything else than the select range statement above.

Whether that holds true for x264 crf mode I really don't know yet. I do remember jonny and I having long communications several years ago about xvid CQ and the same principles did not hold up well at all. The graphs were all over the place. He an I both dropped out about that time and I don't know if he ever pursued that aspect of xvid compression again. I did send him a pm asking him to check in to this thread as this stuff should be second nature to him.

Primarily I am only attempting to do some grunt level labor in the hopes it might provide some of you braniacs with usable date and stimulate conversation and ideas.

If you would like a set of tests run, please provide the commandline you would like used, a base level sampling script/method preferred and your specifications for the test. I'll get it done and collate the results back to you.

Doing some of the grunt labor for some of you is little enough to repay the work you have done with x264.

Btw, none of this has any validity until I do multiple sources and see if the curves track in parallel. Even with a Wcooled D930 running at 4.5GHz the time required is scary.
Quote:
Originally Posted by virus View Post
Also the type of source plays an important role. I've seen several cases where the "-6 QP -> 2x rate" rule of thumb holds decently (say, within 10-15%), but also a case where going from QP 16 to QP 10 brought a 3.8x increase in bitrate. I have no idea whether CRF's rates would differ significantly from constant QP in that specific case, though.
That was always the beauty of Using 1P Q in CCE, or I am assuming any constant quality mode in any encoder, as the source complexity completely dictated the bitrate required to "hold" the complexity of the source (within certain caps). That was the key for us back then as the sample size, translated to DABR, allowed a fingerprinting of the source complexity and thus an extrapolation of the size/bitrate needed in advance of actually doing the encode. As well, it allowed one to estimate the effect on bitrate demand by various matrices, filters, and would clearly show even something as simple as the minute effects of DC precision. So incredibly useful. Guess that is why I am hoping some of this might pan out for x264.

Last edited by DDogg; 9th October 2006 at 17:49.
DDogg is offline   Reply With Quote
Old 8th October 2006, 00:56   #8  |  Link
IgorC
Registered User
 
Join Date: Apr 2004
Posts: 1,315
Quote:
Originally Posted by DDogg View Post
Sorry, I don't completely follow your question. Presently I am just using the method of sampling we always used for CCE that turned out to be reliable in all regards. There was never a need for anything else than the select range statement above.
For different 2% pieces of movie filesize can variate siginifically. i.e 2% of chapter 10th may have different filesize from 2% of chapter 20th due to different motion and another factors. It's a probalistic task. Maybe it would be better to imply estadistic tools for better estimation of the results. I.e. confidence intervals 0.9-0.95. It should be realible.

Last edited by IgorC; 8th October 2006 at 01:01.
IgorC is offline   Reply With Quote
Old 8th October 2006, 04:23   #9  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
Quote:
Originally Posted by IgorC View Post
For different 2% pieces of movie filesize can variate siginifically. i.e 2% of chapter 10th may have different filesize from 2% of chapter 20th due to different motion and another factors. It's a probalistic task. Maybe it would be better to imply estadistic tools for better estimation of the results. I.e. confidence intervals 0.9-0.95. It should be realible.
Yes, certainly various 2% slices would demand different bitrates as the complexity of that 2% slice would dictate the bitrate, but we are speaking of the SelectRangeEvery(600,12) creating an average of all the frames in the source.

I remember huge amounts of discussion on this topic because any intelligent person might doubt the ability of a 1 or 2% sampling of all the frames to accurately average the bitrate. However, after literally 10s of thousands of test on various sources we found it accurate to +-2% consistently and popular programs like DVDRebuilder use these techniques every day now. That's just one of those things that ended up working that people swore would not work. It did and nobody argues about it anymore as it relates to the process with CCE.

Whether it works at all with x264 crf mode is a whole other question. All we can do is try and see. If you have a method of sampling you think would work well, please share it. I can do a controlled set of tests. BTW, I don't have any idea what estadistic tools are.

Last edited by DDogg; 8th October 2006 at 04:31.
DDogg is offline   Reply With Quote
Old 8th October 2006, 06:07   #10  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
If that's how you sampled it, then I have no trouble believing that it could make a good predictor once you compensate for the systematic errors.
A trivial explanation for why it overestimated the filesize: the test pass is limited to GOP size 12, whereas the final pass uses much larger GOPs. Solutions: (1) Use larger ranges (but then your test wouldn't sample the input as evenly). (2) Estimate the final average GOP size and downweight the I-frames when counting the bitrate of the test pass. In my experience the average GOP is around 100 when unrestricted, or you could try to measure it based on the number of I-frames that are not on the border between ranges (though x264's biased scenecut detection might interfere with that).
Several other options are affected similarly, though the magnitude of error introduced by those are smaller: B-adapt's decisions are also constrained by the GOP size, and multiref is less effective for the first few frames of a GOP.

Edit: I'm making this more complicated than it needs to be. To compensate for the extra I-frames, just delete the first frame in each range. The actual number of scenecuts in the movie will be correctly represented by the remainder of the frames. (This doesn't fix B-frames and refs, though.)

I assume the 5% sample was SelectRangeEvery(240,12)? Then the next test to run is SelectRangeEvery(600,30) and so on.

Last edited by akupenguin; 9th October 2006 at 17:16.
akupenguin is offline   Reply With Quote
Old 8th October 2006, 13:29   #11  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
Quote:
Originally Posted by akupenguin View Post
If that's how you sampled it, then I have no trouble believing that it could make a good predictor once you compensate for the systematic errors.
A trivial explanation for why it overestimated the filesize: the test pass is limited to GOP size 12, whereas the final pass uses much larger GOPs. Solutions: (1) Use larger ranges (but then your test wouldn't sample the input as evenly). (2) Estimate the final average GOP size and downweight the I-frames when counting the bitrate of the test pass. In my experience the average GOP is around 100 when unrestricted, or you could try to measure it based on the number of I-frames that are not on the border between ranges (though x264's biased scenecut detection might interfere with that).
Several other options are affected similarly, though the magnitude of error introduced by those are smaller: B-adapt's decisions are also constrained by the GOP size, and multiref is less effective for the first few frames of a GOP.

I assume the 5% sample was SelectRangeEvery(240,12)? Then the next test to run is SelectRangeEvery(600,30) and so on.
Here is the derived bitrate spreadsheet I use and the select lines are listed at the bottom which brings up a math question (mine are 1970s US high School=poor). Just for conversation's sake, let's assume the above curve holds true (big stupid assumption now). Would you, or others with good math skills, have a suggestion on how one would take the formula (((Sample_Size_In_Bytes*(100/Sample_Size_Percentage))*8)/1000)/(Total_Frames/Frame_Rate) and add to it a curve correction factor based upon the above chart or data points?

As for your point # 2 above, seems to me that would take a small program to accomplish?

Graph and data points updated above

Last edited by DDogg; 8th October 2006 at 14:18.
DDogg is offline   Reply With Quote
Old 8th October 2006, 21:35   #12  |  Link
foxyshadis
Angel of Night
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,559
Are you keeping logs of the x264 output? Try comparing the I-frame size as a % of the total size in the test vs the final, and subtract the difference from the sampled size.

So if in the sample, there I frames are 478632 bytes and 40%, but in the final they're 9042673 and 25%, you subtract 179478 and predict from what's left. ([average full %]/[obtained %]*size will net you the amount, or close enough to it for this.) You'll probably have to make tests to figure out what the average I-frame size in % is in movies. Or you might just be able to find an average obtained % as well and not need to look at logs at all.

Since you're not writing software that can analyse the stream and chop the first two and last two frames of each gop off, which is how the most accurate mpeg-2 and mpeg-4 results seem to come about, this should be the next-best way.
foxyshadis is offline   Reply With Quote
Old 9th October 2006, 00:48   #13  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
foxyshadis, that sounds like a very intelligent thing to try. Problem is I am not smart enough, and I did not keep logs. Hopefully you might feel like dipping a toe in this water as I could certainly use some help from somebody like you.

I'm going to take the simple and dumb route with another experiment. I have another source of 150,804 frames with much lower complexity (to my eye) than the previous more complex source of 141,241 frames. I've run the 2% tests and using the same method above, size predicts out at 1,246,861,350 for crf 15 and 160,868,500 for crf 30. Btw, I was pleased to see this as it has more frames but clearly the 2% sample is getting a good fingerprint of the complexity and indicates it will come in smaller in size, which seems right to me.

If the curve holds and I apply the correction of -4.41% and -16.23% respectively they should come in at 1,191,846,279 and 134,752,562 (using identical script and commandline). I would be surprised if they do hit the size, but it is an experiment that has to be done to see.

Will post back whatever the results are.

/add:

Results are encouraging -
CRF 30 - expected 134,752.562 got 128,583,054 which is -4.6% shy of the target
CRF 15 - expected 1,191,846,279 got 1,194,073,460 WOW! I'll take that anyday.

Will run several more toward the belly of the curve:

/add:
These did not work out as well [+-2% is desired]-
CRF 18 - expected 703,308,561 got 684,558,029 - 97.3% of predicted
CRF 20 - exxected 509,203,536 got 489,826,680 - 96.2% of predicted
CRF 22 - expected 377,038,516 got 360,203,986 - 95.5% of predicted

Last edited by DDogg; 13th October 2006 at 03:22.
DDogg is offline   Reply With Quote
Old 9th October 2006, 16:25   #14  |  Link
xyloy
x264 & XviD rules! ;-)
 
xyloy's Avatar
 
Join Date: Jul 2005
Location: France, near Bordeaux
Posts: 178
If you use x264 via MeGUI, logs are usually kept in the program files\megui\logs subdirectory.

BTW, this is a very interesting topic.
__________________
x264 with mb-tree is kicking my ass!! :o
Recommended Codec :
Latest x264 revision build for everything.
Unrecommended Codecs: everything else.

Last edited by xyloy; 9th October 2006 at 17:53. Reason: oops, I should relearn my english sometimes :P
xyloy is offline   Reply With Quote
Old 9th October 2006, 17:20   #15  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
Quote:
Originally Posted by xyloy View Post
If you use x264 via MeGUI, logs are usually keeped in the program files\megui\logs subdirectory.

BTW, this is a very interesting topic.
Thanks, xyloy. I think there is a fair amount of potential in the subject, but I also think the advice of akupenguin and foxyshadis is going to have to be incorporated to take it anywhere. I don't think I can get there on my own.

Still, although super accurate prediction has not been demonstrated (so far) with these very simple methods used, there certainly seems a potential to use them as a reality check before doing a bitrate based encode as well as generally checking the compressibility of a source. Given the slower speed of x264, this might be helpful to some. My hope is this thread will stimulate more advanced members of this forum to take the subject farther.

@akupenguin - just noticed your edit - "...To compensate for the extra I-frames, just delete the first frame in each range. The actual number of scenecuts in the movie will be correctly represented by the remainder of the frames. (This doesn't fix B-frames and refs, though.).." My brain is fried, please suggest the modified SelectRangeEvery statement

Hmmm, wondering if a PredictCRFsizefunction.avsi might accomplish this task better? Function writers out there?

Last edited by DDogg; 9th October 2006 at 17:40.
DDogg is offline   Reply With Quote
Old 9th October 2006, 18:51   #16  |  Link
jonny
Registered User
 
jonny's Avatar
 
Join Date: Feb 2002
Location: Italy
Posts: 876
Quote:
please suggest the modified SelectRangeEvery statement
the problem are bad (statistically) frames introduced with SelectRangeEvery
you should think that you are serving to the encoder little snips of the original source (it's full of jumps in different parts of the movie), this introduce abnormal sized frames (at the start and at the end of the snip)

the only way to handle this is using the log with all the frame sizes, summing sizes but discarding bad frames (this will give you a better value, compared to raw output size)

with mp4-asp a good way to handle bframes is discarding 3 frames at the start and 3 frames at the end of the snip

i haven't done testing with avc, i've found a discussion here:

http://forum.doom9.org/showthread.ph...ion#post598896
(i'm trying to see where it ends : )
jonny is offline   Reply With Quote
Old 9th October 2006, 19:12   #17  |  Link
foxyshadis
Angel of Night
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,559
I don't think he means modifying the avisynth, I'm pretty sure that means chopping the first frame of each segment off in the output. To do that you can modify x264's output module or use a stream parser to ignore every 12 frames, but it's definitely not something you can gather from the basic info you're using now.

Oh, I just noticed that min-gop is still 25, so it's not just every 12 frames... hmm, interesting. Might still be I frames though. I guess I'll have to run a similar test and parse it to check.
foxyshadis is offline   Reply With Quote
Old 9th October 2006, 19:20   #18  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
No need to actually modify the compressed stream. Just ignore the appropriate lines from `x264 -v` output.
And it doesn't matter what min-gop is set to. An I-frame takes almost the same number of bits as a P-frame full of I-blocks. Except for the keyframe qp bonus, but that won't really be correct either way.

Code:
#!/usr/bin/perl
$range_length = 12;
$fps = 24000/1001;
while(<>){
    /frame= *(\d+).*size=(\d+)/ or next;
    $1 % $range_length or next;
    $size += $2;
    $frames++;
}
printf "%.3f\n", $size*$fps*.008/$frames;
...but this tends to overcompensate a bit. Maybe because the keyframes still got their qp bonus and thus made better references but their bit cost was ignored. Or maybe the converse: the real scenecuts didn't get detected as such and didn't get the qp bonus, so didn't increase the bitrate as much as in the final encode.

Last edited by akupenguin; 9th October 2006 at 19:35.
akupenguin is offline   Reply With Quote
Old 9th October 2006, 20:50   #19  |  Link
Hellworm
Registered User
 
Join Date: Aug 2005
Posts: 132
Quote:
Originally Posted by DDogg View Post
@akupenguin - just noticed your edit - "...To compensate for the extra I-frames, just delete the first frame in each range. The actual number of scenecuts in the movie will be correctly represented by the remainder of the frames. (This doesn't fix B-frames and refs, though.).." My brain is fried, please suggest the modified SelectRangeEvery statement
I think he meant that you should ignore the size of the I-Frames of every cut (analysing the log) and completely ignore those frames as the scenecut was introduced by the selectrange anyway

damn beaten to it
Hellworm is offline   Reply With Quote
Old 9th October 2006, 21:19   #20  |  Link
DDogg
Retired, but still around
 
DDogg's Avatar
 
Join Date: Oct 2001
Location: Lone Star
Posts: 3,058
jonny, thanks for posting that link. A general comment, somewhat on topic with the recent VFW thing, is how they are trying to work around all the problems of vfw, vdubmod, etc whereas the modern cli version, if the author chose, could probably generate a specialized log, or for that matter, even have a predictive method built in without all the nonsense of the workarounds. At least, I think this sounds like it may be true. akupenguin, is that theoretically possible, or am I misunderstanding some of your comments?
DDogg is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 04:14.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.