View Full Version : x264: problems with VBV MB row ratecontrol causing occassional messy i-frames
F J Walter
27th August 2008, 10:34
Hello,
Unfortunately I have been having a trouble with x264 encodes that have nasty blocking in occasional I-frames when using VBV.
The VBV values I am using are quite reasonable, and even increasing buffer size does not fix the problem. However, fiddling with just about any parameter shifts the problem around.
Resolution: 320x176 (it's for mobile devices)
VBV maximum rate: 768k
VBV buffer size: 2000k
I have tried it both with --crf 18 and with an average bitrate, both ways of having a 'constrained' VBR. I need 1-pass only. I also tried it with CBR (minimum and maximum VBV rate the same).
I tried it multithreaded and with a single thread, and like with many other parameters this changes where the effect occurs but it still turns up on some i-frames. I have of course tried no-fast-pskip but this is distinct macroblock rows so it isn't that.
Looking in ratecontrol.c, it seems that this is due to code in x264_ratecontrol_mb(), specifically the part where the QP is raised on macroblock rows (other than the first row) where the frame size sofar seems to be significantly higher than the predicted size.
This is causing very nasty blocking on some i-frames, particularly on macroblock rows near the top of the frame (except for the first macroblock row, which is crystal clear).
I am thinking that it could be one of:
- This is not optimised well for such small frame sizes - at 176 pixels high it's only 11 macroblock rows
- The predicted frame size (based on predicted row sizes) is buggy
- The predicted frame size is incompatible with AQ (hmmm, I will try to see if dropping AQ fixes it - it did seem to be happening on some low-complexity frames).
- Perhaps "rc->frame_size_planned * rc_tol" is not appropriate to work out the maximum predicted size before raising a MB row QP. rc_tol will almost always be fairly tight - 1.5 at the most - regardless of the size of the buffer. This means that even if there is room in the buffer for 20 big i-frames, any single frame which looks like it's going to exceed 1.5 times its "planned" frame size has its macroblock QPs mangled to fit.
I will do more experimenting - this kinda takes time.
Build used: daily source tarball from 26 August on x264 official site
Typical command line:
x264 --bframes=0 --no-cabac --level=13 --ref=1 --partitions=p
8x8,b8x8,i4x4 --direct=spatial --me=dia --subme=3 --threads=0 --progress --scenecut 96 --no-psnr --qpmin 6 --qpmax 38 --qpstep 4 --verbose --keyint 250 --min-keyint 25 --crf=18 --vbv-maxrate=768 --vbv-bufsize=2000 -o out.mp4 in.yuv 320x176
scenecut 96 exacerbates the problem by creating more i-frames on scene changes. The problem still occurs with scenecut 40. Problems still occur when removing the settings: partitions, direct, me, subme, threads, scenecut, qpmin, qpmax, and changing crf to effective CBR.
F J Walter
27th August 2008, 11:05
Further investigation has found that, at least in the case of using CRF with vbv-maxrate, the problem can be traced back to this code in ratecontrol.c.
/* avoid VBV underflow */
while( (rc->qpm < h->param.rc.i_qp_max)
&& (rc->buffer_fill - b1 < rc->buffer_size * 0.005))
{
rc->qpm ++;
b1 = predict_row_size_sum( h, y, rc->qpm );
}
I took the liberty of printing out some of the values of these variables where the problem was occurring using a quick and dirty printf statement in that while loop, and here are some (from a single frame).
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 1339257791, buffer_siz
e 2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 1193145031, buffer_siz
e 2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 1062973361, buffer_siz
e 2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 947003586, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 843686262, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 751640991, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 669637977, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 596581597, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 531495762, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 473510875, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 421852213, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 375829577, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 334828070, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 298299880, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 265756963, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 236764519, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 210935188, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 187923870, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 167423117, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 149159022, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 132887563, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 118391341, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 105476676, buffer_size
2000000.0
NOTICE: avoiding VBV underflow, buffer_fill 1999890.0, b1 93971017, buffer_size
2000000.0
Typical values of b1 for other complex frames is in the tens of thousands, but just on these problematic i-frames it is in the hundreds of millions !! That is 4 orders of magnitude out from the typical frame.
b1 derives from predict_row_size_sum(). I'm guessing that macroblocks are being boosted to their maximum allowed QPs in an attempt to reduce that superfluous b1 value, and predict_row_size_sum() still persists in guessing values that are way too high.
I have also confirmed that the problem still exists with AQ turned off, but to a lesser degree. Some i-frames are still butchered, but to a lesser extent and the values of b1 are only 2 orders of magnitude out from what you would expect. Still, it is happening on frames that it really shouldn't be happening on due to low complexity, so the brief flicker of macroblocks across the screen is still particularly noticeable.
Further investigation has also found that raising qpmin from 6 to 10 also reduces the effect significantly in terms of the odd values of predict_row_size_sum I am seeing, but it still has about the same visual impact. (note: I now believe this effect was only coincidental)
I will look into possibly coming up with a patch, but unfortunately it might have to involve simply ignoring the predicted size. Either that or disabling MB row based ratecontrol (but that may lead to more chance of actual underflows).
Dark_shakiri: I think my problem may be related to this post of yours, if it is any help
http://forum.doom9.org/showthread.php?p=1098690#post1098690
Gabriel_Bouvigne
27th August 2008, 11:36
It's likely that the size predictor is wrong.
My guess is that it's happening in your case because within x264_ratecontrol_mb there is a safeguard to only adjust Qp after 5% of the planned bits have been used, in order to be safe from potentially bad predictions at the top of the frame. Usually 5% gives a few lines before Qp adjustment is potentially triggered. In your case, there is a possibility that your frame already used those 5% on the first line, so Qp adjustment will occur after only 1 single line of accumulated stats.
Could you please provide any (short) sample which triggers this behavior? (preferably it would trigger this behavior even when disabling AQ)
F J Walter
27th August 2008, 14:17
Getting it to happen with AQ disabled is difficult - happened a couple of times on a 3 min clip of mine but when I chop out a smaller section of that clip it won't happen (perhaps it gets worse over time, or just luck).
Here is a 29 sec sample of input (YUV) and output (MP4) using the exact command line above except changing --threads=0 to --threads=1. It has AQ. Check out frames 195 (7.8 sec), 490 (19.6 sec) , 614 (24.56 sec) and 712 (28.48 sec) of the output clip.
http://www.mediafire.com/file/wwovyj0rals/Samples.tar.gz
I'd agree with you that that 5% could be a factor. With only 11 macroblock rows, that is virtually certain of only giving the first row. But even with only one row to go on, it shouldn't be predicting sizes that are 1000s of times higher than they should be, should it? I guess AQ can account for some of that. Possibly, VBV should be modified when AQ is active (which may include turning off MB-row level ratecontrol).
Problem with testing on a short clip is that it's not always reproducible - slight changes in settings or environments (such as number of processor cores) can make vast differences to where it occurs. I have found that on a 3 minute clip, however, it usually happens a few times over the clip's length.
Gabriel_Bouvigne
27th August 2008, 15:12
You are using crf and vbv at the same time? Then of course the encoder will have to drastically increase Qp in some frames, as it is its only way to fulfill vbv constraints (by using crf, you prevented it from gently adjusting frames Qp in order to fulfill vbv constraints).
I'll check if I can reproduce this behavior with a target bitrate instead of crf.
F J Walter
27th August 2008, 15:23
I also noticed that for most of the clip, the VBV buffer is almost always completely unused (2000000) at the start of a frame because 18 CRF usually doesn't reach that bitrate, and yet the QP is still raised.
The majority of cases where this problem is occurring is where the buffer is almost completely unused, but the predicted size of the frame is much more than the "planned" size of the frame, or more than the entire buffer, which is unnecessary in the first case and reflects a huge prediction error in the second. On one clip I had a few frames predicted at being around 600 megabits for a single frame! Not only larger than the planned frame size, but 300 times larger than the VBV buffer.
Sharktooth
27th August 2008, 15:26
stop using CRF and VBV. it cant work. use 2pass modes to be sure the VBV will work.
Gabriel_Bouvigne
27th August 2008, 15:34
(VBV should also work in single pass if you are using a target bitrate instead of cfr)
I can reproduce the problem with --bitrate=768 instead of crf (and indeed, the problem seems to go away when disabling AQ)
kemuri-_9
27th August 2008, 15:59
aq-mode 2 (the default) will shift bits around between frames,
which has already been talked about in other threads about how it practically messes with vbv to no end.
if you're gonna use vbv, it's best to not use aq-mode 2.
aq-mode 1 (prevent shifting) or 0 (disabled) should have no problems.
Sharktooth
27th August 2008, 16:08
it would be good then to automatically switch Aq-mode 1 as default if VBV parameters are specified... and keep aq-mode 2 if VBV params are not specified.
Dark Shikari
27th August 2008, 16:08
(VBV should also work in single pass if you are using a target bitrate instead of cfr)
I can reproduce the problem with --bitrate=768 instead of crf (and indeed, the problem seems to go away when disabling AQ)Perhaps its finally time to move AQ to a frame-level function, store an array of the frame quantizers, and use those quantizers when calculating the SATD cost for VBV?
F J Walter
27th August 2008, 16:29
You are using crf and vbv at the same time? Then of course the encoder will have to drastically increase Qp in some frames, as it is its only way to fulfill vbv constraints (by using crf, you prevented it from gently adjusting frames Qp in order to fulfill vbv constraints).
I'll check if I can reproduce this behavior with a target bitrate instead of crf.
I understand how VBV works - my problem here is that even where the CRF is creating a bitrate far below the VBV constraint and the VBV buffer is completely unused (2000000 bits remaining), it is still messing up some i-frames due to wildly incorrect prediction values part way through the frame which see i-frames being predicted to be 600 million bits, etc.
Also, a single-pass VBV buffer is just as unintelligent with CRF as with 1 pass ABR. There should be no distinction - it will simply raise QP when it needs to in order to satisfy the buffer. I therefore disagree that CRF and VBV can't work. It ought to work in the same sense as it does with 1-pass ABR: it can't work miracles, but given an appropriate buffer size the pitfalls of 1-pass VBV should be smoothed out enough as to be acceptably smooth and unnoticeable.
Gabriel_Bouvigne's suggestion that the 5% of the predicted frame size was not enough to tell at such low bitrate was a good one. I have raised that to 25% and in combination between doing that, turning off multithreading, and fiddling with the way the 'headroom' is calculated, the problem is practically solved. Problem is, if I enable multi-threading, the wildly inaccurate estimates come back and start messing up i-frames again, even when I increase that 25% even further.
I will investigate the aq-mode=1 avenue now.
Sharktooth
27th August 2008, 16:55
no you dont. dont use CRF.
Gabriel_Bouvigne
27th August 2008, 16:56
Also, a single-pass VBV buffer is just as unintelligent with CRF as with 1 pass ABR. There should be no distinction - it will simply raise QP when it needs to in order to satisfy the buffer. I therefore disagree that CRF and VBV can't work. It ought to work in the same sense as it does with 1-pass ABR: it can't work miracles, but given an appropriate buffer size the pitfalls of 1-pass VBV should be smoothed out enough as to be acceptably smooth and unnoticeable.
CRF is designed so frame Qp is selected according to the CRF value you choosed, while in ABR/CBR modes the frame Qp is selected according to target rate AND vbv constraints. When using both CRF and vbv, Qp will be adjusted per row when encoding the frame (if needed), but not before encoding the frame. CRF and VBV can only react, while ABR/CBR and VBV can both anticipate and react.
That does not mean that it would not be possible to have a CRF mode that would behave like ABR/CBR regarding frame allocation, but the current CRF mode was not designed with VBV in mind.
(that does not change the fact that your sample demonstrates that there is indeed an issue)
akupenguin
27th August 2008, 17:02
CRF and ABR differ in how they choose the multiplier for rceq, nothing else. In particular, they don't differ in how they deal with VBV. CRF wasn't designed with VBV in mind, and neither was ABR.
F J Walter
27th August 2008, 17:05
aq-mode=1 does not seem to help with this particular problem unfortunately. Nor does aq-mode=0. I feel I can rule out AQ as a cause, although for some reason I notice that the problem is slightly worse when aq-mode=1, compared to 0 or 2.
- I've been tweaking more and I think that even in the sanest of circumstances, waiting for more MB rows to start predicting, etc, whether ABR or CRF, there are still seemingly random situations where the predicted row size is outrageous. I think I'll fiddle with predict_row_size_sum to take predict_row_size out of the equation - I can still predict frame size per-row based on bits sofar I guess.
Dark Shikari
27th August 2008, 17:08
I've found a bug in which row_satd is used uninitialized in some cases in I-frames, but I haven't found the cause yet.
The bug, not surprisingly, causes the QP to skyrocket to 51.
F J Walter
27th August 2008, 17:14
in ABR/CBR modes the frame Qp is selected according to target rate AND vbv constraints. ... ABR/CBR and VBV can both anticipate and react.
Wow, I didn't know x264 actually did that - consider me informed now. Presumably that would make x264's VBV implementation, at least with ABR/CBR, drastically better than ffmpeg (lavc). And yes, it doesn't change that this is still indeed an independent issue, as CBR does it sometimes too. Unfortunately I won't have time to fiddle with this further for the next day or so.
Dark_Shikari that sounds promising.
F J Walter
28th August 2008, 07:32
Wow, that was fast - I take this to mean it has been patched already?
Latest version:
r951 (Aug 27, 2008)
Version history:
r951
Fix some uses of uninitialized row_satd values in VBV
Resolves some issues with QP51 in I-frames with scenecut
I'll see if I can compile and test soon.
Dark Shikari
28th August 2008, 07:35
Wow, that was fast - I take this to mean it has been patched already?Yes, though I'm still working on a rework of AQ to act before VBV in order to improve VBV accuracy. It works, but currently breaks threads and scenecut for unknown reasons, so it hasn't been committed.
F J Walter
28th August 2008, 07:41
Sounds awesome - you are certainly fast to act! I look forward to it.
F J Walter
28th August 2008, 12:35
The bug appears to be fixed now, thanks! Still a bit wonky with AQ but only damages it by about 4 QP instead of raising to qpmax. I've hacked mine to increase the tolerance a bit in VBV so AQ doesn't interfere, though this may increase the chance of an underflow.
bob0r
28th August 2008, 17:41
Sounds awesome - you are certainly fast to act! I look forward to it.
Thats usually the best way.
1: motivation is high, you have testers and its fresh
2: making a todo list usually ends in not happening at all, unless you have much more planned work
3: acting right away may solve a problem quicker because in the future code may change so much, it may be harder to fix.
(that would be for beginner programmers, current x264 crew doesn't seem to have any problems)
4: it motivates the users to test and keep using x264
I love x264 since the day i saw the #x264 announcement in the #xvid chanel :D
x264 and the people around it to me seem so professional, we just love being a part of that community!
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.