Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
28th June 2009, 14:04 | #1 | Link |
Stray Developer
Join Date: Mar 2003
Location: Italy
Posts: 82
|
Adaptive max bframes patch
I just put together a patch (it's more of an ugly hack but whatever) that makes --b-adapt 2 somewhat faster.
Basically what it does is to adapt the number of b-frames to look for based upon the length of the previous consecutive b-frames span. It is way easier just to read the patch to find out how it works; this is the relevant part: Code:
+ /* adaptive max b-frames */ + if( h->sh.i_type == SLICE_TYPE_P ) + { + int i_gop_bframes = h->fdec->i_frame - h->fref0[0]->i_frame - 1; + const int i_bframes_overhead = 4; + if ( i_gop_bframes + i_bframes_overhead > h->frames.i_adapt_bframes ) + h->frames.i_adapt_bframes = i_gop_bframes + i_bframes_overhead; + else + h->frames.i_adapt_bframes--; + h->frames.i_adapt_bframes = x264_clip3( h->frames.i_adapt_bframes, 0, h->param.i_bframe ); + } h->frames.i_adapt_bframes is initialized (and reset at every scencut) to h->param.i_bframe (i.e. the number of b-frames specified on the command line) The same adaptive method can be easily added also to --b-adapt 0 and 1, but I guess that it wouldn't give the same speedup. Right now I'm running a few tests and I will post further results briefly, but the first tests showed a speedup ranging from 5 to 30% (being heavily dependent of the content of the video) for -b 16 (I know that this is the best-case scenario as well as I know the tendancy of doom9ers to max out their settings ). Just a few more warnings: I tested it only on a Ubuntu32 VM and I have not paid much attention to concurrency issues (there should be none, and I experienced no crashes or other misbehaviour - but I'll wait for a review from the x264 devs about this). I tried to be as clear as possible in the patch, but I guess that at least a few variable names and comments will have to be edited... I couldn't come out with better ones, though. That's all. Feel free to comment and give (possibly) constructive advice. |
28th June 2009, 14:08 | #2 | Link | |
Stray Developer
Join Date: Mar 2003
Location: Italy
Posts: 82
|
First tests:
Quote:
|
|
28th June 2009, 14:16 | #3 | Link |
Registered User
Join Date: Jan 2002
Location: France
Posts: 2,856
|
If you don't already do it, reset i_adapt_bframes to i_bframes when a scenecut is detected.
__________________
|
28th June 2009, 14:49 | #5 | Link |
Registered User
Join Date: Jan 2002
Location: France
Posts: 2,856
|
Then you ought to make adaptation instantaneous at the beginning of a scene, because the average scene length is quite small (a few seconds), so with -b 16, decreasing i_adapt_bframes by 1 at each P may costs a lot.
__________________
|
28th June 2009, 19:02 | #7 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
There is a specific type of danger with this type of patch. Let me describe a similar situation to explain.
Let's say you have a patch that speeds up --ref 16 by 50%. Everyone is very happy; clearly this is a great improvement. Then, someone goes and tests and finds that you can get the same speedup for the same quality cost simply by setting --ref 6 instead of 16, and that the patch is a complete waste of time. The big possible problem with this I imagine is that the places it will do worst are the few places where tons of B-frames are actually useful--and the places it will do best is the places where they aren't useful, in other words, it will do nothing at all. You need to test on some contrived input cases, e.g. linear fades created in Avisynth and placed after an ordinary sequence or similar (with no scenecut in between), in order to make sure your algorithm isn't going to react badly. Finally, I don't like the idea of making b-adapt 2 too suboptimal because the idea of b-adapt 2 to begin with was to serve as a reference representing the best possible B-frame decision given a certain metric (slicetype_frame_cost). |
28th June 2009, 23:56 | #8 | Link | ||||
Stray Developer
Join Date: Mar 2003
Location: Italy
Posts: 82
|
Quote:
Quote:
Apart from this, the patch should behave exactly as a vanilla build. Only faster (YMMV). Quote:
Quote:
|
||||
29th June 2009, 01:54 | #9 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
What you need to do is test speed-vs-quality with B-frame numbers that actually get chosen. In other words, graph normal B-adapt 2 with bframes 1/2/3/4, and yours with 1/2/3/4, on an RD chart.
Quote:
This applies to basically the majority of videos uploaded to the website of the company I currently work for (Facebook). I don't think there's anything wrong with an attempt to speed up B-adapt 2 with heuristics, but I don't like patches which assume that scenes are uniform and adapt heavily to a scene (in particular, adapt in such a way that they cannot adapt back until the scenecut resets them). |
|
29th June 2009, 06:01 | #10 | Link |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
The above example just suggests a motion adaptive b-frame searh.
Well, regardless of it being motion adaptive or not, you could just set a min and max b-frame! Such that: -b 4 (sets the minimum b-frames search -bmax 12 (sets the maximum b-frame search Therefore in this case, the minimum b-frame search will always be 4, so in the above situation you aren't suboptimal. If it were motion adaptive, since you already have the motion information you could use that information to detect where b-frames are useless, then having a b-frame search of 1 could be beneficial for the time of search. If b-frame search of 1 is successful (such that in the case of the motion 1 b-frame is used), then dynamically increase the search based on that. Even if the above is complete nonsense because I don't really know what I'm talking about the min/max idea isn't?! |
29th June 2009, 07:40 | #11 | Link | |
Stray Developer
Join Date: Mar 2003
Location: Italy
Posts: 82
|
I don't really get why you say that
Quote:
Code:
I P P P P P P P P P P P P P P B B B B B B B B B B B B B B B B P B B B B B B B B B B B B B B B B P B B B B B B B B B B B B B B B B P Code:
frame_t w/o patch I P P P P P P P P P P P P P P P B B B B B B B B B B B B B B B B P B B B B B B B B B B B B B B B B P B B B B B B B B B B B B B B B B ... ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- prev_gop 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 8 12 16 frame_t w/patch I P P P P P P P P P P P P P P P B B B B P B B B B B B B B P B B B B B B B B B B B B P B B B B B B B B B B B B B B B B P B B B B B B ... i_adapt_bframes 16 16 15 14 13 12 11 10 9 8 7 6 5 4 4 4 8 12 16 16 Moreover, as far as the min/max approach is concerned, actually it is already that way: -b N controls the maximum and i_bframe_overhead controls the minimum (as you can see in the table above, i_bframe_overhead <= i_adapt_bframes <= param.i_bframe). p.s. Rereading my previous posts I noticed that I may have mislead you by using the term GOP. What I was referring to was the spans of consecutive b-frames, not all the frames between to i-frames. Sorry about that. |
|
29th June 2009, 07:56 | #12 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
The term you're looking for is "minigop," one set of frames starting at a P-frame, containing X B-frames, and ending before the next P-frame.
So what you're really trying to say is that your patch allows the number of B-frames to increase by up to 4 between minigops, specifically that if the previous minigop had 4 B-frames, the search for the next one will look for up to 8. If the next one has 2 B-frames, we will decrease the threshold by one to 7 for the next minigop after that. In other words, you predict that the current minigop will probably never need more than 4 B-frames more than the previous minigop. This isn't a bad statement at all; however I see two problems with it. 1. B-adapt 2 assumes that the max B-frames throughout its path-searching is uniform. "Properly" implementing this heuristic would probably involve modifying the B-adapt 2 search function to take your heuristic into account when selecting paths. 2. The most common case of a sudden jump from few to an enormous number of B-frames is that of a fade; this patch might worsen coding in fades. This probably won't be too big a deal with real content, which rarely has perfectly linear fades, and will be partly mitigated by the upcoming weighted P-frame prediction patch, but is still potentially an issue. |
29th June 2009, 08:30 | #13 | Link | ||||
Stray Developer
Join Date: Mar 2003
Location: Italy
Posts: 82
|
Quote:
Quote:
Quote:
Quote:
Moreover, IIRC, the 16 b-frames limit is arbitrary, right? Using this patch you could as well raise it and still be able to finish a two hour encode within a human lifespan. As a side note, the same method (or slight variations thereof) could be also applied to a tons of other parameters (e.g. me_range). |
||||
29th June 2009, 10:06 | #14 | Link |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
The whole fade issue can be negated by using a simple say, 3 frame buffer look-ahead. Fades involve an increase or decrease in luminance, so if such a change in luminance is detected within that small amount of frames then it can be dealt with! It would also be ok in this case for flickering caused by campfire scenes etc.
|
29th June 2009, 13:38 | #15 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
|
|
29th June 2009, 20:38 | #16 | Link |
Stray Developer
Join Date: Mar 2003
Location: Italy
Posts: 82
|
DS, I just had an idea on how to improve the algorithm so that it doesn't, in any case, perform worse than the optimal solution.
The basic idea remains the same, but after calling x264_slicetype_path_search I check if num_bframes is equal to max_bframes and less than param.i_bframe. In this case I set max_bframes back to param.i_bframe (as if it was a scenecut) and call again immediately x264_slicetype_path_search (I know that in this way you'd end up doing the search twice for the same GOP). In this way the decision should be absolutely identical to the non-patched trellis decision. Could it work or am I missing something? Ideally x264_slicetype_path_search itself could be patched so that if it hits the limit it does the reset thing described above without having to do the same search twice - IIRC trellis allows this. |
29th June 2009, 21:05 | #17 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Here's the reason that won't work.
Let's say our max B-frames is 16 but our current threshold is 2. We run B-adapt trellis and it tells us we should use 2 B-frames. So then we run it again with a threshold of 3. Now we run it and it tells us we should use 1 B-frame. What the heck? This is because it's a trellis algorithm; it picks the optimal series of frametypes taking into account the effect the current decision has on future frametype decisions. So if you tell b-adapt 2 that "we can only use 2 B-frames max in any minigop", it will give you a different decision than if you say "we can only use 3 B-frames max in any minigop"... and it could even give you fewer B-frames with the latter than the former. Now, in general, I think there are potentially good heuristics that are still "wrong" from a trellis sense, but IMO we should be careful to not lie to the trellis. |
30th June 2009, 15:55 | #20 | Link |
Registered User
Join Date: May 2006
Posts: 957
|
Avisynth, ffmpeg, mencoder.
__________________
x264 log explained || x264 deblocking how-to preset -> tune -> user set options -> fast first pass -> profile -> level Doom10 - Of course it's better, it's one more. |
Thread Tools | Search this Thread |
Display Modes | |
|
|