Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 23rd January 2012, 15:11   #1  |  Link
Peterxy
Registered User
 
Join Date: Jan 2011
Posts: 18
sync-lookahead

hi,
I have got a question about --sync-lookahead and what this makes.
OK, I thought rc_Lookahead is the number of Frames for MacroBlock Rate control to look forward and going any higher as 60 rc_look. will not do anything great (seems still to increase System RAM Usage)
Setting sync_Lookahead seems somehow to push treading - so did it a set a framesbuffer between lookahead & encoding- -or- what else does it make? Should sync_lookahead be untouched?
cu peter

Last edited by Peterxy; 6th February 2012 at 11:24.
Peterxy is offline   Reply With Quote
Old 23rd February 2013, 04:59   #2  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 2,442
Bump. I'm curious too.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Instant Video

My Compression Book

Amazon Instant Video is hiring! PM me if you're interested.
benwaggoner is offline   Reply With Quote
Old 2nd April 2013, 19:24   #3  |  Link
mogobime
Registered User
 
Join Date: Oct 2012
Posts: 44
Sync-lookahead seems a little bit confusing to some people, including me. Is it related to --lookahead-threads, too? What I can find to sync-lookahead here isn't really up to date anymore.

What I understood is that it seems to have nothing to do with rc-lookahead (mbtree/vbv-lookahead).

Maybe someone can explain what exactly it does / what's the difference to rc-lookahead.
__________________

X-QuaSaT - x264 Quality and Speed analyse Tool (multiple file analysation/average values/csv-output)
Benchmark-thread for x264 encoding with GPU-based OpenCL lookahead and DGDecNV GPU-based decoding
mogobime is offline   Reply With Quote
Old 2nd April 2013, 22:22   #4  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,393
Sync-lookahead is the number of frames in limbo in the queues connecting lookahead to the rest of the encoder. It has no effect on compression quality. Default value is as high as is useful to go. Reducing it saves some memory and latency and the cost of throughput, since lower values allow less leeway in when threads are scheduled.
akupenguin is offline   Reply With Quote
Old 3rd April 2013, 00:11   #5  |  Link
mogobime
Registered User
 
Join Date: Oct 2012
Posts: 44
So is it correct if I say this is like a frame-buffer for vbv-lookahead/mbtree ratecontrol?
An example: I like to raise --lookahead-threads from 2 to 3 when working with --threads=12 and rc-lookahead=60.
Sometimes I also use opencl lookahead already.
So in this case could it be useful to raise sync-lookahead to prevent this buffer getting emptied to early?
Or did I misunderstand your explanation?

Thx,
mogobime
__________________

X-QuaSaT - x264 Quality and Speed analyse Tool (multiple file analysation/average values/csv-output)
Benchmark-thread for x264 encoding with GPU-based OpenCL lookahead and DGDecNV GPU-based decoding

Last edited by mogobime; 3rd April 2013 at 00:14.
mogobime is offline   Reply With Quote
Old 3rd April 2013, 03:11   #6  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,393
Quote:
Originally Posted by mogobime View Post
So is it correct if I say this is like a frame-buffer for vbv-lookahead/mbtree ratecontrol?
Yes.

Quote:
An example: I like to raise --lookahead-threads from 2 to 3 when working with --threads=12 and rc-lookahead=60.
Is that still necessary after r2260 (2013-02-25) updated the default lookahead-threads?

Quote:
Sometimes I also use opencl lookahead already.
So in this case could it be useful to raise sync-lookahead to prevent this buffer getting emptied to early?
The only factors that should affect the maximum useful value of sync-lookahead are things that change the amount of variance in how much time it takes to encode and/or lookahead one frame vs another frame. You don't need to prevent the buffer from emptying early if it does so consistently: bad-performance-preventable-by-sync-lookahead only happens if it oscillates between empty and full.

The relevant options include bframes (which are already accounted for in the default sync-lookahead); but don't include lookahead-threads, because lookahead-threads only affects the absolute speed of lookahead and not the variance. I don't know whether it includes opencl, and even if it does, I don't know whether opencl increases or decreases the required sync-lookahead.

Otoh, if you have memory to spare and aren't in a latency-sensitive usecase, then there isn't really any cost to overestimating sync-lookahead. Otth, I think the default value is already a conservatively high estimate, so you probably don't need to further increase it even if you twiddle some related options.

Last edited by akupenguin; 3rd April 2013 at 03:17.
akupenguin is offline   Reply With Quote
Old 3rd April 2013, 15:25   #7  |  Link
mogobime
Registered User
 
Join Date: Oct 2012
Posts: 44
OK, thx for those useful explanations
__________________

X-QuaSaT - x264 Quality and Speed analyse Tool (multiple file analysation/average values/csv-output)
Benchmark-thread for x264 encoding with GPU-based OpenCL lookahead and DGDecNV GPU-based decoding

Last edited by mogobime; 3rd April 2013 at 18:39.
mogobime is offline   Reply With Quote
Old 3rd April 2013, 18:39   #8  |  Link
mogobime
Registered User
 
Join Date: Oct 2012
Posts: 44
To your question:
Quote:
Is that still necessary after r2260 (2013-02-25) updated the default lookahead-threads?
Actually I use r2273, doesn't it still use the default-divisor threads/6=lookahead-threads?

For me in most cases bumping lookahead-threads from 2 to 3 with threads=9-18 brought a small speed-up without a considerable reduction of SSIM, as you can see here.
I also did a quick test with crf20 and it seemed to be the same. But raising lookahead-threads to 4 almost always immediately reduced SSIM for about 0,025.
__________________

X-QuaSaT - x264 Quality and Speed analyse Tool (multiple file analysation/average values/csv-output)
Benchmark-thread for x264 encoding with GPU-based OpenCL lookahead and DGDecNV GPU-based decoding

Last edited by mogobime; 3rd April 2013 at 20:17.
mogobime is offline   Reply With Quote
Old 3rd April 2013, 22:17   #9  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,393
No. r2260 changed to anywhere between threads/1 and threads/12 depending on an estimate of the relative speed of lookahead vs main encoder. (Still only an estimate, based on a few of the most important parameters)

Last edited by akupenguin; 3rd April 2013 at 22:20.
akupenguin is offline   Reply With Quote
Old 3rd April 2013, 23:47   #10  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 5,562
looked up what changed and from the looks of it this:
Code:
 895     if( h->param.i_lookahead_threads == X264_THREADS_AUTO )
 896     {
 897         if( h->param.b_sliced_threads )
 898             h->param.i_lookahead_threads = h->param.i_threads;
 899         else
 900         {
 901             /* If we're using much slower lookahead settings than encoding settings, it helps a lot to use
 902              * more lookahead threads.  This typically happens in the first pass of a two-pass encode, so
 903              * try to guess at this sort of case.
 904              *
 905              * Tuned by a little bit of real encoding with the various presets. */
 906             int badapt = h->param.i_bframe_adaptive == X264_B_ADAPT_TRELLIS;
 907             int subme = X264_MIN( h->param.analyse.i_subpel_refine / 3, 3 ) + (h->param.analyse.i_subpel_refine > 1);
 908             int bframes = X264_MIN( (h->param.i_bframe - 1) / 3, 3 );
 909 
 910             /* [b-adapt 0/1 vs 2][quantized subme][quantized bframes] */
 911             static const uint8_t lookahead_thread_div[2][5][4] =
 912             {{{6,6,6,6}, {3,3,3,3}, {4,4,4,4}, {6,6,6,6}, {12,12,12,12}},
 913              {{3,2,1,1}, {2,1,1,1}, {4,3,2,1}, {6,4,3,2}, {12, 9, 6, 4}}};
 914 
 915             h->param.i_lookahead_threads = h->param.i_threads / lookahead_thread_div[badapt][subme][bframes];
 916             /* Since too many lookahead threads significantly degrades lookahead accuracy, limit auto
 917              * lookahead threads to about 8 macroblock rows high each at worst.  This number is chosen
 918              * pretty much arbitrarily. */
 919             h->param.i_lookahead_threads = X264_MIN( h->param.i_lookahead_threads, h->param.i_height / 128 );
 920         }
 921     }
source: http://git.videolan.org/?p=x264.git;...b808f9ae3189e1
should be the interesting part, for those who want to know how x264 calculated the lookahead threads now
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 6th April 2013, 20:07   #11  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 5,562
Is it intended that slower uses less lookahead threads than slow? (Did I make a mistake?)

slow:
uses b-adapt 2 -> badapt = 1
uses subme 8 -> subme = 3
uses bframes 3 -> bframe 0
I use threads 12 -> i_threads = 15
lookahead_thread_div[1][3][0] = 6
-> i_lookahead_threads = 2
height: 352
-> i_lookahead_threads = 2


slower:
uses b-adapt 2 -> badapt = 1
uses subme 9 -> subme = 4
uses bframes 3 -> bframe 0
I use threads 12 -> i_threads = 15
lookahead_thread_div[1][4][0] = 12
-> i_lookahead_threads = 1
height: 352
-> i_lookahead_threads = 1

note that unless the resolution is below 128 in height it will not have any influence on the lookahead threads with preset slower,..

Cu Selur
__________________
Hybrid here in the forum, homepage

Last edited by Selur; 6th April 2013 at 20:15.
Selur is offline   Reply With Quote
Old 6th April 2013, 21:02   #12  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,269
In the code you posted above, "subme" is a parameter for the thread divider, and it changed in slower, so... thats it?
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 6th April 2013, 22:05   #13  |  Link
Rumbah
Registered User
 
Join Date: Mar 2003
Posts: 453
Quote:
Originally Posted by Selur View Post
Is it intended that slower uses less lookahead threads than slow? (Did I make a mistake?)
I guess the idea was that with slower presets lookahead isn't the bottleneck anymore like it can be in faster presets.
__________________
x264 full help - x264 --fullhelp r2345
Cuttermaran HCEnc provider - Support for HCEnc in Cuttermaran
DualDVDRB - Dual core support for DVD-RB free
Rumbah is offline   Reply With Quote
Old 7th April 2013, 02:07   #14  |  Link
mogobime
Registered User
 
Join Date: Oct 2012
Posts: 44
Quote:
I guess the idea was that with slower presets lookahead isn't the bottleneck anymore like it can be in faster presets.
No, it's not like that. As you can see here at the section 720p testing/Benchmarks of all x264 presets... lookahead-threads are set to 5 at preset placebo with the x264 automatism, for example.
__________________

X-QuaSaT - x264 Quality and Speed analyse Tool (multiple file analysation/average values/csv-output)
Benchmark-thread for x264 encoding with GPU-based OpenCL lookahead and DGDecNV GPU-based decoding

Last edited by mogobime; 7th April 2013 at 02:10.
mogobime is offline   Reply With Quote
Old 7th April 2013, 04:55   #15  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,690
Lookahead can become a bottleneck again at veryslow/placebo because --bframes gets much higher, so the trend reverses again there.
Dark Shikari is offline   Reply With Quote
Old 7th April 2013, 07:33   #16  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 5,562
Okay, thanks for clearing that up.

Just wanted to make sure it was intended since it seemed kind of counter intuitive that the lookahead thread went down for slower.
medium: 2
slow: 2
slower: 1
very slow: 2
placebo: 2

Cu Selur
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 7th April 2013, 15:30   #17  |  Link
mogobime
Registered User
 
Join Date: Oct 2012
Posts: 44
@Dark Shikari
For me OpenCL-lookahead doesn't seem to be slowed down the same way as the regular lookahead when using more bframes as you can see here (added a test with lookahead-threads chosen by x264) and here and here.

Also subme 7 with ref=6 and bframes=5 seems to be really fast in my tests.
I guess it's the same with raising ref...
I'm always modding the x264 presets at ref/bframes now, when I use OpenCL-lookahead.
__________________

X-QuaSaT - x264 Quality and Speed analyse Tool (multiple file analysation/average values/csv-output)
Benchmark-thread for x264 encoding with GPU-based OpenCL lookahead and DGDecNV GPU-based decoding

Last edited by mogobime; 7th April 2013 at 15:38.
mogobime is offline   Reply With Quote
Old 7th April 2013, 18:57   #18  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,690
That's probably because your GPU is powerful enough that it has resources free to do more analysis, so higher --bframes just uses more of the GPU instead of going slower. But lookahead-threads applies to the CPU logic anyways, not the GPU implementation, as far as I know.
Dark Shikari is offline   Reply With Quote
Old 7th April 2013, 21:30   #19  |  Link
mogobime
Registered User
 
Join Date: Oct 2012
Posts: 44
maybe even more important is the speed of VRAM, because gpu-load is never above 50% (and even 50% load is really rare).

Do you think it is generally useful to raise ref + bframes at lower subme modes like 7 or 9 or are there negative side effects?
At my system subme 7 with ref=6 and bframes=5 for example is almost 15% faster with opencl lookahead activated.
__________________

X-QuaSaT - x264 Quality and Speed analyse Tool (multiple file analysation/average values/csv-output)
Benchmark-thread for x264 encoding with GPU-based OpenCL lookahead and DGDecNV GPU-based decoding

Last edited by mogobime; 7th April 2013 at 22:14.
mogobime is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 03:29.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2018, vBulletin Solutions Inc.