Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
15th October 2008, 23:40 | #1 | Link |
Registered User
Join Date: Aug 2008
Posts: 16
|
FFMPEG, x264, multi-core
Hi,
I have a fresh build of FFMPEG and x264 that I built today. I have noticed 2 things (I don't think anything changed in my most recent build - I just wanted to mention that I am on the most recent code). I am doing 2 pass encoding on a 2 core machine. 1) It seems that the first pass of my encode only uses 1 core regardless of what I set -threads to. Is this an FFMPEG issue? x264 issue? An issue with a setting I am using? 2) On the 2nd pass, in order to fully load the machine I need to use 4 threads on a 2 core machine. What is the "optimal" number of threads to use for encoding based on the number of cores in the machine? Thanks |
16th October 2008, 02:57 | #2 | Link |
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
number of cores * 1.5
however, its highly probable you cant fill the cores in the first pass if you specified a high number of b-frames along the b-adapt 2 option.
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! |
16th October 2008, 05:54 | #3 | Link |
Registered User
Join Date: Aug 2008
Posts: 16
|
I am using -bf 16 and -badapt 2. However I tried -badapt 1 and it still only used 1 core (though the FPS went way up). Is it the high b-frame # that causes the lack of parallelism? I don't understand what the encoder is doing and whether it could become more parallel. Now with -badapt 2, which I would like to make use of for better quality, it would be cool to be able to load the system during the first pass. Is this technically possible (just curious)?
|
16th October 2008, 06:46 | #4 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
|
|
16th October 2008, 17:01 | #5 | Link |
Registered User
Join Date: Aug 2008
Posts: 16
|
What is the quality tradeoff for bf16/b-adapt1 vs. bf3/b-adapt2?
Also, as I said even wht using bf16/b-adapt1, I only saw 1 core being used. Is this because my quality settings are low enough for the 1st pass that 16 bframes becomes a parallelism blocker even though the decision mechansim is faster? |
16th October 2008, 17:11 | #6 | Link |
Compiling Encoder
Join Date: Jan 2007
Posts: 1,348
|
is x264 giving a warning similar to
x264 [warning]: not compiled with pthread support! ? if it is, it means you didn't give it the ability to multithread when you compiled it. which would explain it is only working with 1 thread no matter what you put for --threads x of course this error only happens if you used the x264 cli, not ffmpeg with libx264... i don't recall offhand what errors ffmpeg has for attempting threading w/o thread support... Last edited by kemuri-_9; 16th October 2008 at 17:16. |
16th October 2008, 17:15 | #7 | Link | |
Registered User
Join Date: Aug 2008
Posts: 16
|
Quote:
|
|
16th October 2008, 17:24 | #8 | Link |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
With "--b-adapt 1" and "--bframes 16" the encoder will not use more than 2 consecutive b-frames very often.
So "--b-adapt 2" with "--bframes 3" should produce a better result in fact. More than "--bframes 5" with "--b-adapt 2" is usually overkill. That's because "--bframes" only limits the maximum number of consecutive b-frames. It still depends on the b-frame decision method how many consecutive b-frames will actually be used. The "old" (fast) method is fast, but tends to use very few b-frames. The "new" (slow) method chooses the optimal number of b-frames. Hence it's better to use the "new" method and use a sane "--bframes" value...
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 16th October 2008 at 17:31. |
16th October 2008, 17:44 | #9 | Link |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
As DS has mentioned previously, the old b-frame decision method was far from perfect, and setting it to 16 b-frames didn't represent any benefit above a setting of 4. I think there was a bug at one stage which gave the illusion that there was! With the new method, having 5 b-frames optioned is more efficient than having 16 b-frames in the old method, and in fact even though setting a higher number offers diminishing returns, even setting at 3 represents better b-frame utilisation in most circumstances when you discount the aforementioned bug.
|
16th October 2008, 19:23 | #10 | Link |
Registered User
Join Date: Aug 2008
Posts: 16
|
I change my settings to use -bf 5 and -b_strategy 2, and a few things:
1) I am still not really seeing much parallelism on the 1st pass. I'm going to go under the assumption that the 1st pass is mostly working on b-frame decision and as such there is not much parallelism possible. 2) Now I am seeing the following warning at the very end of the 2nd pass: [libx264 @ 0x17593170]2nd pass has more frames than 1st pass (726).2kbits/s [libx264 @ 0x17593170]continuing anyway, at constant QP=18 [libx264 @ 0x17593170]disabling adaptive B-frames [libx264 @ 0x17593170]2nd pass has more frames than 1st pass (726) [libx264 @ 0x17593170]continuing anyway, at constant QP=18 [libx264 @ 0x17593170]disabling adaptive B-frames [libx264 @ 0x17593170]2nd pass has more frames than 1st pass (726) [libx264 @ 0x17593170]continuing anyway, at constant QP=18 [libx264 @ 0x17593170]disabling adaptive B-frames Note that this message is repeated based on the number of threads in use. In this case I am using 3 threads. If I use 1 thread I see the message once. Any ideas? |
16th October 2008, 19:29 | #11 | Link | ||
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Quote:
|
||
16th October 2008, 19:44 | #12 | Link | |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
Quote:
http://avidemux.org/admForum/viewtop...d=29706#p29706
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ |
|
16th October 2008, 19:57 | #13 | Link |
Registered User
Join Date: Aug 2008
Posts: 16
|
Apart from the FFMPEG issue, even if I set -bf 0 I still see barely any parallelism on the 1st pass (I can tell it uses the 2nd core, but very little). Here is my encode script if anyone has any further input on why I can't get parallelism on the 1st pass.
nice -n 10 ffmpeg -t 30 -i $1 -croptop $5 -cropbottom $6 -cropleft $7 -cropright $8 \ -s $3 -y -an -pass 1 -vcodec libx264 -threads $NUM_THREADS \ -b ${BIT_RATE}k -maxrate ${BIT_RATE}k -bufsize ${BUF_SIZE}k -rc_init_occupancy ${BUF_INIT}k -flags +loop \ -cmp +chroma -partitions +parti4x4+partp8x8+partb8x8 -me_method epzs -subq 1 -trellis 0 \ -refs 1 -bf 0 -b_strategy 0 -coder 1 -me_range 16 -g 250 -keyint_min 25 -sc_threshold 40 \ -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 $2 nice -n 10 ffmpeg -t 30 -i $1 -croptop $5 -cropbottom $6 -cropleft $7 -cropright $8 -s $3 -y \ -acodec libfaac -ab 96k -ar 48000 -pass 2 -vcodec libx264 -threads $NUM_THREADS \ -b ${BIT_RATE}k -maxrate ${BIT_RATE}k -bufsize ${BUF_SIZE}k -rc_init_occupancy ${BUF_INIT}k \ -flags +loop -cmp +chroma -partitions +parti8x8+parti4x4+partp8x8+partp4x4+partb8x8 \ -flags2 +dct8x8+wpred+bpyramid+mixed_refs -me_method umh -subq 8 -trellis 1 -refs 6 -bf 0 \ -directpred 3 -b_strategy 0 -bidir_refine 1 -coder 1 -me_range 16 -g 250 \ -keyint_min 25 -sc_threshold 40 -i_qfactor 0.71 -rc_eq 'blurCplx^(1-qComp)' -qcomp 0.6 \ -qmin 10 -qmax 51 -qdiff 4 $2 |
16th October 2008, 20:16 | #14 | Link | |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
Quote:
Also the "-rc_eq" option is obsolete, I think. The RC equitation is hardcoded now.
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 16th October 2008 at 20:19. |
|
16th October 2008, 20:24 | #15 | Link |
Registered User
Join Date: Aug 2008
Posts: 16
|
The source could definitely be the issue. In this case it is 1080x720 H264. Honestly the threading model of FFMPEG confuses me. I think you can use the -threads option for both the decoder and the encoder. I am pretty sure that the H264 decoder included in FFMPEG is now multithreaded, however when I added -threads 3 to the input I saw little difference. I'm not sure if this is because the decoder is not efficient.
Should I also be using cores *.5 for the # of input threads? Also in the 2nd pass does audio happen on its own thread? You would think that FFMPEG would be able to determine the appropriate # of threads for the operation at hand automatically, but that is probably a question for a different forum. |
16th October 2008, 20:34 | #16 | Link |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
AFAIK ffmepg's H.264 decoder had at least two different methods of multi-threading in the past:
Slice-based multi-threading (works with sliced sources only) and CABAC-based multi-threading (works with all sources that use CABAC, but not very efficient). I think Frame-based multi-threading (similar to x264's implementation) for ffmpeg's H.264 decoder is under development, but not ready yet... (And I think the "threads = cores * 3/2" formula applies to x264 only)
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 16th October 2008 at 20:38. |
16th October 2008, 20:37 | #17 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
|
|
16th October 2008, 20:40 | #18 | Link |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
I see. So unless MattKB's source is sliced or he is using an experimental ffmpeg build from MT-branch, there's no multi-threading at all for the source/decoder.
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ |
16th October 2008, 20:54 | #19 | Link | |
Registered User
Join Date: Aug 2008
Posts: 16
|
Quote:
In either case you would think 1 core could be used for decoding and the other core could be used for encoding with a shared memory buffer between them, but I have no idea how FFMPEG is implemented and this is more of an intellectual excercise so I give up! |
|
|
|