View Full Version : Multithreading, frameserving and x264
Daodan
15th March 2006, 15:42
When using a dual core cpu and you have a cpu consuming avs script, even if you use multithreading for x264, very little from the second cpu gets used. I was wondering if there is a way to make the frameserving to go to one cpu and the encoding to the other, I think this would be the best way to achieve maximum speed from a dualcore.
So can this be done? MT for plugins does help but very little, and making it lossless is also difficult (I calculated that I would need a 300 gb partition for my movie with huffyuv).
Thank you.
Doom9
15th March 2006, 19:25
it's a bad thing to say but use virtualdub and you shall have what you want. Keep in mind though that for an optimal use, decoding would need to use the exact same amount of cpu time as encoding and it's impossibly to really control that.
akupenguin
15th March 2006, 19:34
We plan to support multithreaded avs reading in x264cli. Of course, I can't tell when it'll be ready.
Daodan
16th March 2006, 13:09
it's a bad thing to say but use virtualdub and you shall have what you want. Keep in mind though that for an optimal use, decoding would need to use the exact same amount of cpu time as encoding and it's impossibly to really control that.
I'm sorry but I can't make it work with VD as well (I tried to do my encode with xvid since I'm using high bitrate anyway). Only 50% is used. Are there any hidden settings for this? (an uncompressed sample has 10 fps for first pass, with avs I get only 0.96 :( )
hwti
16th March 2006, 19:08
We plan to support multithreaded avs reading in x264cli. Of course, I can't tell when it'll be ready.
So currently AVS reading is done in the same thread as video encoding.
this would explain my problem :
http://forum.doom9.org/showthread.php?p=800368#post800368
bond
17th March 2006, 15:49
http://students.washington.edu/lorenm/src/x264/x264_avs_thread.1.diff
seems pengvado is working on the goodie
foxyshadis
18th March 2006, 16:35
Now that the patch has been committed, I'm curious on one detail: Is --threads 2 now 1 source thread and one encoder thread? Or 1 source thread in addition to the two encoder threads?
woah!
18th March 2006, 20:43
r473 works great 23.40 fps to 39.50fps is a great speedup.
--ref 3 --bframes 3 --filter -4,-4 --subme 4 --
analyse all --8x8dct --direct auto --me umh --no-fast-pskip --progress --no-psnr
encoded 3601 frames, 23.40 fps, 714.16 kb/s
--threads 2 --ref 3 --bframes 3 --filter -4,-4 --subme 4 --
analyse all --8x8dct --direct auto --me umh --no-fast-pskip --progress --no-psnr
encoded 3601 frames, 38.65 fps, 718.34 kb/s
--threads 4 --ref 3 --bframes 3 --filter -4,-4 --subme 4 --
analyse all --8x8dct --direct auto --me umh --no-fast-pskip --progress --no-psnr
encoded 3601 frames, 39.50 fps, 720.37 kb/s
Daodan
20th March 2006, 12:16
Great! I didn't expect to see this so fast. I finally get over 1 fps in second pass, definitely better than with multithreaded x264. Strange thing is that using 2 threads for x264 (beside the one for avs) (since I still had around 70% cpu usage) I get almost 100% usage but speed decreases a bit.
As I said, great addition if you have a complicated avs script and don't have the space for lossless.
hwti
21st March 2006, 10:55
with a avs script with FFT3dGPU (takes time, but no CPU)
no thread options -> 15.55 fps
--thread-input -> 16.25 fps
--threads 2 -> 23.59 fps
--threads 2 --thread-input -> 23.73 fps
I only have a single core CPU, so I guess that the input thread doesn't do prefetching (asking avisynth a new frame while the encoder thread processes the previous)
It would be useful to have --thread-input do prefetching like virtualdub.
Manao
21st March 2006, 14:05
It would be useful to have --thread-input do prefetching like virtualdub.Do a little search. You'll find out that it has been there for a long time now. It was actually the only advantage of vdub over x264 cli.
hwti
21st March 2006, 16:54
Do a little search. You'll find out that it has been there for a long time now. It was actually the only advantage of vdub over x264 cli.
But why doesn't x264 CLI use 100% CPU with only --thread-input with FFT3dGPU ?
the CPU time not used by FFT3dGPU when reading frame n+1 from avs should be used by x264 to encode frame n.
Romario
21st March 2006, 17:02
Can someone post little guide to --thread-input function.
Latest MeGUI build doesn't have option for this.
Pomyk
21st March 2006, 17:08
You just use it or or you don't. No need for a guide.
Romario
21st March 2006, 17:16
Yes, I know that. But, can I use --thread-input on single core machine, Athlon XP 3200+.
About speed-ups on single core???
Doom9
21st March 2006, 18:23
as far as megui goes, take a look at the changelog:
0.2.3.2114 19 March 2006
Commit by Sharkx1976:
- Added support for --thread-input x264 option in CommandLineGenerator.cs
Note: it gets automatically added for --threads > 1.
On top of that, MeGUI automatically sets the proper number of threads so basically you need to know or worry about nothing (except when you have Win9x.. I added an internal exception just for those operating systems :)
can I use --thread-input on single core machine, Athlon XP 3200+.Why would you? Imagine, if you split up a job into two threads, the OS will have more work with synchronization.. so you'll only slow things down.
hwti
22nd March 2006, 09:53
Why would you? Imagine, if you split up a job into two threads, the OS will have more work with synchronization.. so you'll only slow things down.
With FFT3dGPU in the avs script, it can be useful to use --thread-input and --threads 2 even with a single core CPU.
I don't understand what does --thread-input do, since if it were doing prefetching, I wouldn't have any speedup with --threads 2 with a single core CPU.
Edit : It seems --thread-input does some some prefetcing, but not enough : I get from 16 to 21 fps (yes, it is sometimes faster, sometimes slower) when i get 15 fps without, and 23 with --threads 2
akupenguin
22nd March 2006, 09:54
--threads 2 is never useful on a single core.
--thread-input does prefetch.
I have not tested fft3dgpu, but I have tested plain yuv input (where, for fast x264 settings, encoding is limited by harddrive speed). And that does benefit from --thread-input one a single core, and then becomes (slightly) slower with "--thread-input --threads 2".
hwti
22nd March 2006, 10:04
Why with --thread-input don't I have 100% CPU used ? If it does prefetching, all CPU time not used by FFT3dGPU when prefetching a frame should be used by the encoder for the previous frame.
Why is --threads 2 faster with my singlecore in this case (A64 so no HT) ?
Maybe I did something wrong, but with virtualdub I get always 100% CPU with the same avs script (not same x264 settings since vfw GUI hasn't all options)
no thread options -> 15.55 fps
--thread-input -> 16 fps to 21 fps, It isn't always the same
--threads 2 -> 23.59 fps
--threads 2 --thread-input -> 23.73 fps
Sharktooth
22nd March 2006, 13:39
more threads on a single core/non HT CPU can only SLOW THINGS DOWN unless something's screwed with your system.
berrinam
23rd March 2006, 11:31
more threads on a single core/non HT CPU can only SLOW THINGS DOWN unless something's screwed with your system.Only true if the CPU is the bottleneck. Input prefetch can be faster even on one-core systems, because the encoding process doesn't need to wait for the hard drive access required to serve the video.
On that topic, is there any real situation in which someone wouldn't want to use --thread-input?
akupenguin
23rd March 2006, 12:48
On that topic, is there any real situation in which someone wouldn't want to use --thread-input?
If input is piped from another program. That's inherently multithreaded, and the data to read is already in memory, so --thread-input will only increase the syncing overhead.
Or if the avisynth script is really cpu-bound.
hwti
23rd March 2006, 13:51
akupenguin> Have you tested with FFT3dGPU ?
I don't think I have a problem with my system.
Is there any difference between x264 CLI --thread-input prefetching and virtualdub one which may justify that with the first I need --threads 2 to use 100% CPU ?
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.