View Full Version : Maximum number of threads to use with x264
Lele-brz
14th November 2008, 17:16
Hi,
Is there a maximum number of threads to use when encoding with x264?
I noticed that (not still sure it's the cause) that on a 24 cores using threads auto, 36 threads are used.
And encoding like this I could see very blocky frames at the beginning of the encoded.
After a while the video is ok.
Thanks for any tips on that
x264 rev number 987
cogman
14th November 2008, 17:50
I don't believe that x264 has any set limit of threads it can use. I don't know why you would see blocking at the beginning, does it change when you use less threads?
bob0r
14th November 2008, 17:54
x264\common\common.h
#define X264_THREAD_MAX 128
There is a limit, not sure if 36 threads would hurt quality that much, if any at all.
cogman
14th November 2008, 17:57
x264\common\common.h
#define X264_THREAD_MAX 128
There is a limit, not sure if 36 threads would hurt quality that much, if any at all.
Is that just an arbitrary number? I mean, right now it is big enough that the average user wont hit that for at least a couple of years, but is there a reason for using 128 and not say 256 or 512.
bob0r
14th November 2008, 17:59
I think it is indeed.
pengvado?
video_magic
14th November 2008, 18:00
From:
http://mewiki.project357.com/wiki/X264_Settings
"....
The quality loss from multiple threads is mostly negligible unless using very high numbers of threads (say, above 16).
The default setting provides more or less optimum speed. If you want to reduce the quality loss, use one thread (Not Recommended, unless all other settings are already maxed)...
"
I've also wondered how much, and in what way, quality loss comes from using more than one thread (to speed up encoding). From what I have read it's something to do with it encoding in rows of blocks.
Lele-brz
14th November 2008, 18:32
Yes, using just 6 threads instead of auto (that was using 36) seemed to fix the problem, but of course increasing the encoding time.
buzzqw
14th November 2008, 18:37
the "auto" cpu usage is cpu*1.5 so 24+12=36
BHH
Chengbin
14th November 2008, 21:42
Why is there quality loss when you use more threads? In a few years we are going to have 16-32 cores and we are going to get horrible looking encodes because we're using a lot of threads?
Dark Shikari
14th November 2008, 21:45
Why is there quality loss when you use more threads? In a few years we are going to have 16-32 cores and we are going to get horrible looking encodes because we're using a lot of threads?You generally don't want (vertical resolution of video) / (threads) to be lower than 40-50, and definitely not lower than ~30. You can afford to let this number be a bit lower (maybe 25-50% lower) if you use B-frames.
Its more of a problem in VBV mode than not.
video_magic
14th November 2008, 22:16
FYI I use threads 2
I have a Hyper-threading P4
I do not notice any quality loss (to my eyes of course, not using any measuring programs) than using threads=1
And I do not fully understand what Dark Shikari is actually recommending in his post above. Would he explain it a bit more simply (what he recommends) to us normal people?
Thanks
:)
Dark Shikari
14th November 2008, 22:27
And I do not fully understand what Dark Shikari is actually recommending in his post above. Would he explain it a bit more simply (what he recommends) to us normal people?Normal people don't have to care, because normal people don't have 16 cores. :p
poisondeathray
14th November 2008, 22:41
Just wondering, Dark Shikari - does the (1.5* number of cores) guideline apply to Bloomfield chips in your testing? Does 4physical + 4virtual cores make a difference for the optimum for setting thread count as opposed to say, 8 physical cores on a current Harpertown setup?
video_magic
14th November 2008, 22:43
Okay man, Like I said thread=2 looks as good as thread=1 to me, so once again - I appreciate all the developers work (and perhaps it doesn't matter if I remain in ignorance on this matter! :) I probably won't upgrade for a few years anyway!)
Dark Shikari
14th November 2008, 22:45
Just wondering, Dark Shikari - does the (1.5* number of cores) guideline apply to Bloomfield chips in your testing? Does 4physical + 4virtual cores make a difference for the optimum for setting thread count as opposed to say, 8 physical cores on a current Harpertown setup?From a quick test on Nehalem, HT really doesn't help at all anyways. If more detailed testing continues to show this, we might modify x264 to detect physical instead of virtual cores.
video_magic
14th November 2008, 23:03
Don't know if your meant this for me (a p4 HT user) but using threads=2 does speed up to about 3.43 frames per second, compared to about 2.47 fps using threads=1, and indeed in my life does compare to allowing for about 3 months to do all my future lot of encodes to about 5 months - which means a lot to me. I am only able to give rough technical information, I have read a lot but am only a normal user. Appreciate all you guys do more than you might think!
Dark Shikari
14th November 2008, 23:07
Don't know if your meant this for me (a p4 HT user) but using threads=2 does speed up to about 3.43 frames per second, compared to about 2.47 fps using threads=1Interesting. So either my Nehalem HT tests were wrong, or the Nehalem's HT is even less useful than the P4's HT.
video_magic
14th November 2008, 23:22
Ok, and for your information my CPU usage goes up to about 97percent plus with HT threads=2 with the latest revisons (in task-manager) compared to about around 65-70percent to before revison 1-plus (sorry I can't be more precise). my command line looks like this
x264 --keyint 300 --min-keyint 20 --scenecut 42 --bframes 4 --b-adapt 2 --b-pyramid --ref 5 --deblock -2:-1 --bitrate 1550 --qpmax 44 --ipratio 1.45 --aq-mode 2 --aq-strength 1.2 --pass 1 --stats "histy1.log" --direct auto --weightb --me umh --merange 20 --subme 9 --mixed-refs --8x8dct --no-fast-pskip --progress --threads 2 --thread-input --output D:\utorrdown\mjh1.264 C:\Temp\avs\mjhst01.avs
PING 1.1.1.1 -n 1 -w 60000 >NUL
x264 --keyint 300 --min-keyint 20 --scenecut 42 --bframes 4 --b-pyramid --ref 5 --deblock -2:-1 --bitrate 1550 --qpmax 44 --ipratio 1.45 --aq-mode 2 --aq-strength 1.2 --pass 2 --stats "histy1.log" --direct auto --weightb --me umh --merange 20 --subme 9 --mixed-refs --8x8dct --no-fast-pskip --progress --threads 2 --thread-input --output D:\utorrdown\mjh1.264 C:\Temp\avs\mjhst01.avs
Hope that is okay and useful. I read that I don't need B-adapt 2 on the Pass 2 command. I would like to know more about what is, and what isn't needed on Pass 2 compared to Pass 1. Thankyou again very much.
kemuri-_9
15th November 2008, 00:19
Hope that is okay and useful. I read that I don't need B-adapt 2 on the Pass 2 command. I would like to know more about what is, and what isn't needed on Pass 2 compared to Pass 1. Thankyou again very much.
A. there is no aq-mode 2 (unless you have some kind of VAQ patch build), it'll clamp down to 1 for standard unpatched versions.
B. as to what's not need in pass 2:
things related solely to frame decisions, which (from what i understand) would be the following
b-adapt, scenecut, min-keyint, keyint, b-bias,
and ratetol by the nature of 2pass
i probably missed some, as i'm not a dev, so Dark Shikari or akupenguin can fill in what i missed.
on this point, it seems that x264 doesn't check the 1st pass .stats for the min/keyint values,
so it'll just write the values given on the 2nd pass right into the string signature,
despite using the ones from the 1st pass (as they determine frame decisions)
Ranguvar
15th November 2008, 02:33
Interesting. So either my Nehalem HT tests were wrong, or the Nehalem's HT is even less useful than the P4's HT.
Especially interesting, since Nehalem's HT is supposed to be better than the old implementations. Perhaps other threading efficiency improvements in Nehalem make HT redundant when dealing with a ~fully scalable app like x264?
squid_80
15th November 2008, 03:17
Is that just an arbitrary number? I mean, right now it is big enough that the average user wont hit that for at least a couple of years, but is there a reason for using 128 and not say 256 or 512.
Not sure about x264 but commonly programs use GetProcessAffinityMask on windows to detect the number of available processors - the catch here is that it returns a 32-bit mask, so only 32 processors max can be reported.
kemuri-_9
15th November 2008, 03:41
x264 calls for pthread's int pthread_num_processors_np(void); which down the line ends up calling GetProcessAffinityMask on windows.
squid_80
15th November 2008, 05:38
I think it matches the OS's limit on processors anyway (technically it's a DWORD_PTR, so 32 for 32-bit and 64 for 64-bit) so no biggie (unless anyone can confirm win32 can handle more than 32 cores).
fields_g
15th November 2008, 11:51
Don't know if your meant this for me (a p4 HT user) but using threads=2 does speed up to about 3.43 frames per second, compared to about 2.47 fps using threads=1
Wouldn't have expected this large of a difference either. Could you do some more benchmarks?
HT | threads | fps
============
off | 1 |
off | 2 |
off | 3 |
on | 1 | 2.47
on | 2 | 3.43
on | 3 |
pcordes
15th November 2008, 22:37
Wouldn't have expected this large of a difference either.
P4 has a reasonable amount of execution units, but it has big problems keeping them busy. It has much smaller caches than Core2, and it has that tiny trace cache. When not executing uops from the trace cache, it's decoder limits it to single-issue. See
http://www.realworldtech.com/page.cfm?ArticleID=RWT040208182719&p=5
(search on that page for "P4".) OTOH, x264 has pretty small code size, so maybe the trace cache does ok. HT will still hide the latency of cache misses, and anything else that stalls the pipeline. Plus, SSE and non-SSE ops use different execution units IIRC, so when one thread is doing some SSE stuff, and another is doing some integer stuff, neither one would be keeping all of the CPU busy by itself anyway.
So I'm not surprised HT helps. It just shows how inefficient P4 is on single threads. Although if the results were different, I might have ended up claiming not to be surprised if HT didn't help x264 on P4, and come up with a logical explanation for that instead. :p
Shinigami-Sama
15th November 2008, 22:52
I think it matches the OS's limit on processors anyway (technically it's a DWORD_PTR, so 32 for 32-bit and 64 for 64-bit) so no biggie (unless anyone can confirm win32 can handle more than 32 cores).
only 32 in 32bit
enterprise server 2k3 can only handle 64 in 64bit
datacenter can handle 128
video_magic
16th November 2008, 20:09
OK, I hope this feedback is alright:
I should now be using a normal, latest build AFAIK. It is from 'Audionut' and reports x264 core:65 r0+1028 83baa7f
My CPU is SL9KF 'Cedar Mill' core - which has 2mb cache by the way http://processorfinder.intel.com/details.aspx?sSpec=SL9KF
I have done a little test just now 2passes with 2 threads, then 2passes with 1 thread. My commandline is a bit different because I wanted more FPS for my future months of encodes and I'm tweaking for better quality to speed ratio, so now --bframes 3 (rather than 4), and --ref 4 (rather than 5). Also removed --aq-mode, it will then use the standard default 1 (according to --longhelp there is no 'option 2' now! ). --keyint and --min-keyint are raised a bit as a little personal preference. Anyway, now the difference in fps is not so great.
P.S. if I look at the SSIM and PSNR numbers which are different for the 1 threads summary vs 2 threads, is that a normal 'variance' in quality results for using just one more thread than only one?
C:\Temp\264>x264 --keyint 350 --min-keyint 30 --bframes 3 --b-adapt 2 --b-pyrami
d --ref 4 --deblock -2:-1 --bitrate 1450 --qpmax 44 --ipratio 1.45 --aq-strength
1.2 --pass 1 --stats "tcrsbld2th.log" --direct auto --weightb --me umh --merang
e 20 --subme 9 --mixed-refs --8x8dct --no-fast-pskip --progress --threads 2 --th
read-input --output D:\utorrdown\cropcrs2th.264 C:\Temp\avs\cropthecross.avs
avis [info]: 720x576 @ 25.00 fps (1741 frames)
x264 [info]: using cpu capabilities: MMX2 SSE2 SSE3 Cache64
x264 [info]: profile High, level 3.0
x264 [info]: slice I:14 Avg QP:26.90 size: 31653 PSNR Mean Y:37.18 U:43.06
V:45.03 Avg:38.50 Global:38.33
x264 [info]: slice P:484 Avg QP:28.53 size: 12708 PSNR Mean Y:34.86 U:40.74
V:43.40 Avg:36.20 Global:35.92
x264 [info]: slice B:1243 Avg QP:29.90 size: 5041 PSNR Mean Y:33.87 U:39.53
V:42.73 Avg:35.22 Global:34.97
x264 [info]: consecutive B-frames: 0.8% 1.0% 26.4% 71.8%
x264 [info]: mb I I16..4: 7.1% 73.6% 19.3%
x264 [info]: mb P I16..4: 0.3% 5.2% 0.8% P16..4: 56.4% 23.0% 12.9% 0.0% 0
.0% skip: 1.5%
x264 [info]: mb B I16..4: 0.1% 0.3% 0.1% B16..8: 65.6% 0.8% 1.6% direct:
7.0% skip:24.5% L0:45.5% L1:52.9% BI: 1.7%
x264 [info]: final ratefactor: 24.15
x264 [info]: 8x8 transform intra:79.2% inter:57.4%
x264 [info]: direct mvs spatial:99.8% temporal:0.2%
x264 [info]: ref P L0 44.9% 28.2% 15.2% 11.7%
x264 [info]: ref B L0 56.6% 31.2% 12.2%
x264 [info]: ref B L1 88.4% 11.6%
x264 [info]: SSIM Mean Y:0.9028894
x264 [info]: PSNR Mean Y:34.175 U:39.897 V:42.935 Avg:35.516 Global:35.234 kb/s:
1477.25
encoded 1741 frames, 3.26 fps, 1477.38 kb/s
C:\Temp\264>PING 1.1.1.1 -n 1 -w 60000 1>NUL
C:\Temp\264>x264 --keyint 350 --min-keyint 30 --bframes 3 --b-pyramid --ref 4 --
deblock -2:-1 --bitrate 1450 --qpmax 44 --ipratio 1.45 --aq-strength 1.2 --pass
2 --stats "tcrsbld2th.log" --direct auto --weightb --me umh --merange 20 --subme
9 --mixed-refs --8x8dct --no-fast-pskip --progress --threads 2 --thread-input -
-output D:\utorrdown\cropcrs2th.264 C:\Temp\avs\cropthecross.avs
avis [info]: 720x576 @ 25.00 fps (1741 frames)
x264 [info]: using cpu capabilities: MMX2 SSE2 SSE3 Cache64
x264 [info]: profile High, level 3.0
x264 [info]: slice I:14 Avg QP:25.65 size: 37579 PSNR Mean Y:38.00 U:43.57
V:45.48 Avg:39.29 Global:39.17
x264 [info]: slice P:485 Avg QP:28.57 size: 12366 PSNR Mean Y:34.83 U:40.72
V:43.40 Avg:36.18 Global:35.97
x264 [info]: slice B:1242 Avg QP:29.86 size: 4926 PSNR Mean Y:33.88 U:39.53
V:42.75 Avg:35.22 Global:35.03
x264 [info]: consecutive B-frames: 0.8% 1.2% 26.2% 71.8%
x264 [info]: mb I I16..4: 5.2% 75.6% 19.2%
x264 [info]: mb P I16..4: 0.3% 5.1% 0.8% P16..4: 57.0% 22.6% 12.8% 0.0% 0
.0% skip: 1.5%
x264 [info]: mb B I16..4: 0.1% 0.3% 0.1% B16..8: 65.7% 0.8% 1.6% direct:
6.9% skip:24.5% L0:45.9% L1:52.4% BI: 1.7%
x264 [info]: 8x8 transform intra:79.5% inter:57.5%
x264 [info]: direct mvs spatial:99.5% temporal:0.5%
x264 [info]: ref P L0 44.8% 28.3% 15.2% 11.7%
x264 [info]: ref B L0 56.6% 31.2% 12.2%
x264 [info]: ref B L1 88.2% 11.8%
x264 [info]: SSIM Mean Y:0.9026834
x264 [info]: PSNR Mean Y:34.182 U:39.897 V:42.951 Avg:35.524 Global:35.293 kb/s:
1452.32
encoded 1741 frames, 3.60 fps, 1452.45 kb/s
C:\Temp\264>PING 1.1.1.1 -n 1 -w 60000 1>NUL
C:\Temp\264>x264 --keyint 350 --min-keyint 30 --bframes 3 --b-adapt 2 --b-pyrami
d --ref 4 --deblock -2:-1 --bitrate 1450 --qpmax 44 --ipratio 1.45 --aq-strength
1.2 --pass 1 --stats "tcrsbld1th.log" --direct auto --weightb --me umh --merang
e 20 --subme 9 --mixed-refs --8x8dct --no-fast-pskip --progress --threads 1 --th
read-input --output D:\utorrdown\cropcrs1th.264 C:\Temp\avs\cropthecross.avs
avis [info]: 720x576 @ 25.00 fps (1741 frames)
x264 [info]: using cpu capabilities: MMX2 SSE2 SSE3 Cache64
x264 [info]: profile High, level 3.0
x264 [info]: slice I:15 Avg QP:26.97 size: 30876 PSNR Mean Y:37.22 U:43.06
V:45.03 Avg:38.54 Global:38.37
x264 [info]: slice P:485 Avg QP:28.56 size: 12647 PSNR Mean Y:34.84 U:40.73
V:43.39 Avg:36.18 Global:35.89
x264 [info]: slice B:1241 Avg QP:29.91 size: 5057 PSNR Mean Y:33.87 U:39.53
V:42.73 Avg:35.21 Global:34.96
x264 [info]: consecutive B-frames: 0.9% 1.3% 25.6% 72.3%
x264 [info]: mb I I16..4: 6.9% 73.9% 19.2%
x264 [info]: mb P I16..4: 0.3% 5.1% 0.8% P16..4: 56.6% 22.9% 12.9% 0.0% 0
.0% skip: 1.6%
x264 [info]: mb B I16..4: 0.1% 0.3% 0.1% B16..8: 65.6% 0.8% 1.6% direct:
7.0% skip:24.6% L0:45.4% L1:53.0% BI: 1.6%
x264 [info]: final ratefactor: 24.14
x264 [info]: 8x8 transform intra:79.3% inter:57.6%
x264 [info]: direct mvs spatial:99.9% temporal:0.1%
x264 [info]: ref P L0 45.1% 28.1% 15.1% 11.7%
x264 [info]: ref B L0 56.7% 31.0% 12.3%
x264 [info]: ref B L1 88.5% 11.5%
x264 [info]: SSIM Mean Y:0.9027838
x264 [info]: PSNR Mean Y:34.170 U:39.893 V:42.935 Avg:35.512 Global:35.224 kb/s:
1478.75
encoded 1741 frames, 2.78 fps, 1478.87 kb/s
C:\Temp\264>PING 1.1.1.1 -n 1 -w 60000 1>NUL
C:\Temp\264>x264 --keyint 350 --min-keyint 30 --bframes 3 --b-pyramid --ref 4 --
deblock -2:-1 --bitrate 1450 --qpmax 44 --ipratio 1.45 --aq-strength 1.2 --pass
2 --stats "tcrsbld1th.log" --direct auto --weightb --me umh --merange 20 --subme
9 --mixed-refs --8x8dct --no-fast-pskip --progress --threads 1 --thread-input -
-output D:\utorrdown\cropcrs1th.264 C:\Temp\avs\cropthecross.avs
avis [info]: 720x576 @ 25.00 fps (1741 frames)
x264 [info]: using cpu capabilities: MMX2 SSE2 SSE3 Cache64
x264 [info]: profile High, level 3.0
x264 [info]: slice I:15 Avg QP:25.75 size: 36580 PSNR Mean Y:38.01 U:43.56
V:45.46 Avg:39.30 Global:39.17
x264 [info]: slice P:486 Avg QP:28.56 size: 12429 PSNR Mean Y:34.84 U:40.72
V:43.41 Avg:36.18 Global:35.97
x264 [info]: slice B:1240 Avg QP:29.88 size: 4883 PSNR Mean Y:33.87 U:39.52
V:42.74 Avg:35.21 Global:35.02
x264 [info]: consecutive B-frames: 0.9% 1.4% 25.4% 72.3%
x264 [info]: mb I I16..4: 5.2% 76.2% 18.6%
x264 [info]: mb P I16..4: 0.3% 5.0% 0.8% P16..4: 56.9% 22.8% 12.8% 0.0% 0
.0% skip: 1.5%
x264 [info]: mb B I16..4: 0.1% 0.3% 0.1% B16..8: 65.8% 0.8% 1.6% direct:
6.8% skip:24.7% L0:45.8% L1:52.5% BI: 1.7%
x264 [info]: 8x8 transform intra:79.7% inter:57.4%
x264 [info]: direct mvs spatial:99.4% temporal:0.6%
x264 [info]: ref P L0 44.8% 28.2% 15.3% 11.6%
x264 [info]: ref B L0 56.8% 31.0% 12.2%
x264 [info]: ref B L1 88.4% 11.6%
x264 [info]: SSIM Mean Y:0.9025816
x264 [info]: PSNR Mean Y:34.178 U:39.889 V:42.953 Avg:35.520 Global:35.288 kb/s:
1452.51
encoded 1741 frames, 3.17 fps, 1452.64 kb/s
C:\Temp\264>
kemuri-_9
16th November 2008, 20:44
P.S. if I look at the SSIM and PSNR numbers which are different for the 1 threads summary vs 2 threads, is that a normal 'variance' in quality results for using just one more thread than only one?
only what's done in 1 thread is guaranteed to be deterministic.
multithreaded passes will generally have slight fluctuations due to this.
looking from your results, you have an extremely large % of max b-frame sequences,
you should see some increase in compression and/or quality if you enlarge the max bframe count
(at the cost of slower encoding times w/ b-adapt 2)
as a small note, if you're looking for standard unpatched builds, it's best to use the ones on x264.nl.
video_magic
16th November 2008, 21:30
Thankyou kemuri-_9 :thanks:
I have done another quick test - exactly the same as above, except putting back up the --bframes 4 and --ref 5
It was slower, but the difference from 2 threads vs 1 thread was still not too great.
pass 1 threads 2
encoded 1741 frames, 3.03 fps, 1475.20 kb/s
pass 2 threads 2
encoded 1741 frames, 3.43 fps, 1454.98 kb/s
pass 1 threads 1
encoded 1741 frames, 2.57 fps, 1476.28 kb/s
pass 2 threads 1
encoded 1741 frames, 3.01 fps, 1454.45 kb/s
My conclusion is that the 'phenomenon' before was some recent build of x264 that I had temporarily been using.
akupenguin
16th November 2008, 21:37
multithreaded passes will generally have slight fluctuations due to this.
True, but only due to a bug I never bothered to track down.
The main cause of the difference is the limitation of vertical mv length and the delayed ratecontrol feedback, which depend on the number of threads and would remain even if I fixed determinism.
video_magic
17th November 2008, 16:11
Just discovered something else - I was using the CTRL+S to try the pause command (which I read about on here). I think I was probably trying it out using it the time I thought I had noticed the big FPS difference in encoding when trying 2 threads.... :( I think the FPS are changed by that than if no pause had been used :stupid:
Sagekilla
17th November 2008, 19:25
That's because when x264 calculates fps, it uses the total time elapsed from start to finish. Meaning, if you start a 3600 frame task at 1:00 and it ends at 2:00, then that encoded at 1 fps. If you had the same task start at 1:00, but you paused it from 1:01 to 2:01 and resumed from 2:01 to 3:00, the fps suddenly becomes 0.50 fps.
The total time spent encoding (not idling) those 3600 frames (1 hr) didn't change at all, but the total time the task was being run (whether processing was done or not) did.
kemuri-_9
17th November 2008, 20:28
yes, it uses simple timestamps of the encoding start and completion times to calculate and display the fps.
So any pauses (times where it's not crunching) are not reflected as it really can't keep track of detecting pausing seamlessly without having to rewrite it some.
and there's no real point in rewriting it to account for pausing.
Dark Shikari
17th November 2008, 20:52
yes, it uses simple timestamps of the encoding start and completion times to calculate and display the fps.
So any pauses (times where it's not crunching) are not reflected as it really can't keep track of detecting pausing seamlesslyIts impossible to keep track of pauses because SIGSTOP cannot be caught. You could grab the time once per row of frame encoded and discard the time if its over a certain value I guess... but that would be hacky.
pcordes
18th November 2008, 23:42
Its impossible to keep track of pauses because SIGSTOP cannot be caught.
CTRL+S just makes write() to the terminal block. It's XON/XOFF flow control. I sometimes disable that behaviour of ^S, so I can use bash's forward history i-search, which complements the ^R reverse history i-search nicely.
CTRL+Z sends SIGTSTP, which can be caught, precisely to allow "clean" behaviour when suspending.
You could grab the time once per row of frame encoded and discard the time if its over a certain value I guess... but that would be hacky.
Or you could measure FPS = frames per second of CPU time, not real time. Again, hacky, and not even what people expect.
Dark Shikari
18th November 2008, 23:46
CTRL+Z sends SIGTSTP, which can be caught, precisely to allow "clean" behaviour when suspending......When SIGSTOP is sent to a process, the usual behaviour is to pause that process in its current state. The process will only resume execution if it is sent the SIGCONT signal. SIGSTOP and SIGCONT are used for job control in the Unix shell, among other purposes. SIGSTOP cannot be caught or ignored.
akupenguin
19th November 2008, 00:52
@DS
Note the difference between SIGSTOP and SIGTSTP (and the other 2 stop signals too).
Or you could measure FPS = frames per second of CPU time, not real time. Again, hacky, and not even what people expect.
And then x264 claims to not be going any faster when multithreaded.
LoRd_MuldeR
19th November 2008, 01:04
Why not reset the number of frames and the time after a certain interval? Like this:
Instead of "FPS = total_number_of_frames_processed / total_time_passed" use "FPS = number_of_frames_processed_in_last_interval / interval_length"
Then process suspension would only give a "wrong" FPS value for one single interval.
Dark Shikari
19th November 2008, 01:14
Why not reset the number of frames and the time after a certain interval? Like this:
Instead of "FPS = total_number_of_frames_processed / total_time_passed" use "FPS = number_of_frames_processed_in_last_interval / interval_length"
Then process suspension would only give a "wrong" FPS value for one single interval.In that case I'd want to display both values the overall fps and the current FPS, as overall FPS is useful as well...
akupenguin
19th November 2008, 01:33
pause.diff (http://akuvian.org/src/x264/pause.diff)
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.