Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
16th December 2006, 02:08 | #1 | Link |
Pain and suffering
Join Date: Jul 2002
Posts: 1,337
|
x264 revision 610 - sliceless threading - what's new
------------------------------------------------------------------------
r610 | pengvado | 2006-12-16 01:32:38 +0100 (Sat, 16 Dec 2006) | 2 lines more win32threads -> pthreads ------------------------------------------------------------------------ r609 | pengvado | 2006-12-16 00:08:57 +0100 (Sat, 16 Dec 2006) | 2 lines cosmetics: rename list operators to be consistent with Perl, and move them to common/ ------------------------------------------------------------------------ r608 | pengvado | 2006-12-16 00:06:21 +0100 (Sat, 16 Dec 2006) | 2 lines win32: use pthreads instead of win32threads. for some reason, pthreads is much faster. ------------------------------------------------------------------------ r607 | pengvado | 2006-12-16 00:03:36 +0100 (Sat, 16 Dec 2006) | 13 lines New threading method: Encode multiple frames in parallel instead of dividing each frame into slices. Improves speed, and reduces the bitrate penalty of threading. Side effects: It is no longer possible to re-encode a frame, so threaded scenecut detection must run in the pre-me pass, which is faster but less precise. It is now useful to use more threads than you have cpus. --threads=auto has been updated to use cpus*1.5. Minor changes to ratecontrol. New options: --pre-scenecut, --mvrange-thread, --non-deterministic ------------------------------------------------------------------------ when multithreading "--threads auto" recommended Here are some results showing cpu*1.5: x264_610.exe --threads # -B5000 -m6 -r5 --direct=temporal --me=hex -b2 -w --qcomp=0.10 -A"p8x8,i8x8,i4x4" -8 --fps=25 --output NUL 720p50_mobcal_ter.yuv 1280x720 x264_606.exe --threads # -B5000 -m6 -r5 --direct=temporal --me=hex -b2 -w --qcomp=0.10 -A"p8x8,i8x8,i4x4" -8 --fps=25 --output NUL 720p50_mobcal_ter.yuv 1280x720 Intel D930 3.0GHz (2 cores, no HT): 610 2 threads: 3.89 610 3 threads: 5.41 606 2 threads: 4.63 606 3 threads: 4.46 P4 xeon 3.06GHz (2 cores, HT): 610 4 threads: 4.24 610 5 threads: 4.63 606 4 threads: 3.98 606 5 threads: 4.01 AMD X2 3800+ 2.55GHz (2 cores): 610 2 threads: 4.07 610 3 threads: 5.24 606 2 threads: 4.85 606 3 threads: 4.78 C2D E6400 3.4GHz (2 cores): 610 2 threads: 8.78 610 3 threads: 12.35 606 2 threads: 9.52 606 3 threads: 8.34 I have more results, of earlier patches: http://x264.nl/results.txt (cv = pthreads) Not sure why, but Intel Core 2 Duo users can be happy with the big speed up. Note as the changelog says, New options: --pre-scenecut, --mvrange-thread, --non-deterministic, Go MeGUI .... and other guis! Last edited by bob0r; 16th December 2006 at 02:58. Reason: i like to edit stuff |
16th December 2006, 02:40 | #3 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Yes, but not as much as before. Hence the "reduces the bitrate penalty".
Code:
cpu: 4 core Xeon 5160 threads speed psnr loss r606 r611 r606 r611 1: 1.000x 1.000x 0.000 0.000 2: 1.540x 1.739x -0.036 -0.004 3: 1.838x 2.384x -0.065 -0.002 4: 2.043x 3.224x -0.077 -0.005 5: 2.028x 3.512x -0.110 -0.009 6: 2.034x 3.629x -0.132 -0.009 7: 1.988x 3.680x -0.151 -0.015 8: 1.953x 3.702x -0.188 -0.017 9: 2.016x 3.729x -0.210 -0.020 10: 1.995x 3.742x -0.233 -0.031 11: 1.954x 3.749x -0.255 -0.030 12: 1.909x 3.765x -0.268 -0.040 13: 1.895x 3.770x -0.286 -0.045 14: 1.936x 3.759x -0.313 -0.046 15: 1.897x 3.781x -0.335 -0.045 16: 1.845x 3.765x -0.349 -0.046 scaling efficiency (speed / #cores): r606: 51% r611: 94% Last edited by akupenguin; 16th December 2006 at 07:44. |
16th December 2006, 03:31 | #4 | Link |
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
So... the megui automatic no. of threads detection should be updated.
also what's the exact usage for the new options? what do they actually do? is the haali's AQ patch working correctly after those changes (it applies almost correctly btw...)? for builders: pthread for win32 is here -> ftp://sources.redhat.com/pub/pthreads-win32
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! Last edited by Sharktooth; 16th December 2006 at 03:49. |
16th December 2006, 03:58 | #5 | Link |
Pain and suffering
Join Date: Jul 2002
Posts: 1,337
|
@builders
or here: cvs -d :pserver:anoncvs@sources.redhat.com:/cvs/pthreads-win32 checkout pthreads make clean GC-static copy pthreads/*.h to include dir copy libpthreadGC2.a to lib dir @sharktooth: You can just use --threads auto for megui, as x264.exe will detect the # of cpu to use, and set threads by *1.5 (Using HT might give one or to extra threads, but the quality will not be affected much at all) Exact usage of new options no clue here, from x264.exe: --pre-scenecut Faster, less precise scenecut detection. Required and implied by multi-threading. --mvrange-thread <int> Minimum buffer between threads [-1 (auto)] --non-deterministic Slightly improve quality of SMP, at the cost of repeatability Last edited by bob0r; 16th December 2006 at 04:02. |
16th December 2006, 04:13 | #6 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Short answer: you shouldn't need to use any of the new options.
--pre-scenecut might possibly be useful for fast 1pass single-threaded encodes, but is mostly just so I can compare the two scenecut algorithms without invoking the other threading stuff. |
16th December 2006, 04:30 | #7 | Link |
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
im just updating megui... i removed the ability to set the number of threads within the x264 profile (updown control grayed out).
so --threads auto will be enforced when "Automatically set the number of threads" global option is enabled in megui settings. It's just a workaround but should work as expected.
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! |
16th December 2006, 05:02 | #8 | Link |
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
Im reorganizing the MeGUI version numbers. A new MeGUI build is coming. Just be patient...
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! |
16th December 2006, 05:02 | #9 | Link |
Fighting spam with a fish
Join Date: Sep 2005
Posts: 2,699
|
Uh, sharktooth? I just downloaded the new version, and those settings aren't greyed out. In addition MeGUI still sets the threads manually. Like I am using 2 threads right now according to the log for the first pass.
Edit: look at the post times! That should explain it. Last edited by Adub; 16th December 2006 at 05:05. |
16th December 2006, 05:13 | #12 | Link |
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
megui 0.2.3.2193 is up and running. Enjoy.
Tomorrow i'll update the Trunk version (0.2.4.0000) with a more elegant solution @bobor and other builders: DO A SVN CHECKOUT and use the sources in Tags/2193 for building.
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! Last edited by Sharktooth; 16th December 2006 at 05:25. |
16th December 2006, 05:21 | #15 | Link |
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
uhm... reboot. .NET cache is doing something wrong.
now im going to sleep (5:21AM) if you have still problems unistall megui and install a fresh version. then let it autoupdate.
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! Last edited by Sharktooth; 16th December 2006 at 05:23. |
16th December 2006, 05:38 | #16 | Link |
Registered User
Join Date: Oct 2003
Posts: 435
|
i use ELDER now for my encodes but i see this "bug" in the release i thought might need a look at:
x264.nl REV611: http://mirror01.x264.nl/x264/revision611/x264.exe H:\MCHD encodes\ELDER RATIO.1.77.1\905029>..\x264.exe -o test3.264 --ref 5 --8x8 dct --mixed-refs --no-fast-pskip --bframes 3 --bime --weightb --direct auto --me range 12 --subme 6 --analyse all --me umh --colormatrix bt709 --filter -2:-1 --p ass 2 --stats stageC-3.stats --progress -B 1377 stageA-3.avs avis [info]: 720x400 @ 23.98 fps (357 frames) x264 [info]: using cpu capabilities MMX MMXEXT SSE SSE2 x264 [info]: slice I:5 Avg QP:20.60 size: 16678 PSNR Mean Y:46.04 U:47.30 V:47.27 Avg:46.40 Global:45.94 x264 [info]: slice P:268 Avg QP:22.98 size: 8262 PSNR Mean Y:44.48 U:46.08 V:45.76 Avg:44.89 Global:44.22 x264 [info]: slice B:84 Avg QP:23.46 size: 3583 PSNR Mean Y:44.30 U:46.71 V:46.87 Avg:44.96 Global:44.21 x264 [info]: mb I I16..4: 22.0% 63.3% 14.7% x264 [info]: mb P I16..4: 15.7% 25.7% 4.5% P16..4: 30.2% 7.1% 2.0% 0.2% 0 .1% skip:14.5% x264 [info]: mb B I16..4: 1.7% 4.0% 2.2% B16..8: 31.7% 1.6% 2.5% direct: 2.8% skip:53.5% x264 [info]: 8x8 transform intra:56.0% inter:50.3% x264 [info]: direct mvs spatial:0.0% temporal:100.0% x264 [info]: ref P 73.7% 13.5% 8.0% 2.7% 2.2% x264 [info]: ref B 86.8% 6.1% 3.7% 1.9% 1.5% x264 [info]: SSIM Mean Y:0.9766733 x264 [info]: PSNR Mean Y:44.460 U:46.243 V:46.042 Avg:44.927 Global:44.239 kb/s: 1396.13 encoded 357 frames, 5.57 fps, 1396.55 kb/s profiling:F:\msys\home\user\x264/x264.gcda:Cannot open profiling:F:\msys\home\user\x264/matroska.gcda:Cannot open profiling:F:\msys\home\user\x264/muxers.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/encoder.gcda:Cannot open profiling:F:\msys\home\user\x264/common/common.gcda:Cannot open profiling:F:\msys\home\user\x264/common/mdate.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/set.gcda:Cannot open profiling:F:\msys\home\user\x264/common/macroblock.gcda:Cannot open profiling:F:\msys\home\user\x264/common/set.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/ratecontrol.gcda:Cannot open profiling:F:\msys\home\user\x264/common/frame.gcda:Cannot open profiling:F:\msys\home\user\x264/common/pixel.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/macroblock.gcda:Cannot open profiling:F:\msys\home\user\x264/common/cpu.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/analyse.gcda:Cannot open profiling:F:\msys\home\user\x264/common/mc.gcda:Cannot open profiling:F:\msys\home\user\x264/common/cabac.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/cavlc.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/cabac.gcda:Cannot open profiling:F:\msys\home\user\x264/common/dct.gcda:Cannot open profiling:F:\msys\home\user\x264/common/quant.gcda:Cannot open profiling:F:\msys\home\user\x264/common/csp.gcda:Cannot open profiling:F:\msys\home\user\x264/common/predict.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/eval.gcda:Cannot open profiling:F:\msys\home\user\x264/common/i386/predict-c.gcda:Cannot open profiling:F:\msys\home\user\x264/encoder/me.gcda:Cannot open profiling:F:\msys\home\user\x264/common/i386/mc-c.gcda:Cannot open H:\MCHD encodes\ELDER RATIO.1.77.1\905029>echo.1>stageC-result3.ready =========================================== Sharktooths REV611: http://mirror05.x264.nl/Sharktooth/?dir=./x264 H:\MCHD encodes\ELDER RATIO.1.77.1\311401>..\x264.exe -o test3.264 --ref 5 --8x8 dct --mixed-refs --no-fast-pskip --bframes 3 --bime --weightb --direct auto --me range 12 --subme 6 --analyse all --me umh --colormatrix bt709 --filter -2:-1 --p ass 2 --stats stageC-3.stats --progress -B 1377 stageA-3.avs avis [info]: 720x400 @ 23.98 fps (357 frames) x264 [info]: using cpu capabilities MMX MMXEXT SSE SSE2 x264 [info]: slice I:5 Avg QP:20.60 size: 16678 PSNR Mean Y:46.04 U:47.30 V:47.27 Avg:46.40 Global:45.94 x264 [info]: slice P:268 Avg QP:22.98 size: 8262 PSNR Mean Y:44.48 U:46.08 V:45.76 Avg:44.89 Global:44.22 x264 [info]: slice B:84 Avg QP:23.46 size: 3583 PSNR Mean Y:44.30 U:46.71 V:46.87 Avg:44.96 Global:44.21 x264 [info]: mb I I16..4: 22.0% 63.3% 14.7% x264 [info]: mb P I16..4: 15.7% 25.7% 4.5% P16..4: 30.2% 7.1% 2.0% 0.2% 0 .1% skip:14.5% x264 [info]: mb B I16..4: 1.7% 4.0% 2.2% B16..8: 31.7% 1.6% 2.5% direct: 2.8% skip:53.5% x264 [info]: 8x8 transform intra:56.0% inter:50.3% x264 [info]: direct mvs spatial:0.0% temporal:100.0% x264 [info]: ref P 73.7% 13.5% 8.0% 2.7% 2.2% x264 [info]: ref B 86.8% 6.1% 3.7% 1.9% 1.5% x264 [info]: SSIM Mean Y:0.9766733 x264 [info]: PSNR Mean Y:44.460 U:46.243 V:46.042 Avg:44.927 Global:44.239 kb/s: 1396.13 encoded 357 frames, 6.19 fps, 1396.55 kb/s H:\MCHD encodes\ELDER RATIO.1.77.1\311401>echo.1>stageC-result3.ready Sharktooth's version works well but the other one as you can see isnt built correctly (i think) anyways... |
16th December 2006, 05:44 | #17 | Link |
Pain and suffering
Join Date: Jul 2002
Posts: 1,337
|
Updated revision 611 coming right up
wrong 611 x264.exe md5: 2d738a38420fb2d92154556a1aa29c70 correct 611 x264.exe md5: b64eb9012bd24e2cd4f616be482c5a31 In about 30 minutes a proper version should be online, without the half-as*-ed make fprofiled version Last edited by bob0r; 16th December 2006 at 06:13. Reason: correct file update |
16th December 2006, 06:36 | #18 | Link |
Registered User
Join Date: Jul 2004
Posts: 169
|
Great job, thanks all the x264 developers for this. Below is a quick test from me:
Hardware: Conroe E6600@2.4GHz FPS Kbps PSNR b606(1t) 9.76 1200.92 47.015 b606(2t) 14.28 1214.10 47.008 b611(1t) 9.37 1229.90 47.203 b611(2t) 15.30 1242.54 47.248 b611(auto) 17.45 1242.29 47.248 b611(4t) 17.97 1241.65 47.247 It's good to see no PSNR penalty now and the CPU is mostly >90% usage. |
16th December 2006, 14:02 | #20 | Link | |
Mr. Sandman
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
|
Quote:
@Merlin: .NET cache sometimes does weird things... and when it happens it's always a PITA
__________________
MPEG-4 ASP Custom Matrices: EQM V1(old), EQM AutoGK Sharpmatrix (aka EQM V2), EQM V3HR (updated 01/10/2004), EQM V3LR, EQM V3ULR (updated 04/02/2005), EQM V3UHR (updated 17/12/2004) and EQM V3EHR (updated 05/10/2004) Info about my ASP matrices. MPEG-4 AVC Custom Matrices: EQM AVC-HR Info about my AVC matrices My x264 builds. Mooo!!! |
|
|
|