Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 16th December 2006, 02:08   #1  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
x264 revision 610 - sliceless threading - what's new

------------------------------------------------------------------------
r610 | pengvado | 2006-12-16 01:32:38 +0100 (Sat, 16 Dec 2006) | 2 lines

more win32threads -> pthreads

------------------------------------------------------------------------
r609 | pengvado | 2006-12-16 00:08:57 +0100 (Sat, 16 Dec 2006) | 2 lines

cosmetics: rename list operators to be consistent with Perl, and move them to common/

------------------------------------------------------------------------
r608 | pengvado | 2006-12-16 00:06:21 +0100 (Sat, 16 Dec 2006) | 2 lines

win32: use pthreads instead of win32threads. for some reason, pthreads is much faster.

------------------------------------------------------------------------
r607 | pengvado | 2006-12-16 00:03:36 +0100 (Sat, 16 Dec 2006) | 13 lines

New threading method:
Encode multiple frames in parallel instead of dividing each frame into slices.
Improves speed, and reduces the bitrate penalty of threading.

Side effects:
It is no longer possible to re-encode a frame, so threaded scenecut detection
must run in the pre-me pass, which is faster but less precise.
It is now useful to use more threads than you have cpus. --threads=auto has
been updated to use cpus*1.5.
Minor changes to ratecontrol.

New options: --pre-scenecut, --mvrange-thread, --non-deterministic

------------------------------------------------------------------------


when multithreading "--threads auto" recommended


Here are some results showing cpu*1.5:
x264_610.exe --threads # -B5000 -m6 -r5 --direct=temporal --me=hex -b2 -w --qcomp=0.10 -A"p8x8,i8x8,i4x4" -8 --fps=25 --output NUL 720p50_mobcal_ter.yuv 1280x720
x264_606.exe --threads # -B5000 -m6 -r5 --direct=temporal --me=hex -b2 -w --qcomp=0.10 -A"p8x8,i8x8,i4x4" -8 --fps=25 --output NUL 720p50_mobcal_ter.yuv 1280x720

Intel D930 3.0GHz (2 cores, no HT):
610 2 threads: 3.89
610 3 threads: 5.41
606 2 threads: 4.63
606 3 threads: 4.46

P4 xeon 3.06GHz (2 cores, HT):
610 4 threads: 4.24
610 5 threads: 4.63
606 4 threads: 3.98
606 5 threads: 4.01

AMD X2 3800+ 2.55GHz (2 cores):
610 2 threads: 4.07
610 3 threads: 5.24
606 2 threads: 4.85
606 3 threads: 4.78

C2D E6400 3.4GHz (2 cores):
610 2 threads: 8.78
610 3 threads: 12.35
606 2 threads: 9.52
606 3 threads: 8.34

I have more results, of earlier patches:
http://x264.nl/results.txt (cv = pthreads)

Not sure why, but Intel Core 2 Duo users can be happy with the big speed up.

Note as the changelog says, New options: --pre-scenecut, --mvrange-thread, --non-deterministic,
Go MeGUI .... and other guis!

Last edited by bob0r; 16th December 2006 at 02:58. Reason: i like to edit stuff
bob0r is offline   Reply With Quote
Old 16th December 2006, 02:21   #2  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
what about the aspect akupenguin once mentioned that, with more splitting up into more threads the quality also decreases (slightly) ? does this still apply now ?
Thunderbolt8 is offline   Reply With Quote
Old 16th December 2006, 02:40   #3  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Yes, but not as much as before. Hence the "reduces the bitrate penalty".

Code:
cpu: 4 core Xeon 5160

threads    speed        psnr loss 
        r606   r611    r606   r611
 1:    1.000x 1.000x   0.000  0.000
 2:    1.540x 1.739x  -0.036 -0.004
 3:    1.838x 2.384x  -0.065 -0.002
 4:    2.043x 3.224x  -0.077 -0.005
 5:    2.028x 3.512x  -0.110 -0.009
 6:    2.034x 3.629x  -0.132 -0.009
 7:    1.988x 3.680x  -0.151 -0.015
 8:    1.953x 3.702x  -0.188 -0.017
 9:    2.016x 3.729x  -0.210 -0.020
10:    1.995x 3.742x  -0.233 -0.031
11:    1.954x 3.749x  -0.255 -0.030
12:    1.909x 3.765x  -0.268 -0.040
13:    1.895x 3.770x  -0.286 -0.045
14:    1.936x 3.759x  -0.313 -0.046
15:    1.897x 3.781x  -0.335 -0.045
16:    1.845x 3.765x  -0.349 -0.046

scaling efficiency (speed / #cores):
r606: 51%
r611: 94%
The above was measured on Linux. We had much trouble getting efficient thread synchronization on Windows, and the performance still might be not quite as good.

Last edited by akupenguin; 16th December 2006 at 07:44.
akupenguin is offline   Reply With Quote
Old 16th December 2006, 03:31   #4  |  Link
Sharktooth
Mr. Sandman
 
Sharktooth's Avatar
 
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
So... the megui automatic no. of threads detection should be updated.
also what's the exact usage for the new options? what do they actually do?
is the haali's AQ patch working correctly after those changes (it applies almost correctly btw...)?

for builders: pthread for win32 is here -> ftp://sources.redhat.com/pub/pthreads-win32

Last edited by Sharktooth; 16th December 2006 at 03:49.
Sharktooth is offline   Reply With Quote
Old 16th December 2006, 03:58   #5  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
@builders
or here:
cvs -d :pserver:anoncvs@sources.redhat.com:/cvs/pthreads-win32 checkout pthreads
make clean GC-static
copy pthreads/*.h to include dir
copy libpthreadGC2.a to lib dir

@sharktooth:
You can just use --threads auto for megui, as x264.exe will detect the # of cpu to use, and set threads by *1.5 (Using HT might give one or to extra threads, but the quality will not be affected much at all)

Exact usage of new options no clue here, from x264.exe:
--pre-scenecut Faster, less precise scenecut detection.
Required and implied by multi-threading.

--mvrange-thread <int> Minimum buffer between threads [-1 (auto)]

--non-deterministic Slightly improve quality of SMP, at the cost of repeatability

Last edited by bob0r; 16th December 2006 at 04:02.
bob0r is offline   Reply With Quote
Old 16th December 2006, 04:13   #6  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Short answer: you shouldn't need to use any of the new options.
--pre-scenecut might possibly be useful for fast 1pass single-threaded encodes, but is mostly just so I can compare the two scenecut algorithms without invoking the other threading stuff.
akupenguin is offline   Reply With Quote
Old 16th December 2006, 04:30   #7  |  Link
Sharktooth
Mr. Sandman
 
Sharktooth's Avatar
 
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
im just updating megui... i removed the ability to set the number of threads within the x264 profile (updown control grayed out).
so --threads auto will be enforced when "Automatically set the number of threads" global option is enabled in megui settings.
It's just a workaround but should work as expected.
Sharktooth is offline   Reply With Quote
Old 16th December 2006, 05:02   #8  |  Link
Sharktooth
Mr. Sandman
 
Sharktooth's Avatar
 
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
Im reorganizing the MeGUI version numbers. A new MeGUI build is coming. Just be patient...
Sharktooth is offline   Reply With Quote
Old 16th December 2006, 05:02   #9  |  Link
Adub
Fighting spam with a fish
 
Adub's Avatar
 
Join Date: Sep 2005
Posts: 2,699
Uh, sharktooth? I just downloaded the new version, and those settings aren't greyed out. In addition MeGUI still sets the threads manually. Like I am using 2 threads right now according to the log for the first pass.

Edit: look at the post times! That should explain it.
__________________
FAQs:Bond's AVC/H.264 FAQ
Site:Adubvideo

Last edited by Adub; 16th December 2006 at 05:05.
Adub is offline   Reply With Quote
Old 16th December 2006, 05:03   #10  |  Link
Audionut
Registered User
 
Join Date: Nov 2003
Posts: 1,281
Thanks. This build is alot faster on a core 2.
Audionut is offline   Reply With Quote
Old 16th December 2006, 05:12   #11  |  Link
Zerofool
VR, 3D & HDR UHD fan
 
Join Date: Mar 2006
Location: Sofia, Bulgaria
Posts: 53
Theoretically will this thing have any (positive) effect on single core CPUs (set at 2 threads) ? Does it make any sense using it that way?
(I'll try it tonight anyway .)
Zerofool is offline   Reply With Quote
Old 16th December 2006, 05:13   #12  |  Link
Sharktooth
Mr. Sandman
 
Sharktooth's Avatar
 
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
megui 0.2.3.2193 is up and running. Enjoy.
Tomorrow i'll update the Trunk version (0.2.4.0000) with a more elegant solution

@bobor and other builders: DO A SVN CHECKOUT and use the sources in Tags/2193 for building.

Last edited by Sharktooth; 16th December 2006 at 05:25.
Sharktooth is offline   Reply With Quote
Old 16th December 2006, 05:17   #13  |  Link
Audionut
Registered User
 
Join Date: Nov 2003
Posts: 1,281
And working like a charm. Thanks.
Audionut is offline   Reply With Quote
Old 16th December 2006, 05:19   #14  |  Link
Adub
Fighting spam with a fish
 
Adub's Avatar
 
Join Date: Sep 2005
Posts: 2,699
I can't update to 2193. I mean I downloaded it, it asks me to restart the program. It does, but it still says 2192 at the top and the x264 options aren't greyed out.
__________________
FAQs:Bond's AVC/H.264 FAQ
Site:Adubvideo
Adub is offline   Reply With Quote
Old 16th December 2006, 05:21   #15  |  Link
Sharktooth
Mr. Sandman
 
Sharktooth's Avatar
 
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
uhm... reboot. .NET cache is doing something wrong.
now im going to sleep (5:21AM)
if you have still problems unistall megui and install a fresh version. then let it autoupdate.

Last edited by Sharktooth; 16th December 2006 at 05:23.
Sharktooth is offline   Reply With Quote
Old 16th December 2006, 05:38   #16  |  Link
woah!
Registered User
 
Join Date: Oct 2003
Posts: 435
i use ELDER now for my encodes but i see this "bug" in the release i thought might need a look at:

x264.nl REV611: http://mirror01.x264.nl/x264/revision611/x264.exe

H:\MCHD encodes\ELDER RATIO.1.77.1\905029>..\x264.exe -o test3.264 --ref 5 --8x8
dct --mixed-refs --no-fast-pskip --bframes 3 --bime --weightb --direct auto --me
range 12 --subme 6 --analyse all --me umh --colormatrix bt709 --filter -2:-1 --p
ass 2 --stats stageC-3.stats --progress -B 1377 stageA-3.avs
avis [info]: 720x400 @ 23.98 fps (357 frames)
x264 [info]: using cpu capabilities MMX MMXEXT SSE SSE2
x264 [info]: slice I:5 Avg QP:20.60 size: 16678 PSNR Mean Y:46.04 U:47.30
V:47.27 Avg:46.40 Global:45.94
x264 [info]: slice P:268 Avg QP:22.98 size: 8262 PSNR Mean Y:44.48 U:46.08
V:45.76 Avg:44.89 Global:44.22
x264 [info]: slice B:84 Avg QP:23.46 size: 3583 PSNR Mean Y:44.30 U:46.71
V:46.87 Avg:44.96 Global:44.21
x264 [info]: mb I I16..4: 22.0% 63.3% 14.7%
x264 [info]: mb P I16..4: 15.7% 25.7% 4.5% P16..4: 30.2% 7.1% 2.0% 0.2% 0
.1% skip:14.5%
x264 [info]: mb B I16..4: 1.7% 4.0% 2.2% B16..8: 31.7% 1.6% 2.5% direct:
2.8% skip:53.5%
x264 [info]: 8x8 transform intra:56.0% inter:50.3%
x264 [info]: direct mvs spatial:0.0% temporal:100.0%
x264 [info]: ref P 73.7% 13.5% 8.0% 2.7% 2.2%
x264 [info]: ref B 86.8% 6.1% 3.7% 1.9% 1.5%
x264 [info]: SSIM Mean Y:0.9766733
x264 [info]: PSNR Mean Y:44.460 U:46.243 V:46.042 Avg:44.927 Global:44.239 kb/s:
1396.13

encoded 357 frames, 5.57 fps, 1396.55 kb/s
profiling:F:\msys\home\user\x264/x264.gcda:Cannot open
profiling:F:\msys\home\user\x264/matroska.gcda:Cannot open
profiling:F:\msys\home\user\x264/muxers.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/encoder.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/common.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/mdate.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/set.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/macroblock.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/set.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/ratecontrol.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/frame.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/pixel.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/macroblock.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/cpu.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/analyse.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/mc.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/cabac.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/cavlc.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/cabac.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/dct.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/quant.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/csp.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/predict.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/eval.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/i386/predict-c.gcda:Cannot open
profiling:F:\msys\home\user\x264/encoder/me.gcda:Cannot open
profiling:F:\msys\home\user\x264/common/i386/mc-c.gcda:Cannot open

H:\MCHD encodes\ELDER RATIO.1.77.1\905029>echo.1>stageC-result3.ready




===========================================



Sharktooths REV611: http://mirror05.x264.nl/Sharktooth/?dir=./x264


H:\MCHD encodes\ELDER RATIO.1.77.1\311401>..\x264.exe -o test3.264 --ref 5 --8x8
dct --mixed-refs --no-fast-pskip --bframes 3 --bime --weightb --direct auto --me
range 12 --subme 6 --analyse all --me umh --colormatrix bt709 --filter -2:-1 --p
ass 2 --stats stageC-3.stats --progress -B 1377 stageA-3.avs
avis [info]: 720x400 @ 23.98 fps (357 frames)
x264 [info]: using cpu capabilities MMX MMXEXT SSE SSE2
x264 [info]: slice I:5 Avg QP:20.60 size: 16678 PSNR Mean Y:46.04 U:47.30
V:47.27 Avg:46.40 Global:45.94
x264 [info]: slice P:268 Avg QP:22.98 size: 8262 PSNR Mean Y:44.48 U:46.08
V:45.76 Avg:44.89 Global:44.22
x264 [info]: slice B:84 Avg QP:23.46 size: 3583 PSNR Mean Y:44.30 U:46.71
V:46.87 Avg:44.96 Global:44.21
x264 [info]: mb I I16..4: 22.0% 63.3% 14.7%
x264 [info]: mb P I16..4: 15.7% 25.7% 4.5% P16..4: 30.2% 7.1% 2.0% 0.2% 0
.1% skip:14.5%
x264 [info]: mb B I16..4: 1.7% 4.0% 2.2% B16..8: 31.7% 1.6% 2.5% direct:
2.8% skip:53.5%
x264 [info]: 8x8 transform intra:56.0% inter:50.3%
x264 [info]: direct mvs spatial:0.0% temporal:100.0%
x264 [info]: ref P 73.7% 13.5% 8.0% 2.7% 2.2%
x264 [info]: ref B 86.8% 6.1% 3.7% 1.9% 1.5%
x264 [info]: SSIM Mean Y:0.9766733
x264 [info]: PSNR Mean Y:44.460 U:46.243 V:46.042 Avg:44.927 Global:44.239 kb/s:
1396.13

encoded 357 frames, 6.19 fps, 1396.55 kb/s

H:\MCHD encodes\ELDER RATIO.1.77.1\311401>echo.1>stageC-result3.ready


Sharktooth's version works well but the other one as you can see isnt built correctly (i think)

anyways...
woah! is offline   Reply With Quote
Old 16th December 2006, 05:44   #17  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
Updated revision 611 coming right up

wrong 611 x264.exe md5: 2d738a38420fb2d92154556a1aa29c70
correct 611 x264.exe md5: b64eb9012bd24e2cd4f616be482c5a31

In about 30 minutes a proper version should be online, without the half-as*-ed make fprofiled version

Last edited by bob0r; 16th December 2006 at 06:13. Reason: correct file update
bob0r is offline   Reply With Quote
Old 16th December 2006, 06:36   #18  |  Link
huang_ch
Registered User
 
Join Date: Jul 2004
Posts: 169
Great job, thanks all the x264 developers for this. Below is a quick test from me:
Hardware: Conroe E6600@2.4GHz

FPS Kbps PSNR
b606(1t) 9.76 1200.92 47.015
b606(2t) 14.28 1214.10 47.008
b611(1t) 9.37 1229.90 47.203
b611(2t) 15.30 1242.54 47.248
b611(auto) 17.45 1242.29 47.248
b611(4t) 17.97 1241.65 47.247

It's good to see no PSNR penalty now and the CPU is mostly >90% usage.
huang_ch is offline   Reply With Quote
Old 16th December 2006, 06:48   #19  |  Link
Adub
Fighting spam with a fish
 
Adub's Avatar
 
Join Date: Sep 2005
Posts: 2,699
your getting >90%. I am only reaching about 83%. But that is still first pass. Plus I am Folding in the background.
__________________
FAQs:Bond's AVC/H.264 FAQ
Site:Adubvideo
Adub is offline   Reply With Quote
Old 16th December 2006, 14:02   #20  |  Link
Sharktooth
Mr. Sandman
 
Sharktooth's Avatar
 
Join Date: Sep 2003
Location: Haddonfield, IL
Posts: 11,768
Quote:
Originally Posted by bob0r View Post
Updated revision 611 coming right up

wrong 611 x264.exe md5: 2d738a38420fb2d92154556a1aa29c70
correct 611 x264.exe md5: b64eb9012bd24e2cd4f616be482c5a31

In about 30 minutes a proper version should be online, without the half-as*-ed make fprofiled version
my build is fprofiled ...

@Merlin: .NET cache sometimes does weird things... and when it happens it's always a PITA
Sharktooth is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 06:31.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.