View Full Version : Current Patches, Where to get them, How they affect speed/output
puffpio
11th March 2009, 21:25
is there much of a difference w/ -march=pentium2 vs core2?
and for the 64 bit version, shouldn't the minimum -march be k8 or nocona?
LoRd_MuldeR
11th March 2009, 21:28
is there much of a difference w/ -march=pentium2 vs core2?
Nope. Most (all?) performance-critical functions in x264 are written as hand-optimized assembler code.
x264 will detect your CPU capabilities at runtime and it will choose the optimized assembler functions accordingly.
Compiler optimizations matter for the pure C code only...
komisar
11th March 2009, 21:59
LoRd_MuldeR, sure, but if gcc-libs compiled with -march -- we get ~1-5 more fps...
puffpio, little latter i compile x264 for generic (pentium2), k8, core2. Minimum for x86_64 -- "-mtune=generic" (without march)
Dark Shikari
11th March 2009, 22:04
LoRd_MuldeR, sure, but if gcc-libs compiled with -march -- we get ~1-5 more fps...Yes, because of cmov.
akupenguin
12th March 2009, 16:40
1-5 more fps
1-5 out of what? There's a big difference between gaining 5fps if it was 1 before, vs gaining 5 from 1000.
No one should ever post any speed results in the form of a diff without a baseline. The only valid forms are ratios.
vmrsss
14th March 2009, 22:23
hi everybody.
In theme of speed differences, would there be much using gcc-4-2.1 rather than gcc-4.3.3 for x86_64? (For -march=core2 rather than nothing I assume the answer is above, that is a few fps due do cmov)
In general, how would I go about timing one build of x264 against another?
Thx.
vmrsss
14th March 2009, 22:34
PS. What does make fprofiled do for you?
J_Darnley
14th March 2009, 23:18
1 - On nix: use time or modify x264 to print the time
On Windows: modify x264 to print the time
imk
15th March 2009, 00:59
r1127M ICC Builds:
SSE2 x32 (http://imk.cx/pc/x264/x264.r1127M.SSE2.x32.imk.exe) (w/MP4 Output (http://imk.cx/pc/x264/x264.r1127M.SSE2.x32.mp4.imk.exe))
SSSE3 x32 (http://imk.cx/pc/x264/x264.r1127M.SSSE3.x32.imk.exe) (w/MP4 Output (http://imk.cx/pc/x264/x264.r1127M.SSSE3.x32.mp4.imk.exe))
SSE2 x64 (http://imk.cx/pc/x264/x264.r1127M.SSE2.x64.imk.exe) (w/MP4 Output (http://imk.cx/pc/x264/x264.r1127M.SSE2.x64.mp4.imk.exe))
SSSE3 x64 (http://imk.cx/pc/x264/x264.r1127M.SSSE3.x64.imk.exe) (w/MP4 Output (http://imk.cx/pc/x264/x264.r1127M.SSSE3.x64.mp4.imk.exe))
Older builds, build scripts, information, etc. can all be found here (http://imk.cx/pc/x264/).
LoRd_MuldeR
15th March 2009, 01:32
PS. What does make fprofiled do for you?
It will analyze the code in execution and then apply additional optimizations.
Since "make fprofiled" will encode the specified video clip several time (to cover all x264 options/codepaths) it will take much longer to build...
http://en.wikipedia.org/wiki/Profiler_(computer_science)
vmrsss
16th March 2009, 16:07
Thx. What criteria are useful when picking the video clips to feed make fprofiled?
LoRd_MuldeR
16th March 2009, 19:15
Thx. What criteria are useful when picking the video clips to feed make fprofiled?
I think you should pick something that equals the kind of video that you are going to encode. A few hundred frames should be enough.
Choosing a "synthetic" clip, like solid black or random noise, wouldn't give optimal profiling results, I guess...
skystrife
18th March 2009, 21:38
x264 r1128 x64 (unpatched) (http://www.mediafire.com/?wnjg5utuzin) - Alternate Download (http://skystrife.com/x264/revision1128/x264.exe)
gcc 4.3.4 fprofiled build.
-------------------------
x264.1128M.x86.exe (http://www.mediafire.com/?mytyyrkwchg) - Alternate Download (http://skystrife.com/x264/x264.1128M.x86.exe) / x264.1128M.x64.exe (http://www.mediafire.com/?mmzmmmmnmxt) - Alternate Download (http://skystrife.com/x264/x264.1128M.x64.exe)
gcc 3.4.5 fprofiled build with -march=pentium2. / gcc 4.3.4 fprofiled build.
Patches used:
x264_hrd_pulldown.10_interlace.diff
x264_win_zone_parse_fix_05.diff
techouse
19th March 2009, 23:37
x264_x86_r1129_techouse (http://techouse.digitalpulse.us/builds/x264_x86_r1129_techouse.7z) | INFO (http://techouse.digitalpulse.us/nfo/x264_x86_r1129_techouse.txt)
GCC 4.3.3, fprofiled, -march=core2
x264_x64_r1129_techouse (http://techouse.digitalpulse.us/builds/x264_x64_r1129_techouse.7z) | INFO (http://techouse.digitalpulse.us/nfo/x264_x64_r1129_techouse.txt)
GCC 4.3.4 20090220 (prerelease) (x64.generic.Komisar), fprofiled, -march=core2
Patches used:
x264_hrd_pulldown.10_interlace.diff
x264_win_zone_parse_fix_05.diff
Kurtnoise
20th March 2009, 02:46
Patches used:
x264_hrd_pulldown.10_interlace.diff
x264_win_zone_parse_fix_05.diff
can I have a link for those, please ?
LoRd_MuldeR
20th March 2009, 03:54
can I have a link for those, please ?
http://forum.doom9.org/showpost.php?p=1257673&postcount=1726
http://forum.doom9.org/showpost.php?p=1230476&postcount=1535
:search:
Sharktooth
25th March 2009, 18:16
Request: updated skystrife's builds.
Trahald
26th March 2009, 00:20
Here is the HRD patch release 11. I rewrote most of the calculations . Because most/all the values calculations are changed, i'd consider 11 an alpha until its been around a bit.
squid_80
26th March 2009, 02:00
Still looks like it's got a typo to me (TBT,BT,BTB,BT).
Trahald
26th March 2009, 03:51
Still looks like it's got a typo to me (TBT,BT,BTB,BT).
HAH. now fixed.
bob0r
26th March 2009, 19:37
Request: updated skystrife's builds.
Hmm, still not active.
Maybe we can use techouse's builds? Or someone else can make them?
Also i would need x264 64bit (gpac/pthreads also 64bit) + fprofiled on the same 64bit system.
If someone can do this, all you need to do is host them on http. Example: http://yourdomain.com/x264/revisionXXXX/
Files inside revisionXXXX should be x264.exe and x264.md5
( To create the md5 for x264.nl: md5sum x264.exe | awk '{print $1}' >x264.md5 )
Then my mirror script checks for the x264.md5 file every hour and mirror it then)
LoRd_MuldeR
26th March 2009, 19:40
Why you worry so much? You have a r1128 build up and the differences between r1128 and r1129 are very minor...
techouse
27th March 2009, 11:43
Hmm, still not active.
Maybe we can use techouse's builds? Or someone else can make them?
Also i would need x264 64bit (gpac/pthreads also 64bit) + fprofiled on the same 64bit system.
If someone can do this, all you need to do is host them on http. Example: http://yourdomain.com/x264/revisionXXXX/
Files inside revisionXXXX should be x264.exe and x264.md5
( To create the md5 for x264.nl: md5sum x264.exe | awk '{print $1}' >x264.md5 )
Then my mirror script checks for the x264.md5 file every hour and mirror it then)
Sure, you can use mine :) I build&profile them on Vista x64, but i'm not sure how you want me to host the builds on my servers.
Cheers ;)
techouse
27th March 2009, 12:35
x264_x86_r1130_techouse (http://techouse.project357.com/builds/x264_x86_r1130_techouse.7z) | INFO (http://techouse.project357.com/nfo/x264_x86_r1130_techouse.txt)
GCC 4.3.3, fprofiled, -march=core2
x264_x64_r1130_techouse (http://techouse.project357.com/builds/x264_x64_r1130_techouse.7z) | INFO (http://techouse.project357.com/nfo/x264_x64_r1130_techouse.txt)
GCC 4.3.4 20090220 (prerelease) (x64.generic.Komisar), fprofiled, -march=core2
Patches used:
x264_hrd_pulldown.11_interlace.diff
x264_win_zone_parse_fix_05.diff
P.S.: I put md5 checksums in the 7-Zip files this time ;)
Trahald
27th March 2009, 12:43
@techouse Your x64 link links to the x86 file and vice versa
techouse
27th March 2009, 12:51
@techouse Your x64 link links to the x86 file and vice versaThanx! I am in a hurry cause I've got a train to catch.... ttyl ;)
bob0r
27th March 2009, 18:49
Sure, you can use mine :) I build&profile them on Vista x64, but i'm not sure how you want me to host the builds on my servers.
Cheers ;)
Very simple: Build x264 64bit (gpac/pthreads 64bit also) from git only, no patches... fprofiled.
And then put the files on any http host, as long as you create the dir: revision1130 with inside x264.exe and x264.md5
so example:
http://techouse.project357.com/builds/revision1130/x264.exe
http://techouse.project357.com/builds/revision1130/x264.md5
Then all i have to do is edit:
fixedurl=http://techouse.project357.com/builds/revision in my script and the files get mirrored to x264.nl.
You can also sent me a private http link if bandwidth is an issue (or ill create a host for you, but i think we have no problems with bandwidth :))
techouse
28th March 2009, 17:52
Very simple: Build x264 64bit (gpac/pthreads 64bit also) from git only, no patches... fprofiled.
And then put the files on any http host, as long as you create the dir: revision1130 with inside x264.exe and x264.md5
so example:
http://techouse.project357.com/builds/revision1130/x264.exe
http://techouse.project357.com/builds/revision1130/x264.md5
Then all i have to do is edit:
fixedurl=http://techouse.project357.com/builds/revision in my script and the files get mirrored to x264.nl.
You can also sent me a private http link if bandwidth is an issue (or ill create a host for you, but i think we have no problems with bandwidth :))OK, I'll do it as soon as I get to the campus ;) (that's tomorrow somewhere around 23:00 CET)
P.S.: Do you only need 64bit or both?
bob0r
29th March 2009, 17:07
...
P.S.: Do you only need 64bit or both?
Only 64bit, the x264.nl compile system is 32bit.
techouse
31st March 2009, 06:24
Only 64bit, the x264.nl compile system is 32bit.Sorry it took so long, but I had some work with tax reports. Anyway, I'll be on the campus around 12:00 CEST, so expect the build somewhere around 13:00 CEST.
Here you go:
http://techouse.project357.com/builds/revision1134/x264.exe
http://techouse.project357.com/builds/revision1134/x264.md5
(unpatched generic & fprofiled 64bit build, GCC 4.3.4 20090220 (prerelease) (x64.generic.Komisar))
techouse
31st March 2009, 12:24
x264_x86_r1134_techouse (http://techouse.project357.com/builds/x264_x86_r1134_techouse.7z) | INFO (http://techouse.project357.com/nfo/x264_x86_r1134_techouse.txt)
GCC 4.3.3, fprofiled, -march=core2
x264_x64_r1134_techouse (http://techouse.project357.com/builds/x264_x64_r1134_techouse.7z) | INFO (http://techouse.project357.com/nfo/x264_x64_r1134_techouse.txt)
GCC 4.3.4 20090220 (prerelease) (x64.generic.Komisar), fprofiled, -march=core2
Patches used:
x264_hrd_pulldown.11_interlace.diff
x264_win_zone_parse_fix_05.diff
Betsy25
31st March 2009, 19:39
Didn't x264 use all cores by default then ? :confused:
LoRd_MuldeR
31st March 2009, 19:53
Didn't x264 use all cores by default then ? :confused:
No, it doesn't and it never did! The default setting is "--threads 1" (that is: multi-threading = OFF).
To use multiple cores, you must either set "--threads auto" or manually specify "--threads n", where n = cores * 3/2.
ajp_anton
31st March 2009, 20:02
To use multiple cores, you must either set "--threads auto" or manually specify "--threads n", where n = cores * 3/2.
"--threads auto" = "--threads 0" = "--threads [cores*3/2]"
However, you can specify whatever number you want (you make it sound you *need* cores*3/2 =))
Betsy25
31st March 2009, 20:59
"--threads auto" = "--threads 0" = "--threads [cores*3/2]"
However, you can specify whatever number you want (you make it sound you *need* cores*3/2 =))
So, you actually replace an application that could please everyone with some that puts a lot of people in the bare cold ?
Astrophizz
31st March 2009, 21:08
I would guess that anyone who wouldn't know about the command line settings would be using a frontend, which would most likely set threads to auto.
LoRd_MuldeR
31st March 2009, 21:21
I would guess that anyone who wouldn't know about the command line settings would be using a frontend, which would most likely set threads to auto.
Exactly. If people use x264 from the commandline, they should be intelligent enough to type "x264 --help" and read :sly:
For the average user, who doesn't want to learn about CLI parameters, we have dozens of GUI's available...
Betsy25
31st March 2009, 21:34
Exactly. If people use x264 from the commandline, they should be intelligent enough to type "x264 --help" and read :sly:
For the average user, who doesn't want to learn about CLI parameters, we have dozens of GUI's available...
Is there actually anything positive about this change ? By removing the --threads option, does the program run faster ? Anything we don't know ? Or is it just some april fools release ?
Dark Shikari
31st March 2009, 21:45
Is there actually anything positive about this change ? By removing the --threads option, does the program run faster ? Anything we don't know ? Or is it just some april fools release ?I have no idea what you're talking about.
LoRd_MuldeR
31st March 2009, 21:46
Is there actually anything positive about this change ? By removing the --threads option, does the program run faster ? Anything we don't know ? Or is it just some april fools release ?
What change are you talking about? The "--threads n" parameter works like described above ever since x264 got the ability to do multi-threading.
At least it didn't change in the last ~2 years. And of course that parameter was not removed recently...
Didn't x264 use all cores by default then ? :confused:
Are you referring to "-march=core2" in techouse's post? That GCC switch enables compiler optimizations for Intel Core 2 processor architecture. It has nothing to do with multithreading.
Betsy25
31st March 2009, 23:01
Are you referring to "-march=core2" in techouse's post? That GCC switch enables compiler optimizations for Intel Core 2 processor architecture. It has nothing to do with multithreading.
Exactly, I was talking about that one. Sorry for the confusion, oh I'm just a newb.:(
LoRd_MuldeR
31st March 2009, 23:13
Exactly, I was talking about that one. Sorry for the confusion, oh I'm just a newb.:(
Then please don't mix up compiler switches and x264 switches:
Compiler switches are used when you are building x264, while x264 switches are used when you are running x264.
As a normal user you only need to worry about the latter ones ;)
Some more info can be found here:
http://sites.google.com/site/linuxencoding/x264-ffmpeg-mapping
skystrife
1st April 2009, 00:18
x264 r1134 x64 (unpatched) (http://www.mediafire.com/?ntkgqriintz) - Alternate Download (http://skystrife.com/x264/revision1134/x264.exe)
gcc 4.3.4 fprofiled build.
-------------------------
x264.1134M.x86.exe (http://www.mediafire.com/?ygwnybz3gy3) - Alternate Download (http://skystrife.com/x264/x264.1134M.x86.exe) / x264.1134M.x64.exe (http://www.mediafire.com/?wydznnmomma) - Alternate Download (http://skystrife.com/x264/x264.1134M.x64.exe)
gcc 3.4.5 fprofiled build with -march=pentium2. / gcc 4.3.4 fprofiled build.
Patches used:
x264_hrd_pulldown.11_interlace.diff
x264_win_zone_parse_fix_05.diff
ajp_anton
1st April 2009, 09:51
Then please don't mix up compiler switches and x264 switches:
Compiler switches are used when you are building x264, while x264 switches are used when you are running x264.
As a normal user you only need to worry about the latter ones ;)As a normal used he doesn't need to worry about either of them =)
BTW... maybe it was just a dream, but wasn't --thread-input made automatically active some time ago? And if not, is there any reason not to?
techouse
1st April 2009, 10:14
I think --thread-input is automatically active if you use --threads auto. Correct me if I'm wrong.
P.S.: LoL @ that Core2 confusion :D
techouse
4th April 2009, 14:10
Can anyone please run a stability test this build on AMD64 and Core2? Thanx!
http://techouse.project357.com/builds/revision1136/x264.exe
http://techouse.project357.com/builds/revision1136/x264.md5
(unpatched generic & fprofiled 64bit build, GCC 4.3.4 20090328 (prerelease) (x64.generic.Komisar))
Sharktooth
4th April 2009, 14:55
I think --thread-input is automatically active if you use --threads auto. Correct me if I'm wrong.
true.
No, it doesn't and it never did! The default setting is "--threads 1" (that is: multi-threading = OFF).
To use multiple cores, you must either set "--threads auto" or manually specify "--threads n", where n = cores * 3/2.
any explanation regarding why the best parameter is 1.5 threads per core? Seems totally random to me, though I discovered that myself (still seems pathetic to me when 2 threads x264 takes only about 70% of two-core cpu and that's with high priority ;)
LoRd_MuldeR
4th April 2009, 17:36
any explanation regarding why the best parameter is 1.5 threads per core? Seems totally random to me, though I discovered that myself (still seems pathetic to me when 2 threads x264 takes only about 70% of two-core cpu and that's with high priority ;)
Because testing showed that "threads = 3/2 * cores" works best. And that finding doesn't seem to be random at all ;)
The intuitive solution "threads = cores" would only give optimal performance if no thread ever becomes idle (needs to wait for another thread).
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.