Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 11th July 2009, 09:15   #1961  |  Link
JEEB
もこたんインしたお!
 
JEEB's Avatar
 
Join Date: Jan 2008
Location: Finland / Japan
Posts: 512
x264 r1181 32bit
download ; release notes
  • built on Jul 11 2009, gcc: 4.3.3
  • fprofiled, -march=i686

x264 r1181 64bit
download ; release notes
  • built on Jul 11 2009, gcc: 4.3.4 20090220 (prerelease) (x64.generic.Komisar)
  • fprofiled, -march=core2

patched with:
__________________
[I'm human, no debug]
JEEB is offline   Reply With Quote
Old 11th July 2009, 10:32   #1962  |  Link
G_M_C
Registered User
 
Join Date: Feb 2006
Posts: 1,076
/me is hoping for an update of IMK's ICC / SSE4.x build
G_M_C is offline   Reply With Quote
Old 11th July 2009, 11:03   #1963  |  Link
juGGaKNot
Registered User
 
juGGaKNot's Avatar
 
Join Date: Feb 2008
Posts: 733
Quote:
Originally Posted by G_M_C View Post
/me is hoping for an update of IMK's ICC / SSE4.x build
What does it bring to the table so special ? intel cpus ?
juGGaKNot is offline   Reply With Quote
Old 11th July 2009, 11:10   #1964  |  Link
G_M_C
Registered User
 
Join Date: Feb 2006
Posts: 1,076
Quote:
Originally Posted by juGGaKNot View Post
What does it bring to the table so special ? intel cpus ?
I find that ICC builds are slightly faster on my QX9650.
G_M_C is offline   Reply With Quote
Old 11th July 2009, 11:42   #1965  |  Link
Fr4nz
Registered User
 
Join Date: Feb 2003
Posts: 448
Quote:
Originally Posted by G_M_C View Post
I find that ICC builds are slightly faster on my QX9650.
Unfortunately we'll have to wait because his video card is broken...read here (I received this message from him via Youtube yesterday):

Quote:
Originally Posted by imk
However, the video card in my computer died and I'm $120 short of a new one. Until I can replace that video card I can't compile any new builds. I'm pretty broke at the moment, so it could take a month or more until I set aside $120 for a card. I'll let you know when I get it replaced.
We'll wait
Fr4nz is offline   Reply With Quote
Old 11th July 2009, 14:56   #1966  |  Link
G_M_C
Registered User
 
Join Date: Feb 2006
Posts: 1,076
Quote:
Originally Posted by Fr4nz View Post
Unfortunately we'll have to wait because his video card is broken...read here (I received this message from him via Youtube yesterday):



We'll wait
I'll wait too. Bummer the Video card broke. Never had one break in my system, but i dont OC videocards (only CPU's, which is easy with an unlocked "Extreme edition" ).
G_M_C is offline   Reply With Quote
Old 11th July 2009, 19:23   #1967  |  Link
rack04
Registered User
 
Join Date: Mar 2006
Posts: 1,538
What is the difference between -march=i686 and -march=core2?
rack04 is offline   Reply With Quote
Old 11th July 2009, 19:30   #1968  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
Quote:
Originally Posted by rack04 View Post
What is the difference between -march=i686 and -march=core2?
The C compiler is instructed to optimize the build either for an i686-family CPU or for an Intel Core2.
While the former allows the compiler to use the "PentiumPro" instruction set, the latter allows the compiler to use also SSE instructions (everything up to SSSE3).
Furthermore with -"march=core2" the compiler will optimize for different CPU timings...

In short: The "-march=i686" build should run on every CPU, except for some archaic ones. The "-march=core2" build should run a bit faster on a Core2 Duo.

See for details:
http://gcc.gnu.org/onlinedocs/gcc-4....002d64-Options

Note that this only effects the plain C code in x264. All the "hand-optimized" assembler code is not effected by compiler optimizations at all!
Also note that x264 uses it's own runtime CPU detection to decide which assembler functions will be used (or not used).

Compiler optimizations can squeeze out a bit more performance (ICC more than GCC), but the important speed-up happens in the assembler part of x264
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

Last edited by LoRd_MuldeR; 11th July 2009 at 19:41.
LoRd_MuldeR is offline   Reply With Quote
Old 11th July 2009, 20:32   #1969  |  Link
kemuri-_9
Compiling Encoder
 
kemuri-_9's Avatar
 
Join Date: Jan 2007
Posts: 1,348
pengvado (either here or on irc) stated that it's not icc's C compilation that is causing the speed up...
the cause is the icc equivalent to gcc's -mtune
__________________
custom x264 builds & patches | F@H | My Specs
kemuri-_9 is offline   Reply With Quote
Old 11th July 2009, 20:50   #1970  |  Link
G_M_C
Registered User
 
Join Date: Feb 2006
Posts: 1,076
Quote:
Originally Posted by kemuri-_9 View Post
pengvado (either here or on irc) stated that it's not icc's C compilation that is causing the speed up...
the cause is the icc equivalent to gcc's -mtune
Quote:
Originally Posted by LoRd_MuldeR View Post
[...]
Note that this only effects the plain C code in x264. All the "hand-optimized" assembler code is not effected by compiler optimizations at all!
Also note that x264 uses it's own runtime CPU detection to decide which assembler functions will be used (or not used).

Compiler optimizations can squeeze out a bit more performance (ICC more than GCC), but the important speed-up happens in the assembler part of x264
That's why I said that they' re only slightly faster. But on clips with > 200.000 frames even "slightly" counts to be a measurable timesaving

Quote:
Originally Posted by G_M_C View Post
I find that ICC builds are slightly faster on my QX9650.
G_M_C is offline   Reply With Quote
Old 11th July 2009, 22:26   #1971  |  Link
IgorC
Registered User
 
Join Date: Apr 2004
Posts: 1,315
Quote:
Originally Posted by G_M_C View Post
That's why I said that they' re only slightly faster. But on clips with > 200.000 frames even "slightly" counts to be a measurable timesaving
1% is still tiny speed up for 100 or 10000 or any other number of frames.
This percentage is hardly noticeble for even >200.000 frames. If encoder gets 10 hours to encode then 1% will present only 6 minutes. It is nothing comparing to 10 hours.
It's still 1%.
IgorC is offline   Reply With Quote
Old 12th July 2009, 02:38   #1972  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Quote:
Originally Posted by kemuri-_9 View Post
pengvado stated that it's not icc's C compilation that is causing the speed up...
the cause is the icc equivalent to gcc's -mtune
-mtune affects only C compilation.
I said that any difference between icc-sse2 and icc-ssse3 must be due to the -mtune part rather than the -march part, because icc-ssse3 didn't actually use any ssse3 (but it did include plenty of asm differences in non-sse code). This was not meant to explain any comparison between icc and some other compiler.
akupenguin is offline   Reply With Quote
Old 12th July 2009, 07:25   #1973  |  Link
Fr4nz
Registered User
 
Join Date: Feb 2003
Posts: 448
Quote:
Originally Posted by IgorC View Post
1% is still tiny speed up for 100 or 10000 or any other number of frames.
This percentage is hardly noticeble for even >200.000 frames. If encoder gets 10 hours to encode then 1% will present only 6 minutes. It is nothing comparing to 10 hours.
It's still 1%.
Well, IIRC sometimes there's a difference of 4-5% in favor of ICC build, which would save 20-30 minutes...not much, but also better than nothing

Changing the subject, I have on question: my father has an AMD Phenom X4 9550 CPU and if I use the "DXVA-HQ preset" in MeGUI first pass is slightly slower than second pass (this does not happen on my Intel E6750 @3,3 ghz , on which the first pass is ~2.3x faster than the second pass)...how's it possible?

Last edited by Fr4nz; 12th July 2009 at 07:42.
Fr4nz is offline   Reply With Quote
Old 12th July 2009, 11:12   #1974  |  Link
nm
Registered User
 
Join Date: Mar 2005
Location: Finland
Posts: 2,641
Quote:
Originally Posted by Fr4nz View Post
Changing the subject, I have on question: my father has an AMD Phenom X4 9550 CPU and if I use the "DXVA-HQ preset" in MeGUI first pass is slightly slower than second pass (this does not happen on my Intel E6750 @3,3 ghz , on which the first pass is ~2.3x faster than the second pass)...how's it possible?
B-adapt 2 slows it down, probably. Frametype decision (which is done only in the first pass) is single-threaded, so it acts as a bottleneck on multi-core CPUs.
nm is offline   Reply With Quote
Old 12th July 2009, 11:14   #1975  |  Link
Fr4nz
Registered User
 
Join Date: Feb 2003
Posts: 448
Quote:
Originally Posted by nm View Post
B-adapt 2 slows it down, probably. Frametype decision (which is done only in the first pass) is single-threaded, so it acts as a bottleneck on multi-core CPUs.
This makes sense, but how's possibile that the Phenom is slowed down so much??

In order to give you an idea, if in the first pass i have 10-11fps/sec with my E6750@3,3ghz, with Phenom I have merely 5fps/sec...
Fr4nz is offline   Reply With Quote
Old 12th July 2009, 11:25   #1976  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by Fr4nz View Post
This makes sense, but how's possibile that the Phenom is slowed down so much??
Phenom has 4 cores, Core 2 Duo has only two?
Dark Shikari is offline   Reply With Quote
Old 12th July 2009, 11:31   #1977  |  Link
Fr4nz
Registered User
 
Join Date: Feb 2003
Posts: 448
Quote:
Originally Posted by Dark Shikari View Post
Phenom has 4 cores, Core 2 Duo has only two?
Intel E6750 has only 2 cores.

Furthermore, in order to give you a better idea of the "situation", second pass is faster on Phenom 9550 than on E6750.

Last edited by Fr4nz; 12th July 2009 at 11:36.
Fr4nz is offline   Reply With Quote
Old 12th July 2009, 11:37   #1978  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by Fr4nz View Post
Intel E6750 has only 2 cores.
That's what I just said. What don't you understand? More cores means the penalty for using settings that cripple multithreading will hurt speed more.
Dark Shikari is offline   Reply With Quote
Old 12th July 2009, 11:46   #1979  |  Link
Fr4nz
Registered User
 
Join Date: Feb 2003
Posts: 448
Quote:
Originally Posted by Dark Shikari View Post
That's what I just said. What don't you understand? More cores means the penalty for using settings that cripple multithreading will hurt speed more.
Ok, what I don't understand is: is frametype decision (which is single threaded and indicated as the "culprit" by nm) so "heavy" in respect to other algorithms used by x264 so that Phenom 9550 is brutally outperformed (~2x) by an Intel E6750 in the first pass?
Fr4nz is offline   Reply With Quote
Old 12th July 2009, 12:26   #1980  |  Link
nm
Registered User
 
Join Date: Mar 2005
Location: Finland
Posts: 2,641
One core of your overclocked E6750 outperforms one core of the Phenom by almost 2x. If b-adapt 2 dominates the encoding time because of fast first-pass parameters, this means that overall encoding is also twice as fast on the E6750. Just check x264's CPU usage during the first pass. I guess it's less than 50 % on the Phenom.
nm is offline   Reply With Quote
Reply

Tags
h.264, x264, x264 builds, x264 patches, x264 unofficial builds

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:18.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.