PDA

View Full Version : Power machine for x264 encoding


fangorn
14th February 2009, 14:45
Hi,

For the last year I only built energy efficient low budget machines. So my knowledge of the performance sector is pretty outdated.

What I want to do is transcode HD Video to h.264 using x264, ffmpeg and mencoder (under Linux if this is of any interest). My current machine is >2 years and (for this task) as fast as the continental drift. As it is a perfectly good machine for all other tasks, I would be interested in upgrading.

For instance: Does someone here know of someone working on porting x264 to CUDA or the AMD version (the name fails me atm)? I would much prefer just putting in a GPU over building a whole new machine and would gladly help with testing and debugging.

If there is no such thing in the near future, I am looking for a replacement machine for my Athlon 64X2 2GHz with min. 3 times the power. So I think it will be an Intel machine atm. And with Intel my knowledge is even more outdated. :rolleyes:

What I need is a stable machine (as in half a year uptime minimum). In that field I am not so satisfied with the Intel boxes we have at work. So tips are much appreciated.

I hope to get some helpful comments,
fangorn

jeffy
14th February 2009, 15:02
Ideas: Let's start somewhere: Core i7 920, X58 chipset board, 3GB DDR3

Ma-Xell
14th February 2009, 15:36
A fast Core2Quad might be cheaper than an i7 but socket 775 has no more future.

LoRd_MuldeR
14th February 2009, 15:46
x264 benefits from Core i7 (Nehalem) greatly, as described here:
http://x264dev.multimedia.cx/?p=51

Overall, the changes in Nehalem are extremely beneficial to x264 and have led to an enormous overall performance increase. Furthermore, since the primary speed increase is in SIMD, the more assembly code we write, the more of a boost Nehalem gets over previous processors.

That's why I'd go with a Core i7 nowadays. Anyway, a Core2 Quad isn't a bad choice either ;)

(BTW: I don't expect that any GPGPU encoder can beat x264 speed-wise while retaining a similar quality at the same bitrate. At least not in the near future...)

Reimar
14th February 2009, 16:51
For instance: Does someone here know of someone working on porting x264 to CUDA or the AMD version (the name fails me atm)? I would much prefer just putting in a GPU over building a whole new machine and would gladly help with testing and debugging.


I am not aware of such plans and generally I'd expect the results to not be good enough to give you better performance per dollar than a new CPU.


If there is no such thing in the near future, I am looking for a replacement machine for my Athlon 64X2 2GHz with min. 3 times the power. So I think it will be an Intel machine atm. And with Intel my knowledge is even more outdated. :rolleyes:


As others have said, i7 is vastly (EDIT: well, vastly may be an exaggeration, not sure) more powerful, but if you do not like rebuilding, you could also check which CPUs your mainboard supports.
I have about the same CPU as you and mine can handle e.g. the Phenoms (which from what I know are not really impressive in performance though, but 3 times faster than your current CPU is not that hard to do). Particularly if the Phenom IIs work in yours they might be the cheapest way to get the performance you want.

fangorn
14th February 2009, 17:23
My Board unluckily only supports a Phenom X4 with 2,2 GHz max.

This would be an option, but it is hard to get nowadays and the benefit would be not so big (x2).

If this has to make sense I will go for Core i7. Any suggestions for Mainboards? Normally I prefer Asus (as already said, I prefer stable systems, so overclocking is not an option) and Hardware that is min. 6 months old.

fangorn
14th February 2009, 17:27
That's why I'd go with a Core i7 nowadays. Anyway, a Core2 Quad isn't a bad choice either ;)


Of what difference are we talking here in benefit? Is it percent or factors? Just to decide if it is worth the price.

LoRd_MuldeR
14th February 2009, 17:28
My Board unluckily only supports a Phenom X4 with 2,2 GHz max.

This would be an option, but it is hard to get nowadays and the benefit would be not so big (x2).

Moving from 2 cores to 4 cores (at roughly same clock speed) means doubled performance for x264, as it scales very well!

If this has to make sense I will go for Core i7. Any suggestions for Mainboards? Normally I prefer Asus (as already said, I prefer stable systems, so overclocking is not an option) and Hardware that is min. 6 months old.

I have good experience with Gigabyte boards, I'm currently running a GA-P35-DS3R. Also it seems X58 is the chipset of choice for Core i7 currently.

So maybe that one:
http://www.alternate.de/html/product/Mainboards_Sockel_1366/GigaByte/GA-EX58-UD3R/317660/?tn=HARDWARE&l1=Mainboards&l2=Intel&l3=Sockel+1366

Or if you prefer Asus:
http://www.alternate.de/html/product/Mainboards_Sockel_1366/Asus/P6T/315232/?tn=HARDWARE&l1=Mainboards&l2=Intel&l3=Sockel+1366

LoRd_MuldeR
14th February 2009, 17:45
Of what difference are we talking here in benefit? Is it percent or factors? Just to decide if it is worth the price.

Some benchmarks:
http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3448&p=18
http://www.techarp.com/showarticle.aspx?artno=603&pgno=8

And we don't know if they used a recent version of x264, that already contains the Core i7 optimizations.
One benchmarks says they used x264 r819, so a recent x264 version would run even faster on a Core i7 machine!

fangorn
14th February 2009, 17:52
Thanks for the tips, I have to adjust at the Intel prices at the moment :eek:

For the price of Processor and Board alone I built two AMD boxes last year. But I'm in the performance league now. We wanna build a machine to replace four or more of the boxes I am used to. :)

DJ Bobo
14th February 2009, 18:35
Uuh...
I am looking for a replacement machine for my Athlon 64X2 2GHz with min. 3 times the power. So I think it will be an Intel machine atm

Funny conclusion. You didn't hear about the AMD Phenom II CPUs, did you? The 3GHz version will be at least 3 times faster than your current X2.
The Phenom II is much cheaper than the Core i7 CPUs (~210€ for the Phenom II @3GHz but ~270€ for the Core i7 @2,67GHz). It also runs much cooler (15°C less than the Core i7 using the same cooler) and consumes less energy (20W less than the Core i7 when fully loaded)

jeffy
14th February 2009, 18:50
Phenom II: another graph: http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=3492&p=14

fangorn
14th February 2009, 18:53
Yes I have heard of the PhenomII. But I have also heard that the fastest of them compares - in some benchmarks - to the slowest i7.

I am an AMD guy by heart, but I am a pragmatic engineer also. When I can get say 50% more performance for 20% more money I take the chance.

All depends on benchmarks comparing i7 and Phenom2, preferrably at same rate, while performing x264 encoding. Everything else I could do with the machine I have now. Actually I will build a new machine anyway and use the old one when I don't have to do heavy encoding. I have never dropped working hardware (besides harddrives:) )

@Jeffy
Thanks, that is what I was looking for.

gigah72
14th February 2009, 19:43
i'd expect the i7-920 to beat the PII-X4 940 (in x264) by 30-40% @ 50-60% more the price of the amd (cpu+board+ram). you need to choose what's more important to you.

fangorn
14th February 2009, 20:49
Thanks, I will have something to think of till they start delivering on monday. :)

fangorn
15th February 2009, 10:41
One last question. Not exactly hardware related though.:rolleyes:

At work we compile our binaries with ifc/icc. Has somebody here experience with compiling x264/mencoder/ffmpeg/vlc, ... with ICC? If it works is there a noticable speed increase over GCC compiled binaries? I know you can patch ICC compiles so they run optimized on AMD too, but I never used it.

Reimar
15th February 2009, 13:00
At work we compile our binaries with ifc/icc. Has somebody here experience with compiling x264/mencoder/ffmpeg/vlc, ... with ICC?

It works fine for ffmpeg, but you might have to search a bit for a version that does not miscompile. Note that it's relevant that you use Linux, only Linux ICC supports gcc-syntax, you would not be able to use Windows-ICC.

If it works is there a noticable speed increase over GCC compiled binaries?

For some things, mostly H.264 decoding. I think that only applies to 32 bit binaries though, compiling 64 bit binaries reduces register pressure and gcc does a much better job, and the code is faster anyway (e.g. since also the system libraries can use MMX and SSE unconditionally).
x264 probably has more than enough hand-optimized assembler that the compiler will almost never make a difference.
I admit I am only guessing, it has been ages since I actually tested that stuff.

I know you can patch ICC compiles so they run optimized on AMD too, but I never used it.

ICC binaries usually run about as well on AMD hardware as on Intel. Yes, I heard about that one issue there was, but I suspect I may have been only the Windows-Compilers that had it...

Dark Shikari
15th February 2009, 13:06
For some things, mostly H.264 decoding. I think that only applies to 32 bit binaries though, compiling 64 bit binaries reduces register pressure and gcc does a much better jobICC seems to be equally better on 32-bit and 64-bit in my experience.
x264 probably has more than enough hand-optimized assembler that the compiler will almost never make a difference.x264 is up to 3-4% faster with ICC, depending on settings.

fangorn
15th February 2009, 15:33
@Reimar
I was talking about Linux. As well at work as at home. I think I have a virtual machine with windows somewhere, but I don't know where atm.:)

And our tests with IFC 9 linux showed exactly that issue. Intel Core2 Processor 5 to 6 times faster than AMD Athlon 64 X2 at same cycle rate! Just because the binaries checked the processor ID and used completely unoptimized code if not Intel. For testers I patched one binary and the difference was nearly gone. I don't know if they correct more modern versions of the compilers. I havent followed the news concerning this. As this is only relevant for one of our programs that does heavy interpolation calculations, we just use the Intel machines to do that.

@All
Thanks again, you all were very helpful.

blubberbirne
15th February 2009, 20:30
i prefer i7 :)

because i have i7 :p

but look at this results.....

First a Q9550


x264 HD BENCHMARK RESULTS
Please copy/paste everything below the line into the forum post to report your data
^^^^^^^^^^^^^^^^^^

Results for x264.exe v0.58.819
encoded 1442 frames, 61.85 fps, 3886.74 kb/s
encoded 1442 frames, 60.84 fps, 3887.22 kb/s
encoded 1442 frames, 58.78 fps, 3886.56 kb/s
encoded 1442 frames, 57.75 fps, 3888.99 kb/s
encoded 1442 frames, 20.55 fps, 3966.41 kb/s
encoded 1442 frames, 18.98 fps, 3966.38 kb/s
encoded 1442 frames, 18.75 fps, 3969.14 kb/s
encoded 1442 frames, 18.62 fps, 3966.03 kb/s

Results for x264.exe v0.59.1101
encoded 1442 frames, 76.14 fps, 3976.48 kb/s
encoded 1442 frames, 72.10 fps, 3976.48 kb/s
encoded 1442 frames, 71.37 fps, 3976.48 kb/s
encoded 1442 frames, 71.49 fps, 3976.48 kb/s
encoded 1442 frames, 20.91 fps, 3935.63 kb/s
encoded 1442 frames, 20.41 fps, 3935.83 kb/s
encoded 1442 frames, 20.30 fps, 3935.23 kb/s
encoded 1442 frames, 20.49 fps, 3935.87 kb/s


System Details
--------------
Name Intel Core 2 Quad Q9550
Codename Yorkfield
Specification Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
Core Stepping C1
Technology 45 nm
Stock frequency 2833 MHz
Core Speed 2833.6 MHz (8.5 x 333.4 MHz)
FID range 6.0x - 8.5x

Northbridge Intel P45 rev. A3
Southbridge Intel 82801JR (ICH10R) rev. 00

CAS# 5.0
RAS# to CAS# 5
RAS# Precharge 5
Cycle Time (tRAS) 18
Command Rate 2T
Memory Frequency 400.0 MHz (5:6)
Memory Type DDR3
Memory Size 3328 MBytes
Channels Dual (Symmetric)

Windows Version Microsoft Windows XP Professional Service Pack 3 (Build 2600)

max VID 1.250 V
Voltage sensor 0 1.13 Volts [0x8D] (CPU VCORE)
Number of processors 1
Number of threads 4
Number of threads 4 (max 4)
L2 cache 2 x 6144 KBytes, 24-way set associative, 64-byte line size
Instructions sets MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, EM64T
Package Socket 775 LGA (platform ID = 4h)

Temperature sensor 0 43°C (109°F) [0x2B] (SYSTIN)
Temperature sensor 1 72°C (161°F) [0x90] (CPUTIN)
Temperature sensor 2 12°C (52°F) [0x1E9] (AUXTIN)
Temperature sensor 0 90°C (193°F) [0xF] (core #0)
Temperature sensor 1 90°C (193°F) [0xF] (core #1)
Temperature sensor 2 100°C (211°F) [0x5] (core #2)
Temperature sensor 3 102°C (215°F) [0x3] (core #3)
Temperature sensor 0 53°C (127°F) [0x35] (GPU Core)


And here my i7 920


x264 HD BENCHMARK RESULTS
Please copy/paste everything below the line into the forum post to report your data
^^^^^^^^^^^^^^^^^^

Results for x264.exe v0.58.819
encoded 1442 frames, 73.10 fps, 3895.56 kb/s
encoded 1442 frames, 71.76 fps, 3896.01 kb/s
encoded 1442 frames, 72.46 fps, 3896.01 kb/s
encoded 1442 frames, 71.63 fps, 3894.31 kb/s
encoded 1442 frames, 29.61 fps, 3986.59 kb/s
encoded 1442 frames, 29.72 fps, 3986.64 kb/s
encoded 1442 frames, 29.75 fps, 3986.64 kb/s
encoded 1442 frames, 29.53 fps, 3983.05 kb/s

Results for x264.exe v0.59.1101
encoded 1442 frames, 85.37 fps, 3976.27 kb/s
encoded 1442 frames, 86.49 fps, 3976.27 kb/s
encoded 1442 frames, 85.15 fps, 3976.27 kb/s
encoded 1442 frames, 86.40 fps, 3976.27 kb/s
encoded 1442 frames, 33.94 fps, 3938.61 kb/s
encoded 1442 frames, 33.89 fps, 3937.93 kb/s
encoded 1442 frames, 33.95 fps, 3938.65 kb/s
encoded 1442 frames, 33.89 fps, 3938.68 kb/s


System Details
--------------
Name Intel Processor
Codename Bloomfield
Specification Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz (Engineering Sample)
Core Stepping
Technology 45 nm
Core Speed 2797.9 MHz (21.0 x 133.2 MHz)

Northbridge Intel X58 rev. 12
Southbridge Intel 82801JR (ICH10R) rev. 00

Memory Frequency 1332.3 MHz ()
Memory Type DDR3
Memory Size 6142 MBytes

Windows Version Microsoft Windows Vista (6.0) Ultimate Edition Service Pack 1 (Build 6001)

max VID 0.825 V
Voltage sensor 0 1.20 Volts [0x4B] (CPU VCORE)
Number of processors 1
Number of threads 8
Number of threads 8 (max 16)
L2 cache 4 x 256 KBytes, 8-way set associative, 64-byte line size
Instructions sets MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, EM64T
Package Socket 1366 LGA (platform ID = 1h)

Temperature sensor 0 35°C (94°F) [0x23] (TMPIN0)
Temperature sensor 1 46°C (114°F) [0x2E] (TMPIN1)
Temperature sensor 2 51°C (123°F) [0x33] (TMPIN2)
Temperature sensor 0 62°C (143°F) [0x2B] (core #0)
Temperature sensor 1 63°C (145°F) [0x2A] (core #1)
Temperature sensor 2 60°C (139°F) [0x2D] (core #2)
Temperature sensor 3 57°C (134°F) [0x30] (core #3)
Temperature sensor 0 43°C (109°F) [0x2B] (GPU Core)


Second pass is more than 50% percent faster

Audionut
16th February 2009, 17:20
System Details
--------------
Name Intel Core 2 Quad Q9550

Temperature sensor 0 90°C (193°F) [0xF] (core #0)
Temperature sensor 1 90°C (193°F) [0xF] (core #1)
Temperature sensor 2 100°C (211°F) [0x5] (core #2)
Temperature sensor 3 102°C (215°F) [0x3] (core #3)


You're results aren't accurate cause you're putting this cpu into throttling mode and are about to kill it.

blubberbirne
16th February 2009, 20:58
You're results aren't accurate cause you're putting this cpu into throttling mode and are about to kill it.

the temps are wrong. It's a bug with cpu-z. The temp was ~50'c, and not 100'c! The cpu don't thottle

fangorn
22nd February 2009, 08:52
Now I got me a new machine (Core I7 920).

I am running Gentoo on this baby and when I do compile stuff it really rocks:cool:

But as soon as I start converting something using mencoder/x264 I need to run several jobs in parallel to get over 30 % average load. :confused:

I am using this command lines:

/usr/local/bin/mencoder movie.mkv -oac copy -ovc x264
-x264encopts subq=4:bframes=4:b_pyramid:weight_b:pass=1:psnr:bitrate=6000:vbv_maxrate=8500:vbv_bufsize=2000:keyint=100:turbo=2:threads=0
-passlogfile movie.mkv_tmp.mkv_2pass.log -vf expand=1920:1080 -fps 23.976 -ofps 23.976 -o /dev/null
/usr/local/bin/mencoder movie.mkv -oac copy -ovc x264 -x264encopts
subq=5:partitions=4x4:8x8dct:frameref=3:bframes=4:b_pyramid:vbv_maxrate=8500:vbv_bufsize=2000:keyint=100:pass=2:psnr:bitrate=$BITRATE:threads=0
-passlogfile movie.mkv_tmp.mkv_2pass.log -vf expand=1920:1080 -fps 23.976 -ofps 23.976 -o movie.mkv.avi


I am using the versions from SVN/GIT 2009-02-20.

Is there anything strange that hits the eye? I expected x264 to utilize at least the 4 real cores to the max.

Also strange: when an encoding is running, compilations get slowed down and the load does not go over 50%, without an encoding, all 8 (virtual) cores are utilized to the max. by gcc.

I hope someone here can give me a hint what to look for.

Thanks in advance,
fangorn

fangorn
22nd February 2009, 10:05
Update:
When in second pass, I get aroung 80% average usage when running 2 encodings in parallel.

LoRd_MuldeR
22nd February 2009, 16:52
Sure your x264 (MEncoder) was compiled with pthreads enabled ???

burfadel
22nd February 2009, 17:06
In terms of GPU encoding asked earlier, both CUDA and CAL are limited in some respects. CUDA is the more popular one it seems, but I see it only have a finite future due to it only being available on one brand GPU. There are standards that work for both Nvidia and ATI (and other companies that want to support them), which is included in Directx 11 and also as OpenCL (I believe thats what they call it). Whilst Directx 11 and OpenCL are yet to be full supported, at least its a fair platform to work upon, both for the person programming and the GPU manufacturers (and end users). Whilst that statement may antagonise Nvidia fanboys, its very valid since if there is only one GPU manufacturer they can charge what they want for the cards, and the last thing the end users should have is graphics cards that cost 40 percent more due to a monopoly.

fangorn
22nd February 2009, 18:44
Platform: X86_64
System: LINUX
asm: yes
avis input: no
mp4 output: yes
pthread: yes
debug: no
gprof: no
PIC: yes
shared: yes
visualize: yes

This is my x264 ./configure output

I do not find pthreads in mplayer ./configure output

Detected operating system: Linux
Detected host architecture: x86_64
Checking for cc version ... 4.3.3
Checking for host cc ... cc
Checking for cross compilation ... no
Checking for CPU vendor ... GenuineIntel (6:26:4)
Checking for CPU type ... Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
Checking for kernel support of mmx ... yes
Checking for kernel support of mmxext ... yes
Checking for kernel support of sse ... yes
Checking for kernel support of sse2 ... yes
Checking for kernel support of ssse3 ... yes
Checking for kernel support of cmov ... yes
Checking for mtrr support ... yes
Checking for GCC & CPU optimization abilities ... native
Checking for byte order ... little-endian
Checking for extern symbol prefix ...
Checking for assembler support of -pipe option ... yes
Checking for compiler support of named assembler arguments ... yes
Checking for .align is a power of two ... no
Checking for 10 assembler operands ... yes
Checking for yasm ... yasm
Checking for bswap ... yes
Checking for -lposix ... no
Checking for -lm ... yes
Checking for langinfo ... yes
Checking for language ... messages: de - man pages: de - documentation: de
Checking for enable sighandler ... yes
Checking for runtime cpudetection ... no
Checking for restrict keyword ... __restrict
Checking for __builtin_expect ... yes
Checking for kstat ... no
Checking for posix4 ... no
Checking for llrint ... yes
Checking for lrint ... yes
Checking for lrintf ... yes
Checking for round ... yes
Checking for roundf ... yes
Checking for truncf ... yes
Checking for mkstemp ... yes
Checking for nanosleep ... yes
Checking for socklib ... yes
Checking for arpa/inet.h ... yes
Checking for inet_pton() ... yes
Checking for inet_aton() ... yes
Checking for socklen_t ... yes
Checking for closesocket() ... no
Checking for network ... yes
Checking for inet6 ... yes
Checking for gethostbyname2 ... yes
Checking for inttypes.h (required) ... yes
Checking for int_fastXY_t in inttypes.h ... yes
Checking for malloc.h ... yes
Checking for memalign() ... yes
Checking for posix_memalign() ... yes
Checking for alloca.h ... yes
Checking for fastmemcpy ... yes
Checking for mman.h ... yes
Checking for dynamic loader ... yes
Checking for dynamic a/v plugins support ... no
Checking for pthread ... yes (using -lpthread)
Checking for w32threads ... no (using pthread instead)
Checking for rpath ... no
Checking for iconv ... yes
Checking for soundcard.h ... yes (sys/soundcard.h)
Checking for sys/dvdio.h ... no
Checking for sys/cdio.h ... no
Checking for linux/cdrom.h ... yes
Checking for dvd.h ... no
Checking for termcap ... yes (using -lncurses)
Checking for termios ... yes (using sys/termios.h)
Checking for shm ... yes
Checking for strsep() ... yes
Checking for vsscanf() ... yes
Checking for swab() ... yes
Checking for POSIX select() ... yes
Checking for audio select() ... yes
Checking for gettimeofday() ... yes
Checking for glob() ... yes
Checking for setenv() ... yes
Checking for sys/sysinfo.h ... yes
Checking for Apple IR ... yes
Checking for pkg-config ... yes
Checking for Samba support (libsmbclient) ... yes
Checking for tdfxfb ... no
Checking for s3fb ... no
Checking for wii ... no
Checking for tdfxvid ... no
Checking for xvr100 ... no
Checking for tga ... yes
Checking for md5sum support ... yes
Checking for yuv4mpeg support ... yes
Checking for bl ... no
Checking for DirectFB ... yes (1.2.6)
Checking for X11 headers presence ... yes (using /usr/X11R6/include)
Checking for X11 ... yes
Checking for Xss screensaver extensions ... yes
Checking for DPMS ... yes (using Xdpms 4)
Checking for Xv ... yes
Checking for XvMC ... no
Checking for VDPAU ... yes
Checking for Xinerama ... yes
Checking for Xxf86vm ... yes
Checking for XF86keysym ... yes
Checking for DGA ... yes (using DGA 2.0)
Checking for 3dfx ... no
Checking for OpenGL ... yes
Checking for VIDIX ... yes
Checking for VIDIX PCI device name database ... yes
Checking for VIDIX dhahelper support ... no
Checking for VIDIX svgalib_helper support ... no
Checking for /dev/mga_vid ... no
Checking for xmga ... no
Checking for GGI ... no
Checking for GGI extension: libggiwmh ... no
Checking for AA ... yes
Checking for CACA ... yes
Checking for SVGAlib ... no
Checking for FBDev ... yes
Checking for DVB ... no
Checking for DVB HEAD ... yes
Checking for PNG support ... yes
Checking for MNG support ... yes
Checking for JPEG support ... yes
Checking for PNM support ... yes
Checking for GIF support ... yes
Checking for broken giflib workaround ... disabled
Checking for VESA support ... no
Checking for SDL ... yes (using sdl-config)
Checking for DXR2 ... no
Checking for DXR3/H+ ... no
Checking for IVTV TV-Out (pre linux-2.6.24) ... no
Checking for V4L2 MPEG Decoder ... yes
Checking for OSS Audio ... yes
Checking for aRts ... no
Checking for EsounD ... yes
Checking for esd_get_latency() ... yes
Checking for NAS ... no
Checking for pulse ... yes
Checking for JACK ... no
Checking for OpenAL ... yes
Checking for ALSA audio ... yes (using alsa 1.0.x and alsa/asoundlib.h)
Checking for Sun audio ... no
Checking for VCD support ... yes
Checking for dvdread ... yes (internal)
Checking for internal libdvdcss ... yes
Checking for cdparanoia ... yes
Checking for libcdio ... auto (using cdparanoia)
Checking for bitmap font support ... yes
Checking for freetype >= 2.0.9 ... yes
Checking for fontconfig ... yes
Checking for SSA/ASS support ... yes
Checking for fribidi with charsets ... yes
Checking for ENCA ... no
Checking for zlib ... yes
Checking for bzlib ... yes
Checking for RTC ... yes
Checking for liblzo2 support ... yes
Checking for mad support ... yes
Checking for Twolame ... yes
Checking for Toolame ... no (disabled by twolame)
Checking for OggVorbis support ... yes (internal Tremor)
Checking for libspeex (version >= 1.1 required) ... no
Checking for OggTheora support ... yes
Checking for internal mp3lib support ... yes
Checking for liba52 support ... yes (internal)
Checking for internal libmpeg2 support ... yes
Checking for libdca support ... yes
Checking for libmpcdec (musepack, version >= 1.2.1 required) ... yes
Checking for FAAC support ... yes (in libavcodec: yes)
Checking for FAAD2 support ... yes (internal floating-point)
Checking for LADSPA plugin support ... no
Checking for Win32 codecs ... no
Checking for XAnim codecs ... yes (using /usr/local/lib/codecs)
Checking for RealPlayer codecs ... yes (using /usr/local/lib/codecs)
Checking for QuickTime codecs ... auto
Checking for Nemesi Streaming Media libraries ... no
Checking for LIVE555 Streaming Media libraries ... no
Checking for FFmpeg libavutil ... yes (static)
Checking for FFmpeg libavcodec ... yes (static)
Checking for FFmpeg libavformat ... yes (static)
Checking for FFmpeg libpostproc ... yes (static)
Checking for FFmpeg libswscale ... yes (static)
Checking for libamr narrowband ... no
Checking for libamr wideband ... no
Checking for libdv-0.9.5+ ... yes
Checking for Xvid ... yes
Checking for Xvid two pass plugin ... yes
Checking for x264 ... yes (in libavcodec: yes)
Checking for libdirac ... no
Checking for libschroedinger ... no
Checking for libnut ... no
Checking for zr ... no
Checking for libmp3lame ... yes (in libavcodec: yes)
Checking for mencoder ... yes
Checking for UnRAR executable ... yes
Checking for TV interface ... yes
Checking for DirectShow TV interface ... no
Checking for Video 4 Linux TV interface ... yes
Checking for Video 4 Linux 2 TV interface ... yes
Checking for TV teletext interface ... yes
Checking for Radio interface ... no
Checking for Capture for Radio interface ... no
Checking for Video 4 Linux 2 Radio interface ... no
Checking for Video 4 Linux Radio interface ... no
Checking for Video 4 Linux 2 MPEG PVR interface ... yes
Checking for ftp ... yes
Checking for vstream client ... no
Checking for OSD menu ... no
Checking for Subtitles sorting ... yes
Checking for XMMS inputplugin support ... no
Checking for GUI ... no
Checking for automatic gdb attach ... no
Checking for compiler support for noexecstack ... yes
Checking for joystick ... no
Checking for lirc ... no
Checking for lircc ... no
Checking for DVD support (libdvdnav) ... yes (internal)
Creating config.mak
Creating config.h

Config files successfully generated by ./configure --enable-vdpau !

Install prefix: /usr/local
Data directory: /usr/local/share/mplayer
Config direct.: /usr/local/etc/mplayer

Byte order: little-endian
Optimizing for:

Languages:
Messages/GUI: de
Manual pages: de

Enabled optional drivers:
Input: dvdnav(internal) ftp pvr tv-teletext tv-v4l2 tv-v4l tv cddb cdda libdvdcss(internal) dvdread(internal) vcd dvb smb network
Codecs: x264 xvid libdv libavcodec(internal) real xanim faad2(internal) faac musepack libdca libmpeg2(internal) liba52(internal) mp3lib(internal) libtheora tremor(internal) twolame libmad liblzo gif
Audio output: alsa openal pulse esd oss v4l2 sdl mpegpes(dvb)
Video output: v4l2 sdl gif89a pnm jpeg png mpegpes(dvb) fbdev caca aa xvidix cvidix opengl dga vdpau xv x11 xover dfbmga directfb yuv4mpeg md5sum tga

Disabled optional drivers:
Input: vstream radio tv-dshow live555 nemesi
Codecs: libschroedinger libdirac libamr_wb libamr_nb qtx win32 speex toolame
Audio output: sun jack nas arts ivtv dxr2
Video output: zr zr2 ivtv dxr3 dxr2 vesa svga ggi xmga mga winvidix 3dfx xvmc bl xvr100 tdfx_vid wii s3fb tdfxfb

'config.h' and 'config.mak' contain your configuration options.
Note: If you alter theses files (for instance CFLAGS) MPlayer may no longer
compile *** DO NOT REPORT BUGS if you tweak these files ***

'make' will now compile MPlayer and 'make install' will install it.
Note: On non-Linux systems you might need to use 'gmake' instead of 'make'.

Please check mtrr settings at /proc/mtrr (see DOCS/HTML/de/video.html#mtrr)

NOTE: Win32 codec DLLs are not supported on your CPU (x86_64) or your
operating system (Linux). You may encounter a few files that cannot
be played due to missing open source video/audio codec support.

Check configure.log if you wonder why an autodetection failed (make sure
development headers/packages are installed).

NOTE: The --enable-* parameters unconditionally force options on, completely
skipping autodetection. This behavior is unlike what you may be used to from
autoconf-based configure scripts that can decide to override you. This greater
level of control comes at a price. You may have to provide the correct compiler
and linker flags yourself.
If you used one of these options (except --enable-gui and similar ones that
turn on internal features) and experience a compilation or linking failure,
make sure you have passed the necessary compiler/linker flags to configure.

If you suspect a bug, please read DOCS/HTML/de/bugreports.html.

LoRd_MuldeR
22nd February 2009, 21:32
So if you use the x264 CLI encoder, which obviously was built with pthreads, instead of MEncoder, what are the results?

Make sure you pass the "--threads auto" parameter to x264.

fangorn
25th February 2009, 09:56
Sorry for the late replay, I was not with my machine.

When running x264 standalone I get approx. 80% (for each of the 8 virtual cores) load with 8 threads, 85% with 12 thread, 90% with 16 threads and approx 85% with --threads auto.

It seems I am having a problem with mencoder. It does not matter if mencoder accesses x264 directly or if it uses -ovc lavc and accesses lib264 through ffmpeg routines. In turbo mode for first pass the load is quite small, but the speed is a little better than with nonturbo.

I will investigate myself, but if someone has got a hint where to look, it will be much appreciated.