PDA

View Full Version : x264 1745 gcc 4.4.4 versus gcc 4.5.1


bob0r
17th October 2010, 00:57
gcc 4.4.4: http://x264.nl/dump/x264_1745_gcc_4.4.4/x264.exe
gcc 4.5.1: http://x264.nl/dump/x264_1745_gcc_4.5.1/x264.exe



x264 1745 gcc 4.4.4 versus gcc 4.5.1 which is fastest with encoding?
Please post the commandline you used, run each test 3x.

Compiled with --configure and make.

All libs recompiled with the gcc version used aswell:
pthreads cvs: 2010-06-22, gpac svn: 2122, ffmpeg git: 25289,
ffmpeg-libswscale git: 1252, ffmpegsource svn: 347


gcc 4.4.5 crashes with compiling: http://x264.nl/dump/x264_1745_gcc_4.4.5_crach.txt
a very weird crash, it only crashes within the script http://x264.nl/x264_updater_git.sh
on the second compile. When i run it manually after it fails, it works....

Dark Shikari
17th October 2010, 01:08
I still use gcc 3.4 :devil:

MatLz
17th October 2010, 02:05
Core2duo @1.66 GHz, 701 frames @1280x720

--crf 20 --me umh -m 9 --b-adapt 0 --b-bias 100 -b 16 -r 8 -o mkv avs

gcc4.4.4 : 4.79, 4.79 and 4.80 fps

gcc4.5.1 : 4.87, 4.88 and 4.87 fps

MatLz
17th October 2010, 03:39
Hmmm....problem ?
My previous test show a 1.67% speed gain for 4.5.1...but with low settings :
--crf 22 --me dia -m 2 --b-bias 100 -b 16 -r 2
It's only 0.65%.


So I decided to try with a little higher settings than my first test :
--crf 20 --me tesa -m 9 --b-adapt 2 --b-bias 100 -b 16 -r 16 (on same clip but with a pointresize(640,360))
And...4.5.1 is slower than 4.4.4 ! (avg 4.51 against 4.59 fps)

burfadel
17th October 2010, 07:23
Can I ask why --b-bias 100? that effectively forces x264 to use 16 b-frames all the time, and disables the internal logic to insert them where beneficial. By using this setting, you impact quality. If you want to use more b-frames, unless its animation anything over around 6 is pointless, but you could set the bias if you still wish to do so to a more practical number like 5...

The main reason why I mentioned it is in the 'low settings' script, you have --me dia, -r2, but also 16 b-frames and 100 b-frames. In the higher settings script, --me umh and -m10 should be more beneficial...

MatLz
17th October 2010, 07:26
It is only tests...
Don't make attention...

burfadel
17th October 2010, 09:00
I was thinking that was the case, but just making sure :) was the gcc 4.5.1 consistently slower than 4.4.4? other processes/system processes can null the results. Also the encode statistics should be identical (apart from encode speed) for 4.4.4 and 4.5.1, are there any anomalies?

bob0r
17th October 2010, 09:17
-b 2000 ../1280x720p50_parkrun_ter.yuv -o NUL

4.4.4
encoded 504 frames, 9.12 fps, 11827.43 kb/s
encoded 504 frames, 14.73 fps, 11827.43 kb/s
encoded 504 frames, 15.00 fps, 11827.43 kb/s
encoded 504 frames, 14.85 fps, 11827.43 kb/s

4.5.1
encoded 504 frames, 14.89 fps, 11827.43 kb/s
encoded 504 frames, 14.65 fps, 11827.43 kb/s
encoded 504 frames, 14.72 fps, 11827.43 kb/s
encoded 504 frames, 14.83 fps, 11827.43 kb/s

Except for the first run, gcc 4.4.4 seems a bit faster here.
That's why you have to run it 3x or more, so you first load it in your system's memory. And then you can get an average because your CPU is always busy.

Ran on Intel Q66200 stock.

bob0r
17th October 2010, 09:20
Hmmm....problem ?
My previous test show a 1.67% speed gain for 4.5.1...but with low settings :
--crf 22 --me dia -m 2 --b-bias 100 -b 16 -r 2
It's only 0.65%.


So I decided to try with a little higher settings than my first test :
--crf 20 --me tesa -m 9 --b-adapt 2 --b-bias 100 -b 16 -r 16 (on same clip but with a pointresize(640,360))
And...4.5.1 is slower than 4.4.4 ! (avg 4.51 against 4.59 fps)

That's why i only ask about the commandline used.
If 100 people run this test, and 95% say gcc 4.5.1 is fastest, we can check if they use slow or fast settings.
But i think most people use the default or slower settings, we shall see how the results end up!

Underground78
17th October 2010, 09:28
AMD Athlon X2 6000+ 3 GHz - 1610 frames 704x384

--crf 23 --preset medium:
GCC 4.4.4: 27.70 fps, 27.81 fps, 27.69 fps
GCC 4.5.1: 28.45 fps, 28.42 fps, 28.50 fps
--> GCC 4.5.1 2.61% faster

--crf 23 --preset fast:
GCC 4.4.4: 30.79 fps, 30.76 fps, 30.80 fps
GCC 4.5.1: 31.45 fps, 31.55 fps, 31.53 fps
--> GCC 4.5.1 2.36% faster

--crf 23 --preset slow:
GCC 4.4.4: 16.96 fps, 16.97 fps, 17.00 fps
GCC 4.5.1: 17.40 fps, 17.40 fps, 17.40 fps
--> GCC 4.5.1 2.49% faster

(all runs with --output NUL)

Groucho2004
18th October 2010, 10:24
I still use gcc 3.4 :devil:

I assume that you have a good reason why you're still using 3.4.x. Could you elaborate?

Dark Shikari
18th October 2010, 11:18
I assume that you have a good reason why you're still using 3.4.x. Could you elaborate?It comes with Cygwin and I'm lazy.

Groucho2004
18th October 2010, 11:30
It comes with Cygwin and I'm lazy.

I see. :rolleyes:

So, there isn't anything that 4.4.x (or 4.5.x) could screw up worse than 3.4.x, right? I tried both (3.4.5 and 4.5.1) and there is very little performance difference. The only thing I noticed is that the 3.4.5 build is much smaller than 4.5.1 (780KB compared to 1MB, without gpac).

bob0r
18th October 2010, 16:12
As you can see the difference between 4.4.4 and 4.5.1 in size is quite large aswell. I guess there are too many factors why a gcc version does what a gcc version does.

kypec
18th October 2010, 19:53
I made 9 rounds with each GCC version, 3 runs with 3 different presets on i7-920 @ stock frequency 2.66GHz.

Command lines used:x264.exe --preset slower --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs
x264.exe --preset medium --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs
x264.exe --preset faster --tune film --crf 18.0 --sar 1:1 --level 4.1 --keyint 240 --output null test.avs
Sample = 1920 x 800 24p sourced via DGDecodeNV, 720 frames

GCC 4.4.4 results:
slower: 1.53 / 1.39 / 1.39 ~ 1.44
medium: 7.96 / 7.95 / 8.03 ~ 7.98
faster: 13.99 / 14.05 / 14.09 ~ 14.04


GCC 4.5.1 results:
slower: 1.72 / 1.51 / 1.44 ~ 1.56 +8.33%
medium: 8.19 / 8.18 / 8.08 ~ 8.15 +2.13%
faster: 14.43 / 14.55 / 14.36 ~ 14.45 +2.92%


However, in the very last run (--preset faster) there occurred a non-deterministic result for GCC 4.4.4:avs [info]: 1920x800p 1:1 @ 24000/1001 fps (cfr)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2
x264 [info]: profile High, level 4.1
x264 [info]: frame I:16 Avg QP:16.58 size:217109
x264 [info]: frame P:375 Avg QP:20.04 size:150895
x264 [info]: frame B:329 Avg QP:21.87 size: 98865
x264 [info]: consecutive B-frames: 9.2% 85.2% 0.4% 5.1%
x264 [info]: mb I I16..4: 39.3% 18.8% 42.0%
x264 [info]: mb P I16..4: 13.2% 17.5% 14.4% P16..4: 15.8% 11.6% 10.5% 0.0% 0
.0% skip:16.9%
x264 [info]: mb B I16..4: 6.3% 10.4% 4.6% B16..8: 24.2% 14.8% 3.6% direct:
14.9% skip:21.3% L0:26.6% L1:22.0% BI:51.4%
x264 [info]: 8x8 transform intra:40.3% inter:19.6%
x264 [info]: coded y,uvDC,uvAC intra: 89.1% 51.0% 17.3% inter: 53.0% 19.9% 2.4%
x264 [info]: i16 v,h,dc,p: 30% 6% 44% 20%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 11% 42% 4% 9% 5% 6% 6% 8%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 15% 24% 5% 12% 7% 8% 7% 11%
x264 [info]: i8c dc,h,v,p: 58% 18% 20% 3%
x264 [info]: ref P L0: 69.3% 9.8% 20.9%
x264 [info]: ref B L0: 77.4% 22.6%
x264 [info]: ref B L1: 99.6% 0.4%
x264 [info]: kb/s:24664.96

encoded 720 frames, 14.09 fps, 24664.96 kb/s as opposed to all other runs (2 times GCC 4.4.4 and 3 times GCC 4.5.1)avs [info]: 1920x800p 1:1 @ 24000/1001 fps (cfr)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2
x264 [info]: profile High, level 4.1
x264 [info]: frame I:16 Avg QP:16.58 size:217110
x264 [info]: frame P:375 Avg QP:20.04 size:150898
x264 [info]: frame B:329 Avg QP:21.87 size: 98881
x264 [info]: consecutive B-frames: 9.2% 85.2% 0.4% 5.1%
x264 [info]: mb I I16..4: 39.3% 18.8% 41.9%
x264 [info]: mb P I16..4: 13.2% 17.5% 14.4% P16..4: 15.8% 11.6% 10.6% 0.0% 0
.0% skip:16.9%
x264 [info]: mb B I16..4: 6.3% 10.4% 4.6% B16..8: 24.2% 14.8% 3.6% direct:
14.9% skip:21.2% L0:26.6% L1:22.0% BI:51.4%
x264 [info]: 8x8 transform intra:40.3% inter:19.7%
x264 [info]: coded y,uvDC,uvAC intra: 89.1% 51.0% 17.2% inter: 53.1% 19.9% 2.4%
x264 [info]: i16 v,h,dc,p: 30% 6% 44% 20%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 10% 11% 42% 4% 9% 5% 6% 6% 8%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 15% 24% 5% 12% 7% 8% 7% 11%
x264 [info]: i8c dc,h,v,p: 58% 18% 20% 3%
x264 [info]: ref P L0: 69.4% 9.7% 20.9%
x264 [info]: ref B L0: 77.4% 22.6%
x264 [info]: ref B L1: 99.6% 0.4%
x264 [info]: kb/s:24666.65

encoded 720 frames, 14.05 fps, 24666.65 kb/s
Is this normal? I thought that non-deterministic results are possible only when VBV model is specified with multithread processing...:confused:

MasterNobody
18th October 2010, 22:00
Is this normal? I thought that non-deterministic results are possible only when VBV model is specified with multithread processing...:confused:
It must be deterministic. May be decoding is non-deterministic (DGDecodeNV)?

kypec
19th October 2010, 08:07
It must be deterministic. May be decoding is non-deterministic (DGDecodeNV)?
I strongly doubt that. If anybody suggests how to make 100% deterministic input for x264 (preferably with use of my sample script, not some other material) I could try to locate the problem better. As it is now I stand biased towards x264 encoding issue (with GCC 4.4.4) rather than DGDecodeNV decoding.

MatLz
19th October 2010, 08:21
In my three series of tests, output files were identical.(for the same command line...of course)

LoRd_MuldeR
19th October 2010, 08:40
I strongly doubt that. If anybody suggests how to make 100% deterministic input for x264 (preferably with use of my sample script, not some other material) I could try to locate the problem better. As it is now I stand biased towards x264 encoding issue (with GCC 4.4.4) rather than DGDecodeNV decoding.

Use a lossless sample file (raw YUV data) from here, for example:
http://media.xiph.org/video/derf/

Using a H.264 source only adds another layer of uncertainty to the "deterministic" test. I usually do a MD5 comparisons of the output files after updating my compiler with one of those Xiph samples. It's not a guarantee, because it only tests one particular combination of settings with one particular source, but it helped me to identify compiler issues in the past...

Here are the results from the latest test:
http://pastie.org/1232074

(BTW: Is there any reason why are we comparing GCC 4.4.4 here, after 4.4.5 has already been officially released from the 4.4.x tree? Any regressions I should know?)

video_magic
19th October 2010, 12:53
I made 9 rounds with each GCC version, 3 runs with 3 different presets on i7-920 @ stock frequency 2.66GHz......
....Is this normal? I thought that non-deterministic results are possible only when VBV model is specified with multithread processing...:confused:

Maybe you could check for a RAM error, or a HDD read/write error. You might have an unreliable component.

kemuri-_9
19th October 2010, 13:27
(BTW: Is there any reason why are we comparing GCC 4.4.4 here, after 4.4.5 has already been officially released from the 4.4.x tree? Any regressions I should know?)

easy answer: 4.4.4 is the version bob0r compiled from the 4.4.x series, he hasn't compiled 4.4.5

bob0r
19th October 2010, 18:50
easy answer: 4.4.4 is the version bob0r compiled from the 4.4.x series, he hasn't compiled 4.4.5

An even easier answer, it's in the first post, it crashes.

kemuri-_9
19th October 2010, 23:47
An even easier answer, it's in the first post, it crashes.

4.4.5 works here, so it looks like you miscompiled gcc.

LoRd_MuldeR
20th October 2010, 12:15
4.4.5 works here, so it looks like you miscompiled gcc.

Same here. x264 seems to compile just fine with GCC 4.4.5 for me. I use Komisar's build:
http://komisar.gin.by/mingw/index.html

bob0r
21st October 2010, 16:58
I know, when i compile it myself, it works, but only in combination with the script, it fails. So _something_ must be called or whatever... ill recompile it later and test again.

bob0r
22nd October 2010, 17:33
4.4.5 works here, so it looks like you miscompiled gcc.

Seems you were right!

gcc 4.4.5: http://x264.nl/dump/x264_1745_gcc_4.4.5/x264.exe

I assume 4.4.5 is not that much faster over 4.4.5, but those who already ran a test, can retest it.

But it looks like we're updating to 4.5.1.