View Full Version : x265 performance of different builds! (GCC,ICC,VS/VC)
kotuwa
16th April 2016, 09:29
Regarding windows binary builds of x265,
GCC, ICC, VS / V C++
0. Are there any other builds?
1. What are the speed/performance based differences?
2. Which build suits better for which system?
3. Are there any quality/size based differences too?
!?
Jamaika
16th April 2016, 11:38
About a little imprecise. Nothing written on chromasubsampling i420 or i444. If the i444 is HEVC miserably looks at high compression for frames I. What kind of CPU you have?
I have the old i5 2500:
Ad0 I don't know
Ad1 They are differences at the expense of quality. Slowest falls VC ++. Even slower is all encoders in one 8+10+12, ie. In the Hybrid
Ad2 I have GCC in Windows 10. The ICC is also tolerably.
Ad3 Yes they are. You should check yourself. I checked for version 1.7.
nevcairiel
16th April 2016, 11:47
3. Are there any quality/size based differences too?
If the compiler impacts the output of an encoder, that sounds like a bug, and you should report that to the developers.
So in general, no, there should be no differences in output no matter how you build it, assuming all builds use the same configuration.
0. In theory it is possible to build x265 for Windows with clang.
1. From my speed tests, if your CPU doesn't have SSE4, the fastest are GCC 6 builds, otherwise VS 2015 builds.
2. For current CPU (AVX/AVX2) the best are VS 2015 builds. Windows version isn't important if it is Windows 7 64bit or newer 64bit.
3. Yes, there are (but very small). In file source/encoder/sao.cpp some decisions are made according to floating point computations that depends on compiler/optimize options.
kotuwa
17th April 2016, 16:50
Ad3 Yes they are. You should check yourself. I checked for version 1.7.
I checked small samples. Couldn't check x264 info/statistics, though!
File sizes were almost same, the slight difference was several bytes, I thought it is due to build info string....
If the compiler impacts the output of an encoder, that sounds like a bug, and you should report that to the developers.
So in general, no, there should be no differences in output no matter how you build it, assuming all builds use the same configuration.
Are you sure? Other 2 replies says otherwise!
Are used instruction sets has impact on size/quality?
SSE4, AVX2 etc?
And does the build has effect on those?
Also another question, what kind of systems benefit by using ICC builds?
nevcairiel
17th April 2016, 22:48
Are you sure? Other 2 replies says otherwise!
Minor floating point differences really shouldn't result in a noticeable quality difference - if they do that should still be investigated by the developers.
Motenai Yoda
22nd April 2016, 20:54
Well I tested vs2015 vs gcc 5.2.0 and on my Bloomfield gcc ones was faster.
Well I tested vs2015 vs gcc 5.2.0 and on my Bloomfield gcc ones was faster.
For VS 2015 before build helps:
set CXXFLAGS=/GS- /GL
Did you use /GS- /GL options?
Motenai Yoda
24th April 2016, 18:44
with set CXXFLAGS=/GS- /GL it throw me an error about some target cpu isn't the same back end and front end???
using the batch included on my cpu:
fps 8bit 10bit 12bit
MSVC 1800 9.02 7.05 5.19
GCC 5.3.0 9.02 7.06 4.67
So the problem is only at 10bit encoding.
My emulation of 10bit encoding with your CPU (SSE4.2 level) on i5 3450S, x265- is compiled without /GS- /GL options, x265 with:
i:\speed\1.9+141>x265- --asm=SSE4.2 ../ducks_take_off_1080p50.y4m w.hevc
y4m [info]: 1920x1080 fps 50/1 i420p8 sar 1:1 frames 0 - 499 of 500
raw [info]: output file: w.hevc
x265 [info]: HEVC encoder version 1.9+141-02d79be487d7
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
x265 [info]: Main 10 profile, Level-4.1 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: frame threads / pool features : 2 / wpp(17 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 2
x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 40
x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
x265 [info]: References / ref-limit cu / depth : 3 / on / on
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 signhide tmvp strong-intra-smoothing
x265 [info]: tools: lslices=6 deblock sao
x265 [info]: frame I: 2, Avg QP:35.89 kb/s: 36347.20
x265 [info]: frame P: 123, Avg QP:36.65 kb/s: 27657.64
x265 [info]: frame B: 375, Avg QP:39.35 kb/s: 4933.84
x265 [info]: Weighted P-Frames: Y:18.7% UV:12.2%
x265 [info]: consecutive B-frames: 0.8% 0.0% 0.0% 96.8% 2.4%
encoded 500 frames in 85.48s (5.85 fps), 10649.55 kb/s, Avg QP:38.67
i:\speed\1.9+141>x265 --asm=SSE4.2 ../ducks_take_off_1080p50.y4m w.hevc
y4m [info]: 1920x1080 fps 50/1 i420p8 sar 1:1 frames 0 - 499 of 500
raw [info]: output file: w.hevc
x265 [info]: HEVC encoder version 1.9+141-02d79be487d7
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
x265 [info]: Main 10 profile, Level-4.1 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: frame threads / pool features : 2 / wpp(17 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 2
x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 40
x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
x265 [info]: References / ref-limit cu / depth : 3 / on / on
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 signhide tmvp strong-intra-smoothing
x265 [info]: tools: lslices=6 deblock sao
x265 [info]: frame I: 2, Avg QP:35.89 kb/s: 36347.20
x265 [info]: frame P: 123, Avg QP:36.65 kb/s: 27657.64
x265 [info]: frame B: 375, Avg QP:39.35 kb/s: 4933.84
x265 [info]: Weighted P-Frames: Y:18.7% UV:12.2%
x265 [info]: consecutive B-frames: 0.8% 0.0% 0.0% 96.8% 2.4%
encoded 500 frames in 84.64s (5.91 fps), 10649.55 kb/s, Avg QP:38.67
VS 2015 builds compiled with /GS- /GL options you can download from www.msystem.waw.pl/x265
Motenai Yoda
24th April 2016, 21:02
wait... the batch included compile with 2013... (but I don't have vs2013 :confused: )
tried with vs2015 and set CXXFLAGS=/GS- /GL
Compilazione completata.
Avvisi: 0
Errori: 0
Tempo trascorso 00:02:44.06
1 file spostato/i.
Microsoft (R) Library Manager Version 14.00.23918.0
Copyright (C) Microsoft Corporation. All rights reserved.
x265-static-main.lib(analysis.obj) : trovato .netmodule MSIL o modulo compilato con /GL; il collegamento verrą riavviato con l'opzione /LTCG; aggiungere /LTCG alla riga di comando del collegamento per migliorare le prestazioni del linker
Microsoft (R) Library Manager Version 14.00.23918.0
Copyright (C) Microsoft Corporation. All rights reserved.
fatal error C1905: Front end e back end non compatibili (il processore di destinazione deve essere lo stesso).
LINK : fatal error LNK1257: generazione codice non riuscita
with vs2015 and set cxxflags
8bit 9.02 / 10bit 6.99 / 12bit 3.70
I see you build multilib version. The compilation is OK and it should be x265.exe that works OK at 8- and 10-bit encoding (and wrong at 12-bit).
The error is from part:
:: combine static libraries (ignore warnings caused by winxp.cpp hacks)
move Release\x265-static.lib x265-static-main.lib
LIB.EXE /ignore:4006 /ignore:4221 /OUT:Release\x265-static.lib x265-static-main.lib x265-static-main10.lib x265-static-main12.lib
which is not important. You can use compiled x265.exe without problem (only avoid 12-bit encoding with multilib version compiled with LTO -- there are bugs in x265 source code).
Sagittaire
16th April 2017, 16:24
Someone can make built for:
-x264 GCC 7.0 "None"
-x264 GCC 7.0 "SSE4"
-x264 ICC 17
-x264 VS 2017 "AVX2"
-x265 GCC 7.0 "None"
-x265 GCC 7.0 "SSE4"
-x265 ICC 17
-x265 VS 2017 "AVX2"
I will produce automatic script benchmark for compare all x264 and x265 build and after choose the best for your CPU and make complete benchmark.
Here i can find some build:
http://msystem.waw.pl/x265/
but not:
-x264 GCC 7.0 "None"
-x264 GCC 7.0 "SSE4"
-x264 ICC 17
-x264 VS 2017 "AVX2"
-x265 ICC 17
THX
easyfab
16th April 2017, 20:28
And it won't be the "best" for each CPU.
For example for my I7-2600K with GCC, I use -march=native, -Ofast, PGO and others agressive settings and extra flags like -frename-registers, -funroll-loops ... And It give me another 5 to 10% speed boost. Even better for my cpu than VS2017 profiled version from Ma. But you must do these for each CPU.
easyfab
16th April 2017, 20:34
I'm really curious to see what a profiled and optimized build can give for Ryzen. Perhaps more speed boost than for intel ?
And better if some code can be rewritten for ryzen architecture.
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.