View Full Version : x265 HEVC Encoder
Midzuki
11th April 2018, 21:17
https://cosgan.de/images/smilie/konfus/g080.gif
307 patches with AVX-512 (and other improved assembly) code uploaded to the developer mailing list. That will take a little while to review.
They are up and running 0_o
https://bitbucket.org/multicoreware/x265/commits/all
LigH
11th April 2018, 21:25
Damn. I waited for the "Re: 0/307 — approved" mail.
Time to build.
_
P.S.: Compiling x265 with AVX-512 support works only for x86-64 architecture targets. A "bailout" for x86 (Win32) architecture targets seems to be missing, so it throws "invalid opcode" errors for the 8-bit depth core where assembler is still enabled.
_
x265 2.7+332-593e63cda903 (Win64) (http://www.mediafire.com/file/vm9jyrb6k40vc2h/x265_2.7%2B332-593e63cda903_Win64.7z)
Support for AVX-512 assembly optimized kernels; remember: enable it manually by adding --asm avx512 to the CLI — and don't fry your CPU...
Only x86-64 (Win64) version available, skipping it in x86 (Win32) mode for NASM is necessary not to break compilation completely.
hajj_3
12th April 2018, 01:15
anyone with an avx512 capable processor fancy doing benchmarks comparing it to the previous build?
LigH
12th April 2018, 02:10
I already have that feeling that one day, x265 will be used rather as a benchmark for the efficiency of the AVX implementations in a specific CPU, rather than as a benchmark for efficient video encoding ... :o
Midzuki
12th April 2018, 02:54
More AVX-512 code = bigger filesize :scared:
https://forum.videohelp.com/attachments/45153-1523497745/x265-with-tons-of-AVX512.png
FranceBB
12th April 2018, 05:04
Yep, but in 2018 20.7 MB is still very small and the increase is negligible. Unfortunately I can't test the latest AVX512 instruction set 'cause I have a Intel Xeon E5-2660 v4 that supports AVX2 only, sadly.
I look forward for benchmarks.
foxyshadis
12th April 2018, 05:09
I thought having Kaby Lake meant I had them, but nope, servers only. I have one customer who has a brand spanking new Skylake-X server that I can remote into, I should be able to get benchmarks tomorrow.
Asmodian
12th April 2018, 06:37
AVX-512 is faster! :D;)
I did some benchmarks using LigH's build x265 2.7+332-593e63cda903 (Win64) (https://forum.doom9.org/showthread.php?p=1839225#post1839225) above. I used the same build for the AVX2 tests, simply without the "--asm avx512" command.
i9-7900X @ 4.5 GHz all cores, 3.0 GHz mesh/cache, DDR4 4000-17-18-18-41-1T. No AVX2 or AVX-512 multiplier offsets. Max 92 degC package CPU temperature during both veryslow encodes. The faster modes did not saturate all 20 threads.
The source is 1920x1080 8-bit gradient MagicYUV 4:2:0 on a NvME SSD encoding to another NvME SSD. I used the first 1000 frames from Firefly episode 9 which I had already denoised (SMDegrain) and had on my drive.
avs2pipemod.exe -y4mp=1:1 "fireflyshort.avs" | x265_AVX512.exe --input - --y4m -o "D:\temp\fireflyshort.mkv" --asm avx512 --preset veryslow --crf 18.5 --output-depth 10
x265 [info]: HEVC encoder version 2.7+332-593e63cda903
x265 [info]: build info [Windows][GCC 7.3.0][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
x265 [info]: Main 10 profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 20 threads
veryslow:
AVX512: encoded 1000 frames in 303.44s (3.30 fps), 4037.41 kb/s, Avg QP:20.64
AVX2: encoded 1000 frames in 335.83s (2.98 fps), 4037.41 kb/s, Avg QP:20.64
medium:
AVX512: encoded 1000 frames in 28.41s (35.20 fps), 3183.67 kb/s, Avg QP:20.46
AVX2: encoded 1000 frames in 30.71s (32.57 fps), 3183.67 kb/s, Avg QP:20.46
veryfast:
AVX512: encoded 1000 frames in 15.47s (64.64 fps), 2769.26 kb/s, Avg QP:20.89
AVX2: encoded 1000 frames in 16.89s (59.20 fps), 2769.26 kb/s, Avg QP:20.89
ultrafast:
AVX512: encoded 1000 frames in 6.86s (145.77 fps), 1398.46 kb/s, Avg QP:25.00
AVX2: encoded 1000 frames in 7.22s (138.41 fps), 1398.46 kb/s, Avg QP:25.00
Thanks to everyone who works on x265 and thanks for the regular builds LigH. :)
excellentswordfight
12th April 2018, 10:41
Slower here.
Using LGHs build with a dell 2u rack server with a Xeon Gold 6126 (12c/24t). CPU utilization dropped with about 10% (both for 1080p and 2160p) and clockspeed dropped from 2.9Ghz to 2.4Ghz. I'm guessing that the gains for AVX512 didnt outweight the dropp in clockspeed and utilization.
Tears of steal source (10bit UHD-Bluray compat x265 source for 2160p test, 8bit bluray compat x264 soruce for 1080p)
2160p with avx512: 80-90% CPU usage, 2.28 fps
--asm avx512 --preset slow --profile main10 --level-idc 51 --crf 22
2160p: 100% CPU usage, 2.36 fps
--preset slow --profile main10 --level-idc 51 --crf 22
1080p with avx512: 45-55% CPU usage, 6.54 fps
--asm avx512 --preset slow --profile main10 --level-idc 41 --crf 18
1080p: 55-65% CPU usage, 7.14 fps
--preset slow --profile main10 --level-idc 41 --crf 18
WhatZit
12th April 2018, 11:10
I'm guessing that the gains for AVX512 didn't outweight the drop in clockspeed and utilization.
Yep, a Catch-22 also discovered by Cloudfare after some cryptography assessments: https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/
nevcairiel
12th April 2018, 12:47
Asmodian runs without a AVX512 offset, which would instantly crash his system if a strong AVX512 workload would run, so clearly its faster with some "light" AVX512 usage. Usually you need at least a -10 offset or such to get it working stable under strong AVX512 load (or boost voltages substantially for more heat). Non-OCed Xeon CPUs probably downlock quite substantially.
Barough
12th April 2018, 14:18
x265 v2.7+337-54ff74d2b635 (http://www.mediafire.com/file/1wohb1plbu02qvy/) (GCC 7.3.0, 32 & 64-bit 8/10/12bit Multilib Windows Binaries)
https://bitbucket.org/multicoreware/x265/commits/branch/default
burfadel
12th April 2018, 14:26
Probably best to utilise AVX-512 where it gives the best gains without triggering thermal throttle. The good thing at least with 307 separate patches this can be whittled down. If a function is frequently used and gives only a small gain, it may actually encode faster if on mitred fire to the throttling the patch causes. Even if throttling isn't triggered on a particular rig, temperature difference should be taken into account to cover typical situations.
LigH
12th April 2018, 17:52
x265 2.7+337-54ff74d2b635 (http://www.mediafire.com/file/m1m441ep3eeyy5g/x265_2.7%2B337-54ff74d2b635.7z)
Merge with default; prep for v3.0
Support for HLG-graded content and pic_struct
Fix conditions for single-sei NAL
Fix 32 bit build error (means: AVX-512 support is only included in x86-64 architecture target)
(VMAF support to report per frame and aggregate VMAF score — unfortunately not yet? available for Windows builds)
New CLI parameters:
--atc-sei <integer> Emit the alternative transfer characteristics SEI message where the integer is the preferred transfer characteristics. Default disabled
--pic-struct <integer> Set the picture structure and emits it in the picture timing SEI message. Values in the range 0..12. See D.3.3 of the HEVC spec. for a detailed explanation.
Asmodian
12th April 2018, 17:57
Asmodian runs without a AVX512 offset, which would instantly crash his system if a strong AVX512 workload would run, so clearly its faster with some "light" AVX512 usage. Usually you need at least a -10 offset or such to get it working stable under strong AVX512 load (or boost voltages substantially for more heat). Non-OCed Xeon CPUs probably downlock quite substantially.
I had downclocked from my normal max clocks when running without an AVX offset.
I also ran some tests at my normal OC settings with -2, -4 multiplier offsets. 4.8 GHz max core, 4.6 GHz AVX2, 4.4 GHz AVX-512.
AVX512: encoded 1000 frames in 310.46s (3.22 fps), 4037.41 kb/s, Avg QP:20.64
AVX2: encoded 1000 frames in 335.85s (2.98 fps), 4037.41 kb/s, Avg QP:20.64
It would probably still melt with a heavy AVX-512 load but it also wasn't completely maxed. AVX-512 ran cooler than AVX2 at these settings. I am not sure why my AVX2 run only had the same speed as the previous 4.5 GHz encode, maybe a latency penalty due to the core changing states.
This is a binned, delidded, and water cooled CPU... other systems may have different results. :)
Edit: If I run Prime95 (p95v294b8) with AVX-512 at 4.5 GHz I do get thermal throttling.
nevcairiel
12th April 2018, 20:57
Edit: If I run Prime95 (p95v294b8) with AVX-512 at 4.5 GHz I do get thermal throttling.
Try with LinX/Linpack and see your system die. Prime95 does not fully use AVX512 yet (only trial factoring, not full FFTs)
Stephen R. Savage
12th April 2018, 21:06
Try with LinX/Linpack and see your system die. Prime95 does not fully use AVX512 yet (only trial factoring, not full FFTs)
It's actually not so bad at higher frequencies, because each 100 MHz increment saves a lot more power, compared to 2.5 GHz server SKUs. i9-7900X can reach 4.1-4.2 GHz AVX-512 frequency with an aftermarket cooling solution.
jlpsvk
12th April 2018, 21:37
x265 has static levels of refinement(--refine inter <level>/refine intra <level>) which can be used with --analysis-reuse-level 10.
Efficiency in terms of quality increases as the levels of refinement increases. This quality increase results from additional computation thereby increasing the overall encoding time.
For a better quality-speed trade-off, dynamic refinement was introduced where the encoder dynamically switches between different inter refine levels.
This basically exploits the fact that not all CUs are required to be encoded with same level for better performance/quality.
Considering the complexity of video content and the analysis information from first pass, the encoder can intelligently decide the optimal level of refinement for each CU.
Intra frames are usually encoded with best quality as they are used as references by the consecutive frames. Hence error introduced in intra frames due to reusing analysis data can propagate to frames that use these intra frames as reference.
To minimize the chances of error propagation, refine-intra 4 (level with best quality) restricts reusing analysis data for intra frames and forces the encoder to perform full intra analysis in the second pass.
This is why x265 documentation suggests to use dynamic refinement along with refine-intra 4 and this setting is expected to give improved quality than other refine intra levels for some videos.
any suggested quality wise settings recommendation for 4K HDR encoding? with CRF ie 17? :)
nevcairiel
12th April 2018, 21:37
It's actually not so bad at higher frequencies, because each 100 MHz increment saves a lot more power, compared to 2.5 GHz server SKUs. i9-7900X can reach 4.1-4.2 GHz AVX-512 frequency with an aftermarket cooling solution.
You can reach that if you boost the power you give the CPU, but unfortunately that also boosts the power outside of AVX512 mode, making your CPU overall less efficient. The integrated voltage controller has no option to increase the core voltage only in AVX512 mode, unfortunately.
But this is probably going a bit off-topic for X265. :)
I would've thought the X265 people already learned the down-clocking lesson with AVX2 though, where they experienced the same effect - fancy instructions that made the overall encode slower, especially on server systems, due to clock changes.
mandarinka
12th April 2018, 21:44
https://forums.anandtech.com/threads/intel-skylake-kaby-lake.2428363/page-662#post-39149633
RZN vs. CFL vs. SKL-X in X265 2.5+31:
RZN: /w AVX2 = 100.00%, /wo AVX2 = 105.21%
CFL: /w AVX2 = 130.61%, /wo AVX2 = 101.13%
SKL-X: w/ AVX2 = 135.47%, /wo AVX2 = 105.21%
Ryzen's performance without AVX2 is impressive, but it is sad to see that there is still a penalty (like on Excavator) when running 256-bit code.
I wish somebody would adjust the CPU detection code to disable AVX2 on Zen. Easy performance gain just from that simple change: Zen gets 5.2% faster by disabling AVX2.
jlpsvk
12th April 2018, 21:51
is it just me? using cpu capabilities with the new x265 not listing AVX-512. :( i7-7820X
Asmodian
12th April 2018, 22:56
Don't forget the "--asm avx512", it isn't enabled by default. This seems good if it is slower on most systems due to the multiplier offsets for AVX-512.
jlpsvk
12th April 2018, 23:01
@Asmodian
aaaaah... forgot it.. :D
nevcairiel
12th April 2018, 23:28
To be fair, the AVX2 speedup is still larger than the frequency penalty, so it made sense.
It is now, because they reigned in AVX2 use in some irrelevant functions with minimal speedups to reduce the effect of downclocks. They even had a presentation about that "adventure" and their findings on some conference once
RieGo
13th April 2018, 15:43
no AVX512:
encoded 1780 frames in 169.38s (10.51 fps), 2541.97 kb/s, Avg QP:20.40
AVX512:
encoded 1780 frames in 161.94s (10.99 fps), 2541.97 kb/s, Avg QP:20.40
makes a ~5% speed increase
considering avx512 encode was almost 10°C cooler, so maybe i can get away with +100MHz. i like it :D
Selur
13th April 2018, 17:15
okay, so nice, but not worth buying a new cpu because of it.
Xizer
14th April 2018, 03:01
Is anyone else having problems with getting it to work on Skylake Xeons?
x265 crashes on my Xeon Platinum 8176 server when I start it with the --asm avx512 flag.
Error: fwrite() call failed when writing frame: 3, plane: 2, errno: 32
Output 80 frames in 13.46 fps (5.90 fps)
It'll work fine on the Xeon Platinum machine as soon as I remove the avx512 flag.
And it works on my i9 7940X with the avx512 flag.
WhatZit
14th April 2018, 03:30
okay, so nice, but not worth buying a new cpu because of it.
Not until the i7-9700K (https://en.wikichip.org/wiki/intel/core_i7/i7-9700k) shows up (November 2018?).
x265 crashes on my Xeon Platinum 8176 server when I start it with the --asm avx512 flag.
One bug that can lead to crash is fixed in version 2.7+338 so please do not use older versions. Which version do you use?
RieGo
14th April 2018, 10:59
i'm seeing very inconsistent results
CRF-20 preset-"medium"
--------------------
Run1:
encoded 1128 frames in 27.42s (41.14 fps), 3355.47 kb/s, Avg QP:20.21
Run2 (AVX512):
encoded 1128 frames in 30.23s (37.32 fps), 3355.47 kb/s, Avg QP:20.21
Run3:
encoded 1128 frames in 27.77s (40.62 fps), 3355.47 kb/s, Avg QP:20.21
Run4 (AVX512):
encoded 1128 frames in 27.04s (41.71 fps), 3355.47 kb/s, Avg QP:20.21
but temps are still very low on avx512. i didn't expect this. probably x265 just uses very little avx512 and that's why we don't see much improvement - if any - running on lowered frequency.
maybe we can run at default speed without getting too much heat/power, if we only use it on x265... i'll give it a try
btw: i agree that avx512 is not worth getting a new cpu :D
LigH
14th April 2018, 22:54
x265 2.7+340-aa9102400f24 (https://www.mediafire.com/file/k0qeet33wv0053d/x265_2.7%2B340-aa9102400f24.7z)
remove unused asmname from x265_param; added a newline in the help
(fixed VMAF warning not applicable under Windows)
Xizer
15th April 2018, 18:31
x265 2.7+340-aa9102400f24 (https://www.mediafire.com/file/k0qeet33wv0053d/x265_2.7%2B340-aa9102400f24.7z)
remove unused asmname from x265_param (may fix some crashes on Xeons); added a newline in the help
(fixed VMAF warning not applicable under Windows)
2.7+340 is the version I'm using when it crashes.
LigH
15th April 2018, 18:36
Sorry to hear ... so the fix is not yet committed, only proposed?
2.7+340 is the version I'm using when it crashes.
Thanks for more info. It looks like more serious bug (x265 works for a while). The error message is probably from decoding app that writes graphics data via pipe to x265.
There are many possible reasons for the crash, one of them is the OS (and msvcrt.dll file in your OS). Could you test VS 2015 and VS 2017 x265 binaries? They are not based on msvcrt.dll.
LigH
15th April 2018, 20:22
I don't have such compilers installed; someone else may have to build them.
foxyshadis
15th April 2018, 21:24
x265-2.7+336-07defe235cde.7z VS 17 x64 (https://www.dropbox.com/s/hfyp95irk7j145i/x265-2.7%2B336-07defe235cde.7z?dl=0), crt is statically linked so no install needed. Debug pdbs included.
Bisection is one of the most common method of finding bugs.
In source file common/x86/asm-primitives.cpp it is function pointers assignment with avx512 code -- from line # 4696 to 5385.
We can turn off (comment) first half of this functions (x265-1.exe) and then second half (x265-2.exe).
You can download avx512-bisect1.7z (http://www.msystem.waw.pl/x265/avx512-bisect1.7z) VS 2015 binaries with diff files -- if x265.exe (from clean sources) hangs, please try x265-1.exe and then x265-2.exe -- it should be that one of them hangs and second not.
Please report back the results (which from the 3 binaries hangs and which not).
Xizer
15th April 2018, 22:38
Bisection is one of the most common method of finding bugs.
In source file common/x86/asm-primitives.cpp it is function pointers assignment with avx512 code -- from line # 4696 to 5385.
We can turn off (comment) first half of this functions (x265-1.exe) and then second half (x265-2.exe).
You can download avx512-bisect1.7z (http://www.msystem.waw.pl/x265/avx512-bisect1.7z) VS 2015 binaries with diff files -- if x265.exe (from clean sources) hangs, please try x265-1.exe and then x265-2.exe -- it should be that one of them hangs and second not.
Please report back the results (which from the 3 binaries hangs and which not).
All three binaries crash with the avx512 flag I'm afraid.
Is there additional information that can be provided for debug?
All three binaries crash with the avx512 flag I'm afraid.
Is there additional information that can be provided for debug?
Thanks for the info.
Could you post results of the command
x265 -V && x265 --asm avx2 -V && x265 --asm avx512 -V
For example in my system it looks likeF:\x265p\ma\avx512>x265 -V && x265 --asm avx2 -V && x265 --asm avx512 -V
x265 [info]: HEVC encoder version 2.7+340-aa9102400f24
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: HEVC encoder version 2.7+340-aa9102400f24
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: HEVC encoder version 2.7+340-aa9102400f24
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
This bug could be not related directly to avx512 code -- could you check if it hangs if you use '--asm avx2' instead of '--asm avx512' (it is important to use --asm avx2 option).
------------------------------------------------------
Your OS is Win 8.1 that do not support avx512. This bug in x265 is not technical but conceptual -- avx512 is not auto recognized by default so option '--asm avx512' should not turn on avx512 without any check.
You can test file avx512-patch.7z (http://www.msystem.waw.pl/x265/avx512-patch.7z) with x265 that check what is supported by CPU & OS up to avx512 if option '--asm avx512' is used.
On Win 8.1 it should work with '--asm avx512' exactly like without this option, on Win 10 it should turn on avx512 if you have CPU with avx512 and you use option '--asm avx512'.
Bhavnahari
17th April 2018, 07:31
any suggested quality wise settings recommendation for 4K HDR encoding? with CRF ie 17? :)
If the data that is being reused comes from encoding a downscaled video (scale-factor=2), it does not make a lot of sense to use intra/inter refinement levels 0/1 as levels 0 and 1 reuse most of the information from the previous pass with no or minimal re-evaluation of analysis information. This can have a huge impact on the quality of the encode.
We have observed that --refine-inter=3 and --refine-intra=4 gives the best quality, even better than standalone x265 encodes in some cases, with a performance gain of up to 1.8X.
For 4K HDR content, you will have to modify the display settings based on the monitor. Please refer the docs for more information - http://x265.readthedocs.io/en/default/cli.html#vui-video-usability-information-options
jlpsvk
18th April 2018, 19:48
If the data that is being reused comes from encoding a downscaled video (scale-factor=2),
it does not make a lot of sense to use intra/inter refinement levels 0/1 as levels 0 and 1 reuse most of the information from
the previous pass with no or minimal re-evaluation of analysis information. This can have a huge impact on the quality of the encode.
We have observed that --refine-inter=3 and --refine-intra=4 gives the best quality, even better than standalone x265 encodes in some cases,
with a performance gain of up to 1.8X.
My settings then...
--crf 17 --profile main10 --level-idc 5.1 --output-depth 10 --ctu 32 --amp --vbv-bufsize 160000 --vbv-maxrate 160000 --me star
--max-merge 5 --rc-lookahead 40 --lookahead-slices 4 --gop-lookahead 34 --ref 5 --hdr --hdr-opt --repeat-headers --no-info --no-deblock
--no-sao --no-strong-intra-smoothing --high-tier --refine-inter 3 --refine-intra 4
Of course, display settings are entered too... :)
Warning:
x265 [warning]: Intra refinement requires analysis load, analysis-reuse-level 10, scale factor. Disabling intra refine.
x265 [warning]: Inter refinement requires analysis load, analysis-reuse-level 10, scale factor. Disabling inter refine.
Can it be used with CRF? Or? Some parameters missing?
Loomes
18th April 2018, 21:47
So far no success for me running the avx512 option on my 7820x on Windows 10 64it. I tried
ffmpeg -i source.mkv -f yuv4mpegpipe - | x265.exe --preset slow --asm avx512 --crf 21 --y4m - -o dest.h265
which crashes immediately. Using --asm avx2 works fine.
I ran
x265.exe -V && x265.exe --asm avx2 -V && x265.exe --asm avx512 -V
and it showed
x265 [info]: HEVC encoder version 2.7+340-aa9102400f24
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: HEVC encoder version 2.7+340-aa9102400f24
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: HEVC encoder version 2.7+340-aa9102400f24
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
You can test file avx512-patch.7z (http://www.msystem.waw.pl/x265/avx512-patch.7z) with x265 that check what is supported by CPU & OS up to avx512 if option '--asm avx512' is used.
I tried from that package x265.exe, x265-1.exe and x265-2.exe. All of them crashed immediately. I am using the latest nightly version of ffmpeg.
Am I missing something?
Asmodian
18th April 2018, 22:45
I tried from that package x265.exe, x265-1.exe and x265-2.exe. All of them crashed immediately. I am using the latest nightly version of ffmpeg.
Am I missing something?
I had crashes using the builds foxyshadis posted but I was successful using LigH's build from this post (https://forum.doom9.org/showthread.php?p=1839464#post1839464). Win10 and an i9-7900X. Have you tried them?
Loomes
18th April 2018, 22:57
I had crashes using the builds foxyshadis posted but I was successful using LigH's build from this post (https://forum.doom9.org/showthread.php?p=1839464#post1839464). Win10 and an i9-7900X. Have you tried them?
I did now but it also results in an instant crash. Are you using a ffmpeg pipe like me or what is your command line?
I tried from that package x265.exe, x265-1.exe and x265-2.exe. All of them crashed immediately. I am using the latest nightly version of ffmpeg.
Am I missing something?
x265-1.exe == x265-2.exe == x265.exe from package avx512-bisect1.7z (I commented out code for 8-bit encoding and compiled 10-bit version, sorry for that). So this package is to forget.
I want to precise: x265.exe from package avx512-patch.7z (http://www.msystem.waw.pl/x265/avx512-patch.7z) gives in your OS output:
x265 --asm avx512 -V
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
?
Loomes
19th April 2018, 01:18
I want to precise: x265.exe from package avx512-patch.7z (http://www.msystem.waw.pl/x265/avx512-patch.7z) gives in your OS output:
x265 --asm avx512 -V
x265 [info]: build info [Windows][MSVC 1900][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
?
Confirmed. I tried again on Windows 10 64bit with an 7820x using the command line shown above.
Asmodian
19th April 2018, 06:31
I did now but it also results in an instant crash. Are you using a ffmpeg pipe like me or what is your command line?
No, actually I was using avs2pipemod.exe (ver 1.1.1 Aug 15, 2016) piping to x265.exe with y4m:
C:\Tools\avs2pipemod.exe -y4mp=1:1 %1 | C:\Tools\x265.exe --input - --y4m -o "D:\Temp\%~n1.mkv" --asm avx512 --preset veryslow --crf 18.5 --output-depth 10
edit:
"x265 --asm avx512 -V" gives:
x265 [info]: HEVC encoder version 2.7+340-aa9102400f24
x265 [info]: build info [Windows][GCC 7.3.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2 AVX512
Confirmed. I tried again on Windows 10 64bit with an 7820x using the command line shown above.
Thanks for confirmation -- it looks like your CPU & OS supporting avx512.
I've prepared some binaries to narrow this bug -- avx512-bisect2.7z (http://www.msystem.waw.pl/x265/avx512-bisect2.7z).
x265-0.exe -- 10-bit encoder with commented out all avx512 function pointers (it works on my i7 8700 with --asm avx512 option). It is for check if the bug is before avx512 code is running but the execution path is exactly like in avx512 case.
x265-01.exe -- assigned only first 1/4 function pointers with avx512 code.
x265-02.exe -- assigned only second 1/4 function pointers with avx512 code.
x265-03.exe -- assigned only third 1/4 function pointers with avx512 code.
x265-04.exe -- assigned only last 1/4 function pointers with avx512 code.
Could you test these 5 exe's and report back which works and which not?
Loomes
19th April 2018, 13:39
Could you test these 5 exe's and report back which works and which not?
I'm home in about 6 hours and will do. By the way: Is it correct that --asm avx512 will not work on Windows 7/8 64bit, no matter what?
nevcairiel
19th April 2018, 13:56
Is it correct that --asm avx512 will not work on Windows 7/8 64bit, no matter what?
That would appear to be the case. If you force avx512 usage through --asm avx512 on an other OS it'll just crash.
Thats really the problem with using the --asm option, it overrides the CPU/OS feature detection, a different option to enable it when present would be better.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.