View Full Version : x265 HEVC Encoder
Jamaika
21st April 2017, 05:43
I'm not interested in generating the data, just consuming it to various degrees, possibly to transmit it to a TV through HDMI down the line, have to at least try to keep PCs able to use those new formats.
For transmission, thats apprently in CTA-861-G (ie. the HDMI 2.0/2.1 standard), perhaps some software features can be supported on older HDMI interfaces by firmware updates (like dynamic metadata).
Interesting thought. I don't know how it is, but I think @sneaker once wrote that HDR is also for 8bit video in VP9 codec. It seems to me that google policy has also changed. The site is only about 10bit movies.
https://support.google.com/youtube/answer/7126552?hl=en
http://www.androidcentral.com/new-nvidia-shield-streaming-box-includes-google-assistant-4k-hdr-streaming
https://www.heise.de/newsticker/meldung/AMD-Radeon-RX-400-HDR-Gaming-ueber-HDMI-nur-mit-8-statt-10-Bit-3488970.html
http://static.frazpc.pl/cms/2016/11/file-14bc21be7eaeed046f-600x333.jpeg
Dolby Vision also works with the older HDMI 1.4a standard, while HDR10 requires HDMI 2.0. Dolby Vision is backwardly compatible to HDR10, but it's not clear if it will work with the new standard HDR10+.
But what exactly does that mean? HDR10/HDR10+ is the current industry standard for HDR in consumer televisions. This first-generation “open” format technology is the starting point for High Dynamic Range, which needs a compatible interface—either HDMI 2.0a/HDMI 2.1 or Internet connection...
Make sure your TV and AV receiver firmware are up to date. Many newer TVs and devices can take updates—if you find that it doesn't support some 4K or HDR features, a TV or receiver firmware update may resolve the problem. Check your TV or device manual to see how to update the firmware.
Edit: Who knows what this is for the Dynamic Range Pro for HDR10? Is it also set in HEVC codecs?
benwaggoner
21st April 2017, 17:24
Interesting thought. I don't know how it is, but I think @sneaker once wrote that HDR is also for 8bit video in VP9 codec. It seems to me that google policy has also changed. The site is only about 10bit movies.
Doing some degree of HDR in 8-bit is possible. I had some working prototypes a couple of years ago. But using the full PQ range isn't feasible without a lot of banding. 10-bit is definitely required to do "standard" HDR over HDMI.
Edit: Who knows what this is for the Dynamic Range Pro for HDR10? Is it also set in HEVC codecs?
The HDR10+/SMPTE 2094-40 metadata is inside the HEVC bitstream as SEI messages. I don't know if there is a final spec for transmission over HDMI.
Jamaika
21st April 2017, 18:05
The HDR10+/SMPTE 2094-40 metadata is inside the HEVC bitstream as SEI messages. I don't know if there is a final spec for transmission over HDMI.
Thanks for the answer. I ask curiosity, because it amazes me advertising SONY. We have 10x more contrast range products than Dynamic Range HDR10 called Dynamic Range Pro which aren't HDR10+. Where are these extra information extracted from the metadata?
It begins to be it for me even more puzzling.
nevcairiel
21st April 2017, 18:19
Sony's "Dynamic Range Pro" is not "dynamic HDR", its just static HDR with some proprietary contrast booster.
So far there have been 4 competing dynamic HDR concepts grouped under SMPTE 2094
2094-10: Dolby Vision
2094-20: Phillips
2094-30: Technicolor (used by LG, IIRC)
2094-40: Samsung
Its a rather unfortunate situation that we have 4 competing concepts already, and possibly more in the future.
We'll have to see how they differ and if TVs will be able to just handle content in all of the formats.
From what I could turn up, the next release of the HEVC spec (October 2017) should also encorporate the SEI messages for these officially.
I also found information that claims that ST 2094-10 and ST 2094-20 are optional parts of the UHD Blu-ray Specification, but the other two didn't make it in at all.
Transmission over HDMI will likely require HDMI 2.1 for all variants of SMTPE ST 2094 - with the exception of Dolbys proprietary transmission format, which requires full hardware support on both ends.
Sagittaire
22nd April 2017, 12:57
I want make direct comparison with HT On and HT Off.
It's possible to change the wpp raws number in x265?
dipje
22nd April 2017, 15:59
Has the 10 bit lambda table been incorporated into the latest build? If so, what build number?
In the bitbucket of multicoreware (official x265 repo I'm guessing) I see the commit of the new lambda tables (10bit / 12bit) being commited on the 13th of april, both in the 'default' branch and the 'stable' branch.
I'm guessing any build from that point on has it?
Or is the commit I'm seeing still something that needs to be approved?
Selur
22nd April 2017, 16:12
I'm guessing any build from that point on has it?
not exactly, it was merged to default branch two days ago, and since then any new build includes the new lambda tables.
I want make direct comparison with HT On and HT Off.
It's possible to change the wpp raws number in x265?
To simulate encoding with CPU with different numbers of logical cores, there are needed two options: '--pools <N>' and '-F <NF>' where N is number of logical cores that you want to simulate and NF is number of frame threads that x265 uses for N logical cores -- formula is at:
https://bitbucket.org/multicoreware/x265/src/2c6e6c9c3da72aaddb33565d7031918fb5a37097/source/encoder/encoder.cpp?at=default&fileviewer=file-view-default#encoder.cpp-137
Example: if you want to simulate 4 logical cores system, use
--pools 4 -F 2
for 8 cores:
--pools 8 -F 3
for 20 cores:
--pools 20 -F 5
for 32 cores and 1080p source movie:
--pools 32 -F 6
wpp you can only turn on or off (for '--no-wpp' option number of frame threads are min(logical cores, 16) for high resolution source).
Natty
22nd April 2017, 19:53
how to use 10 bit lambda table
nevcairiel
22nd April 2017, 19:58
how to use 10 bit lambda table
Its always used when you encode 10-bit content, and any builds of the last few days should have the new table.
pradeeprama
23rd April 2017, 01:36
x265 version 2.4 has been released. This release incorporates support for the new HDR10+ standard, and revised lambda tables for main, main10, and main12 profiles that significantly improve visual quality!
Version 2.4 can now be downloaded from here (md5: ab0986aa5c4465b874de94095b0d0cae). Full documentation is available at http://x265.readthedocs.io/en/stable/.
Release Notes for Version 2.4
======================
Release date - 22nd April, 2017.
Encoder enhancements
----------------------------------
1. HDR10+ supported. Dynamic metadata may be either supplied as a bitstream via the userSEI field of x265_picture, or as a json file that can be parsed by x265 and inserted into the bitstream; use --dhdr10-info to specify json file name, and --dhdr10-opt to enable optimization of inserting tone-map information only at IDR frames, or when the tone map information changes.
2. Lambda tables for 8, 10, and 12-bit encoding revised, resulting in significant enhancement to subjective visual quality.
3. Enhanced HDR10 encoding with HDR-specific QP optimizations for chroma, and luma planes of WCG content enabled; use --hdr-opt to activate.
4. Ability to accept analysis information from other previous encodes (that may or may not be x265), and selectively reuse and refine analysis for encoding subsequent passes enabled with the --refine-level option.
5. Slow and veryslow presets receive a 20% speed boost at iso-quality by enabling the --limit-tu option.
6. The bitrate target for x265 can now be dynamically reconfigured via the reconfigure API.
7. Performance optimized SAO algorithm introduced via the --limit-sao option; seeing 10% speed benefits at faster presets.
API changes
-------------------
1. x265_reconfigure API now also accepts rate-control parameters for dynamic reconfiguration.
2. Several additions to data fields in x265_analysis to support --refine-level: see x265.h for more details.
Bug fixes
--------------
1. Avoid negative offsets in x265 lambda2 table with SAO enabled.
2. Fix mingw32 build error.
3. Seek now enabled for pipe input, in addition to file-based input
4. Fix issue of statically linking core-utils not working in linux.
5. Fix visual artifacts with --multi-pass-opt-distortion with VBV.
6. Fix bufferFill stats reported in csv.
Happy Compressing!
x265 team.
Midzuki
23rd April 2017, 02:26
x265.exe 2.4+2-5bc5e73760cd
https://forum.videohelp.com/threads/357754-%5BHEVC%5D-x265-EXE-mingw-builds?p=2483884#post2483884
dipje
23rd April 2017, 08:37
not exactly, it was merged to default branch two days ago, and since then any new build includes the new lambda tables.
Ah so that was the catch I was missing :).
It looked reverse on my end , (I've read it wrong it bitbucket apparently?),
Like it was added to the default branch , but 2 days ago that default branch was merged into 'stable branch.
So it depends which branch is used in the builds , but I believe most people who are nice enough around here use the default branch.
Anyway, talking about nice people who builds, can we expect a 2.4 build from LigH, or are you still having compilation problems?
Midzuki
23rd April 2017, 08:50
Anyway, talking about nice people who builds, can we expect a 2.4 build from LigH, or are you still having compilation problems?
Over here, no compiler warnings happened.
But I didn't enable the HDR10+ thing, I was too lazy to edit CMakeLists.txt.
LigH's next build of x265 will appear sooner or later anyway :)
Selur
23rd April 2017, 10:14
quick question about the new options:
--dhdr10-info <filename> JSON file containing the Creative Intent Metadata to be encoded as Dynamic Tone Mapping
--[no-]dhdr10-opt Insert tone mapping SEI only for IDR frames and when the tone mapping information changes.Default disabled
Do I get it right that: dhdr10-info is always required for dynamic hdr10 and dhdr10-opt is an option which allows to limit how often the tone mapping SEI is inserted?
Also does it also require '--hdr' or are those separate?
LigH
23rd April 2017, 11:34
Mateusz Brzostek proposed a patch:
[x265] [PATCH] cmake: set '-std=gnu++11' for GCC if ENABLE_DYNAMIC_HDR10 is on (https://mailman.videolan.org/pipermail/x265-devel/2017-April/010984.html)
but it has not yet been commited. And I have only little experience in manual working set management with TortoiseHg, I prefer using only commited updates.
_
P.S.: I tried to import this patch, but it failed with code 255. Might be because the base changeset is already outdated?
Selur
23rd April 2017, 11:41
multilib.sh should also be adjusted,..
for clang it's c++11 instead of gnu++1 and in encoder.h:
- #include "dynamicHDR10\hdr10plus.h"
+ #include "dynamicHDR10/hdr10plus.h"
LigH
23rd April 2017, 11:55
Of course, I made my own builds script by adding "-DENABLE_DYNAMIC_HDR10=ON" to 10 and 12 bit libraries.
Barough
23rd April 2017, 12:42
x265 v2.4+2-5bc5e73760cd (http://www55.zippyshare.com/v/HMZWezgh/file.html) (MSYS/MinGW, GCC 6.3.0, 32 & 64bit 8/10/12bit multilib EXEs)
x265 : HEVC encoder version 2.4+2-5bc5e73760cd
x265 [info]: build info [Windows][GCC 6.3.0][32 bit/64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2
https://bitbucket.org/multicoreware/x265/commits/branch/default
[I]HDR10 Enabled
I tried to import this patch, but it failed with code 255. Might be because the base changeset is already outdated?
If you save 3 attachments "warnings.patch", "cmake-hdr10.patch" and "dhdr10.patch" to your x265 folder, you can execute (in MSYS)
patch -p1 <warnings.patch
patch -p1 <cmake-hdr10.patch
patch -p1 <dhdr10.patch
If you want clean source (remove these patches), please execute
hg update -C
stax76
23rd April 2017, 15:01
-limit-tu 4 is in slower and veryslow preset
a new issue I see is mediainfo don't show any info like depth 10, main 10, resolution 10 when using mkv
is this issue caused by mkvmerge, mediainfo or x265?
for mp4 output it's there:
https://s16.postimg.org/6fwvm77fl/Unbenannt.png (https://postimg.org/image/6fwvm77fl/)
Selur
23rd April 2017, 15:07
a new issue I see is mediainfo don't show any info like depth 10, main 10, resolution 10 when using mkv
isn't new been that way for a few month as I remember, but I never really looked into it :)
stax76
23rd April 2017, 15:12
This is really new because every time when I update the x265 build I make a 8 bit and a 10 bit encode to verify it's actually a multi lib build.
Selur
23rd April 2017, 15:13
Strange, noticed this month ago,...
stax76
23rd April 2017, 15:14
I don't update mediainfo often, could be a mediainfo issue then, not sure when I updated it the last time.
sneaker_ger
23rd April 2017, 15:24
Mkvmerge somewhere between 10.0.0 and 11.0.0. But that doesn't necessarily mean it's a bug in mkvmerge.
stax76
23rd April 2017, 15:35
I've posted it also to Mosu's and Zenitram's thread.
LazyNcoder
24th April 2017, 13:51
Hi guys,
I want to know what's the deal with this --limit-tu option in new v2.4. is it on by default in slow and very slow presets or should we enable it manually?
If it's not default, what's the best value for it? which level of the option would affect the quality the lower?
I usually encode with slow preset. should I need other options we've talked before like --rskip --limit-refs 3 --limit-modes and so? I care about quality the most, but it won't hurt to have a faster encoding.
Thanks
stax76
24th April 2017, 14:02
Hi guys,
I want to know what's the deal with this --limit-tu option in new v2.4. is it on by default in slow and very slow presets or should we enable it manually?
If it's not default, what's the best value for it? which level of the option would affect the quality the lower?
I usually encode with slow preset. should I need other options we've talked before like --rskip --limit-refs 3 --limit-modes and so? I care about quality the most, but it won't hurt to have a faster encoding.
Thanks
It's 4 with slower and veryslow, otherwise 0.
https://x265.readthedocs.io/en/default/presets.html
https://bitbucket.org/multicoreware/x265/src/5bc5e73760cdb61d2674e74cc52149fa0603af8a/source/common/param.cpp?at=default&fileviewer=file-view-default#param.cpp-401
LigH
24th April 2017, 14:04
Verbose documentation (http://x265.readthedocs.io/en/default/cli.html?highlight=--limit-tu#cmdoption-limit-tu)
Preset options (http://x265.readthedocs.io/en/default/presets.html?highlight=--limit-tu)
As you can see, the developers enabled TU limiting mode 4 for presets "slower" and "veryslow" only ("placebo" has always everything enabled, no matter how much time it wastes for how little improvement; and any faster preset is probably still fast enough without limiting).
You may agree to me that an average user may not easily understand the meaning of the four different modes, it may require in-depth understanding of the encoder. I would recommend to trust the developers using a good choice in their presets, and anyone who is certain to know it better may explain to us why... ;)
Motenai Yoda
24th April 2017, 14:24
Notice it is default 4 on presets where tu-inter is more than 1 too, coz limit-tu speed up only tu-inter > 1
littlepox
24th April 2017, 15:45
The reason for them to open limit-tu 4 for slower&veryslow is to fight --amp.
--amp enables TOO MANY possible partitions; with close-to-zero gain.
LazyNcoder
24th April 2017, 17:10
It's 4 with slower and veryslow, otherwise 0.
https://x265.readthedocs.io/en/default/presets.html
https://bitbucket.org/multicoreware/x265/src/5bc5e73760cdb61d2674e74cc52149fa0603af8a/source/common/param.cpp?at=default&fileviewer=file-view-default#param.cpp-401
Verbose documentation (http://x265.readthedocs.io/en/default/cli.html?highlight=--limit-tu#cmdoption-limit-tu)
Preset options (http://x265.readthedocs.io/en/default/presets.html?highlight=--limit-tu)
As you can see, the developers enabled TU limiting mode 4 for presets "slower" and "veryslow" only ("placebo" has always everything enabled, no matter how much time it wastes for how little improvement; and any faster preset is probably still fast enough without limiting).
You may agree to me that an average user may not easily understand the meaning of the four different modes, it may require in-depth understanding of the encoder. I would recommend to trust the developers using a good choice in their presets, and anyone who is certain to know it better may explain to us why... ;)
Thank you guys.
So, there should be some mistake in "release notes" of v2.4 because it stated "Slow and veryslow" presets have changed.
I thought it's odd Slower preset is left out. But now I know it's slower and veryslow presets. No benefits for me. lol.
http://x265.readthedocs.io/en/default/releasenotes.html#version-2-4
troica
25th April 2017, 16:00
Hello guys is there any forum for the HM test model for HEVC? Just off topic-ing for a bit, since my question is for the HM test model.
Is the current HM model (16.3) now supporting error concealment capabilities per CTU (not just per frame) or still under development? Thank you!
LigH
25th April 2017, 19:06
It doesn't seem like this topic has been discussed often before; so just create a new thread in this forum (High Efficiency Video Coding (HEVC) (https://forum.doom9.org/forumdisplay.php?f=81)).
LigH
26th April 2017, 07:42
According to recent test reports in the x265 developer mailinglist, GCC 6.x and CLang both seem to work well with "-std=c++11" (no need for gnu++11 for GCC), so a unified patch should be expectable soon, to enable the required compiler mode in relation to possibly enabled DHDR10 support. :cool:
I did a comparison of an old 10-bit lambda with a new 10-bit lambda.
Movie: Tears of Steel 4K downsized to 2K, encoded @ 1000 kb/s
old lambda - www.msystem.waw.pl/x265/tears-old-lambda.mkv
new lambda - www.msystem.waw.pl/x265/tears-new-lambda.mkv
command line (first & second pass):
ffmpeg -i ../tearsofsteel-4k.y4m -pix_fmt yuv420p16 -vf "scale=1920:-4:flags=bicubic+accurate_rnd+full_chroma_int+
full_chroma_inp:param0=-0.5:param1=0.25,setsar=1" -v warning -strict -1 -f yuv4mpegpipe - | x265 --y4m - --bitrate 1000 -p9 --de
block -1 --keyint 480 --multi-pass-opt-distortion -o p1n-.hevc --pass 1
ffmpeg -i ../tearsofsteel-4k.y4m -pix_fmt yuv420p16 -vf "scale=1920:-4:flags=bicubic+accurate_rnd+full_chroma_int+
full_chroma_inp:param0=-0.5:param1=0.25,setsar=1" -v warning -strict -1 -f yuv4mpegpipe - | x265 --y4m - --bitrate 1000 -p9 --de
block -1 --keyint 480 --multi-pass-opt-distortion -o p2n-.hevc --pass 2
Encoder log from 2 pass old lambda:
y4m [info]: 1920x804 fps 24/1 i420p16 sar 1:1 unknown frame count
raw [info]: output file: p2-.hevc
x265 [info]: HEVC encoder version 2.3+28-08a05ca9fd16
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 10 profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(13 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 5 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 60 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 5 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-1000 kbps / 0.60
x265 [info]: tools: rect amp rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00 tskip
x265 [info]: tools: signhide tmvp b-intra strong-intra-smoothing
x265 [info]: tools: deblock(tC=-1:B=-1) sao stats-read
x265 [info]: frame I: 129, Avg QP:24.88 kb/s: 9257.96
x265 [info]: frame P: 3666, Avg QP:29.50 kb/s: 2845.08
x265 [info]: frame B: 13825, Avg QP:35.23 kb/s: 431.98
x265 [info]: Weighted P-Frames: Y:2.2% UV:1.8%
x265 [info]: Weighted B-Frames: Y:0.9% UV:0.6%
x265 [info]: consecutive B-frames: 10.7% 5.3% 7.8% 27.8% 16.8% 15.1% 4.2% 6.4% 6.0%
encoded 17620 frames in 68761.43s (0.26 fps), 998.67 kb/s, Avg QP:33.96
Encoder log from 2 pass new lambda:
y4m [info]: 1920x804 fps 24/1 i420p16 sar 1:1 unknown frame count
raw [info]: output file: p2n-.hevc
x265 [info]: HEVC encoder version 2.3+40-2c6e6c9c3da7
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 10 profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(13 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 5 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 60 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 5 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-1000 kbps / 0.60
x265 [info]: tools: rect amp rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00 tskip
x265 [info]: tools: signhide tmvp b-intra strong-intra-smoothing
x265 [info]: tools: deblock(tC=-1:B=-1) sao stats-read
x265 [info]: frame I: 129, Avg QP:25.82 kb/s: 9022.46
x265 [info]: frame P: 3666, Avg QP:30.36 kb/s: 2828.23
x265 [info]: frame B: 13825, Avg QP:36.14 kb/s: 438.65
x265 [info]: Weighted P-Frames: Y:2.2% UV:1.8%
x265 [info]: Weighted B-Frames: Y:0.9% UV:0.6%
x265 [info]: consecutive B-frames: 10.7% 5.3% 7.8% 27.8% 16.8% 15.1% 4.2% 6.4% 6.0%
encoded 17620 frames in 76852.48s (0.23 fps), 998.67 kb/s, Avg QP:34.86
I don't see that new lambda is better. Definitely 1000 kb/s is too small for this movie.
Natty
27th April 2017, 21:02
I did a comparison of an old 10-bit lambda with a new 10-bit lambda.
I don't see that new lambda is better. Definitely 1000 kb/s is too small for this movie.
why arent u using x265 2.4.2 to compare ? on both of ur encodes the x265 versions are different
LigH
27th April 2017, 21:13
That's the trick. The older x265 version (2.3+28) uses the older tables per default, the newer version (2.3+40) uses the newer tables. It doesn't require the very latest version (2.4+2), just one before the patch and one after.
If you wanted to use the same version with different lambda tables, you would need separate files with these tables to specify. I don't remember seeing such a file published anywhere, yet...
why arent u using x265 2.4.2 to compare ? on both of ur encodes the x265 versions are different
From x265 2.3+40 to 2.4+2 there are no changes in normal encoding (without dhdr10). I started encoding before 2.4+2 came out.
Romario
27th April 2017, 23:21
Now is slow time for beginning of AVX-512 optimisation though whole x265 code.
What are the plans about it?
Gesendet von meinem GT-I9295 mit Tapatalk
Jamaika
28th April 2017, 07:29
According to recent test reports in the x265 developer mailinglist, GCC 6.x and CLang both seem to work well with "-std=c++11" (no need for gnu++11 for GCC), so a unified patch should be expectable soon, to enable the required compiler mode in relation to possibly enabled DHDR10 support. :cool:
Hmm, but some programs need eg c++14. Short question: What files can be used interchangeably gnu++11 and c++11? Only .cpp or all.
LigH
28th April 2017, 07:39
This patch is related to a source language level (the compiler understanding the source structure at all). It is not related to compiler brands and versions, especially not to Microsoft Visual C++ (as it is for GNU C++ and CLang).
It doesn't matter if different applications are written in different source language levels (like casual Chinese vs. Mandarin dialect), as soon as they are all available in executable binary form. Runtime DLL's are a completely different topic.
NikosD
28th April 2017, 08:06
The speed-up from SSE4.x to AVX2 is about ~20%
The speed-up from AVX2 to AVX-512 could be less than 10%
benwaggoner
28th April 2017, 18:24
That's the trick. The older x265 version (2.3+28) uses the older tables per default, the newer version (2.3+40) uses the newer tables. It doesn't require the very latest version (2.4+2), just one before the patch and one after.
If you wanted to use the same version with different lambda tables, you would need separate files with these tables to specify. I don't remember seeing such a file published anywhere, yet...
You can pull the old lambda table out of the diff, and then load the csv file with a command:
For 10/12-bit you can get the old csv from the red lines here:
https://bitbucket.org/multicoreware/x265/commits/94d59c325e975888e4f7b152cc90b4199d9d24c4
Put that into a file, and then call it using:
--lambda-file (http://x265.readthedocs.io/en/default/cli.html#cmdoption-lambda-file)
At least, that should work in theory. I haven't tried it myself.
Note that for apples-to-apples comparison, you'll want to use 2-pass ABR encoding. The new lambda table uses somewhat higher bitrate for a given CRF value, so if you compare with CRF you'll get better quality AND a bigger file, which isn't that informative.
benwaggoner
28th April 2017, 18:28
The speed-up from SSE4.x to AVX2 is about ~20%
The speed-up from AVX2 to AVX-512 could be less than 10%
And will probably vary some between different frame sizes, other parameters, and CPU implementations.
AVX and AVX2 got relatively greater performance boosts on more recent Intel processors, since not as much thermal limiting was applied. And IIRC we saw a greater perf delta the slower the preset used.
Sagittaire
28th April 2017, 20:01
I did a comparison of an old 10-bit lambda with a new 10-bit lambda.
Movie: Tears of Steel 4K downsized to 2K, encoded @ 1000 kb/s
old lambda - www.msystem.waw.pl/x265/tears-old-lambda.mkv
new lambda - www.msystem.waw.pl/x265/tears-new-lambda.mkv
command line (first & second pass):
ffmpeg -i ../tearsofsteel-4k.y4m -pix_fmt yuv420p16 -vf "scale=1920:-4:flags=bicubic+accurate_rnd+full_chroma_int+
full_chroma_inp:param0=-0.5:param1=0.25,setsar=1" -v warning -strict -1 -f yuv4mpegpipe - | x265 --y4m - --bitrate 1000 -p9 --de
block -1 --keyint 480 --multi-pass-opt-distortion -o p1n-.hevc --pass 1
ffmpeg -i ../tearsofsteel-4k.y4m -pix_fmt yuv420p16 -vf "scale=1920:-4:flags=bicubic+accurate_rnd+full_chroma_int+
full_chroma_inp:param0=-0.5:param1=0.25,setsar=1" -v warning -strict -1 -f yuv4mpegpipe - | x265 --y4m - --bitrate 1000 -p9 --de
block -1 --keyint 480 --multi-pass-opt-distortion -o p2n-.hevc --pass 2
Encoder log from 2 pass old lambda:
y4m [info]: 1920x804 fps 24/1 i420p16 sar 1:1 unknown frame count
raw [info]: output file: p2-.hevc
x265 [info]: HEVC encoder version 2.3+28-08a05ca9fd16
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 10 profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(13 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 5 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 60 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 5 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-1000 kbps / 0.60
x265 [info]: tools: rect amp rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00 tskip
x265 [info]: tools: signhide tmvp b-intra strong-intra-smoothing
x265 [info]: tools: deblock(tC=-1:B=-1) sao stats-read
x265 [info]: frame I: 129, Avg QP:24.88 kb/s: 9257.96
x265 [info]: frame P: 3666, Avg QP:29.50 kb/s: 2845.08
x265 [info]: frame B: 13825, Avg QP:35.23 kb/s: 431.98
x265 [info]: Weighted P-Frames: Y:2.2% UV:1.8%
x265 [info]: Weighted B-Frames: Y:0.9% UV:0.6%
x265 [info]: consecutive B-frames: 10.7% 5.3% 7.8% 27.8% 16.8% 15.1% 4.2% 6.4% 6.0%
encoded 17620 frames in 68761.43s (0.26 fps), 998.67 kb/s, Avg QP:33.96
Encoder log from 2 pass new lambda:
y4m [info]: 1920x804 fps 24/1 i420p16 sar 1:1 unknown frame count
raw [info]: output file: p2n-.hevc
x265 [info]: HEVC encoder version 2.3+40-2c6e6c9c3da7
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 10 profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(13 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 5 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 60 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 5 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-1000 kbps / 0.60
x265 [info]: tools: rect amp rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00 tskip
x265 [info]: tools: signhide tmvp b-intra strong-intra-smoothing
x265 [info]: tools: deblock(tC=-1:B=-1) sao stats-read
x265 [info]: frame I: 129, Avg QP:25.82 kb/s: 9022.46
x265 [info]: frame P: 3666, Avg QP:30.36 kb/s: 2828.23
x265 [info]: frame B: 13825, Avg QP:36.14 kb/s: 438.65
x265 [info]: Weighted P-Frames: Y:2.2% UV:1.8%
x265 [info]: Weighted B-Frames: Y:0.9% UV:0.6%
x265 [info]: consecutive B-frames: 10.7% 5.3% 7.8% 27.8% 16.8% 15.1% 4.2% 6.4% 6.0%
encoded 17620 frames in 76852.48s (0.23 fps), 998.67 kb/s, Avg QP:34.86
I don't see that new lambda is better. Definitely 1000 kb/s is too small for this movie.
x265 seem really powerfull at quantizer ~25 with default setting. It's certainely the best quantizer area for make comparison. 1000 kbps for 2K source at q35 have definitely too low visual quality.
WhatZit
29th April 2017, 08:27
x265 has hundreds of kernels that are SIMD optimized for various instruction sets, and it's a fairly substantial effort (many, many developer man-months).
I understand that your recent collaboration with HaiGear Labs exploited expensive, but otherwise completely generic, Intel Visual Compute Accelerator 2's accessed via your UHDKit to produce those 4Kp60 real-time streaming results.
Given your hint at "various instruction sets", does this mean that one of these:
http://www.intel.com/content/www/us/en/servers/accelerators/visual-compute-accelerator-2.html
is currently the closest thing we can get to a proper Multicoreware x265 (not anyone else's "HEVC") hardware encoder card?
Of course, I can see at least these two flies swimming in the ointment:
1) The VCA2 hardware support was custom coded by Haivision, not Multicoreware
2) The VCA2 hardware support was coded by Multicoreware, but is only available through the UHDKit API, not the x265 CLI
Additionally, if x265 does include optimised instruction support for the VCA2, does that mean that the VCA (1st one, which is now half the cost) is also supported?
Is there any clarification that you can make on this?
I ask, because I need to plan for a new system who's primary purpose is x265 encoding, and one of these in the appropriate system, although very expensive, could be a dream come true.
Especially if you consider that a Ryzen/Kaby Lake might top out at about 40-60% speed increase over my current rig, but a Xeon+VCA2 would get me a 250-500% (?) speed increase! That would be worth it over the lifespan just in time savings/productivity increases alone.
x265_Project
29th April 2017, 23:39
I understand that your recent collaboration with HaiGear Labs exploited expensive, but otherwise completely generic, Intel Visual Compute Accelerator 2's accessed via your UHDKit to produce those 4Kp60 real-time streaming results.
Given your hint at "various instruction sets", does this mean that one of these:
http://www.intel.com/content/www/us/en/servers/accelerators/visual-compute-accelerator-2.html
is currently the closest thing we can get to a proper Multicoreware x265 (not anyone else's "HEVC") hardware encoder card?
Of course, I can see at least these two flies swimming in the ointment:
1) The VCA2 hardware support was custom coded by Haivision, not Multicoreware
2) The VCA2 hardware support was coded by Multicoreware, but is only available through the UHDKit API, not the x265 CLI
Additionally, if x265 does include optimised instruction support for the VCA2, does that mean that the VCA (1st one, which is now half the cost) is also supported?
Is there any clarification that you can make on this?
I ask, because I need to plan for a new system who's primary purpose is x265 encoding, and one of these in the appropriate system, although very expensive, could be a dream come true.
Especially if you consider that a Ryzen/Kaby Lake might top out at about 40-60% speed increase over my current rig, but a Xeon+VCA2 would get me a 250-500% (?) speed increase! That would be worth it over the lifespan just in time savings/productivity increases alone.
Haivision demonstrated an experimental version of UHDkit that didn't rely on Intel's VCA board. UHDkit was running multiple instances of x265 in a new configuration. This new version is showing promising results for live encoding scenarios, but it is not fully optimized, so it's too soon to share design details or test results.
That doesn't mean we won't look to find ways to accelerate x265 using VCA or similar hardware.
x265_Project
29th April 2017, 23:41
Now is slow time for beginning of AVX-512 optimisation though whole x265 code.
What are the plans about it?
Gesendet von meinem GT-I9295 mit Tapatalk
We plan to optimize x265 with AVX-512 instructions, as soon as possible.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.