Log in

View Full Version : x265 HEVC Encoder


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 [47] 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197

LoRd_MuldeR
9th May 2015, 23:07
Not sure how to avoid emoticons shortcuts (: out, with no space)

https://forum.doom9.org/misc.php?do=bbcode#noparse

plonk420
10th May 2015, 12:38
Something bad happening here. Can you please report this issue in our issue tracker (https://bitbucket.org/multicoreware/x265/issues) with exact commandlines and source (or mention a freely available clip) so that we can reproduce this at our end?

Thanks,

it was just the video from Atlantis (the first, 2 Disc R1 release). checks out okay with VirtualDub, and even said settings (--preset slower --tu-intra-depth 1 --tu-inter-depth 1 ... as well as --preset slow), just not --preset slower

LoadPlugin("C:\mm\MeGUI_2028_x86\tools\dgindex\DGDecode.dll")
DGDecode_mpeg2source("E:\Atlantis_US_Col_feat\VIDEO_TS\VTS_01_1 - 0xE0 - Video - MPEG-2 - 720x480 (NTSC) - 16~9 - Letterboxed.d2v", info=3)
LoadPlugin("C:\mm\MeGUI_2028_x86\tools\avisynth_plugin\ColorMatrix.dll")
ColorMatrix(hints=true, threads=0)
crop(8, 56, -8, -64)
trim(7500,7600)

i cut it down to 500 frames since i was tired of waiting for the encode.

i tried MOD16, didn't help.

eventually, i tried 1.6+295 (i don't like straying too far from versions included with software), and it worked.

edit: wonky clip https://mega.co.nz/#!LB4xmBza!sekieIiwjFeUQzBZFRF2KfvEdsfwnFJVDGHCshXBJeE
edit2: not sure if this is SIMPLY --preset slow, or if i tweaked it: https://mega.co.nz/#!DY4E0SwZ!nCqDxPVwowvHc__NncVQzLKNsf4pROX5aPm7Yak__tw

foxyshadis
10th May 2015, 20:12
MeGUI is showing signs of abandonment lately, so updating x265 manually might be the only choice going forward. Hopefully that changes and it gets a 1.7 update, because QG-size is a very significant improvement, but otherwise more since-fixed bug reports can probably be expected.

LigH
11th May 2015, 09:07
A new "merge with stable" to prepare a coming v1.7 era: x265 1.6+412-b642b3d8cc1e (https://www.mediafire.com/download/xpmr0uv091u0mq6/x265_1.6+412-b642b3d8cc1e.7z)

Also support for "content light level" (HDR SEI info).

Kurtnoise
11th May 2015, 13:48
Compilation failed w/ the "DETAILED_CU_STATS" settings enabled...

c:/multicoreware-x265-3700169eb622/source/encoder/slicet
ype.cpp: In member function 'virtual void x265::PreLookaheadGroup::processTasks(
int)':
c:/multicoreware-x265-3700169eb622/source/encoder/slicet
ype.cpp:790:30: error: 'm_preLookaheadElapsedTime' was not declared in this scop
e
ProfileLookaheadTime(m_preLookaheadElapsedTime, m_countPreLookahead);
^
c:/multicoreware-x265-3700169eb622/source/encoder/slicet
ype.cpp:38:71: note: in definition of macro 'ProfileLookaheadTime'
#define ProfileLookaheadTime(elapsed, count) ScopedElapsedTime _scope(elapsed);
count++
^
c:/multicoreware-x265-3700169eb622/source/encoder/slicet
ype.cpp:790:57: error: 'm_countPreLookahead' was not declared in this scope
ProfileLookaheadTime(m_preLookaheadElapsedTime, m_countPreLookahead);
^
c:/multicoreware-x265-3700169eb622/source/encoder/slicet
ype.cpp:38:81: note: in definition of macro 'ProfileLookaheadTime'
#define ProfileLookaheadTime(elapsed, count) ScopedElapsedTime _scope(elapsed);
count++

^
make[2]: *** [encoder/CMakeFiles/encoder.dir/slicetype.cpp.obj] Error 1
make[1]: *** [encoder/CMakeFiles/encoder.dir/all] Error 2
make: *** [all] Error 2

LigH
11th May 2015, 13:56
Best posted to the x265 Developers Mailing List, I'd suggest...

This is a boolean cmake option (-D)?

Ah, yes, I can confirm.

LigH
12th May 2015, 09:04
x265 1.6+417-f2081ef64fd2 (https://www.mediafire.com/download/i6msp4p0svl74j3/x265_1.6+417-f2081ef64fd2.7z) publishes a switch to select the output (and internal) bit depth at run time:

-D/--output-depth 8|10 Output bit depth (also internal bit depth). Default {8|10}

Default depth depends on the compile options.

Now I wonder: If a 64 bit executable should support both 8 and 10 bit depths, shouldn't the building process create "more or less identical" binaries containing both code groups, so that switching between 8 and 10 bit depths at run time is at all possible? But normal and HBD builds still differ (DLLs as well as EXEs). I doubt this works as desired (but not tested yet...).

This package has also DETAILED_CU_STATS enabled.
__

P.S.: The documentation reads like it doesn't have to. If a specific depth is unsupported (because not linked in), it seems to fall back to a supported one. I guess that a dynamic EXE linking both 8 and 10 bit DLLs would be an easier way to make a CLI encoder supporting both at run time?

MeteorRain
12th May 2015, 10:59
Now I wonder: If a 64 bit executable should support both 8 and 10 bit depths, shouldn't the building process create "more or less identical" binaries containing both code groups, so that switching between 8 and 10 bit depths at run time is at all possible? But normal and HBD builds still differ (DLLs as well as EXEs). I doubt this works as desired (but not tested yet...).

P.S.: The documentation reads like it doesn't have to. If a specific depth is unsupported (because not linked in), it seems to fall back to a supported one. I guess that a dynamic EXE linking both 8 and 10 bit DLLs would be an easier way to make a CLI encoder supporting both at run time?

I'll look into it and probably can work out a custom patch to implement this.

EDIT:

Alright, this is what I can do for now.
Since cli heavily relies on the core (e.g. x265_malloc() / free() / log() / param.cpp etc), I can only keep the library inside cli while support dynamically load either dll at run time, for now. Maybe I can take sometime and rip the dependent code out of the core.

That is to say, (cli+lib) as the cli, while 8-bit lib and 10-bit lib in separate dll files. Depending on the param, it will either load 8-bit dll or 10-bit dll, completely ignore the internal one.

Ma
12th May 2015, 11:20
[...] If a specific depth is unsupported (because not linked in), it seems to fall back to a supported one.

For 8-bit x265.exe put in the same folder 10-bit *.dll with name "libx265_main10.dll".

For 10-bit x265.exe put in the same folder 8-bit *.dll with name "libx265_main.dll".

The names of *.dll are important.

GodRealm
12th May 2015, 12:48
For 8-bit x265.exe put in the same folder 10-bit *.dll with name "libx265_main10.dll".

For 10-bit x265.exe put in the same folder 8-bit *.dll with name "libx265_main.dll".

The names of *.dll are important.

Working perfect! Thank you!

benwaggoner
12th May 2015, 16:37
I'll look into it and probably can work out a custom patch to implement this.

EDIT:

Alright, this is what I can do for now.
Since cli heavily relies on the core (e.g. x265_malloc() / free() / log() / param.cpp etc), I can only keep the library inside cli while support dynamically load either dll at run time, for now. Maybe I can take sometime and rip the dependent code out of the core.

That is to say, (cli+lib) as the cli, while 8-bit lib and 10-bit lib in separate dll files. Depending on the param, it will either load 8-bit dll or 10-bit dll, completely ignore the internal one.
Are you implementing api_get?

http://x265.readthedocs.org/en/default/api.html#multi-library-interface

That's the recommended way to handle multiple .dll bit depth versions from a single app.

LigH
12th May 2015, 19:34
And ffmpeg is going to implement it as well:

[PATCH] avcodec/libx265: use x265 Multi-library Interface to query the API (http://ffmpeg.org/pipermail/ffmpeg-devel/2015-May/172773.html)

stax76
12th May 2015, 19:52
For 8-bit x265.exe put in the same folder 10-bit *.dll with name "libx265_main10.dll".

For 10-bit x265.exe put in the same folder 8-bit *.dll with name "libx265_main.dll".

The names of *.dll are important.

That makes it a bit easier for me. :goodpost:

MeteorRain
13th May 2015, 00:46
Are you implementing api_get?

http://x265.readthedocs.org/en/default/api.html#multi-library-interface

That's the recommended way to handle multiple .dll bit depth versions from a single app.

Yes, but my goal is to completely separate these two components.

nandaku2
13th May 2015, 09:46
That is to say, (cli+lib) as the cli, while 8-bit lib and 10-bit lib in separate dll files. Depending on the param, it will either load 8-bit dll or 10-bit dll, completely ignore the internal one.

So, depending on the param, it will check if the internal lib (the one the CLI was compiled with) matches the param, if so, use it. If not, it will load the dll with requested depth.

LigH
13th May 2015, 09:54
The "dynamic build" I imagined would have no internal library, only be a CLI stub, always loading external libraries depending on a default or parameter value. Would that be less preferable?

foxyshadis
13th May 2015, 10:09
I prefer a static build, so I don't have to deal with paths. As long as building a static library is possible, though, linking cli to it is easy enough.

nevcairiel
13th May 2015, 10:11
I agree, a single binary build is a feature that should not be made impossible with all these changes.
Personally, I don't really see the advantage of having .exe + 2x dll over .exe (with included library) + 1x dll

qyot27
13th May 2015, 20:15
From the way it read, it does seem to allow a main static build for one of the bittages, but I haven't tried it yet.

I figure on Linux or OSX this is handled differently, e.g. by LD_LIB_PREFIX or whatever the environment variable is. Having an option in the CMake install recipe to handle the DLL naming would be nice, though (if it's not already there).

qyot27
15th May 2015, 04:57
While it works with x265.exe and the -D option, I haven't been able to get the multi-lib to work with FFmpeg. I tried symlinking libx265_main10.dll into the directory with FFmpeg, and even resorted to a real copy of the .dll. No dice: if you pass an >8-bit source to it (or use -pix_fmt yuv4**p10le to fake it), libx265 won't use the 16bpp dll. And FFmpeg was linked against the exact same 8-bit libx265 as the x265.exe that works with the .dll uses. So I'm stumped.

EDIT: with 1.7+16 FFmpeg works, so this resolved itself.

Motenai Yoda
15th May 2015, 23:26
The only option I use is f3kdb(dither_algo=2) despite algo_3 being supposedly better.
Indeed I don't like f3kdb's defaults coz it adds too much grain and remove some subtle details/edges, also I'm not sure how well it works on 16bit without input_mode=1 and output_mode=1.
Actually I prefer gradfun3(0.1, lsb=true, lsb_in=true).Dither_add_grain16(0.3, 0.3)

btw there are restriction about how low --qg-size can be to keep hw decoding capability?

x265_Project
16th May 2015, 17:20
btw there are restriction about how low --qg-size can be to keep hw decoding capability?
No. Fine-grained adaptive quantization (at the CU level, and not just at the CTU level), is supported by the HEVC specifications, and any compliant HEVC decoder must support this.

stax76
18th May 2015, 12:39
Is it already possible to output either 8 or 10 bit by merely choosing --profile? This and avs reader would make it much easier for users and GUI authors, sorry for repeating myself, I just think it's very important.

Kurtnoise
18th May 2015, 13:04
Is it already possible to output either 8 or 10 bit by merely choosing --profile?
-D switch

This and avs reader would make it much easier for users and GUI authors, sorry for repeating myself, I just think it's very important.
look at the previous pages...there is already a patch for that.

LigH
18th May 2015, 13:09
@ Kurtnoise: stax76 wonders if it would be preferable to derive the result of the -D (output depth) switch from a selected output profile (e.g. "--profile main10" implies "-D 10").

stax76
18th May 2015, 13:14
look at the previous pages...there is already a patch for that.

Is there a binary already?

@ Kurtnoise: stax76 wonders if it would be preferable to derive the result of the -D (output depth) switch from a selected output profile (e.g. "--profile main10" implies "-D 10").

Yes, I could live with -D but only using --profile would be a much better solution for everybody I believe.

Kurtnoise
18th May 2015, 16:07
Is there a binary already?
locally, yes..., publicly, I dont think so.

Yes, I could live with -D but only using --profile would be a much better solution for everybody I believe.
well...that might be problematic because some Profiles have a range of bit depths allowed, not 1 or 2 values available.

x265_Project
18th May 2015, 16:36
Yes, I could live with -D but only using --profile would be a much better solution for everybody I believe.
We had to consider all of the various combinations of input-depth, internal-depth and output-depth. What really matters when choosing the right x265 library is the internal depth, which is generally tied to the output depth.

stax76
18th May 2015, 17:23
Thanks, I'll just add --output-depth then, are there already builds supporting this?

Ma
18th May 2015, 18:40
Thanks, I'll just add --output-depth then, are there already builds supporting this?

Yes, you can try my builds (1.6+451 support --output-depth):
www.msystem.waw.pl/x265/

You can also take last LigH build and rename *.dll.

stax76
18th May 2015, 19:27
Thanks, great that it's ready. I don't know exactly how to support it in StaxRip unfortunately, problem is profiles are not really well documented, not in the x265 documentation and not in the wikipedia HEVC article, or maybe I just don't understand it.

Ma
18th May 2015, 19:49
Thanks, great that it's ready. I don't know exactly how to support it in StaxRip unfortunately, problem is profiles are not really well documented, not in the x265 documentation and not in the wikipedia HEVC article, or maybe I just don't understand it.

About profiles in x265 I found:
http://x265.readthedocs.org/en/default/cli.html#profile-level-tier

About profiles in HEVC I found (page 17):
http://iphome.hhi.de/wiegand/assets/pdfs/2012_12_IEEE-HEVC-Overview.pdf

Can you describe what do you want to do (example)?

Edit: I'll try to guess. You are using 10-bit LigH build and you make HEVC 'main10' profile by default. If you want to make HEVC 'main' profile, add 'libx265.dll' from 'Win64_8bpp' folder in LigH archive to your x265.exe and rename 'libx265.dll' to 'libx265_main.dll'. Then you can add '-D 8' option to your command line and there will be 'main' profile. This works only with the latest LigH build.

stax76
18th May 2015, 20:38
Thanks, this pdf is very helpful, it's much clearer now. I need to expose a GUI option for both --profile and --output-depth, problem solved.

Ma
18th May 2015, 20:59
Scenario:
10-bit x265.exe + 8-bit libx265_main.dll
Command line: x265 -P main in.y4m out.hevc
x265 output:
x265 [error]: main profile not supported, compiled for Main10.

My proposition: if there is 8-bit dll and user ask for 'main' profile, x265 should switch to 8-bit and encode to 'main' profile instead of abort with error message.

x265_Project
19th May 2015, 03:04
x265 version 1.7 has been released. This release contains a large amount of assembly code optimizations, some preliminary support for high dynamic range content, improvements for multi-library support, and some new quality features.

Full documentation at: http://x265.readthedocs.org/en/1.7/

This release simplifies the multi-library support introduced in version 1.6. Any libx265 can now forward API requests to other installed libx265 libraries (by name) so applications like ffmpeg and the x265 CLI can select between 8bit and 10bit encodes at runtime without the need of a shim library or library load path hacks. See --output-depth, and http://x265.readthedocs.org/en/1.7/api.html#multi-library-interface

For quality, x265 now allows you to configure the quantization group size smaller than the CTU size (for finer grained AQ adjustments). See --qg-size.

x265 now supports limited mid-encode reconfigure via a new public method: x265_encoder_reconfig()

For HDR, x265 now supports signaling the SMPTE 2084 color transfer function, the SMPTE 2086 mastering display color primaries, and the content light levels. See --master-display, --max-cll

x265 will no longer emit any non-conformant bitstreams unless --allow-non-conformance is specified.

The x265 CLI now supports a simple encode preview feature. See --recon-y4m-exec.

The AnnexB NAL headers can now be configured off, via x265_param.bAnnexB This is not configurable via the CLI because it is a function of the muxer being used, and the CLI only supports raw output files. See --annexb

Misc:
* --lossless encodes are now signaled as level 8.5
* --profile now has a -P short option
* The regression scripts used by x265 are now public, and can be found at: https://bitbucket.org/sborho/test-harness
* x265's cmake scripts now support PGO builds, the test-harness can be used to drive the profile-guided build process.

LigH
19th May 2015, 09:58
On the occasion of the new milestone:

x265 1.7+2-d7b100e51e82 (https://www.mediafire.com/download/fwdd59oxx2rq5m5/x265_1.7+2-d7b100e51e82.7z)

Please note: Contains different file names, starting with v1.7, to be more easily compliant with the new Multi-library Interface (http://x265.readthedocs.org/en/1.7/api.html#multi-library-interface).

Barough
19th May 2015, 10:24
Thnx for the new compile LigH :)

LigH
19th May 2015, 16:16
P.S.: Additional MSYS2-64 GCC 4.9.2 build (http://www.mediafire.com/download/77fynsk4i536v9v/x265_1.7+2-d7b100e51e82.GCC492.7z), each one pair only (8-bit EXE + 10-bit DLL), as built with media-autobuild_suite by jb_alvarado

Ma
20th May 2015, 11:06
There is something wrong with new GCC compilers and x265. First I thought that new GCC compilers are weak, but I've made new speed test, and now I don't know what's going on.

Win7 64-bit, i5 3450S, test video https://media.xiph.org/video/derf/y4m/720p50_parkrun_ter.y4m

Options: "--preset slow -D 10 --crf 20 --rdoq-level 1 --psy-rd 0.4 --deblock -1 --keyint 288 --colormatrix bt709 -f 120"

Warriors:
All builds are for generic CPU with default -O3 optimize option.
x265-492-no-asm -- GCC 4.9.2, assembly OFF
x265-600-no-asm -- GCC 6.0.0, assembly OFF
x265-492 -- GCC 4.9.2, assembly ON
x265-600 -- GCC 6.0.0, assembly ON

Result:
x265-492-no-asm | 112.91s | 1.06 fps | 100.0%
x265-600-no-asm | 106.26s | 1.13 fps | 94.1%
-------------------------------------------------------
x265-492 | 33.60s | 3.57 fps | 100.0%
x265-600 | 35.49s | 3.38 fps | 105.6%

Without asm GCC 6.0.0 crushes GCC 4.9.2, with asm there is quite opposite. The result are similar for GCC 5.1 (it is not so fast without asm). This is quite strange to me...

Full result and builds -- http://msystem.waw.pl/x265/test6.7z

Kurtnoise
20th May 2015, 12:26
x265 1.7 (http://www.mediafire.com/download/9n1uyn4cm378ccn/x265_1.7.7z) w/ YUV, Y4M, AVS, AVI, MKV, MOV, MP4, FLV files support as input and MP4, MKV as output...for testing of course.

stax76
20th May 2015, 13:23
x265 1.7 (http://www.mediafire.com/download/9n1uyn4cm378ccn/x265_1.7.7z) w/ YUV, Y4M, AVS, AVI, MKV, MOV, MP4, FLV files support as input and MP4, MKV as output...for testing of course.

Thanks, I'm gonna test it.

shinchiro
20th May 2015, 13:58
I tested chromashift's msvc build with gcc build before and notice msvc slightly faster than gcc. So between gcc, icl and msvc compiler, which one give more speed in windows for anyone?

Ma
20th May 2015, 15:03
So between gcc, icl and msvc compiler, which one give more speed in windows for anyone?

My observations:
MSVC is good for 10-bit encoding, worse for 8-bit,
ICL is slow and gives strange results (output file differs from MSVC/GCC output),
GCC for slow/slower/veryslow presets is OK when you are making native builds with -O2 optimize option (there is very small L1 cache in my i5 3450S CPU -- only 32 KB),
GCC for faster presets are OK with -O3 optimize option, native build + PGO.

shinchiro
20th May 2015, 15:42
Thanks. Look like I'll stick with MSVC build since I regularly encoding in 10bit :)
btw just curious, how you do benchmark speed for different compiled build?
Is there any huge gain when specify optimize option like -march=core-avx2, -O3 when compiling?

Ma
20th May 2015, 16:36
btw just curious, how you do benchmark speed for different compiled build?

I take part of the movie which I want to encode, for example:
ffmpeg.exe -ss 7693 -i movie.mkv -filter:v "crop=1920:800:0:140" -an -sn -frames 750 -pix_fmt yuv420p -f yuv4mpegpipe 1920x800-hob.y4m

Then I rename different builds to unique names and use my test.bat file (you can find one in test6.7z couple post above). First I try with only 20 frames (-f 20 with options) and I estimate how many frames I need to about 100 s for fast compare, 300 s for good compare and 500 s for very good compare. In the evening I close all apps, unplug network cable, start test.bat and go to sleep. In the morning I can read result file -- last result file I attached to this message as example.

Is there any huge gain when specify optimize option like -march=core-avx2, -O3 when compiling?

I have only AVX CPU (AVX2 builds hangs on my system). If we say that my native build encoding time is 100.0%, then generic build encoding time is 101.5% (from last test).

MeteorRain
21st May 2015, 06:23
x265 1.7 (http://www.mediafire.com/download/9n1uyn4cm378ccn/x265_1.7.7z) w/ YUV, Y4M, AVS, AVI, MKV, MOV, MP4, FLV files support as input and MP4, MKV as output...for testing of course.

Please take extra care on timestamps, e.g., vfr / timebase / fps. I'm not so sure they are 100% working so be careful.

x265_Project
21st May 2015, 07:40
The current default build has a bug that is causing it to be quite a bit slower than previous builds (like the 1.7 stable build). Ironically, this bug was caused by a new analysis method that is designed to make x265 faster. Our team will get this ironed out soon. In the meantime, you're better off sticking with earlier builds.

dipje
21st May 2015, 08:59
Just to be clear, the slow-ness bug is in the 1.7-2 LigH build for example?

LigH
21st May 2015, 09:08
Probably yes, this is (related to the source versioning, not the compile time) the very first "v1.7 stable" build (the +2 patches are marginal, after declaring the v1.7 milestone).

Ma
21st May 2015, 12:02
Due to last night results (attached) there is slowdown from 1.7+2 to 1.7+37 version:
---------- |- 1.7+2 -|- 1.7+37 | slowdown
10b-AVX | 330.52s | 333.72s | +1.0%
10b-GEN | 335.37s | 337.29s | +0.6%

The code is slower and bigger. In 1.7+2 AVX version (with -O2 optimize option) the generated code was small enough for 32 KB L1 cache, in GEN (generic CPU) version was too big, but in 1.7+37 the code is too big for L1 cache in AVX & GEN builds.