16 bit Calculation Precision for X265 [Archive]

Augur89

2nd May 2021, 18:07

Can we please get 16 bit precision back in the future releases?
By using 16 bit instead of 8 bit precision, the video bitrate can be reduced significantly → way less filesize for the same quality.

This was removed in 2015 for a reason, i never understood.
If i'm not wrong, main reason for usage of x265 is saving bandwidth (video bitrate) and by removing the 16 bit calculation precision the devs drastically reduced the amount of bandwidth, that can be saved by using x265. Why?

Are the X265 Devs reading this forum or is there a way to get in contact with them?

Thanks in advance

rwill

2nd May 2021, 18:45

But 16 bit is twice the bandwidth of 8 bit, I don't understand what you want.

PS: Dont feed the troll.

Augur89

2nd May 2021, 21:34

with 8 bit calculation precision you need a significantly higher video bitrate to make the compressed video look as proper as with 16 bit calculation precision (which can be seen best in dark scenes with smog or dust in the air).

why was the 16 bit calculation precision removed?
i'm still using the old 2015 version since it was way more efficient and would love to see 16 bit calculation precision reimplemented in the new versions

huhn

3rd May 2021, 09:40

it's still supported and even updated just compile x265 with HIGH_BIT_DEPTH and follow the rules it "needs" like 64 bit.

FranceBB

3rd May 2021, 13:10

x265 is NOT x264.
Back in the days, when we used to encode files with x264 we noticed that 10bit files were actually smaller than 8bit ones 'cause the additional precision in vector calculations for motion compensation and other things improved compression.
This is no longer the case with x265, 'cause no matter in which bit depth you're gonna end up with, calculations will always be performed in high bit depth.
In other words, it doesn't matter which bit depth you're targeting, calculations will always be done with the highest possible precision, so if this is what kept you up at night, don't worry, you can have sweet dreams. ;)

rwill

3rd May 2021, 15:24

For instance, let's suppose you feed x265 with 12bit planar and you wanna go down to 8bit planar.
It doesn't matter if you're going down 8bit, all the calculations will be performed with the highest bit depth and only the final result will be 8bit (dithered down).
In other words, it doesn't matter which bit depth you're targeting, calculations will always be done with the highest possible precision, so if this is what kept you up at night, don't worry, you can have sweet dreams. ;)

That is not correct.

In your example the encoder input would be converted from 12 bit to 8 bit and then processed. Pel buffers will be 8 bit.

HEVC just has higher intermediate precision and does not take the shortcuts H.264 took when doing prediction and transform.

Augur89

3rd May 2021, 17:26

well that would be nice, but it's not what my experiences show.
i'm not talking about bit depth. i only talk about calculation precision and i think, the highest possible precision at the current releases is 8 bit (instead of the 16 bit, we had in the 2015er versions)

however i had a lot of videos, that looked way worse with the latest version and the exact same coding options than with the old version (just with 16 bit calculation precision). with other words, i didn't find a encoding setting with the new versions, that did even get close to the results, the older versions provided.

with very short test samples, it's the other way round.
so it just would be nice to get the opportunity to chose back

huhn

3rd May 2021, 17:34

again you can still compile it with 16bpp. is not gone it's still updated and i have no clue when and how it is compiled by default and it does not really matter to the answer it is still an option.

Augur89

3rd May 2021, 17:48

so you think, if i select high bit depth, it will automatically be processed with 16 bit calculation precision?

no matter where i changed between 8 bit depth or 10/12/16, the results always weren't the same like 16 bit calculation precision

→ where / in which software do you select high bit depth, so that it increases calculation precision to 16?

Selur

3rd May 2021, 17:52

@huhn: can you share a binary that still supports 16bit?
Looking at:
https://bitbucket.org/multicoreware/x265_git/src/master/source/encoder/CMakeLists.txt
https://bitbucket.org/multicoreware/x265_git/src/master/build/linux/multilib.sh
https://bitbucket.org/multicoreware/x265_git/src/master/build/vc15-x86_64/multilib.bat
none of them seems to support 16bit,...
Documentation (https://x265.readthedocs.io/en/master/cli.html?highlight=bitdepth#cmdoption-output-depth) also does only mention 8/10/12 bit,...

Cu Selur

rwill

3rd May 2021, 18:16

@Augur89

Do you mean the Range Extensions (RExt ?)

Like RExt__HIGH_BIT_DEPTH_SUPPORT, FULL_NBIT and RExt__HIGH_PRECISION_FORWARD_TRANSFORM ?

huhn

3rd May 2021, 18:23

i never talked about 16 bit output just 16bpp. it still just 8-12 bit.

from the CMakeList: if(X64)
# NOTE: We only officially support high-bit-depth compiles of x265
# on 64bit architectures. Main10 plus large resolution plus slow
# preset plus 32bit address space usually means malloc failure. You
# can disable this if(X64) check if you desparately need a 32bit
# build with 10bit/12bit support, but this violates the "shrink wrap
# license" so to speak. If it breaks you get to keep both halves.
# You will need to disable assembly manually.
option(HIGH_BIT_DEPTH "Store pixel samples as 16bit values (Main10/Main12)" OFF)
endif(X64)
source: https://github.com/videolan/x265/tree/master
clearly not dead: https://github.com/videolan/x265/commit/9462f1a9b488b1101a89ed58bec28884293fdabc

and no i don't have binary and no setup where i could even try to compile one.

this is also now leaving my expertise on the matter.

Augur89

3rd May 2021, 18:36

@Augur89

Do you mean the Range Extensions (RExt ?)

Like RExt__HIGH_BIT_DEPTH_SUPPORT, FULL_NBIT and RExt__HIGH_PRECISION_FORWARD_TRANSFORM ?

no, i'm not talking about bit depth.
i'm talking about calculation precision.

PS: i'm currently rendering two 45 min videos (one time with 8 bit calculation precision and another time with 16 bit calculation) to show you the difference

rwill

3rd May 2021, 19:09

PS: i'm currently rendering two 45 min videos (one time with 8 bit calculation precision and another time with 16 bit calculation) to show you the difference

A 3:32 min video would be enough.

Augur89

3rd May 2021, 19:25

A 3:32 min video would be enough.

no, that would be way too short. for a proper test with 2 pass encoding, you need at least 30-50 min to really see a difference.
if they only tested at 3 min video, i wouldn't wonder, why they skipped 16 bit calculation precision :)

as i wrote before, at very short samples, 8 bit calculation precision even delivers better results sometimes

huhn

3rd May 2021, 19:32

not how encoding works either you have a difference between a GOP or you don't have a difference.
16 bpp isn't skipped at all but i guess bug fixes and the fact it can be compiled isn't enough?

Augur89

3rd May 2021, 19:43

not how encoding works either you have a difference between a GOP or you don't have a difference.
16 bpp isn't skipped at all but i guess bug fixes and the fact it can be compiled isn't enough?

1) i just can say, the quality loss speaks for itself. i can't explain to you how exactly calculation precision works as i didn't find an explanation anywhere. i just can tell you, that it makes a big difference for testing if you only render 3 min or the whole movie

2) as long as i can't select 16 bit calculation precision and reach the same quality / filesize, i assume, it's not available.
if you have a current version of x265, that included 16 bit calculation precision, please link it. and please keep in mind, that i am not talking about bit depth and bpp.

huhn

3rd May 2021, 21:14

it's a compile option not a setting it never was a setting. there where just executables with 8bpp and 16bpp if you use a tool that has this option it is just switching between executables.

if i read this correctly it always used for main10 and main12 by default only main8 is using 8 bit.

benwaggoner

3rd May 2021, 21:28

Where are you talking about 8-bit precision? In the frequency transform? Internally, all the operations run at >>8-bit precision in the encoder and the decoder. IIRC, 16-bit for Main and 32-bit for Main10 and Main12.

Any encoder actually doing 8-bit iDCT would presumably yield poor output in all kinds of typical usage. You need the internal operations to happen at higher precision than input/output to avoid cumulative rounding errors and spatial/frequency domain bugs.

Maybe MPEG-2 had a mode to do internal 8-bit precision, but even then 10-bit was available even with 8-bit encoding. I'm not sure if that's apples-to-apples for what I'm talking about.

huhn

3rd May 2021, 22:23

you can clearly compile it with 8 bit internal precision.

in the past these builds are called x265 8 bit 8bpp.

what a time: https://forum.doom9.org/showthread.php?p=1684406#post1684406

only 2 ways to fix this "issue":
1. find a comparison between two modern 8bpp and 16bpp x265 builds
2. compile then and do the comparison
3. move on they are not idiots that create x265

the rest is talking around.

benwaggoner

4th May 2021, 00:39

you can clearly compile it with 8 bit internal precision.

in the past these builds are called x265 8 bit 8bpp.

what a time: https://forum.doom9.org/showthread.php?p=1684406#post1684406

only 2 ways to fix this "issue":
1. find a comparison between two modern 8bpp and 16bpp x265 builds
2. compile then and do the comparison
3. move on they are not idiots that create x265

the rest is talking around.
Can you specify what internal value you are referring to when you say "8-bit internal?" Frequency or spatial domain?

It sounds like you're talking about being able to encode 8-bit source as 10-bit Main10 output or something? The way encoders are structured is they start with input frames of the same depth and subsampling that they will be encoding. Although you can use a 10-bit input with Main x265 encoding if you want. That's what the --dither parameter is for.

huhn

4th May 2021, 02:29

i personally want nothing it's just that i know there is an option to compile different version of x265 with a internal bit deep 8 or 16 and it is fully support and was a common thing back in the days called 8bpp or 16bpp.

this has nothing todo with encode setting 8 bit encoding main10 main12 this doesn't matter to the compile option it could be limited to 10/12 which is is by default but it's also an option for 8 bit.

if it is this unknown maybe it's a good idea to compile one so some user can play around with it.

and just to be absolutely clear this is a compile option not an encode option.

here is the cmake again:
if(X64)
# NOTE: We only officially support high-bit-depth compiles of x265
# on 64bit architectures. Main10 plus large resolution plus slow
# preset plus 32bit address space usually means malloc failure. You
# can disable this if(X64) check if you desparately need a 32bit
# build with 10bit/12bit support, but this violates the "shrink wrap
# license" so to speak. If it breaks you get to keep both halves.
# You will need to disable assembly manually.
option(HIGH_BIT_DEPTH "Store pixel samples as 16bit values (Main10/Main12)" OFF)
endif(X64)
if(HIGH_BIT_DEPTH)
option(MAIN12 "Support Main12 instead of Main10" OFF)
if(MAIN12)
add_definitions(-DHIGH_BIT_DEPTH=1 -DX265_DEPTH=12)
else()
add_definitions(-DHIGH_BIT_DEPTH=1 -DX265_DEPTH=10)
endif()
else(HIGH_BIT_DEPTH)
add_definitions(-DHIGH_BIT_DEPTH=0 -DX265_DEPTH=8)
endif(HIGH_BIT_DEPTH)

benwaggoner

4th May 2021, 05:56

i personally want nothing it's just that i know there is an option to compile different version of x265 with a internal bit deep 8 or 16 and it is fully support and was a common thing back in the days called 8bpp or 16bpp.

this has nothing todo with encode setting 8 bit encoding main10 main12 this doesn't matter to the compile option it could be limited to 10/12 which is is by default but it's also an option for 8 bit.

if it is this unknown maybe it's a good idea to compile one so some user can play around with it.

and just to be absolutely clear this is a compile option not an encode option.

here is the cmake again:
if(X64)
# NOTE: We only officially support high-bit-depth compiles of x265
# on 64bit architectures. Main10 plus large resolution plus slow
# preset plus 32bit address space usually means malloc failure. You
# can disable this if(X64) check if you desparately need a 32bit
# build with 10bit/12bit support, but this violates the "shrink wrap
# license" so to speak. If it breaks you get to keep both halves.
# You will need to disable assembly manually.
option(HIGH_BIT_DEPTH "Store pixel samples as 16bit values (Main10/Main12)" OFF)
endif(X64)
if(HIGH_BIT_DEPTH)
option(MAIN12 "Support Main12 instead of Main10" OFF)
if(MAIN12)
add_definitions(-DHIGH_BIT_DEPTH=1 -DX265_DEPTH=12)
else()
add_definitions(-DHIGH_BIT_DEPTH=1 -DX265_DEPTH=10)
endif()
else(HIGH_BIT_DEPTH)
add_definitions(-DHIGH_BIT_DEPTH=0 -DX265_DEPTH=8)
endif(HIGH_BIT_DEPTH)
Ah. High Bit Depth, aka support for 10-bit and 12-bit, is only available for x86-64 because the internal calculation precision of Main10 and Main 12 REQUIRE double the floating point precision to operate, which can double the

A High Bit depth Build of x265 allows for Main10 and Main12 encoding, but encodes good old 8-bit Main the same way as a build without it.

Either I don't understand the problem you're having, or you don't understand the problem you aren't actually having.

huhn

4th May 2021, 09:19

there is no issue (and i'm pretty sure it's int16 doesn't matter).

there was a request for a 16bpp build like in the old days which can be compiled.
and if you go out of your way even for main8 so a x265 8 bit 16bpp.

benwaggoner

4th May 2021, 18:14

there is no issue (and i'm pretty sure it's int16 doesn't matter).

there was a request for a 16bpp build like in the old days which can be compiled.
and if you go out of your way even for main8 so a x265 8 bit 16bpp.

Do you mean input support for >8-bit sources to use x265's dithering instead of an external tool?

huhn

4th May 2021, 19:33

it just changes how the encoder works internally the user will not notice any difference on how it is used. the name 16bpp is not my idea.
x265 has code for both stored in the source you just take a different code not for the input part and not for the output part. that's as simple as i can make it.

as HolyWu just said the executable doesn't work when it is forced to compile with 8 bit 16bpp unlike in the past so time to use main10/main12 instead and move on.

benwaggoner

5th May 2021, 00:57

I guess I am lost on what a High Bit Depth exe would do for 8-bit that a non High Bit Depth exe can't.

huhn

5th May 2021, 04:43

it's just more efficient (and should be slower)but i have not seen tests in many years.
it uses intrapred16 (just like main10/12) instead of intrapred8 and many more examples like this. there is extra code in x265 for 8 bit x265 8bpp.
it's just a better 8 bit x265. the simple solution is just using 10 bit x265 and move on i can even come up with a good reason to use 8 bit over 10 bit x265 in general so i can't make a point for the 16bpp.

maybe this term help it's not 100 % correct but what ever.
a 8 bit x265 16bpp version works internal just like a 10 bit x265 version while been 100 % spec confirm to 8 bit H265.

the x265 exe in megui around 2015 is supposed to be 16bpp if you really want to have a go with it.

benwaggoner

6th May 2021, 01:49

I checked with one of the primary x265 developers who is now a colleague of mine. Her response:
Internal precision in most modules has always been 16-bit, in fact, it wouldn’t even work without it. The MC modules in intrapred and interpred probably use 8b/16b, just because you can’t pack 2 pixels into a 16-bit register. But there would be no increase in precision/anything by using intrapred16 for 8-bit inputs.

Every module from residual block onwards uses 16-bit.
Which also matches my understanding and experience.

Boulder

6th May 2021, 05:09

Good information, thank you. That has also been my reasoning for feeding the processed 16-bit data into x265 instead of dithering down to 10 bits beforehand.

Asmodian

6th May 2021, 05:45

This is an odd thread, very interesting incites into how x265 works in response to a bizarre premise. :)

i'm still using the old 2015 version since it was way more efficient and would love to see 16 bit calculation precision reimplemented in the new versions

I still cannot wrap my head around this statement.

Are you really claiming a build of x265 from 2015 offers better quality per bit than a current build? That does not match my experience at all. There have been very significant improvements in x265 since 2015. x265 wasn't that great yet in 2015, was it? :confused:

huhn

6th May 2021, 09:52

ok finally found it:
https://forum.doom9.org/showthread.php?p=1684205#post1684205

it clearly works without 16 bit precision in same encoding parts and you can clearly compile without it.
why x265 8 bit 16 bpp is not compliable anymore is nothing i can say to but just use x265 10 bit and get the same benefits.

why 16 bpp for 8 bit matter? if i remember correctly it's the same reason why x264 10 bit was so massively better than 8 bit.

what so ever 16 and 8 bpp where build for a reason.

rwill

6th May 2021, 13:01

ITT: confused people

Augur89

6th May 2021, 20:01

This is an odd thread, very interesting incites into how x265 works in response to a bizarre premise. :)

I still cannot wrap my head around this statement.

Are you really claiming a build of x265 from 2015 offers better quality per bit than a current build? That does not match my experience at all. There have been very significant improvements in x265 since 2015. x265 wasn't that great yet in 2015, was it? :confused:

unfortunately you understood me correctly.
regarding compression efficiency, they made a big step back and never reached that point since 2015.

i really tried more every setting combination, i could imagine, but didn't find a way to reach the same compression with the newer versions.
i would suggest, you test it yourself to get an impression of what i'm talking about.

as promised, here's a very short sample to show typical artefacts: https://drive.google.com/drive/folders/1SDJgS_tDIhttBrR19dKu-OimV9H7C9yN?usp=sharing

these are 2 clips of the same video, rendered one time with 8 bit calculation and one time with 16 bit calculation. i used 2-pass encoding and 700 kbits so that you can see the difference pretty well. as you can see in the media info, in the 8 bit calculation clip, the encoder allocated less bitrate than in the 16 bit calculation. i got the impression, that 16 bit calculation is simply better in recognizing in which scenes more bitrate is neccessary and where these bits can be saved. in 16 bit calculation all scenes look proper with 700 kbits, in 8 bit calculation there are 4 scenes that look similar to the one, i uploaded

again: for proper testing, i would suggest, you do your own tests

MeteorRain

6th May 2021, 23:16

Am I reading correct that you are comparing a 2-pass 700k vs a ABR 700k?

Augur89

7th May 2021, 12:45

no, it was 2-pass abr for both.
as i said, that's just to show you the different look.
you have to do your own tests, if you really want proper comparisons.

i tested every possible setting i could find in every tool i could find to reach the same compression. if you can find a way to get the same or even better results than in the 2015 versions with 16 bit calculation precision, please let me know how exactly

Augur89

7th May 2021, 13:34

As MeteorRain said, that's a flawed test for comparing single-pass rate control vs. multi-pass rate control. Seeing that you are encoding in main10 profile, which means HIGH_BIT_DEPTH (Store pixel samples as 16bit values) must be enabled while building x265, I wonder how did you conclude that 8 bit calculation precision was used instead of 16 bit calculation precision.

i can't find a way to chose 16 bit calculation precision in the versions of x265 after 2015, so i assume that it's not included.

if you check the binaries, it's not listed as well:
https://bitbucket.org/multicoreware/x265_git/src/master/source/encoder/CMakeLists.txt
https://bitbucket.org/multicoreware/x265_git/src/master/build/linux/multilib.sh
https://bitbucket.org/multicoreware/x265_git/src/master/build/vc15-x86_64/multilib.bat

Same for the Documentation (https://x265.readthedocs.io/en/master/cli.html?highlight=bitdepth#cmdoption-output-depth)

again: i don't really mind how it's called or if there are other improvements. i just would like the same quality/filsize quotient as i had in the 2015 x265 version with 16 bit calculation precision and in the versions after 2015 i always needed to use more bitrate to get the same quality (and as i said before, i really tried every setting and every tool i could find).

▬▬▬▬My encoding setting from the 2015 x265 version▬▬▬▬
Encoding Mode: specific filesize/birate (2-pass) - fast 1st pass
target bitrate: 600 to 1500 kbit/s for 720p (depending on material)
Level/profile/tier: unrestricted/Main10/High
Calculation precision: 16 bit
In-/Output Bit-Depth: 10 Bit
Color Space: i420
Coding QT: max CU size, min CU size : 64 / 8
Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
ME / range / subpel / merge: dia / 57 / 2 / 1
Keyframe min / max / scenecut: 25 / 250 / 40
Lookahead / bframes / badapt: 20 / 4 / 2
b-pyramid / weightp / weightb / refs: 1 / 1 / 0 / 1
AQ: mode / str / qg-size / cu-tree: 1 / 1.0 / 64 / 1

so for further discussion, i would suggest you do your own tests and let me know if you experience the same or if you can find something, i wasn't able to find.
If you could tell me an encoding setting for the latest x265 releases, that provides a better or at least not worse quality/filesize quotient, it would be awesome.

benwaggoner

7th May 2021, 18:42

Good information, thank you. That has also been my reasoning for feeding the processed 16-bit data into x265 instead of dithering down to 10 bits beforehand.
That definitely won't make a difference unless the dithering algorithm you're using is superior to x265's default one. Which is very basic. The --dither version is somewhat better, but not cutting edge or anything.

The actual encoder will always convert to the final color space before starting quantization or anything else.

benwaggoner

7th May 2021, 18:48

ok finally found it:
https://forum.doom9.org/showthread.php?p=1684205#post1684205

it clearly works without 16 bit precision in same encoding parts and you can clearly compile without it.
why x265 8 bit 16 bpp is not compliable anymore is nothing i can say to but just use x265 10 bit and get the same benefits.

why 16 bpp for 8 bit matter? if i remember correctly it's the same reason why x264 10 bit was so massively better than 8 bit.

what so ever 16 and 8 bpp where build for a reason.
You are talking spatial bits per pixel, not frequency bits. IIRC, 8-bit always using 16-bit floating point for all iDCT processing, while 10/12 use 32. But that's different than what you are talking about here. Higher internal precision IS a big reason why 10-bit H.264 is better than 8-bit. The gap is much smaller in HEVC.

The bpp number you're talking about is just how many bits are used to store each pixel value. 8-bit is 8-bit. 10 and 12 use 16 because there aren't 10 or 12 bit register sizes. This is a primary reason 8-bit encoding is faster than 10/12.

This is all internal perf optimization stuff, and has no impact on the final encode. Forcing 16-bit for 8-bit content would make encoding slower without changing output. The actual encoder module itself will always start with 8-bit 4:2:0 input with Main Profile.

Kurosu

8th May 2021, 15:11

During the course of developping HEVC, Main10 was shown to be only 2-5% better (BDRate-wise) than Main because indeed all intermediate computation during prediction are raised to 16 bits internally. 16-bits is just a convenience thing, because a lot of CPU SIMD has this data format. It's no longer the 10+% eg mentioned by that old Ateme paper for H.264.

Main10 is still better, besides HDR, because of eg banding, and because input to prediction still has more precision.

RExt (4:2:2/4:4:4 and 4:2:0 > 10 bits) does use more precision (32 bits) on some intermediates, eg transforms, but that basically halves the throughput of the prediction. In any case, the goal was never to have a better 8 bits output, but to have ("contribution") workflows working on more than 10 bits.