10-bit encoding [Archive]

lansing

28th June 2018, 19:03

I have a basic question, if I want to encode an 8-bit video into a 10-bit x264, do I need to first convert the video to 10 bit in vapoursynth before feeding to the encoder?

LigH

28th June 2018, 19:14

lansing

28th June 2018, 19:47

It doesn't hurt to process a video with 8 bits per color component (Y, U, V) with a 10 bit precision conversion from pixels to DCT frequency values. The "10 bit" are not related to pixel color depths.

Can you elaborate what this is?
"10 bit precision conversion from pixels to DCT frequency values"

So is it wrong if I do this before passing to the encoder?

clip = mvf.Depth(clip, 10)

DJATOM

28th June 2018, 20:39

In general — nope. Just be sure that your encoder will handle 10 bit input.

lansing

28th June 2018, 23:30

Um I just did a test encode of 1500 frames of a 1080p video, one output from 8 bit input to 10 bit x264, another output from 8->10 bitdepth conversion in vs and then encode to 10 bit x264 with the same settings, both file sizes came out exactly the same down to the last byte. Is there something wrong?

DJATOM

29th June 2018, 01:34

Nope. As I understand, just rising bitdepth (multiplying, or rather shifting) will not alter your input. So that's okay to have identifical results from the same content.

poisondeathray

29th June 2018, 04:03

WolframRhodium

29th June 2018, 08:14

It implies that you used the same dithering method with mvf. I think most (all? ) x264 10bit builds will apply dithering to 8bit input by default.

But obviously different dithering settings can be used, yielding different results

As far as I know, low bitdepth data can be exactly represented in high bitdepth space without any quantization error (simply left-shifting). Why the dithering is needed?

LigH

29th June 2018, 11:57

The difference between 8 bit and 10 bit encoding in x264 is not related to the color depth of your input image, but to the precision of the stored MPEG-AVC / H.264 data. Please read about "Discrete Cosine Transform (https://en.wikipedia.org/wiki/Discrete_cosine_transform#Example_of_IDCT)" to learn what happens to the input frames. They are not stored pixel-by-pixel. They are stored as a table of numbers (called "matrix coefficients" in vector maths) describing 2-dimensional frequencies over 8x8 pixel squares. And in most cases, even only differences between two frames are transformed into DCT coefficients. And these coefficients are quantized and stored with either 8 or 10 bit precision.

Some hardware decoder chips in consumer players may not support AVC video encoded with 10 bit DCT precision, they may only be able to play 8 bit AVC.

For a thread about VapourSynth, this whole x264 issue is pretty off-topic.

Wolfberry

29th June 2018, 12:15

8bit → 10bit is not dithering, it's just higher precision.

In fmtconv dithering is performed when meeting at least one of these conditions:
Reducing the bit depth of integer data, or converting from float to integer.
Doing a full-range ↔ TV-range conversion between integer formats, because the resulting values haven’t an exact representation.

Also x265 --dither option is only applicable when the input bit depth is larger than 8bits
color depth(high color, true color, deep color) is the one deciding how many different colors can be represented. e.g. GIF only supports 8-bit color (256 colors)

WolframRhodium

29th June 2018, 14:33

For a thread about VapourSynth, this whole x264 issue is pretty off-topic.

In the lansing's case, the 8bit input for the 10bit x264 is exactly the same as the 10bit input, so the output of the encoded video should be the same if x264 is a stationary algorithm.

Thus I insist this is not a x264 issue but an issue related to image/video representation.

Anyway, DCT by itself is not capable to compress the highly correlated video data to a large extent without the help of inter/intra prediction.

LigH

29th June 2018, 19:48

Well ... it is as little specific to VapourSynth (because it is just as relevant for AviSynth+) as it is specific to x264 (because it is just as relevant for other encoders with a selection of available internal precisions, e.g. already for MPEG-1/2). It is generally relevant "to image/video representation", as you say. So I believe this section deserves a separate thread.

And yes, DCT is just a first step to efficiency, but many steps follow... but understanding DCT is important to understand what the "bits" precision refers to.

lansing

30th June 2018, 03:14

Okay so let me rephrase what I read so far, up converting bitdepth from 8 bit to 10 bit in vs will do nothing to the image. And the "8/10 bit" encoder in x264/x265 has nothing to do with bitdepth. So if I am to encode an 8 bit video to a 10 bit x264, there's no point to do the bitdepth conversion before that. Is this correct?

LigH

30th June 2018, 06:48

So if I am to encode an 8 bit video to a 10 bit x264, there's no point to do the bitdepth conversion before that. Is this correct?

That's correct, because it doesn't increase the precision of the image. It's like having a fractional number with a few significant digits (e.g. 0.125); adding another zero (0.1250) increases the number of digits in no relevant way, it's still identically the same number.

Using a video clip with 10 bit precision of the color components (Y, U, V) is only useful if your original material is already of a higher bit depth, or if you convert or filter the image before encoding it. Already when you convert a whole video from original RGB (may have been CG rendered) to YUV, the additional precision (R8G8B8 to Y10U10V10 instead of Y8U8V8) may be an advantage, to avoid little rounding errors which may get visible as banding in faint color ramps.

lansing

30th June 2018, 15:47

That's correct, because it doesn't increase the precision of the image. It's like having a fractional number with a few significant digits (e.g. 0.125); adding another zero (0.1250) increases the number of digits in no relevant way, it's still identically the same number.

Using a video clip with 10 bit precision of the color components (Y, U, V) is only useful if your original material is already of a higher bit depth, or if you convert or filter the image before encoding it. Already when you convert a whole video from original RGB (may have been CG rendered) to YUV, the additional precision (R8G8B8 to Y10U10V10 instead of Y8U8V8) may be an advantage, to avoid little rounding errors which may get visible as banding in faint color ramps.

Got it, thanks all the people for the clarifications.

Wilbert

5th July 2018, 13:17

Moved to seperate thread.