Encoding 4K HDR 4:2:0 10bit BT.2020 [Archive] - Page 3

View Full Version : Encoding 4K HDR 4:2:0 10bit BT.2020

Pages : 1 2 [3] 4 5 6 7

James Freeman

2nd May 2016, 18:12

ffmpeg.exe -framerate 24 -i "E:\Sequence\img-%%05d.exr" -strict -1 -vf scale=out_color_matrix=bt709 format=yuv420p10 "E:\x265\RAW.Y4M"
What about chroma position?
It looks like the chroma is shifted 1 or 0.5 pixel to the left, top to bottom looks right.
This is from EXR to Y4M ffmpeg, without x265 yet.

sneaker_ger, nevcairiel
Please educate me about chroma position.

The right crosses are 1 pixel wide in 3840x2160.
http://www.mediafire.com/convkey/96d4/n0svoyvqxt3i8afzg.jpg

kolak

2nd May 2016, 18:29

If anyone interested here is what the colormatrix=bt601:bt709 command does:
yuv444p is indeed only 8bit.

scale=out_color_matrix=bt709:

For a long time there was a bug in RGB 10/16bit to any 10bit YUV based format through Rec.709 (some tint is introduced). There was some patch recently added, so maybe it's fixed now, but test it against e.g. After Effects.

James Freeman

2nd May 2016, 19:07

No tint, all the RGB colors (printscreen from 4:2:0 10bit YUV to GIMP) are exactly the same in 8bit (255) before and after all the conversion steps.
I would love to test 10bit colors but AE doesn't play HEVC files in the workflow.

I wonder if there is a way to take a 16bit PNG screen shot from a 10bit video with MPC-HC and madVR?

kolak

3rd May 2016, 09:31

Sounds like it's fixed, but if you want to test it you can use any other format, not h265.
I always test DPX and v210 and for last years there was always a problem. I will try it.

James Freeman

3rd May 2016, 10:02

I found some ffmpeg "Location of chroma samples" in the code, and apparently I can change that (at the bottom):
https://ffmpeg.org/doxygen/3.0/pixfmt_8h_source.html
I'll try to find a way to implement this.

Also this post from sneaker_ger: http://forum.doom9.org/showthread.php?t=173047
So 2020 4:2:0 should be #3 (top left) like mpeg2 4:2:2.

EDIT:
According to ffmpeg debug log, EXR to DPX is "rgb48le to rgb48le" so no change to 4:2:0p10 at all.

kolak

3rd May 2016, 10:07

Tried latest build on Mac and still not very accurate.

There are difference in RGB values even up to +-3, where AE v210 has max difference is +-1. Looks like ffmpeg rounding is not precise at all. I also tried "accurate_rnd", but no difference. This problem has been in ffmpeg for ages now.
Funny enough when you go from 10bit DPX to RGB48 and than to v210 errors are smaller. There is something odd with math inside ffmpeg for gbrp10le to yuv422p10le pixel format conversion.

James Freeman

3rd May 2016, 10:25

Can you share your ffmpeg command line?

EDIT:
Okay got it, I needed to use "-vcodec v210", (I'm new at this, mind you);
Going from EXR to V210 AVI, the error is +1 maximum.
When going from DPX to V210 the error is +3 as you say, even the black is 1,0,2 RGB instead of 0,0,0.

Importing the EXR/DPX to V210.AVI to AE reveals more because I can color grab in 10bit and higher and compare to the original RGB.
DPX->V210: What should be 538 can be 547, 269 can be 276, 806 can be 815, so maximum +8 steps error in 10bit (or +2 in 8bit).
EXR->V210: 538 to 541, 269 to 270, 806 can be 810, maximum +4 steps error in 10bit or +1 step in 8bit.
Theses are maximum error and more often less.

Now the question is why do you use DPX instead of EXR if you know the error is bigger?
If ffmpeg can't handle DPX (gbrp10le) right or it has less bits for calculation, why still use it?

kolak

3rd May 2016, 10:45

-vf 'scale=out_color_matrix=bt709, format=yuv422p10le' -sws_flags accurate_rnd -sws_dither none -c:v v210 out.mov

this is DPX to v210 MOV.

kolak

3rd May 2016, 10:47

can you share your ffmpeg command line from EXR to V210 DPX?

EDIT:
Okay got it, I needed to use "-vcodec v210", (I'm new at this, mind you); Going from EXR to V210 AVI, the error are +1 maximum.

EXR is 16bit, so this is why. Problem seams to be with 10bit pixel formats, like gbrp10le, which is used for 10bit DPX.

James Freeman

3rd May 2016, 11:13

Thanks, I figured things quickly.
Yes, it seems ffmpeg likes higher bitdepth for more accurate calculation.
If you work with people and move content between stages (intermediate), ask for EXR specifically.
EXR is smaller than DPX anyway, so use that.

kolak

3rd May 2016, 11:49

It's ffmpeg problem with some of the conversion paths. They use many routes, some ones are heavily optimised for speed, others not etc. I don't need EXR in order to get proper v210, this is strictly ffmpeg problem. Other than this- big post houses have fixed workflows and something like "simple" change from DPX to EXR is not as easy as you think :)
EXRs can be bit smaller than DPX if you use compression (if not they are way bigger), but this is not so obvious and great. If you do use compression and deal with e.g. 4K EXRs than you loose a lot of CPU just for decoding. There are many factors, which may not be so obvious straight away.

James Freeman

3rd May 2016, 12:11

Thanks for the clarification Kolac, information is always much appreciated.

Now how about this chroma subsampling:
http://www.mediafire.com/convkey/1fb9/4chc7aiin8cpcf4zg.jpg

Looks like typical 4:2:0 MPEG-2 (Left between top and bottom).
How can I change it to Left-Top with ffmpeg?

kolak

3rd May 2016, 12:22

Not easy or impossible at all with ffmpeg.
Vapourysynth (avisynth) probably is your best choice.

James Freeman

3rd May 2016, 12:31

Hmmm, my goal was to go from After Effects to HEVC in 4:2:0 10bit, the fastest path with the smallest HDD space.
So far EXR to ffmpeg piped to x265 is the best way.

As far as I know about this (not much) Avisynth is a pain with high bit depths.
ImageSource (which supports EXR) is only 8bit.

Any suggestion?

kolak

3rd May 2016, 12:54

Use vapoursynth and pipe to x265. VS should load EXR/DPX fine over ffmpeg import plugin.

James Freeman

3rd May 2016, 13:19

Could you be kind and save me a couple of hours of googlin'?
EDIT:
On second thought, this is way over my head and free time.
I'll stick to MPEG-2 chroma position for a while, till ffmpeg is capable.
Thanks.

kolak

3rd May 2016, 13:26

I'm not good enough.
Ask in vapoursynth section.

surami

3rd May 2016, 13:54

Any suggestion?
Let's find somebody, who can extend the Advanced FrameServer code (https://sourceforge.net/p/advancedfs/code/HEAD/tree/trunk/) with 16bpc capabilities. :)

James Freeman

3rd May 2016, 14:14

surmai, we need to receive EXR image sequence, convert to YUV 4:2:0 10bit TV range with Top-Left Chroma as specified to 2020 HDR, and pipe it to x265 without storing to HDD before the pipe, all in high bit depth.
If we accomplish that, we get close to UHD-BD encoding in the most efficient way.

surami

3rd May 2016, 14:26

we need to receive EXR image sequence
a fake 16bpc YUV 444 or RGB AVI could do this as frameserver, that is how AFS is working, but at this time it serves only in 8bpc (RGB24, RGB32, etc.)

Top-Left Chroma as specified to 2020 HDR
There is --chromaloc (http://x265.readthedocs.io/en/default/cli.html#cmdoption--chromaloc) command on the x265 side. Isn't that you are talking about?

Did you try scale=out_color_matrix=bt2020 instead of scale=out_color_matrix=bt709 on a bt2020 source?

James Freeman

3rd May 2016, 14:39

I did not try yet --chromaloc if it changes the chroma location. I'll try.
--chromaloc: Specify chroma sample location for 4:2:0 inputs.
In case of ffmpeg the location is always "Left between top and bottom" or 0 in --chromaloc.
The x265 encoding will be wrong if --chromaloc 2 is set with MPEG-2 Y4M input.

Here is the official H265 Spec: https://www.itu.int/rec/T-REC-H.265-201504-I/en
Page 375 of pdf, 353 of book.
Maybe someone can clarify the whole chroma location issue.

EDIT:
A quick test shows that --chromaloc in x265 does not change the position of the chroma, from what I see it's just metadata to the decoder.
At least that is what I see.

kolak

3rd May 2016, 14:55

If you want to make sure no conversion is happening than maybe it's good to use both in/out parameters:

"scale=in_color_matrix=bt2020:out_color_matrix=bt2020"

nevcairiel

3rd May 2016, 15:06

Don't believe the nay-sayers, you can acurrately specify the chroma position for ffmpeg, the syntax is just not very straight forward since it actually supports any chroma position anyone could ever imagine, including odd subpixel positioning.

For the progessive 4:2:0 MPEG-2 position, extend your command like this (which is also the default, afaik):

-vf 'scale=out_color_matrix=bt709:out_h_chr_pos=0:out_v_chr_pos=128'

The position is specified in the luma grid / 256, so for a 4x4 grid in 4:2:0, 0:0 is center of the first pixel, 256:256 the last, and 0:128 is in the first column, between both rows - ie. mpeg2 position.
You can fiddle with the values to achieve any position you want, including the new bt2020 position. HEVC type 2 should be 0:0 from what I can tell (ffmpeg calls this AVCHROMA_LOC_TOPLEFT)

Note that software will likely still have issues with chroma type 2.

PS:
If your RGB is mastered as bt2020, why do you want to convert that to bt709 yuv, and not bt2020 yuv? That seems odd. Its a conversion to yuv either way, why introduce bt709 in there if that was not used anywhere before?

James Freeman

3rd May 2016, 15:30

God bless you Nev.
I will try the :out_h_chr_pos=0:out_v_chr_pos=128 line ASAP (like NOW).

EDIT:
IT WORKS!!!!!!!
:out_h_chr_pos=0:out_v_chr_pos=0 got me the chroma position that complies to BT.2020 spec, Hurrah Nevcairiel !
Now it is right to use --chromaloc 2 in x265, after this ffmpeg command.

nevcairiel

3rd May 2016, 15:37

You convert from RGB to YUV - you are not going to "convert" the colorspace, but you need to actually pick one for the RGB->YUV conversion. If your original RGB is BT.2020, then your YUV should also be BT.2020, or weird things might happen.
If you perform a RGB -> YUV conversion, and expect a future YUV -> RGB conversion to look the same as the original, both conversions need to use the same colorspace. And you encode the YUV as BT.2020, then the original conversion should also be BT.2020, should it not?

Note that using bt2020 in -vf scale needs a rather recent ffmpeg, it was added there only mid of april.

In short, the important part is that in a RGB -> YUV -> RGB chain, both conversions use the same colorspace.

nevcairiel

3rd May 2016, 15:57

As I said, I want no color space conversion at all by ffmpeg, but I want the colors to look right, the only option that does what I want is scale=out_color_matrix=bt709.
ffmpeg doesn't have to know if my image is 2020 or 709 or 601,, RGB values are RGB values.
I want only x265 to know that the image is in 2020 ST.2084.

But you have ffmpeg do the RGB -> YUV conversion, for that you need to tell it the *correct* YUV matrix you expect it to be afterwards. I don't get how that is so hard to understand.

surami

3rd May 2016, 16:08

If your source is already in bt2020, why do you want to use bt709 in and out, just use what kolak wrote.

nevcairiel

3rd May 2016, 16:52

There is no "in" matrix, since the input is RGB. Such a matrix only applies to YUV, not RGB. If you do RGB -> YUV, and then later YUV -> RGB, you need to use the same colormatrix in both conversions. So if you want your output HEVC file to be BT.2020, you also need to use BT.2020 for the RGB->YUV conversion, otherwise your YUV is BT.709, and NOT BT.2020.
Its as simple as that. If something looks wrong doing that, you should re-evaluate the other steps of your process, maybe something in the process isn't actually capable of displaying BT.2020 properly, or your ffmpeg is too old to actually support BT.2020.

But I've said this 10 times in this thread already. Don't try to fudge things until they "appear" right even if logically they are just plain wrong - chances are, they actually are wrong, and something else is screwing up as well.
Claiming all your data is BT.2020 and then using BT.709 for one conversion is just plain wrong, and if you don't realize this, we cannot help you.

From your posts, you seem to not really understand well how this entire process works, so instead of discarding any suggestions as "but my thing looks right", maybe be a bit more open to actually understanding whats going on.

nevcairiel

3rd May 2016, 17:51

The answer is obvious, your Y4M has no matrix info, so during playback madVR assumes its BT.709. If you were to tell it that it actually is BT.2020 manually, since Y4M does not carry this information, it would look right. madVR has a shortcut to switch the matrix info.
Like I said, evaluate all the steps. You even have the madVR OSD on, read it, and it tells you that its assuming its BT.709 instead of BT.2020 ("best guess" means it guesses since the info is not available), which nicely explains why it looks wrong.

You just think everything is right because there are no obvious errors, but using BT.709 for the conversion is not correct.
Always make sure every single step is correct, otherwise two errors might cancel each other out like here, you think BT.709 looks correct because thats what madVR "guesses" for your Y4M file, but once its actually handled as proper BT.2020, it would suddenly look wrong afterall.

nevcairiel

3rd May 2016, 18:00

I do not want ffmpeg to convert the matrix to 2020, I want to keep it 709; therefor I use 709.

This just makes no sense. Nothing in your chain ever was BT.709, so there is nothing to "keep it" as that. I just don't understand how it makes sense to you to even consider using BT.709 for anything in your entire process if its all supposed to be BT.2020. Oh well.

You clearly don't want to listen, so I leave you to your misconceptions, good luck.
Just don't pretend to create properly mastered HDR material and use that for reporting bugs.

James Freeman

3rd May 2016, 18:29

Can I eat my hat? :o

Let's just say I have wasted your time (again) Nev, and polluted a bunch of pages on this lovely and informative thread.
I do have to specify out_color_matrix=bt2020 in ffmpeg so that x265 encodes it correctly.
I tested with "--colorprim bt2020 --transfer bt709 --colormatrix bt2020nc" in x265 to clearly see what I did not see before.
I had to switch madVR display to 2020 also to get the correct picture back.

I guess my learning curve is a little stubborn, no hard feelings all in good spirit.

Anyway, I cleaned the thread a bit, hope the good stuff are still in.

kolak

3rd May 2016, 23:05

The references I find to the first parameter suggest this is just metadata set in the bitstream. The latter parameters sound like they would actually impact the image processing, but I have no idea what the correct values would be.

ffmpeg documentation always seems to be lacking in direct proportion to how much documentation would be useful...

Answer is there just needs a person with knowledge :)

http://forum.doom9.org/showthread.php?p=1766645#post1766645

kolak

3rd May 2016, 23:11

In short, the important part is that in a RGB -> YUV -> RGB chain, both conversions use the same colorspace.

Quite simple math, but this is the place where so many software/players fail badly :)

kolak

3rd May 2016, 23:12

If your source is already in bt2020, why do you want to use bt709 in and out, just use what kolak wrote.

It actually needs just out_color_matrix, as source is RGB (as nevcairiel pointed).

kolak

3rd May 2016, 23:20

James Freeman

4th May 2016, 05:19

But you have ffmpeg do the RGB -> YUV conversion, for that you need to tell it the *correct* YUV matrix you expect it to be afterwards. I don't get how that is so hard to understand.

I was lacking education about what Color Matrix actually is and how is it different than Color Primaries, and in consequence gave it no importance.

Y′CbCr signals (prior to scaling and offsets to place the signals into digital form) are called YPbPr, and are created from the corresponding gamma-adjusted RGB (red, green and blue) source using two defined constants KB and KR as follows:

https://upload.wikimedia.org/math/1/4/a/14ab0cc4c6dc23a84889b5da73930534.png

Where KB and KR are ordinarily derived from the definition of the corresponding RGB space. (The equivalent matrix manipulation is often referred to as the "color matrix".)

If I'm not mistaken (now) these KB and KR constants are what changing when selecting different "out_color_matrix" value in ffmpeg.
They do not affect Primary chromaticity points (as I though) but only part of the equation when converting from RGB to YCbCr and back.
The actual difference between 601, 701, 2020 KB and KR is small mathematically, but that's enough to skew the colors on the CbCr plane.

If I am mistaken this time please correct me (hope you didn't give up on me just yet :)).

Quite simple math, but this is the place where so many software/players fail badly :)
It's even worse when the person encoded the video used the wrong matrix... :p:D

James Freeman

4th May 2016, 06:04

You may also want to use zscale instead of scale, as this filter may be higher quality? Not sure if it has all needed options.
Isn't scale working in floating point too?
Can I specify chroma location as suggested by nevcairiel?

surami

4th May 2016, 14:10

I wrote to Ildar (the developer of AFS), maybe it's possible to extend the plugin with some simple code lines, so this way the frameserving could be 16bit capable instead of only 8bit. I don't know too much about this programing language, but as I see the source code in two files (PremiereFS.cpp (https://sourceforge.net/p/advancedfs/code/HEAD/tree/trunk/dfscPremiereOut/PremiereFS.cpp) and PremiereFS.h (https://sourceforge.net/p/advancedfs/code/HEAD/tree/trunk/dfscPremiereOut/PremiereFS.h)), the 8u could be rewritten to 16u and extra 16bit formats could be added on a very simple way, am I right or not? After that ffmpeg could be feeded with the fake AVI file.

Is here somebody with MSVS to try extending the code and compile it? I don't know, maybe Ildar will not see my message.

kolak

4th May 2016, 17:41

Isn't scale working in floating point too?
Can I specify chroma location as suggested by nevcairiel?

I don't think scale works with float.
I don't think you can specify chroma position with zscale :( There are no options for this.

James Freeman

4th May 2016, 17:52

Might as well post the link to the official place where all the white papers are:

http://www.blu-raydisc.com/en/Technical/TechnicalWhitePapers/General.aspx
* Blu-ray Disc Format - General 4th Edition (August 2015).

http://www.blu-raydisc.com/en/Technical/TechnicalWhitePapers/BDROM.aspx
* BD ROM - Physical Format Specifications - 9th Edition (August 2015)
* BD ROM - Audio Visual Application Format Specifications - Version 3 (July 2015)
* BD ROM - CMP Export Specifications - Version 1.1 (August 2015)

The red one is where all the good stuff's in.

Here is the full official HEVC spec:
https://www.itu.int/rec/T-REC-H.265

Select the latest one, there is a pdf.
Here you'll find the information that "BD ROM - Audio Visual Application Format Specifications" refers to.

benwaggoner

5th May 2016, 00:30

As for RGB->YUV conversion, both the input and output primaries and EOTF (electro optical transform function - like gamma or PQ) need to be specified, or match. If a RGB -> HDR-10 (Rec. 2020 primaries, D65 white point, and PQ curve EOTF) conversion assumes the source is sRGB, we'd expect that R'G'B'=255 would translate to a limited range value of 100 nits (peak white in Rec. 709) and thus a PQ curve Y'=509. And the Cb and Cr values would be constrained to the sRGB/709 subset of 2020.

Conversely, if a conversion makes the naïve assumption that the source and output have the same color primaries at EOTF (like sRGB->709), you'd get pretty crazy results. The image would come out way too bright in midtones, way oversaturated, and a reddish color shift. R'G'B'=255 would result in 10,000 nits, which no display in the world can actually show. Practical HDR-10 typically limits chroma to the P3 subset of 2020, and peak nits to 1000 (sometimes up to 4000).

For 2020 10-bit, you just get the enhanced color primaries but the same old gamma EOTF, so no extra dynamic range. And there's no defined white point or nit mapping. Also, no TV in the world can display all of the 2020 primaries, so you'll get clipping or some other tone mapping at the edges of the color volume.

This whole space requires new kinds of processing, and conversions that are quite difficult to do without scene-by-scene creative decisions. I think we're going to see the industry move away from integer-based processing altogether, and adopt floating-point linear light in order to have a common, flexible color space to combine and process stuff in. Going from SDR to HDR is qualitatively different, not just quantitatively.

surami

5th May 2016, 14:59

James Freeman

5th May 2016, 17:13

I get that too with madVR.

Use this small soft to see the metadata: https://mediaarea.net/en/MediaInfo.
Sony's clip encoded with wrong metadata of 0.5 nit instead of 1000nit peak.
The whole ST.2086 metadata of this clip is crap.

If I'm not mistaken, madVR uses the peak metadata, it clips anything above, and stretches the image to maximum white.
Most TVs do not use this ST.2086 metadata (for now), but simply map all bit words to ST.2084 EOTF in absolute way.

benwaggoner

5th May 2016, 18:05

Sometimes I see weird white frames if I play back Sony's HDR sample on my PC.

https://866450b80d380784660ddd7c64fd2076c544dd7e.googledrive.com/host/0B9p82xjTYmAxVTNVSFlvWHNjMmM/doom9/sony_hdr_camp_white_frame.jpg

Is this because there are two encoded layers, a base layer + an enhancement layer?
No. Only Dolby Vision does that, which was really a legacy mode for when 8-bit decoders were all there were. Dolby Vision now supports a couple single-layer 10-bit modes as well.

Sony TVs don't support Dolby Vision in any case, but HDR-10. HDR-10 is always single layer.

There is a command --temporal-layers (http://x265.readthedocs.io/en/default/cli.html#cmdoption--temporal-layers).

Is this the trick why Dolby Vision HDR videos are looking different (more detailed) than HDR10 videos?
That command is for temporal scalability. Like encoding 60p source with a backwards-compatible 30p base layer and an enhancement layer that can then add every other frame back in. The enhancement layer is just extra non-reference b-frames with timestamps in between the base layer's. It's cool, but entirely orthogonal to HDR or spatial quality.

Maybe there is a HDR10 layer (for only HDR10 capable TVs) + an enhancment layer for TVs, that are capable to decode the two layers simultaneously + merge them together and capable for more/less nits values or not?
There are lots of technologies being worked on to make backwards-compatible streams that can play in either HDR or SDR. None are in the wild yet other than Dolby Vision. And they're all targeting broadcast bandwidth constrained systems like cable/sat/terrestrial. For IP delivery, just delivering a SDR or HDR stream as appropriate to the device is much simpler.

surami

5th May 2016, 21:25

@Ben, thanks for clarifying things.
@James, you can see the same info already in the latest nightly MPC-HC, Shift + F10.
@nevcairiel, could you check the LAVFilter with a 12bit RGBA Cineform footage, I can't open them with the latest one.

kolak

5th May 2016, 23:05

12bit Cineform requires very latest libraries.

James Freeman

6th May 2016, 09:16

@James, you can see the same info already in the latest nightly MPC-HC, Shift + F10.

Thanks for this.

Still, I don't understand why TVs don't use the ST.2086 metadata (mastering display primary chromaticity ponts, peak brightness, minimum black level, MaxCLL, MadFALL), but simply map the bits to ST.2084 EOTF.
That's why the sony HDR test video plays without problems on all TVs ignoring the wrong 2086 metadata.

I think that the ST.2086 metadata is for extreme case corrections if the mastered video was done on a completely different mastering display that has no similar behavior to the consumer display.
We all know that the ST.2084 EOTF is static bit=luminance curve and has to behave exactly the same on all displays, so a logical question would be why even use ST.2086 metadata with a ST.2084 EOTF?

The mastering display color volume and peak luminance metadata is ignored because it's all in ST.2084 2020 container anyway, the TV just clips/compresses anything that's outside its own color gamut and luminance according to bit values not metadata.
maxCLL is there to avoid automatic brightness limiting (ABL) of the TV because of power limitation and the possibility of overheating and burning of the backlight LED; on secondary importance is predictable picture luminance.
maxFALL as maxCLL is nothing but power constraints to the mastering engineer because the consumer TV would explode above 200W (figuratively speaking), both do not actually have to be specified as a metadata but followed when grading/mastering.
MaxFALL indicates the maximum value of the frame average light level, in units of 1 cd/m2, in entire playback sequence of the BDMV HDR video streams in the PlayList.
To prevent overheating of the TV set obviously.
But will the TV actually lower the average light level to not overheat?, that is the question.
If it actually does something it'll break the precise ST.2084 curve.

In this Sony HDR Camp clip all the 2086 metadata is perfectly useless and should destroy the picture if applied for anything (like in madVR), yet users say it looks excellent on their HDR TVs.
It means that the 2086 metadata is completely ignored by TVs for now.

benwaggoner

6th May 2016, 18:37

Thanks for this.

Still, I don't understand why TVs don't use the ST.2086 metadata (mastering display primary chromaticity ponts, peak brightness, minimum black level, MaxCLL, MadFALL), but simply map the bits to ST.2084 EOTF.
Some TVs do use that information, but vary in how they do it. Generally dimmer TVs will use the metadata more, because they can't decode the full 1000 nits and P3 color volume of typical HDR-10.

That's why the sony HDR test video plays without problems on all TVs ignoring the wrong 2086 metadata.
TVs may also use sanity checks. Given the code values in the fames are going to be >>MaxCLL, and that 1 nit would almost never be used, perhaps out of range values or ignored, or values that wildly vary from the actual range of the content.

I think that the ST.2086 metadata is for extreme case corrections if the mastered video was done on a completely different mastering display that has no similar behavior to the consumer display.
We all know that the ST.2084 EOTF is static bit=luminance curve and has to behave exactly the same on all displays, so a logical question would be why even use ST.2086 metadata with a ST.2084 EOTF?
Well, 2084 is nits assuming 10 nit ambient light. So I'd argue that it's not entirely normative, and that a TV could appropriately use ambient light detection and adjust tone mapping when ambient light isn't close to 10 nits. This is analogous to Rec. 709 (SDR) which is nominally 100 nits peak with 10 nit ambient light.

Also, lots of TVs aren't 1000 nits. Lots are less, some are more. The TVs that are less need to roll off at the top of the curve to avoid clipping. The TVs that can do more might do so if ambient light is higher. Also, 2084 can go up to 10,000 nits. While most HDR-10 mastered content uses a 1000 nit peak, there is stuff out there at 2000 or 4000 nits mastered on Dolby's high-end grading monitors.

The mastering display color volume and peak luminance metadata is ignored because it's all in ST.2084 2020 container anyway, the TV just clips/compresses anything that's outside its own color gamut and luminance according to bit values not metadata.
The 2086 metadata is also there for cases where an external device is decoding, and so the display doesn't have access to the bitstream level metadata. HDMI 2.0a will pass those parameters from player to display.

maxCLL is there to avoid automatic brightness limiting (ABL) of the TV because of power limitation and the possibility of overheating and burning of the backlight LED; on secondary importance is predictable picture luminance.
maxFALL as maxCLL is nothing but power constraints to the mastering engineer because the consumer TV would explode above 200W (figuratively speaking), both do not actually have to be specified as a metadata but followed when grading/mastering.

To prevent overheating of the TV set obviously.
But will the TV actually lower the average light level to not overheat?, that is the question.
If it actually does something it'll break the precise ST.2084 curve.
For displays that do 1000 nits peak, we shouldn't assume they can do 1000 nits over a whole white frame! The peak measurements are generally for a relatively small patch of the screen, like 10%. I think every existing TV would display a white frame coded at 1000 nits at less than 1000 nits. For which both the EPA and your eyes thank you.

In this Sony HDR Camp clip all the 2086 metadata is perfectly useless and should destroy the picture if applied for anything (like in madVR), yet users say it looks excellent on their HDR TVs.
It means that the 2086 metadata is completely ignored by TVs for now.
TVs will look at this data, and as we move to PC-based playback, where 300-400 nits peak is more reasonable today, I would expect that the metadata would be useful if properly coded. There is also SMPTE 2084-40 in progress, which is an open-standard dynamic metadata. Think of MaxFALL and MaxCLL that can vary by shot/scene instead of fixed per title, plus other useful information like Min and Mean light levels.

surami

7th May 2016, 22:28

No. Only Dolby Vision does that, which was really a legacy mode for when 8-bit decoders were all there were. Dolby Vision now supports a couple single-layer 10-bit modes as well.

That command is for temporal scalability. Like encoding 60p source with a backwards-compatible 30p base layer and an enhancement layer that can then add every other frame back in. The enhancement layer is just extra non-reference b-frames with timestamps in between the base layer's. It's cool, but entirely orthogonal to HDR or spatial quality.

I just run through the Dolby Vision white paper (v2) (http://www.dolby.com/us/en/technologies/dolby-vision/dolby-vision-white-paper.pdf). They are mentioning two layers, a base layer + an enhancement layer + metadata (page 11). On the x265 command line options page: "A decoder may chose to drop the enhancement layer and only decode and display the base layer slices."

It can be, that Sony HDR sample has been encoded this way, so that's why we see sometimes the "weird", let's say now enhancement frames and maybe that's why more fps is needed (Joe Kane also mentioned, that more fps is needed.). If madVR or LAVsplitter (I don't know how this softwares are working exactly) or hardware decoder catches the base layer (frames) we see the baselayer, if not we see the enhancment layer (frames). But there are systems like Dolby Vision and who knows what other systems, that can merge the two layers together. Like you make a picture in Photoshop with two layers, there is the base layer and on the top you put an enhancment layer or difference layer (overlay, multiply, etc.) with opacity (difference between the SDR and HDR picture or HDR10 and Dolby Vision). 2 or 4 extra bits are enough for the enhancment.

Not to forget that Cineform 12bit RGBA became a standard intermediate, it can deliver the two layers RGB as base layer + A as enhancment/difference layer.

I'm just thinking loudly, maybe I'm wrong and nothing is right.

surami

8th May 2016, 10:06

I got bored to copy all the time the new encoded stuff to an USB 3.0 stick and put into my TV's USB 3.0 interface for testing, so I decided to solve this somehow. I found a device called "DIGITUS USB 3.0 Sharing Switch", I will order next week, hopefully it will work. :)