Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 3rd January 2020, 21:29   #21  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 5,132
Big picture, these values can be left at 0 for "undefined" and most HDR displays will do something pretty close to the right thing. It is a lot better to not specify a value at all than to specify a materially incorrect one.

There is a straightforward translation from Y' code values to nits. So as long as you have some way to measure the brightest pixel in Y', you could then convert that to nits for MaxCLL.

MaxFALL is trickier, since the nits of the mean of the code values can be quite different than the mean of the nits of all the code values. So every pixel needs to get converted to nits and then mean of those determined, and then MaxFALL is the nits of the highest nit frame.

I think that for content already in Y'CbCr, nits can be derived directly from Y'. However all the tools I know of convert to RGB and then calculate from there. That should be empirically tested before that shortcut gets used, but obviously calculating Y'->nits would be a lot faster than going Y'CbCb -> RGB -> luma -> nits.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 3rd January 2020, 22:28   #22  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
@benwaggoner

The SMPTE reference implementation works with RGB values, so I figure that's the way to go. You're absolutely right that the mean of the code values is not the same as the mean of the nits values; the mean I calculate in my code is a mean of the nits values according to the algorithm in the reference pseudo-code. Unless I have made a mistake in implementing it, it should give the correct results.

Nowhere did I see references to using Y' code values. I presume Y' is a form of weighted average of the RGB values, whereas the reference implementation uses max(R,G,B) [aka the highest value channel] for the mean. So it would likely lead to different results again.

I think specifying metadata is better than just passing 0 because then the display has to dynamically adjust to the frames it sees, which has a good chance to lead to flickering/suddenly changing brightness I think, though I could be wrong.
TomArrow is offline   Reply With Quote
Old 4th January 2020, 11:05   #23  |  Link
suarsg
Registered User
 
Join Date: Dec 2018
Posts: 21
In a YUV420 10bit encoded video (HDR10), doesn't the Y-component of the pixel already define the brightness of the pixel?

For example, there's a flashlight in a dark scene located at coordinates (x=1000,y=800) in a frame. Decoding this frame of that scene and getting the Y component of the pixel at those coordinates, gives you e.g. the value 643. Using 643 on the PQ EOTF equals 432.5cd/m2.

Am I missing something?
suarsg is offline   Reply With Quote
Old 4th January 2020, 13:28   #24  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
Well the YUV data gets converted to RGB anyway for displaying. I think using YUV these days is mainly about saving bandwidth via color subsampling, but I could be wrong. I also think it's more of a pseudo-brightness that is roughly proportional to perceived brightness and not really a physical measure of brightness or anything.

This seems to be an example code for converting RGB to YUV (not sure which color matrix):

Quote:
Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16

Cr = V = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128

Cb = U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
So it's basically what I suspected, a weighed average.

Let's take an RGB value of 0.5, 0.6, 0.8. The Y value would be 0.5093 (Edit: Yeah dunno what my brain was thinking here, I wasn't aware that the code was for an 8bit limited YUV range, just ignore this please). Whereas max(R,G,B) would be 0.8. The RGB mean would be ~0.63.

The nits of the Y value would be 101.224

The nits of max(R,G,B) would be 1555.178

The nits of the RGB mean would be 323.845

So depending on what algorithm is used the results can differ quite a lot. max(R,G,B) is the official recommendation afaik.

Edit: I made a little mistake there in the averaging the RGB mean, since I averaged the RGB values, not the nits values. That average would be (92+244+1555)/3 = ~630

Last edited by TomArrow; 4th January 2020 at 18:28.
TomArrow is offline   Reply With Quote
Old 4th January 2020, 14:30   #25  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,406
Quote:
Originally Posted by TomArrow View Post
I think using YUV these days is mainly about saving bandwidth via color subsampling, but I could be wrong.
Subsampling is of course one reason (technically one could come up with a RGB subsampling scheme, similar to how Bayer filter work in image sensors and LCD displays, if one really wanted to), but there is also significant compression efficiency advantages by decorrelating the color planes, reducing redundant coded data. Compressing RGB without such a scheme would use more bandwidth.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 4th January 2020 at 14:33.
nevcairiel is offline   Reply With Quote
Old 4th January 2020, 15:44   #26  |  Link
suarsg
Registered User
 
Join Date: Dec 2018
Posts: 21
Quote:
Originally Posted by TomArrow View Post
Let's take an RGB value of 0.5, 0.6, 0.8. The Y value would be 0.5093. Whereas max(R,G,B) would be 0.8. The RGB mean would be ~0.63.

The nits of the Y value would be 101.22
Not sure why you're converting full range RGB to a limited color range 8bit YUV and then comparing it to 10bit RGB.

As per your example numbers, for full color range:
Code:
R = 0.5*1023 = 512
G = 0.6*1023 = 614
B = 0.8*1023 = 818
PQEOTF(818) = 2716 cd/m2

Y = 0.2627*R + 0.6780*G + 0.0593*B = 599
PQEOTF(599) = 270 cd/m2
But that wasn't really my point. My point was, the display won't output 10,000nits if all of the pixels are RGB(0,1023,0) - but your way of calculating MaxCLL would say 10,000. As a matter of fact, it would take just one fully lit green pixel in the entire movie to result in MaxCLL of 10,000 with your calculation, when in reality this is no more than 750cd/m2.
I'm not saying you're wrong in the sense that you're not following the instructions. I just don't understand why they'd suggest doing it that way. Unless we're both misinterpreting what they mean by:
Quote:
convert the pixel’s non-linear (R’,G’,B’) values to linear values (R,G,B) calibrated to cd/m2
Can someone post his interpretation of the above? The way I understand the above line is to apply the PQEOTF to the pixel values, since the pixel values represent luminance in a non-linear way but cd/m2 is linear.

They also have this note which sort of supports my above point:
Quote:
For MaxCLL, the unit is equivalent to cd/m2 when the brightest pixel in the entire video stream has the chromaticity of the white point of the encoding system used to represent the video stream. Since the value of MaxCLL is computed with a max() mathematical operator, it is possible that the true CIE Y Luminance value is less than the MaxCLL value. This situation may occur when there are very bright blue saturated pixels in the stream, which may dominate the max(R,G,B) calculation, but since the blue channel is an approximately 10% contributor to the true CIE Y Luminance, the true CIE Y Luminance value of the example blue pixel would be only approximately 10% of the MaxCLL value.
I just don't understand why they'd recommend it this way. This seems incredibly dumb. What am I missing?

Last edited by suarsg; 4th January 2020 at 18:21.
suarsg is offline   Reply With Quote
Old 4th January 2020, 18:26   #27  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
Sorry, I wasn't aware about it being limited or not, I just took the code from some website I found at quick glance. I suppose that example was nonsense then, my bad.

Well, MaxCLL is supposed to be the brightest pixel, and a single RGB pixels strictly speaking consists of R,G and B pixels on a display, so it makes sense to simply take the brightest one because that one pixel (R,G or B) would indeed reach 10,000 nits if that was its value.

It is indeed strange that they would use Max(R,G,B) for the MaxFALL measurement however since it's literally supposed to be an average value. Beats me why they do it that way. My original approach was to just average all the nits values of every single channel and since that made much more sense to me too, I left that in the plugin as alternate algorithm as mentioned earlier.
TomArrow is offline   Reply With Quote
Old 4th January 2020, 19:58   #28  |  Link
suarsg
Registered User
 
Join Date: Dec 2018
Posts: 21
Quote:
Originally Posted by TomArrow View Post
because that one pixel (R,G or B) would indeed reach 10,000 nits if that was its value
Yes, if the pixel is (1023,1023,1023). My example was (0,1023,0) though. In BT709 or sRGB you can only achieve maximum luminance of a pixel when it's (255,255,255), i.e. if all three channels are at their max possible value (255).

I'd assume this is the same for SMPTE2084/HDR10. Anything else doesn't make sense in my opinion, but I'm not an expert.

As a counter example:
Let's think of a medium gray pixel with the values (128,128,128). If we were to apply your luminance to each pixel's channel, each channel would have the same luminance output. This however, would make your medium gray not gray but give it a very blue tint instead. Because the eye is much more sensitive to blue than to red or green.

Another example:
If (0,1023,0) was 10,000nits, that would mean (0,1023,1023) would now be 20,000 nits, since the blue-subpixel was dark before but is now also emitting 10,000nits luminance on top of the 10,000nits the green sub-pixel was emitting. The spec's max however is 10,000nits. So I don't see how anything other than (1023,1023,1023) can be 10,000nits.
suarsg is offline   Reply With Quote
Old 4th January 2020, 20:49   #29  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
Hmm good point about the eye's sensitivity, but I presume the nits value is simply the physical measurement disregarding the human eye's sensitivity? That would certainly explain it in my mind.

Ultimately I think the physical limit will be imposed by how much energy the individual colored dot can output, not how bright it appears to the eye. Since even if the eye isn't very sensitive to blue color, it will still heat up the component just as much as a red pixel of similar intensity.

Also the intensity wouldn't add up like you say I think, since nits is candela per square meter, so two pixels will have twice the candela but also twice the area, so it would still be 10,000 nits?
TomArrow is offline   Reply With Quote
Old 5th January 2020, 01:20   #30  |  Link
suarsg
Registered User
 
Join Date: Dec 2018
Posts: 21
Quote:
Originally Posted by TomArrow View Post
Hmm good point about the eye's sensitivity, but I presume the nits value is simply the physical measurement disregarding the human eye's sensitivity? That would certainly explain it in my mind.

Ultimately I think the physical limit will be imposed by how much energy the individual colored dot can output, not how bright it appears to the eye. Since even if the eye isn't very sensitive to blue color, it will still heat up the component just as much as a red pixel of similar intensity.
That.. makes no sense. It seems like you lack basic understanding of how any of this works and almost feels like you keep misreading everything I write on purpose. And then repeat it the exact opposite.

Quote:
Originally Posted by TomArrow View Post
Also the intensity wouldn't add up like you say I think, since nits is candela per square meter, so two pixels will have twice the candela but also twice the area, so it would still be 10,000 nits?
That was a bad example on my part in regards to cd/m2. My example was about one and the same pixel though - adding a second light source (the blue pixel) - will increase brightness of that pixel. Just like your room gets brighter the more white pixels your display shows.
suarsg is offline   Reply With Quote
Old 5th January 2020, 07:21   #31  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
Why would I misread something you write on purpose? The sheer paranoia! Yes, I'm after you and I only came to this forum specifically to piss you off. Everything I ever did in my life was specifically anchored around the final goal of going to a forum about video processing to purposely misread what YOU wrote. Finally I can rest in peace, having achieved my goal!
TomArrow is offline   Reply With Quote
Old 5th January 2020, 08:12   #32  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
Aight, seems I misunderstood the measurement of luminance, but I think so did you.

Apparently the luminance/luminous intensity (cd) is already weighed by the human eye sensitivity: https://en.wikipedia.org/wiki/Luminous_intensity

Quote:
Let's think of a medium gray pixel with the values (128,128,128). If we were to apply your luminance to each pixel's channel, each channel would have the same luminance output. This however, would make your medium gray not gray but give it a very blue tint instead.
So yes, it does make sense that a value of (128,128,128) with the ST2084 conversion would be neutral grey because luminance is already compensated for the human eye's spectral sensitivity.

The other way I misunderstood the nits measurement is in terms of the square meter. I thought it meant emitter area, but it means the area across which light output is measured.

Quote:
If (0,1023,0) was 10,000nits, that would mean (0,1023,1023) would now be 20,000 nits, since the blue-subpixel was dark before but is now also emitting 10,000nits luminance on top of the 10,000nits the green sub-pixel was emitting. The spec's max however is 10,000nits. So I don't see how anything other than (1023,1023,1023) can be 10,000nits.
So yeah, I think this is exactly how it works. The 10,000 nit limit is then likely the limit of each individual *physical* pixel:


Which makes more sense, since one physical pixel operates independently from the neighboring ones, so why should they be (from a physical perspective) arbitrarily grouped?

Wanted to think about one more angle, but my brain is falling asleep and I'm tired, will think about it more tomorrow.
TomArrow is offline   Reply With Quote
Old 5th January 2020, 11:42   #33  |  Link
suarsg
Registered User
 
Join Date: Dec 2018
Posts: 21
Why don't you run a few movies through your application and compare your calculated MaxCLL/MaxFALL results with what it says in their own metadata?
suarsg is offline   Reply With Quote
Old 5th January 2020, 15:07   #34  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
Sure. I looked at 3 HDR demos I downloaded a while back and they all seem to lack the MaxCLL/MaxFALL data, so there's nothing to compare to. Do you have any test videos in mind that are suitable?
TomArrow is offline   Reply With Quote
Old 5th January 2020, 15:40   #35  |  Link
suarsg
Registered User
 
Join Date: Dec 2018
Posts: 21
8 of the 10 most recently released UHD-BDs have MaxCLL/MaxFALL data. Pick any of them or check the latest one you already have. The tools to extract the HEVC stream exist.
suarsg is offline   Reply With Quote
Old 5th January 2020, 23:02   #36  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,484
Quote:
Originally Posted by suarsg View Post
If (0,1023,0) was 10,000nits, that would mean (0,1023,1023) would now be 20,000 nits, since the blue-subpixel was dark before but is now also emitting 10,000nits luminance on top of the 10,000nits the green sub-pixel was emitting. The spec's max however is 10,000nits. So I don't see how anything other than (1023,1023,1023) can be 10,000nits.
The way you (want to) do it is have the display understand how to drive the subpixels. If (1023,1023,1023) is 4000 nits the display could still drive the green subpixel harder to have it also reach 4000 nits by itself.

In practice this is impossible but that does not mean HDR video would not prefer 4000 nits for all pure colors as well as white or that HDR displays must drive the green subpixel the same when at (1023,1023,1023) and (0,1023,0).
__________________
madVR options explained

Last edited by Asmodian; 6th January 2020 at 08:46.
Asmodian is offline   Reply With Quote
Old 6th January 2020, 04:54   #37  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 5,132
Quote:
Originally Posted by TomArrow View Post
So yes, it does make sense that a value of (128,128,128) with the ST2084 conversion would be neutral grey because luminance is already compensated for the human eye's spectral sensitivity.
It's distracting to use full range 8-bit values. All HDR encoding is encoded in limited range Y'CbCr 4:2:0 10-bit.

Anyway, yes RGB 128, 128, 128 won't have any chroma. In 10-bit YUV it'd translate to (512,0,0). It's neutral in the sense that there is no chroma, but it's going to look quite white in PQ. The same values would be much more gray in 709.

Quote:
So yeah, I think this is exactly how it works. The 10,000 nit limit is then likely the limit of each individual *physical* pixel.

Which makes more sense, since one physical pixel operates independently from the neighboring ones, so why should they be (from a physical perspective) arbitrarily grouped?
MaxFALL and MaxCLL aren't relative to any physical pixel. How could it be? Some displays have R, G, B elements, other R, G, B, and W. 4K HDR content can be played on 1080p or 8K panels.

Also, an RGB of 0, 512, 0 is going to be a lot brighter than one that's 0, 0, 512, as green is a much bigger portion of luma than blue.

I might want to ruminate on this a bit more, but I think that the conversion to luma used to figure out MaxFALL and MaxCLL is the same used in converting to Y'CbCr, which is why actual brightness should be able to derived with reasonable accuracy from Y'.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 6th January 2020, 10:18   #38  |  Link
suarsg
Registered User
 
Join Date: Dec 2018
Posts: 21
Quote:
Originally Posted by Asmodian View Post
The way you (want to) do it is have the display understand how to drive the subpixels. If (1023,1023,1023) is 4000 nits the display could still drive the green subpixel harder to have it also reach 4000 nits by itself.

In practice this is impossible but that does not mean HDR video would not prefer 4000nits for all pure colors as well as white or that HDR displays must drive the green subpixel the same when at (1023,1023,1023) and (0,1023,0).
Like I alread mentioned 2 times, this was a poorly chosen example on my part. I was trying to keep it simple without much details convoluting the actual issue. This was an attempt to make him see how flawed his logic was without going into particulars. He was thinking way too much in sRGB-terms and without even considering how RGB relates to brightness or how to convert an RGB-tuple to luminance.

Hence for BT2020 (the color space of HDR10!) we have:
Code:
Y = 0.2627*R + 0.6780*G + 0.0593*B
Solving for R, G, B when Y=1.0 (=10,000nits) makes it clear there's only one possible tuple. And in SMPTE284, that is (940+,940+,940+).
Not sure why you're bringing 4000nits or display tech into this as it doesn't really matter for the theory. Obviously there's no consumer display out there that can do these values without tone mapping and each display's tech is completely different on a sub-pixel level and how they themselves decide to derive its pixel's brightness from that sub-pixel structure is their own individual job to figure out.
The important part is, if you have an RGB-tuple for a pixel, the pixel's brightness/luminance will not be based on the value of MAX(R,G,B) or AVG(R,G,B). My examples were there to demonstrate why it makes absolutely no sense.

Quote:
Originally Posted by benwaggoner View Post
MaxFALL and MaxCLL aren't relative to any physical pixel. How could it be? Some displays have R, G, B elements, other R, G, B, and W. 4K HDR content can be played on 1080p or 8K panels.

Also, an RGB of 0, 512, 0 is going to be a lot brighter than one that's 0, 0, 512, as green is a much bigger portion of luma than blue.
That's why I stopped responding to him, he made no attempt in trying to understand why that is but instead went on a rant about his life goals.

Quote:
Originally Posted by benwaggoner View Post
I might want to ruminate on this a bit more, but I think that the conversion to luma used to figure out MaxFALL and MaxCLL is the same used in converting to Y'CbCr, which is why actual brightness should be able to derived with reasonable accuracy from Y'.
I agree and this was my first post regarding this:
Quote:
Originally Posted by suarsg View Post
In a YUV420 10bit encoded video (HDR10), doesn't the Y-component of the pixel already define the brightness of the pixel?
suarsg is offline   Reply With Quote
Old 6th January 2020, 23:18   #39  |  Link
TomArrow
Registered User
 
Join Date: Dec 2017
Posts: 90
Quote:
Originally Posted by benwaggoner View Post
MaxFALL and MaxCLL aren't relative to any physical pixel. How could it be? Some displays have R, G, B elements, other R, G, B, and W. 4K HDR content can be played on 1080p or 8K panels.

Also, an RGB of 0, 512, 0 is going to be a lot brighter than one that's 0, 0, 512, as green is a much bigger portion of luma than blue.

I might want to ruminate on this a bit more, but I think that the conversion to luma used to figure out MaxFALL and MaxCLL is the same used in converting to Y'CbCr, which is why actual brightness should be able to derived with reasonable accuracy from Y'.
You're right, the physical pixel of course probably doesn't correspond with the "perfect" pixel in terms of HDR primaries in most cases. Kind of an intriguing problem to be honest, because ultimately I think that's what the TV needs to know, how much it will have to drive any individual physical pixel. Still I'd figure it's more useful for the TV to know the maximum brightness of any of the HDR primaries than they Y value, since Y could mean a wider range of different things. E.g. a value of 0.0593 for Y could mean a blue channel with the value of 1.0 (normalized to 0.0-1.0) or it could mean a green channel with the value of 0.09. I figure it would have to assume the highest possible value and thus end up using more tonemapping than necessary, since the corresponding maximum channel value could have been as low as 0.09 but it needed to assume 1.0.

Be that as it may, do you mean the encoded Y value in a HDR PQ encoded video when you say Y' or a separately calculated brightness based on the RGB PQ data? If you mean the former, I just looked up some pdfs earlier today and I believe most deliverables these days actually do the RGB to YUV conversion for PQ *after* the transfer function is already applied, so reading that Y value and converting it to nits is a pretty invalid thing to do. I believe this problem is called "non-constant" vs "constant", where the latter is done from the linear RGB values and the former (the typical one apparently) is done from the already PQ-encoded RGB values.

To demonstrate, I did a bit of math:
Quote:
Y = 0.2627*R + 0.6780*G + 0.0593*B

PQ equivalents (roughly)
100 nits = 0.508
400 nits = 0.652
1600 nits = 0.803

With neutral grey, Y value is same as PQ value, since 0.2627 + 0.6780 + 0.0593 = 1

Let's say we have a "pure HDR red" of 100 nits and 400 nits:
(0.508,0,0)
(0.652,0,0)

Converted to nits from RGB we get roughly:

(100,0,0)
(400,0,0)

Let's convert to Y:
0.508*0.2627 = 0.1334516
0.652*0.2627 = 0.1712804

Now convert that value to nits:
0.714005 nits
1.491089 nits

Divide the second value through the first and we get
1.491089 / 0.714005 = 2.08834

We completely lose linearity. The higher value should be almost exactly 4 times as high as the lower value, but instead it is only 2.088 times as high. And the resulting value of course is also nowhere near the proper value.
Quick question by the way, does the white point imply the point where all color components have equal perceptional brightness? I always intuitively assumed that but now I'm questioning it a bit, since for example an sRGB (or linear sRGB) value of (0,255,0) appears brighter than (0,0,255), but (255,255,255) is perceived as white.

I suppose the answer is no. Gotta wrap my head around that. Hah!
TomArrow is offline   Reply With Quote
Old 6th January 2020, 23:20   #40  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 5,132
Quote:
Originally Posted by Asmodian View Post
The way you (want to) do it is have the display understand how to drive the subpixels. If (1023,1023,1023) is 4000 nits the display could still drive the green subpixel harder to have it also reach 4000 nits by itself.

In practice this is impossible but that does not mean HDR video would not prefer 4000 nits for all pure colors as well as white or that HDR displays must drive the green subpixel the same when at (1023,1023,1023) and (0,1023,0).
Yeah. Current display technologies make blue the easiest color to make bright, which is why brighter TV settings are also bluer. Since proper "movie" settings require a warm D65 white point, that caps the maximum bright white possible. Adding a white subpixel mitigates that some, at the cost of reducing maximum brightness of saturated colors.

Tone mapping is a complex and interesting field, and there is definitely not any one way to do it. Tradeoffs around how to optimally preserver brightness, hue, and saturation without banding involve a lot of proprietary alchemy by TV companies.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:09.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2026, vBulletin Solutions Inc.