Log in

View Full Version : Google VP9 "Next Generation Open Video" information posted


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [16] 17 18 19 20 21 22 23 24 25

huhn
15th November 2015, 19:29
I don't understand why the colormatrix vpxenc sRGB is equal to the BT709? At least in HEX it looks like that.

the colormatric is equal the difference is the transfer(gamma) between sRGB and BT709

Jamaika
15th November 2015, 21:37
This is true.
Changing the color matrix (RGB <-> YUV_bt709) in codec BPG(X265) gives a complete change in the recording compression.
For VP9 record it is almost the same in HEX. Unless that RGB is also bt709.
http://i64.tinypic.com/14juebq.png
https://www.sendspace.com/filegroup/H%2FCXE9aJ2azF0F6zVclLNEjYZbXOBNSB
vpxenc.exe --aq-mode=1 --verbose --threads=4 --i444 --profile=1 --best --codec=vp9 --fps=25000/1000 --cpu-used=1 -w 3840 -h 2160 --passes=1 --pass=1 --drop-frame=0 --end-usage=q
--color-space=bt709/sRGB - -o "vp90_1.5.0-110_bt709/sRGB.webm"
bpgenc.exe -v -q 0 -e x265 -f 444 -c ycbcr_bt709/rgb -limitedrange -b 8 -m 9 -keepmetadata "input_RGB.png" -o "x265_0.9.6_yuv444_bt709/sRGB.bpg"

foxyshadis
16th November 2015, 01:05
Well you're not inputting rgb into vpxenc are you? (Hard to tell, since it's piped in.) The color-space is just a tag in the file to facilitate proper decoding to rgb, assuming the decoder doesn't just ignore it. If you input yuv and tell the encoder it's rgb, or vice versa, it'll be decoded all wrong.

BPGenc is nice enough to convert to yuv if you specify bt709 when it knows you're inputting rgb, of course it would be completely different.

benwaggoner
16th November 2015, 01:35
@LigH


We wrote theory. Let's get to the facts.


What can I enter minimal good GOP for movies 25, 50,100 fps now?


For me the default GOP=250 for vpx is too large.





PS. No problem with GOP=25 for X265.



Well, the shorter the GOP the lower the compression efficiency, and the greater the risk of key frame strobing. H.264/H.265 have pretty rich reference list features that let a decoder skip lots of B-frames when seeking. Something like that is theoretically possible in VP9 using alt-ref frames, but I don't know if it has been implemented in the existing encoders or decoders.



If you look at a typical Long GOP encode in x264/x265, you get mainly B-frames or b-frames, which means most frames don't need to be decoded when seeking far into the GOP. Mainly just the P-frames.

Jamaika
16th November 2015, 07:31
The color-space is just a tag in the file to facilitate proper decoding to rgb, assuming the decoder doesn't just ignore it.
It's true but I haven't seen a decoder and viewer for VP9 sRGB.
FFplay/MPC-BE/LAVsplitter it reads as follows::(
Stream #0:0(eng): Video: vp9 (Profile 1), gbrp(pc, gbr/unknown/unknown), 3840x2160, SAR 1:1 DAR 16:9, 1k tbr, 1k tbn, 1k tbc (default)
Something like that is theoretically possible in VP9 using alt-ref frames, but I don't know if it has been implemented in the existing encoders or decoders.
I gave up and will not be inquired what was missing.

LigH
16th November 2015, 09:06
Something like that is theoretically possible in VP9 using alt-ref frames, but I don't know if it has been implemented in the existing encoders or decoders.

At least the CLI option is available in my last released version, v1.4.0-1652-gd7bbe1a:

VP9 Specific Options:
...
--auto-alt-ref=<arg> Enable automatic alt reference frames

But I don't know if it changes the output.

Motenai Yoda
17th November 2015, 14:33
IIRC --auto-alt-ref works on 2 pass only

MoSal
17th November 2015, 14:46
@LigH
We wrote theory. Let's get to the facts.
What can I enter minimal good GOP for movies 25, 50,100 fps now?
For me the default GOP=250 for vpx is too large.

PS. No problem with GOP=25 for X265.


kf_max_dist is not set to a default 250 in ffmpeg. That's probably why you had problems with seeking. There should be no seeking issues if you actually pass '-g 250' and force an intra frame every 5 seconds.

Jamaika
17th November 2015, 17:32
This isn't the cause. This effect alone in the same frame for GOP=250.
Tested on decoders MPC-BE 1.4.6.935 and LAV Video 0.66.

MoSal
17th November 2015, 20:30
This isn't the cause. This effect alone in the same frame for GOP=250.
Tested on decoders MPC-BE 1.4.6.935 and LAV Video 0.66.

Works for me in mpv(GNU/Linux system). Seeking between keyframes is fast. Accurate seeking is a little bit slow, but nothing drastic.

xyzone
20th November 2015, 10:22
I wonder about yuv422, yuv444, and 10-bit+ vp9 encodes of profile 2 and 3. It doesn't seem like anything but profile 0 yuv420 is getting much hardware support. yuv422 would suit some of my purposes better, since heavy contrast reds tend to stay sharper, but strangely enough, even chrome only supports yuv444, but not even yuv422 in profile 0. I've done some experiments with 10-bit vp9 (yuv420), and in some cases there is some efficiency advantages when encoding very flat and still content. So is there any point of encoding in the more exotic vp9 profiles and pixel formats, or they're DOA as far as potential hardware support?

nevcairiel
20th November 2015, 11:33
but strangely enough, even chrome only supports yuv444, but not even yuv422 in profile 0.

Anything but yuv420 is invalid in profile 0, you would need profile 1 for 422/444, and profile 2/3 for 10-12 bits

Jamaika
20th November 2015, 14:09
I've done some experiments with 10-bit vp9 (yuv420), and in some cases there is some efficiency advantages when encoding very flat and still content.
In my opinion, it makes sense to have a monitor 10-bit.
I hear too of the opinion that only layman makes films 10-16bit. The human eye is unable to perceive more colors than in the movies 8bit.

PS You cann't convert lossless quality of yuv420 on yuv444 or rgb!
You cann't convert lossless quality of bt709 on bt2020!

xyzone
21st November 2015, 00:12
Anything but yuv420 is invalid in profile 0, you would need profile 1 for 422/444, and profile 2/3 for 10-12 bits

You're right. I was talking about 422 in profile 1. Chrome can't play it, but it can 444.

xyzone
21st November 2015, 00:16
In my opinion, it makes sense to have a monitor 10-bit.
I hear too of the opinion that only layman makes films 10-16bit. The human eye is unable to perceive more colors than in the movies 8bit.

PS You cann't convert lossless quality of yuv420 on yuv444 or rgb!
You cann't convert lossless quality of bt709 on bt2020!

It's supposed to be about more efficient compression (of flat content), not higher baseline in quality. I have no idea if it's true, but I've seen this argument a lot coming from 10-bit h264 anime advocates.

Nintendo Maniac 64
21st November 2015, 06:38
It doesn't seem like anything but profile 0 yuv420 is getting much hardware support

AFAIK most h.264 hw decoders are only 8bit 4:2:0 as well...

xyzone
21st November 2015, 12:53
AFAIK most h.264 hw decoders are only 8bit 4:2:0 as well...

Yes, I know. But does that mean newer and better codecs should do the same?

foxyshadis
22nd November 2015, 02:00
The human eye is unable to perceive more colors than in the movies 8bit.

That's not even remotely true anymore; that's based on crappy old TVs with a terrible gamut. Sure there are still lots of 6bit TVs and monitors out there, but the difference of a truly high-end TV is obvious to most people. Both banding reduction and wider gamut are important.

In motion it doesn't matter, of course, like many things.

Jamaika
22nd November 2015, 08:30
Maybe, but opinion is that the gamut is only handling the recipient. 24(8x3)bit image is 16.8 million colors, 30(10x3)bit is 1 billion colors when the young man receives only ~25mln. Is reduction banding not just a matter of implementation 8bit codec?

LigH
23rd November 2015, 17:16
VPX v1.5.0-132-g16eba81 (http://www.mediafire.com/download/vgw5x3eee7ax3xt/vpx_v1.5.0-132-g16eba81.7z) (GCC 5.2.0, VP8/VP9/VP10, Win32/Win64)

Motenai Yoda
24th November 2015, 00:28
Maybe, but opinion is that the gamut is only handling the recipient. 24(8x3)bit image is 16.8 million colors, 30(10x3)bit is 1 billion colors when the young man receives only ~25mln. Is reduction banding not just a matter of implementation 8bit codec?

actually for bt.709 u'll need at least 10 bit yuv to get all colors of 8 bit RGB, as yuv is intended as a space to store RGB colors in a color opponent colorspace.
maybe for a bt.709 RGB gamut 8bit will be enough, but for larger colorspaces as DCI P3 (video) or AdobeRGB (photo), at least 11bit will be needed, and for ever larger as bt.2020 (UHD BD/TV) or ProPhoto (some imaginary colors) it will take even 12 bit or more.

The reduction of banding Imho will be based on 2 primary factors,
1- some source has banding that is hided during mastering with noise/grain,deband or dither, during compression lossy codecs often remove/attenuate high frequency details, so "unhide" the banding;
2 - when case 1 or high bit depth source the codec can't process/store/reproduce high precision values for low frequencies, so those will be rounded to low precision (8bit). As this precision isn't enough to mantain the continuity between blocks and isn't even linked between planes, on a lot of gradients will generate an interference that cause banding or color banding.

dapperdan
5th December 2015, 22:23
The latest Chrome beta release, supports VP9 in WebRTC

http://blog.chromium.org/2015/12/chrome-48-beta-present-to-cast-devices_91.html

Selur
20th January 2016, 06:28
Anyone has a working vpxenc binary with high bit depth support?
The one I build with https://github.com/jb-alvarado/media-autobuild_suite always crashes. (+ they removed the "--input-bit-depth=X" and "--bit-depth=X" from the vpxenc help,..)

Jamaika
20th January 2016, 14:50
Cancelled input color depths for vpxenc. So in the new Hybrid 2016 will only have ffmpeg/vpxenc the 8bit BT709.

pandy
20th January 2016, 15:29
Maybe, but opinion is that the gamut is only handling the recipient. 24(8x3)bit image is 16.8 million colors, 30(10x3)bit is 1 billion colors when the young man receives only ~25mln. Is reduction banding not just a matter of implementation 8bit codec?

Nope - trained person is capable to perceive somewhere between 12 and 14 bit (and usually such bit depth is used in medical image processing/displays), non trained person is able to perceive 10 bit relatively easily. Please avoid confusion between perceived number of colours (typical human male is able to name only few of them when usually human females has different names for different colours and for them green is definitely not dark mint).
In absolute terms we are limited in a way how many colours we can perceive but in relative way when shades are involved 8 bit is definitely not enough and banding is classical example.

zerowalker
20th January 2016, 15:34
Didn't know banding had something to do with how we perceive it, thought it was completely computational rounding errors and nothing more in that sense.
As i mean can't we kinda hide the rounding errors through dither and by so we perceive it as "higher quality" while it's still the same amount of information?

mzso
20th January 2016, 15:48
Nope - trained person is capable to perceive somewhere between 12 and 14 bit (and usually such bit depth is used in medical image processing/displays), non trained person is able to perceive 10 bit relatively easily. Please avoid confusion between perceived number of colours (typical human male is able to name only few of them when usually human females has different names for different colours and for them green is definitely not dark mint).
In absolute terms we are limited in a way how many colours we can perceive but in relative way when shades are involved 8 bit is definitely not enough and banding is classical example.

I highly doubt there's any difference in color perception between genders. Woman might be able to name more shades, because of their much higher interest in shallow fashion and such crap.

zerowalker
20th January 2016, 15:51
Pretty sure women actually see more colors.
Think they believe it was some evolutionary trait cause they may have looked for berries and colors mattered a lot, something a long those lines.
Sounds stupid though but there was a much better explanation then the one i gave;P

LigH
20th January 2016, 16:03
Oh, I guess there is another level of difference: Perceiving vs. realizing the difference of colors (not sure if that is an appropriate term at all). Women and men may possibly have a similar threshold which difference they will recognize as "different colors", and that will also depend on the hue (bluegreen to violet is usually a range where differences are recognized best, on the other hand it lacks of names for individual colors). But the conclusions may be different. You know, women recognizing faint shades of red in their husbands' face had a better chance to survive their anger about a missed hunt... :o

zerowalker
20th January 2016, 16:09
You know, women recognizing faint shades of red in their husbands' face had a better chance to survive their anger about a missed hunt...
Haha, love this, made me laugh xd

Truly the thing that was the sole reason for the survival of the human species.

Jamaika
20th January 2016, 16:12
Nope - trained person is capable to perceive somewhere between 12 and 14 bit (and usually such bit depth is used in medical image processing/displays), non trained person is able to perceive 10 bit relatively easily. Please avoid confusion between perceived number of colours (typical human male is able to name only few of them when usually human females has different names for different colours and for them green is definitely not dark mint).
In absolute terms we are limited in a way how many colours we can perceive but in relative way when shades are involved 8 bit is definitely not enough and banding is classical example.
I posted this motto. In Poland on film forum was the fuss about the point of 10bit video recording cameras. The problem with archiving. The main thought was, that codec ProRes is needed only for scaling the movie 8bit and everything. The rest is advertising.
maybe for a bt.709 RGB gamut 8bit will be enough, but for larger colorspaces as DCI P3 (video) or AdobeRGB (photo), at least 11bit will be needed, and for ever larger as bt.2020 (UHD BD/TV) or ProPhoto (some imaginary colors) it will take even 12 bit or more.
Good idea. Unfortunately I read the tags, if the potential client has a TV that has BT2020 and correctly converts colors?

vivan
20th January 2016, 18:15
Didn't know banding had something to do with how we perceive it, thought it was completely computational rounding errors and nothing more in that sense.
As i mean can't we kinda hide the rounding errors through dither and by so we perceive it as "higher quality" while it's still the same amount of information?The reason you can see banding (even if you generate clean gradient) is because we can see more colors (see difference between 2 close shades) that display can show.

Dithering hides display and/or format limitation (the reason why you have to round), it gives you more average precision while sacrificing local precision - pixels have different values from what they are supposed to have, while average value is closer to the desired one. With dithering higher output bitdepth gives you less noise. This (and fp yuv->rgb conversion) is why output and video bitdepth are completely unrelated.

mzso
20th January 2016, 19:25
Pretty sure women actually see more colors.
Think they believe it was some evolutionary trait cause they may have looked for berries and colors mattered a lot, something a long those lines.
Sounds stupid though but there was a much better explanation then the one i gave;P

I'm still a 100% skeptical.
Also the woman were gathering while the men hunted in prehistoric times hypothesis is pretty unfounded. I remember a paper that concluded that man are actually better at spotting food to be gathered. And doesn't make much sense. In my opinion it's most likely that men hunted, meanwhile gathering whatever they found. Women were taking care of children meanwhile doing some soft labor and didn't drag them around.

Plus all the famous painters ever were men.

zerowalker
20th January 2016, 19:32
Ah well i don't know much about it, just saw some documentary about it and therefore mentioned it.

However famous painters being men doesn't tell much though, as men is usually of higher status then women, and that probably plays some role.
Then again paintings is more about abstract stuff than actual color.

But there is probably some scientific tests that's been done to see how men differ from women in eye sight.
There should be some difference, as with everything else as we aren't built identical. If it's noticeable however is another story.

Not denying either one as i am not too well read in this topic;P

mzso
20th January 2016, 20:03
But there is probably some scientific tests that's been done to see how men differ from women in eye sight.
There should be some difference, as with everything else as we aren't built identical.

We're not identical, on an individual level. It doesn't necessarily mean that the two genders have different capabilities in an any/every trait.

zerowalker
20th January 2016, 20:05
We're not identical, on an individual level. It doesn't necessarily mean that the two genders have different capabilities in an any/every trait.

Indeed, hence what i meant it being noticeable or not.
If it's not, then it's basically in the "individual" measure as you say.

benwaggoner
21st January 2016, 00:40
Nope - trained person is capable to perceive somewhere between 12 and 14 bit (and usually such bit depth is used in medical image processing/displays), non trained person is able to perceive 10 bit relatively easily.
With specifically designed test patterns! For natural images and some appropriate dithering, 10, 12, and 14 should be pretty well identical for Rec. 709 images. Even 8-bit can be great with enough pixels. HDR needs at least 10.

zerowalker
21st January 2016, 01:51
I am confused with the concept of HDR.
Isn't that basically a system to try to force a higher dynamic range, but it doesn't so by pushing the limits,
it's not that the actual range is higher at all, it simple crushes the colours to force details in the dark at the cost of other things etc?

foxyshadis
21st January 2016, 04:38
I am confused with the concept of HDR.
Isn't that basically a system to try to force a higher dynamic range, but it doesn't so by pushing the limits,
it's not that the actual range is higher at all, it simple crushes the colours to force details in the dark at the cost of other things etc?

HDR is merely a wider gamut of color (hue and tone) in a range. You can display that wider gamut on device with a lower gamut by compressing what were whites and blacks into greys, kind of like incorrectly converting TV-levels YUV, but you can also dynamically emphasize the brighter parts or the darker parts, and get higher quality gamma conversions for things like a "night mode" or "daytime mode" -- it's more flexible. Movies can easily be graded with the information allowing the player to turn a HDR stream into an intended standard DR stream on playback, by choosing the white and black points and gamma. And of course, you can just have a screen that shows the wider gamut natively! Screens just a few years ago struggled to reach sRGB reliably, but now regularly exceed it, thanks to advances in LED backlighting and OLED, so it's the perfect time for a new standard.

One of the biggest problems with standard dynamic range is its crushed, banded shadows, and an HDR TV offers the possibility of completely fixing that without dithering to the point of killing shadow detail. Similarly in bright areas, but the eye tends to shy away from them, especially in a dark room, so it's not as noticeable. Getting better fidelity is something everyone can cheer.

vivan
21st January 2016, 07:45
Even 8-bit can be great with enough pixels.With enough pixels even 1-bit could be enough. Just like all printing is 1-bit.

mzso
21st January 2016, 14:34
With enough pixels even 1-bit could be enough. Just like all printing is 1-bit.

Colour printing is not 1 bit. You need 2-3 for the 3-4 colors they use.

vivan
21st January 2016, 19:30
Colour printing is not 1 bit. You need 2-3 for the 3-4 colors they use.Even if you can control dot size (I believe that common inkjet printers don't do that, their resolution is high enough) you still either have ink in a specific point or don't.

Nintendo Maniac 64
22nd January 2016, 01:12
A better example would be DSD - it's an audio stream that's only 1 bit but has a sampling rate of 2822.4 KHz.

LigH
22nd January 2016, 09:16
I love dithering. There is a library providing some image quantizations with error distributing dithering algorithms, named "libcaca" (yes, with a turd as icon, how funny).

mzso
22nd January 2016, 10:47
Even if you can control dot size (I believe that common inkjet printers don't do that, their resolution is high enough) you still either have ink in a specific point or don't.

Yes, but you have one of 4-5 colors in that point. CMY(K) and white. So not 1 bit.

nevcairiel
22nd January 2016, 11:15
Yes, but you have one of 4-5 colors in that point. CMY(K) and white. So not 1 bit.

Its one bit per component. Just like when we talk about 8-bit video, its not 8-bit overall, its 8-bit per component for 24 in total.

mzso
22nd January 2016, 14:04
Its one bit per component. Just like when we talk about 8-bit video, its not 8-bit overall, its 8-bit per component for 24 in total.

Okay. We typically refer to that as "24 bit", don't we?

nevcairiel
22nd January 2016, 14:07
Okay. We typically refer to that as "24 bit", don't we?

Not really, everyone is talking about 8-bit or 10-bit video around here. 24 or 30-bit are rare.

LigH
22nd January 2016, 20:03
A resolution per component is less ambiguous, because there is usually some chroma subsampling, therefore Cb (U) and Cr (V) components won't add 100% of their value to each pixel. And furthermore, greyscale video (Y only) does exist as well.

Nintendo Maniac 64
23rd January 2016, 01:16
Okay. We typically refer to that as "24 bit", don't we?

There's "bits per pixel" and "bits per channel".

Originally it was more common to refer to the total bits per pixel.

However, around the mid 2000s when pretty much everything supported 8 bits per channel and alpha transparency was becoming more common, the terminology started shifting towards "bits per channel".

I imagine that alpha transparency had a lot to do with it because 48 bits per pixel could be either 12 bits per channel + alpha transparency or 16 bits per channel without transparency. Simply stating the bits per channel eliminate the ambiguity in such a situation.