Google VP9 "Next Generation Open Video" information posted [Archive] - Page 19

View Full Version : Google VP9 "Next Generation Open Video" information posted

Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 [19] 20 21 22 23 24 25

dapperdan

30th August 2016, 13:32

This is weird... So it's worse below and above 1080p, but better or equivalent at 1080p?

They only tested 480, 720 and 1080. So it sounds like x265 does better overall, but mostly on the lower end.

dapperdan

30th August 2016, 13:42

Edge already supports it.

But it is for me a huge mystery, how on Earth the company (google) that develops both Chrome and VP9 hasn't managed yet to make HW acceleration feasible in its own browser that forces VP9 codec in Youtube, while Microsoft the other opponent has already done this in its own browser that doesn't forces the use of VP9.

Crazy.

It's boringly sane actually.

VP9 only helps Youtube save bandwidth if lots of people use it. Restricting it to those with hardware acceleration would basically kill it and make the whole things pointless, so they need to do at least software decode and then hardware as a bonus for low end and mobile devices. Each time they have to choose which to spend developer time on, then software decode is going to have lots of benefits since it will help so many more people it'll basically pay for itself many times over.

They run the numbers based on users bandwidth vs CPU vs visual quality vs user engagment to see whether it's working or not. They initially restricted VP9 to not show on XP devices because these stats showed that people on those devices gave up in disgust. If they're still showing it to you and you have a bad experience with it then presumably their data shows 3 or 4 people (maybe those with modern computers on bad connections) are having a better experience and watching more Youtube ads as a result.

Meanwhile, Microsoft has always been wary aout exposing itself to patents or licence fees, so is more interested in farming that out to 3rd parties and doesn't really care that much if most Edge users don't get VP9
support.

NikosD

30th August 2016, 13:52

Interesting point of view

Motenai Yoda

30th August 2016, 16:34

from intel 7th core generation slides
http://images.anandtech.com/doci/10610/7th%20Gen%20Intel%20Core%20Performance%20Evaluation-13_575px.png

Intel Video Codec Support
Kaby Lake Skylake Broadwell
H.264 Decode Hardware Hardware Hardware
HEVC Main Decode Hardware Hardware Hybrid
HEVC Main10 Decode Hardware Hybrid No
VP9 8-Bit Decode Hardware Hybrid Hybrid
VP9 10-Bit Decode Hardware No No

H.264 Encode FF & PG-Mode FF & PG-Mode PG-Mode
HEVC Main Encode FF & PG-Mode PG-Mode No
HEVC Main10 Encode FF & PG-Mode No No
VP9 8-Bit Encode FF & PG-Mode No No
VP9 10-Bit Encode No No No

x265_Project

31st August 2016, 20:12

Netflix's presentation of their study; A large scale video codec comparison of x264, x265 and libvpx for practical VOD applications, can be watched on YouTube here... https://youtu.be/wi1BefrfTos?t=1h25s

Naturally, if you don't use --tune ssim with x265, results are worse when measured using SSIM. When they avoided --tune psnr and --tune ssim their own visual quality metric showed that x265 delivered the highest efficiency.
http://x265.org.s3.amazonaws.com/img/Netflix_Codec_Comparison_Results_50.png

Motenai Yoda

1st September 2016, 01:17

Naturally, if you don't use --tune ssim with x265, results are worse when measured using SSIM. When they avoided --tune psnr and --tune ssim their own visual quality metric showed that x265 delivered the highest efficiency.

well but they avoided to tune for ssim or psnr with vp9 and x264 too for those tests as the 1/3 psnr slide mention "PSNR-tuned configuration", where vp9 give 43.5% bitrate reduction over x264 vs 43.4% of x265 (still there is a 2.5% gain for x265 vs vp9 on averages)
slide 2/3 based on ms-ssim indeed visual quality tuned, and here vp9 performs a bit better

so the point is: will tuning codecs for ssim give more reliable results on ms-ssim or vmaf metrics?

dapperdan

1st September 2016, 18:07

They talk about a paper with more details of their results, is that available yet?

x265_Project

2nd September 2016, 16:28

They talk about a paper with more details of their results, is that available yet?
Not yet. Keep your eye on http://techblog.netflix.com/

x265_Project

2nd September 2016, 17:14

well but they avoided to tune for ssim or psnr with vp9 and x264 too for those tests as the 1/3 psnr slide mention "PSNR-tuned configuration", where vp9 give 43.5% bitrate reduction over x264 vs 43.4% of x265 (still there is a 2.5% gain for x265 vs vp9 on averages)
slide 2/3 based on ms-ssim indeed visual quality tuned, and here vp9 performs a bit better

so the point is: will tuning codecs for ssim give more reliable results on ms-ssim or vmaf metrics?
It's well known that PSNR and SSIM are somewhat crude quality metrics that don't correlate perfectly with human quality evaluations. Over the years that x264 was optimized, the development team noticed that certain algorithms would deliver better subjective (human) visual quality, but worse objective quality scores (PSNR or SSIM). Knowing that the main goal of a video encoder is to deliver the best subjective visual quality, they optimized for the best experience as judged by humans. To enable PSNR or SSIM driven evaluations to be done more fairly, the x264 team came up with "tunings" that turned off all algorithms that affected these measurements negatively. Of course, no one would ever use --tune PSNR or --tune SSIM for production encoding, as these encodes will always deliver inferior visual quality when evaluated by real people.

As Netflix explained, they did 2 sets of test encodes with x264 and x265. One set was tuned for PSNR (using --tune psnr), and one set was done without this tuning. PSNR tuned encodes are the only valid encodes to use when you use PSNR as a test metric.
So, the first result they showed was a valid comparison...
http://x265.org.s3.amazonaws.com/img/Netflix_Codec_Comparison_Results1_50.png
The second results slide they showed was not valid comparison (you need to use --tune ssim if you want to compare x264 or x265 using SSIM).
http://x265.org.s3.amazonaws.com/img/Netflix_Codec_Comparison_Results2_50.png
Similarly, if you're doing subjective visual quality tests, or using an advanced metric like Netflix's VMAF, which correlates more closely with subjective visual quality assessments, you should not use the --tune PSNR encodes. The third results they showed are valid, as they used the visual quality tuned encodes to compare using their VMAF metric.
http://x265.org.s3.amazonaws.com/img/Netflix_Codec_Comparison_Results_50.png

Of course, no objective (computer calculated) quality metric is as good as humans watching and evaluating video. There are many aspects of a video experience that are hard to measure with mathematical formulas - like motion accuracy, or the degree of noticeable compression artifacts.

Netflix's study is one of the most valid and comprehensive studies ever conducted, as they used real production codecs on a very large and representative sample of high quality real production content (unlike some codec comparisons that used previously compressed content, or used only reference encoders). This study definitely furthers our understanding of the performance that is possible from VP9 and HEVC under real-world conditions. Netflix has an amazingly talented R&D team, and it's awesome that they are willing to publish their research for everyone's benefit.

x265_Project

2nd September 2016, 17:32

And now, straight from Netflix...http://www.streamingmedia.com/Articles/ReadArticle.aspx?ArticleID=113346
Jan Ozer writes...
I asked Netflix which set of results they felt was most significant. Their response was, “We believe that VMAF results will have the best correlation to user perception of quality. We use this metric, and sanity-check against other metrics (PSNR, SSIM, VIF, etc.) internally.”
http://dzceab466r34n.cloudfront.net/StreamingMedia/ArticleImages/InlineImages/108157-Netflix-HEVC-ORG.png

Jamaika

2nd September 2016, 19:58

Thanks for the explanation. It's nice that someone has developed a metric VMAF but I can't use from that.
According to this chart tune psnr there is no foreseeable result of the ratio of the maximum signal power to the noise power distortive the signal.
Results for Daala.
http://dzceab466r34n.cloudfront.net/StreamingMedia/ArticleImages/InlineImages/107094-Netflix-VAMF-ORG.png
I Wonder, what is my predictable conversion score in function pass2 veryslow?

mandarinka

3rd September 2016, 14:53

http://x265.org.s3.amazonaws.com/img/Netflix_Codec_Comparison_Results2_50.png

Hmm, could it be that VP9's win in that MS-SSIM test was caused completely just by psy optimizations distorting the result?

IIRC, libvpx doesn't have as extensive set of psyRDO/adaptive quantization tools as x264 developed (and then x265 adapted/extended). Since psy measurably hurts metrics, in this test x265 might have been handicapped while libvpx not, "thanks"to the latter's lack of meaningfully psy (somebody correct me if it has Psy RDO now)?

x265_Project

3rd September 2016, 20:23

Hmm, could it be that VP9's win in that MS-SSIM test was caused completely just by psy optimizations distorting the result?

IIRC, libvpx doesn't have as extensive set of psyRDO/adaptive quantization tools as x264 developed (and then x265 adapted/extended). Since psy measurably hurts metrics, in this test x265 might have been handicapped while libvpx not, "thanks"to the latter's lack of meaningfully psy (somebody correct me if it has Psy RDO now)?
Yes... it's well known that if you want to measure x264 or x265's SSIM scores, you need to use --tune SSIM for your encodes. If you don't (as with this example), x264 and x265 will be using algorithms that improve visual quality, but produce lower SSIM scores.

How many of you remember this blog post?
http://web.archive.org/web/20100723085702/http://x264dev.multimedia.cx/?p=458

Even when you use --tune PSNR or --tune SSIM, and then compare encoders using PSNR or SSIM, you won't have a fully reliable comparison.

x264 and x265 were not designed to achieve the highest efficiency when measured with PSNR or SSIM. They're designed to achieve the highest visual quality with any content at any chosen bit rate. We could have much higher PSNR or SSIM scores if we simply optimized everything for these objective metrics. But that wouldn't be the best encoder for producing video for actual human beings to watch.

Netflix understands the limitations of PSNR and SSIM, which is why they've invested a lot of time and energy, working with experts at the University of Southern California, into developing a better objective quality measurement tool - Video Multimethod Assessment Fusion (VMAF) (http://techblog.netflix.com/2016/06/toward-practical-perceptual-video.html). VMAF correlates to subjective (human) visual quality assessments much more closely than PSNR, SSIM, or other available objective metrics. VMAF results are a much more reliable predictor of actual human subjective visual quality evaluations.

LigH

3rd September 2016, 21:08

Then I hope the VMAF specs are publicly available to make an AviSynth plugin implementable... :o

But let's discuss that elsewhere (http://forum.doom9.org/showthread.php?t=173676).

mandarinka

3rd September 2016, 21:24

Yeah, I'm aware of the ssim/psnr and x264 businesses, I was encoding back then, when VAQ/PsyRDO landed and changed the encoding landscape like nothing in years :) ('08/'09)

I'm thinking that they probably meant to not test the same thing as with plain SSIM, when they run the MS-SSIM metric. Maybe they aimed to use it to gauge the visual quality like with VMAF... be it a good idea or not. I'm not sure how similar is MS-SSIM to SSIM proper. Xiph people use it IIRC, probably thinking it correlated better with visual quality than SSIM.

But x265 is still going to have bigger delta between metrics-tuned and psy-tuned result than libvpx is going to, I think, since libvpx lacks psyrdo. So to the degree MS-SSIM is similar to SSIM, it is probably disadvantaging x265. Hard to know for sure though, I never tried it.

x265_Project

3rd September 2016, 23:21

Then I hope the VMAF specs are publicly available to make an AviSynth plugin implementable... :o

But let's discuss that elsewhere (http://forum.doom9.org/showthread.php?t=173676).
They open sourced it under the Apache 2.0 license...
https://github.com/Netflix/vmaf

Jamaika

4th September 2016, 07:23

Hmm, could it be that VP9's win in that MS-SSIM test was caused completely just by psy optimizations distorting the result?

IIRC, libvpx doesn't have as extensive set of psyRDO/adaptive quantization tools as x264 developed (and then x265 adapted/extended). Since psy measurably hurts metrics, in this test x265 might have been handicapped while libvpx not, "thanks"to the latter's lack of meaningfully psy (somebody correct me if it has Psy RDO now)?
Codecs libvpx and libaom have metric pnsr and ssim.
--tune=<arg> Material to favor
psnr, ssim}
--psnr Show PSNR in status line
The rest I think no one cares.
More interesting to me, how netflix introduced meric to codec Daala. Here probably wrote additional function and included. Daala doesn't have such functions. With the deduction indicates that VMAF can be used for all the codecs.
It is a pity that netflix didn't give the predicted score charts of PNSR / SSIM / VMAF vs DMOS for codecs X265 or VPX.:confused: Probably not to annoy developers.
Results for Daala.
http://cdn1.infoqstatic.com/statics_s1_20160831-0533/resource/articles/a-quality-assessment-tool-for-video-streaming-media/zh/resources/02.png
I'm thinking that they probably meant to not test the same thing as with plain SSIM, when they run the MS-SSIM metric. Maybe they aimed to use it to gauge the visual quality like with VMAF... be it a good idea or not. I'm not sure how similar is MS-SSIM to SSIM proper. Xiph people use it IIRC, probably thinking it correlated better with visual quality than SSIM.

Exceptions results are considerable for Daala

Chroma from Luma is a PSNR in Daala (http://people.xiph.org/~xiphmont/demo/daala/demo4.shtml)
Despite lending visual improvements, Chroma from Luma is a PSNR penalty in Daala. This is not particularly surprising, as neither PSNR nor any of the other common objective quality measures used in video coding represent color perception well, but it is especially interesting as the similar lack of PSNR performance in HEVC testing would certainly have contributed to it being dropped from the final HEVC standard. Perhaps the joint working group might have overlooked that if the spatial version of Chroma from Luma had not also been an apparent performance penalty. I speculate here mainly because our frequency domain implementation is substantially faster than classic intra prediction, not slower.

herbert

5th September 2016, 23:27

VideoLAN Dev Days 2016: Update on VPX
Recap on recent VPX developments.

https://www.youtube.com/watch?v=peS2I14w8ow

VideoLAN Dev Days 2016: A VP9 Encoder
Information on ffvp9 until about 6:45, followed by a talk about 'Eve', an alternative VP9 encoder.

https://www.youtube.com/watch?v=t_z52-CBut0

Selur

10th September 2016, 06:30

Quikee

10th September 2016, 20:32

Just wondering: Since av1, vp8 and vp9 seem to be based on the same framework can they be compiled into a single binary? (vpxenc can include vp8, vp9, vp10, so I was wondering if av1 could be added there too)

No, they removed everything that won't be used in av1 and renamed everything vp8, vp9, vpx to aom or av1.

Selur

10th September 2016, 20:39

Thanks for the info. :)

mzso

11th September 2016, 12:51

No, they removed everything that won't be used in av1 and renamed everything vp8, vp9, vpx to aom or av1.

Does that mean they thrashed everything from Daala? Or did-they/will-they merge some stuff?

Quikee

11th September 2016, 22:07

Does that mean they thrashed everything from Daala? Or did-they/will-they merge some stuff?

I was referring to how they prepared the initial AOM repository. After that xiph and cisco guys started to merge their coding tools into the code base. From daala there is their entropy coder and deringing filter merged already. PVQ is waiting to be merged. From cisco there is CLPF merged. Everything is still experimental and in flux - no idea which of the coding tools will stay in.

Clare

30th September 2016, 12:25

Netflix understands the limitations of PSNR and SSIM, which is why they've invested a lot of time and energy, working with experts at the University of Southern California, into developing a better objective quality measurement tool - Video Multimethod Assessment Fusion (VMAF) (http://techblog.netflix.com/2016/06/toward-practical-perceptual-video.html). VMAF correlates to subjective (human) visual quality assessments much more closely than PSNR, SSIM, or other available objective metrics. VMAF results are a much more reliable predictor of actual human subjective visual quality evaluations.

I've tried to apply VMAF to my image comparison website and I have dubious results… The numbers show JPEG2000 being a clear winner but visually I can't confirm it, JPEG2000 pictures clearly have more artifacts than Daala or BPG at very low bpp. I don't know if I did something wrong or if VMAF is not fit for still images.

VMAF is first graph on this page: http://wyohknott.github.io/image-formats-comparison/lossy_stats.html

Jamaika

2nd October 2016, 11:42

I've tried to apply VMAF to my image comparison website and I have dubious results… The numbers show JPEG2000 being a clear winner but visually I can't confirm it, JPEG2000 pictures clearly have more artifacts than Daala or BPG at very low bpp. I don't know if I did something wrong or if VMAF is not fit for still images.
You can describe in detail, how did you it?

I wonder summary of your results for lossless.
http://wyohknott.github.io/image-formats-comparison/lossless_stats.html
Today, I would delete bpg and misapplied X265. I stated that as long as the creator will not add a new codec X265, there is no after at about that mention it.

CruNcher

2nd October 2016, 17:02

Dalaas most hard to rate still results

Ballet Excercise <- edge ringing is really a problem here
Steinway
Sking
Tennis <- The worst Psy Result
Production

especially Tennis and Ballet Excerise are very problematic more bits should have been spend on the faces and edges and less on the plain colored objects

Neither Dalaa nor ACM AV1 seem to handle the Input Noise efficient enough

Clare could you add a 3rd comparison mode on click 3rd result overlay ?

ACM AV1 already seems improved here much better Psy result :)

ACM AV1 detail retention really improved nicely even if it makes issues like banding/ringing more visible now compared to the dalaa state :)

Adventure with the Windmills result is also really interesting

BPG has a nice lanczos decimation good ringing avoidance (filter) when confronted with High Frequency Noise :)

Detail Retention improvement of AV1 gets best visible here :)

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&ogv=t&aom=t

though surely some time will still pass reaching these low bitrate results :)

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&bpg=t&aom=t

http://wyohknott.github.io/image-formats-comparison/#tennis*3:1&ogv=t&aom=t

http://wyohknott.github.io/image-formats-comparison/#tennis*3:1&bpg=t&aom=t

http://wyohknott.github.io/image-formats-comparison/#sking*3:1&bpg=t&aom=t

though BPG "eats" parts of the pixel at the tiny level and so destroys whole objects uniformity better to compare @ small level bellow to many really annoying reconstruction failures.

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&bpg=s&aom=s

But still better then AV1s current ringing mess :(

at lower then medium AV1 has no real chance with your setup vs bpg and still at medium its psy issues stay problematic

So damn close and partly above jpeg2000 though :)

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&jp2=m&aom=m

improvements over dalaa @ medium are also clearly visible less reconstruction failures

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&ogv=m&aom=m

This is so freaking close only the ringing and some chroma issues :D

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&bpg=m&aom=m

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&bpg=ll&aom=ll

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&png=ll&aom=ll

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&bpg=ll&aom=ll

http://wyohknott.github.io/image-formats-comparison/#ballet-exercise*3:1&bpg=l&aom=l

@Clare

how do the High Freqency Noise inputs with AV1 look if you leave AQ out maybe it hurts the lower bit distribution to much ending in those awful ringing overflow results ?

aomenc-+d8003fe.exe --passes=2 --end-usage=q --cq-level=$Q

ANd how hard does VMAF punish the ringing ?

Clare

3rd October 2016, 09:29

Clare could you add a 3rd comparison mode on click 3rd result overlay ?

I didn't do the coding of the tool, so I wouldn't know how to to modify it.

BPG has a nice lanczos decimation good ringing avoidance (filter) when confronted with High Frequency Noise :)

I found it good visually too but it actually smooths out a lot of details comparing to Daala and AV1. AV1 was marginally improved when it added its deringing filter.

how do the High Freqency Noise inputs with AV1 look if you leave AQ out maybe it hurts the lower bit distribution to much ending in those awful ringing overflow results ?

I actually don't know, I've only made some tests with differents settings on Mercado de Lavradores, 1 retained more details, 0, 2, &3 were very simillar, more smoothed like BPG.

ANd how hard does VMAF punish the ringing ?
I don't really know what features it favors, just that its results surprise me. I know that JPEG artifacts are often considered more "visually pleasing" than others which smooth details like WebP, but I didn't expect to have both MozJPEG and JPEG2000 with such an advantage.
VMAF is computed according to a dataset it has "learned" from provided by Netflix, but this dataset was made from videos, so I'm not sure the VMAF is actually a good tool with this dataset for still pictures.

mzso

17th November 2016, 15:22

Already posted this in the AV1 thread (http://forum.doom9.org/showthread.php?p=1785666#post1785666), but since it's also related to VP9, posting it here as well:

The Netflix encoding department had a new meet-up where they discussed VP9 vs. H.265 vs. H.264. The Alliance for Open Media was also there and gave a presentation on AV1, where they said they are aiming for a 50% increase in efficiency over VP9/H.265 with AV1. They also said that AV1 in it's current state already beats VP9 by 25-30% (with not yet released Google internal tools). They said that in order to achieve the 50% increase in efficiency over VP9, they are willing to accept a 40% increase in decoding complexity and a 5 to 10 times higher encoding complexity than VP9, see:

https://youtu.be/thvSyJN1vsA

:eek:

Yay! let's make everyone buy new CPUs/devices! We're any an consumer economy anyway... God forbid increasing bandwidth costs cut into mega rich corporations' profits. Win-win. (not for the users)

Motenai Yoda

17th November 2016, 17:31

if they increase again av1 compression time nobody out industry will use it

btw It's not clear the vp9 part, they talk about up 17% of efficiency loss in 2 pass encoding, but their codec comparison is done with all 2 pass encodes? or costant quantizer/ crf ones?

kathykit

18th November 2016, 14:40

Thank you so much!
Thanks of the information! Useful!

Motenai Yoda

19th November 2016, 14:09

You're question is probably being answered in the video at the 1 hour 21 minutes 28 seconds mark:

https://youtu.be/thvSyJN1vsA?t=1h21m28s

Thank you, so actually vp9 quantizer based encode will be almost in pair with x265

The Netflix encoding department also has another video up where they explain their internal encoding workflow and where they are stating that quality is more important for them than speed, since they encode simultaneously in chunks distributed over several machines anyway and stitch it together afterwards, see:

https://youtu.be/fsRLHHIoC6E
As I wrote "Industry"

sneaker_ger

5th December 2016, 15:42

Netflix is now also using VP9 but currently only for downloads to Android and iOS(*). Streaming to these devices will follow "in the near future".
http://techblog.netflix.com/2016/12/more-efficient-mobile-encodes-for.html

https://lh4.googleusercontent.com/Rd23yc29qLQtG0J0Vg7EdHUs0lFYcvB7knDWFAm1LU6zAOx-RWOCUpAhH1KgFgsxr4fyfR7LcQN6g-EofvSZerV2JvLbi-WOT2_MeuvgGBKkCR_8sFqLnuwHRxrWSKaxK-iDlN61

(*) Well, the blog mentions iOS. I'm not actually sure if Apple now supports VP9. IIRC they were one of the biggest supporters of H.264 for HTML5.

dapperdan

5th December 2016, 19:39

I read it as iOS getting the AVCHi-Mobile encodes and Android getting VP9-mobile, which judging by their graph means the Androids will get slightly better quality and/or smaller file size.

sneaker_ger

5th December 2016, 19:45

That makes a lot of sense.

CruNcher

6th December 2016, 01:52

Youtube is currently in the major phase of VP9 4K introduction so many content already transcoded in so many different bitrate states :)

It will be really interesting how long it will take for them content wise to own 4K user content at this rate especially asian content is flooding it daily now.

It's also in it's core the biggest push of Matroska (as .webm) we ever saw happening and also Opus reach gets so much bigger now :)

If this continues at this rate Netflix Hollywood catalog will be no compare ;)

If users gonna accept it this way also and prefer it over HEVC that would be a fundamental step towards AV1 and it's introduction :)

Especialy interesting lot of the content will be in the end transcoded 4K AVC/HEVC input :D

Yay! let's make everyone by new CPUs/devices! We're any an consumer economy anyway... God forbid increasing bandwidth costs cut into mega rich corporations' profits. Win-win. (not for the users)

Isn't that equation way to simple ;) ?

dapperdan

8th December 2016, 10:48

mzso

8th December 2016, 11:10

I wonder if the investment in VP8 etc. is now positive for Google. It seemed like a big gamble at the time, but I guess at their scale bandwidth must add up quickly (an article the other day about Android app updates claimed they were saving Petabytes every day due to a relatively simple change in how they sent them) and they seem to have succeeded in changing the conversation on royalty free formats for the web with everyone but Apple publicly on board with AV1, which I guess must have concrete savings attached.

A measly hundred million for a codec by a company that's worth tens of billions? That's nothing. I wouldn't consider it a small gamble. (In comparison AMD buying ATI was a huge gamble) They could have bought a dozens of such companies without much negative effects.

What's surprising to me that with all the effort put into codecs they couldn't make youtube half-way decent. It's still the buggy steaming pile it ever was.
Some parts got definitely worse. They filter every upload automatically so it's filled with butchered (mirrored, cropped, sped-up, etc) videos. Yet they don't filter duplicates, or 100 times re-encoded videos. Even the more modern codecs are used to cut down on the already inadequate bitrate, rather than improving quality. It's as if their main purpose was to make the experience worse for youtube users.

CruNcher

8th December 2016, 12:20

1. Not sure what you brabling about first of all what for duplicates of user generated content ?
2. Bitrate is wrong i saw as high as 20 mbps allready for VP9 transcoded for 4K that's a lot for WEB and most of target devices.
3. All Post Processing filter things are only for specific target devices in specifc profiles if the user doesn't adhere to them fully in compliance it's bug fixing of wrong user encoded content nothing more to have 1a delivery experience within the profile boundaries of every target device.

Sure they have to be compromises that is normal if you want to reach as many users as possible on so many devices you have todo compromises these can save you millions of users be able to watch it in the end for 1 user being unhappy with it the equation is simple, you lose against 1 million happy users being able to enjoy and bring Add Revenues in.

Apple fights with the same issues over and over when introducing a new CODEC to their Ecosystem they need to be carefull with it calculating the lose vs win and happy vs unhappy users.

You can never ever make everyone happy that wont work but you can find some balance and VP9 tries that very heavily.

This is why you also go see skepticsm about HEVC on many other platforms still that will want to reach even 1 percent more user base no matter the quality cost at all and that cant do 100 of different encodes at all targeting everything as the big platforms ;)

sneaker_ger

8th December 2016, 15:20

AMD "Crimson ReLive" drivers are now available via amd.com (http://support.amd.com/en-us/download). Add VP9 decoding.
VP9 Decode Acceleration:
4K 60Hz GPU-Accelerated Video Streaming enabled on supported Google™ Chrome web browsers.

Requires supported Chrome™ web browser versions with Hardware Acceleration enabled. Compatible with AMD Radeon™ GCN and Radeon RX 400 series enabled products on Windows® 7/8.1/10.

dapperdan

14th December 2016, 12:11

"Overview of the VP9 codec":

https://blogs.gnome.org/rbultje/2016/12/13/overview-of-the-vp9-video-codec/

Clare

15th December 2016, 11:39

"Overview of the VP9 codec":

https://blogs.gnome.org/rbultje/2016/12/13/overview-of-the-vp9-video-codec/

I'm still pissed he hasn't open sourced his VP9 encoder. :angry:

CruNcher

15th December 2016, 15:04

Ehh he not even released it into Public he stated clearly "its a commercial product" ;)

And it's most likely that at least some Youtube transcodes coming from his Eve Encoder codebase actually ;)

Clare

19th December 2016, 13:01

I'm doing some encode using FFMPEG and vpxenc and I have some curious results. Can anyone explain to me why these two similar encodes give wildly different results?

ffmpeg -y -i in.y4m -c:v libvpx-vp9 -b:v 0 -crf 20 -quality good -threads 8 -cpu-used 2 -tile-columns 4 -an out.webm
vpxenc --good --threads=8 --cpu-used=2 --tile-columns=4 --end-usage=q --cq-level=20 -o out.webm in.y4m

For the same parameters I get very different filesizes in the end, with FFMPEG the file is 3,93 MiB, whereas with vpxenc it is only 2,26 MiB. Both are using libvpx 1.6.0. I don't get it.

smok3

19th December 2016, 14:15

@Clare: -crf 20 and --cq-level=20 are the same thing?

Jamaika

19th December 2016, 14:43

@Clare: And if you used Adobe Webm that you will have the third result. it was written about it.;)
libvpx-vp9 encoder AVOptions:
-auto-alt-ref <int> E..V.... Enable use of alternate reference frames (2-pass only) (from -1 to 2) (default -1)
-lag-in-frames <int> E..V.... Number of frames to look ahead for alternate reference frame selection (from -1 to INT_MAX) (default -1)
-arnr-maxframes <int> E..V.... altref noise reduction max frame count (from -1 to INT_MAX) (default -1)
-arnr-strength <int> E..V.... altref noise reduction filter strength (from -1 to INT_MAX) (default -1)
-arnr-type <int> E..V.... altref noise reduction filter type (from -1 to INT_MAX) (default -1)
backward E..V....
forward E..V....
centered E..V....
-tune <int> E..V.... Tune the encoding to a specific scenario (from -1 to INT_MAX) (default -1)
psnr E..V....
ssim E..V....
-deadline <int> E..V.... Time to spend encoding, in microseconds. (from INT_MIN to INT_MAX) (default good)
best E..V....
good E..V....
realtime E..V....
-error-resilient <flags> E..V.... Error resilience configuration (default 0)
default E..V.... Improve resiliency against losses of whole frames
partitions E..V.... The frame partitions are independently decodable by the bool decoder, meaning that partitions can be decoded even though earlier partitions have been lost. Note that intra predicition is still done over the partition boundary.
-max-intra-rate <int> E..V.... Maximum I-frame bitrate (pct) 0=unlimited (from -1 to INT_MAX) (default -1)
-crf <int> E..V.... Select the quality for constant quality mode (from -1 to 63) (default -1)
-static-thresh <int> E..V.... A change threshold on blocks below which they will be skipped by the encoder (from 0 to INT_MAX) (default 0)
-drop-threshold <int> E..V.... Frame drop threshold (from INT_MIN to INT_MAX) (default 0)
-noise-sensitivity <int> E..V.... Noise sensitivity (from 0 to 4) (default 0)
-undershoot-pct <int> E..V.... Datarate undershoot (min) target (%) (from -1 to 100) (default -1)
-overshoot-pct <int> E..V.... Datarate overshoot (max) target (%) (from -1 to 1000) (default -1)
-cpu-used <int> E..V.... Quality/Speed ratio modifier (from -8 to 8) (default 1)
-lossless <int> E..V.... Lossless mode (from -1 to 1) (default -1)
-tile-columns <int> E..V.... Number of tile columns to use, log2 (from -1 to 6) (default -1)
-tile-rows <int> E..V.... Number of tile rows to use, log2 (from -1 to 2) (default -1)
-frame-parallel <boolean> E..V.... Enable frame parallel decodability features (default auto)
-aq-mode <int> E..V.... adaptive quantization mode (from -1 to 3) (default -1)
none E..V.... Aq not used
variance E..V.... Variance based Aq
complexity E..V.... Complexity based Aq
cyclic E..V.... Cyclic Refresh Aq
-level <float> E..V.... Specify level (from -1 to 6.2) (default -1)
-speed <int> E..V.... (from -16 to 16) (default 1)
-quality <int> E..V.... (from INT_MIN to INT_MAX) (default good)
best E..V....
good E..V....
realtime E..V....
-vp8flags <flags> E..V.... (default 0)
error_resilient E..V.... enable error resilience
altref E..V.... enable use of alternate reference frames (VP8/2-pass only)
-arnr_max_frames <int> E..V.... altref noise reduction max frame count (from 0 to 15) (default 0)
-arnr_strength <int> E..V.... altref noise reduction filter strength (from 0 to 6) (default 3)
-arnr_type <int> E..V.... altref noise reduction filter type (from 1 to 3) (default 3)
-rc_lookahead <int> E..V.... Number of frames to look ahead for alternate reference frame selection (from 0 to 25) (default 25)
Rate Control Options:
--drop-frame=<arg> Temporal resampling threshold (buf %)
--resize-allowed=<arg> Spatial resampling enabled (bool)
--resize-width=<arg> Width of encoded frame
--resize-height=<arg> Height of encoded frame
--resize-up=<arg> Upscale threshold (buf %)
--resize-down=<arg> Downscale threshold (buf %)
--end-usage=<arg> Rate control mode
vbr, cbr, cq, q
--target-bitrate=<arg> Bitrate (kbps)
--min-q=<arg> Minimum (best) quantizer
--max-q=<arg> Maximum (worst) quantizer
--undershoot-pct=<arg> Datarate undershoot (min) target (%)
--overshoot-pct=<arg> Datarate overshoot (max) target (%)
--buf-sz=<arg> Client buffer size (ms)
--buf-initial-sz=<arg> Client initial buffer size (ms)
--buf-optimal-sz=<arg> Client optimal buffer size (ms)

Edit:
Oh, I don't know what you do compiled, but ffmpeg contains only the first version of libvpx 1.6.

mzso

19th December 2016, 14:45

So VP9 seems pretty mature these days, right?
I usually use these settings generally to encode AVC:

ffmpeg -i -g 30 -vcodec libx264 -preset slower -crf 18 ki.mkv

What would be roughly identical quality wise with VP9? (with sparing a bit of space, otherwise it's pointless) Are there any convenient presets for encoding speed as for x264?

Can I encode full range 4:4:4 videos like I do with libx264?
ffmpeg -i -pix_fmt yuvj444p -vf scale=out_color_matrix=bt709 -g 30 -vcodec libx264 -preset slower -crf 18 -x264opts colorprim=bt709:transfer=bt709:colormatrix=bt709 out.mkv
(Which I normally do for screencasts)

Clare

19th December 2016, 15:26

@Clare: -crf 20 and --cq-level=20 are the same thing?
I think so, at least on the libvpx wiki, both methods are described as Constant Quality encoding (quote: "crf is the quality value (0-63 for VP9)", si the same thing as cq level).

Oh, I don't know what you do compiled, but ffmpeg contains only the first version of libvpx 1.6.
I didn't compile it myself, it's the latest version from my Linux distro.

Jamaika

19th December 2016, 15:42

What would be roughly identical quality wise with VP9? (with sparing a bit of space, otherwise it's pointless) Are there any convenient presets for encoding speed as for x264?
ffmpeg.exe -loglevel verbose -i rgb24_input.avi -an -f yuv4mpegpipe -vf scale=1920:1080:in_color_matrix=rgb:in_range=full:out_color_matrix=bt709:out_range=tv,format=yuv420p - |
x264.exe --demuxer y4m --input-csp i420 --input-depth 8 --input-range tv --output-csp i420 --threads 4 --preset veryslow --tune grain --crf 28 --fps ??? --keyint 2xfps --nal-hrd none (--vbv-bufsize 40000 --vbv-maxrate 40000)???
--colormatrix bt709 --colorprim bt709 --transfer bt709 --range tv --output "x264_420p_crf28.h264" -

ffmpeg.exe -loglevel verbose -i rgb24_input.avi -an -f yuv4mpegpipe -vf scale=1920:1080:in_color_matrix=rgb:in_range=full:out_color_matrix=bt709:out_range=tv,format=yuv420p - |
vpxenc.exe --bit-depth=8 --input-bit-depth=8 --i420 --codec=vp9 --good --threads=4 --cpu-used=4 --profile=0 --drop-frame=??? --end-usage=q --cq-level=48 --target-bitrate=0 --min-q=0 --kf-max-dist=2xfps --auto-alt-ref=1
--frame-boost=1 --aq-mode=0 --color-space=bt709 --verbose --debug??? --pass=1 --passes=1 --output=114.webm -
Can I encode full range 4:4:4 videos like I do with libx264?
No one wants such comparisons codecs. Everyone wants to prove that the codec is better to the competition.
What is the point of use i444? As you have a source i444. Reportedly TV, Bluray x264 miserably support i444.
(Which I normally do for screencasts)
ffmpeg.exe -loglevel verbose -i yuv420p_rangetv_bt709_input.mp4/mov/mkv -an -f image2 -vf scale=1920:1080:in_color_matrix=bt709:in_range=tv:out_color_matrix=rgb:out_range=pc,format=rgb24 -c:v png -an image001.png

Neavrie

19th December 2016, 15:46

@Clare
Try this:
ffmpeg -y -i in.y4m -c:v libvpx-vp9 -b:v 0 -crf 20 -quality good -threads 8 -cpu-used 2 -tile-columns 4 -arnr-maxframes 7 -arnr-strength 5 -an out.webm
vpxenc --good -p 1 --threads=8 --cpu-used=2 --tile-columns=4 --end-usage=q --cq-level=20 -o out.webm in.y4m

or this:
ffmpeg -y -i in.y4m -c:v libvpx-vp9 -b:v 0 -crf 20 -quality good -threads 8 -cpu-used 2 -tile-columns 4 -arnr-maxframes 7 -arnr-strength 5 -an -pass 1 -f null -
ffmpeg -y -i in.y4m -c:v libvpx-vp9 -b:v 0 -crf 20 -quality good -threads 8 -cpu-used 2 -tile-columns 4 -arnr-maxframes 7 -arnr-strength 5 -an -pass 2 out.webm

vpxenc --good -p 2 --threads=8 --cpu-used=2 --tile-columns=4 --end-usage=q --cq-level=20 -o out.webm in.y4m

mzso

19th December 2016, 16:16

ffmpeg.exe -loglevel verbose -i rgb24_input.avi -an -f yuv4mpegpipe -vf scale=1920:1080:in_color_matrix=rgb:in_range=full:out_color_matrix=bt709:out_range=tv,format=yuv420p - |
x264.exe --demuxer y4m --input-csp i420 --input-depth 8 --input-range tv --output-csp i420 --threads 4 --preset veryslow --tune grain --crf 28 --fps ??? --keyint 2xfps --nal-hrd none (--vbv-bufsize 40000 --vbv-maxrate 40000)???
--colormatrix bt709 --colorprim bt709 --transfer bt709 --range tv --output "x264_420p_crf28.h264" -

ffmpeg.exe -loglevel verbose -i rgb24_input.avi -an -f yuv4mpegpipe -vf scale=1920:1080:in_color_matrix=rgb:in_range=full:out_color_matrix=bt709:out_range=tv,format=yuv420p - |
vpxenc.exe --bit-depth=8 --input-bit-depth=8 --i420 --codec=vp9 --good --threads=4 --cpu-used=4 --profile=0 --drop-frame=??? --end-usage=q --cq-level=48 --target-bitrate=0 --min-q=0 --kf-max-dist=2xfps --auto-alt-ref=1
--frame-boost=1 --aq-mode=0 --color-space=bt709 --verbose --debug??? --pass=1 --passes=1 --output=114.webm -

No one wants such comparisons codecs. Everyone wants to prove that the codec is better to the competition.
What is the point of use i444? As you have a source i444. Reportedly TV, Bluray x264 miserably support i444.

ffmpeg.exe -loglevel verbose -i yuv420p_rangetv_bt709_input.mp4/mov/mkv -an -f image2 -vf scale=1920:1080:in_color_matrix=bt709:in_range=tv:out_color_matrix=rgb:out_range=pc,format=rgb24 -c:v png -an image001.png

Thanks for the answer. For the first part I didn't want full range just a generic replacement for "ffmpeg -i -g 30 -vcodec libx264 -preset slower -crf 18 ki.mkv" (Not for full range, i444)
Also I only want to use ffmpeg, and don't want to pipe to other tools.

No one wants such comparisons codecs. Everyone wants to prove that the codec is better to the competition.
What is the point of use i444? As you have a source i444. Reportedly TV, Bluray x264 miserably support i444.

ffmpeg.exe -loglevel verbose -i yuv420p_rangetv_bt709_input.mp4/mov/mkv -an -f image2 -vf scale=1920:1080:in_color_matrix=bt709:in_range=tv:out_color_matrix=rgb:out_range=pc,format=rgb24 -c:v png -an image001.png

I don't understand what you're saying. What do you mean by "comparisons codecs". Of course everyone wants a better codec, that doesn't exclude 4:4:4 and full-range support

The point of 4:4:4 is to not destroy color information before you even encode the video.