Alliance for Open Media codecs - Page 71

mandarinka · 30th January 2019, 04:39

Interesting that some of the champions supporting or helping AOM are in the problematic(?) group of unpooled HEVC licensors... you would say these companies support AV1 because they hated that.

kuchikirukia · 30th January 2019, 05:55

Quote:

Originally Posted by TD-Linux

Here's a link to AWCY as shown in Tim's presentation:

https://beta.arewecompressedyet.com/...y-525f981376bd

tl;dr it is better than x264 at every bitrate, but still worse that libvpx VP9. It is also currently about 10x slower than x264, which is blazing fast compared to libaom but still has a lot of room for improvement.

Taking x264 PSNR and SSIM values without setting --tune PSNR and SSIM is fail.

nevcairiel · 30th January 2019, 10:35

Quote:

Originally Posted by kuchikirukia

Taking x264 PSNR and SSIM values without setting --tune PSNR and SSIM is fail.

You get one encode to compare, do you really want that to be one tuned for PSNR? Because that would be the real fail.
One encode, several metrics. Not re-encoding targeted for metrics.

TD-Linux · 30th January 2019, 10:42

Quote:

Originally Posted by kuchikirukia

Taking x264 PSNR and SSIM values without setting --tune PSNR and SSIM is fail.

There are more metrics than those at the link, but fair. I compared against x264 --tune PSNR as well and it still beats x264 at PSNR:

https://beta.arewecompressedyet.com/...y-525f981376bd

MoSal · 30th January 2019, 12:29

Quote:

Originally Posted by benwaggoner

In particular I worry that VMAF is insufficiently sensitive to temporal shifts in video quality. A VMAF of 70, 65, 60, 60, 65, 60, 60 might come out as a nice "VMAF=65.3" but be a annoying to watch. Frame strobing was a weakness of libvpx.

That's not really a problem. Per-frame data is available (wrote this mostly in a couple of hours). It's even available for multiple metrics, which is nice.

The real problem is that VMAF is not that good. It, for example, spectacularly fails with samples that greatly benefit from AQ (yes, I know you already hinted at this).

jonatans · 30th January 2019, 13:57

Quote:

Originally Posted by TomV

Thanks for posting an update Jonatan. Your efforts on this, and in the MC-IF are really appreciated.

Thank you Tom. And thanks for providing additional context to this interesting and complicated matter.

Quote:

Originally Posted by hajj_3

I think i read that Franhaufer sold their HEVC patents to General Electric, if true you should remove Franhaufer from your diagram.

This is correct. But my understanding is that Fraunhofer did not sell all their HEVC patents. They are listed as licensor in HEVC Advance. In the latest patent list from HEVC Advance there are two Fraunhofer patents listed: https://www.hevcadvance.com/pdfnew/H...nuary-2019.pdf

Beelzebubu · 30th January 2019, 17:56

Quote:

Originally Posted by benwaggoner

In particular I worry that VMAF is insufficiently sensitive to temporal shifts in video quality. A VMAF of 70, 65, 60, 60, 65, 60, 60 might come out as a nice "VMAF=65.3" but be a annoying to watch. Frame strobing was a weakness of libvpx.

First and foremost: yes! It's great to see some technical & independent thinking of how good VMAF really is.

Netflix uses "hVMAF" as official notation in their charts. "h" means "harmonic", which means it uses harmonic means, which bias towards the least favourable. So in your example, the harmonic mean would be 62.66, whereas the average would be 62.86. Neither of these is 65.3. So I'm personally not as concerned about the averaging mechanism aspect of your concern. (In the CLI, use --pool harmonic_mean or something similar, depending on which exact tool you use.) On the other hand, I don't believe that VMAF uses temporal consistency in the reconstruction (the "motion" component is calculated from the source), so that particular concern ("frame throbbing" - i.e. keyframe pulsing or grain/textured-background tearing) I agree with.

[edit] Actually, I have to hedge a little here, since I'm not 100% sure VIF (another VMAF component) has a temporal component to it. I don't think it does but I'm not 100% sure. [/edit]

Since we're on the subject, here's some more of my personal concerns about VMAF:
* it's luma-only;
* AQ (x264/5) or SAO (x265) appear to have a negative impact on vmaf score, which is inconsistent with the reported visual results. I do have more detailed thoughts on this but let's leave that for some other time;
* the actual MOS/VMAF correlation depends very strongly on the viewing environment and therefore on the used model file, but most poeple simply use the default model without knowing what viewing environment it represents.

Just to be clear, I'm not trying to talk badly about VMAF, I think it's a great tool, it's better than the alternatives and it's fantastic that they opensourced the library as well as the models so that we can learn and understand how it works and constructively critique it. Hopefully, over time, that will make it even better, which should be the ultimate goal.

Separately, I also do agree with you that in the end, we should probably make a distinction between codecs optimized using VMAF vs. those that did not. This isn't an excuse to suck at writing encoders or to not use VMAF when writing encoders, but at the end of the day, we have to acknowledge that as in any metric, we're assuming a perfect correlation between our metric-of-the-day and the visual experience (or MOS score). That correlation will in practice always be imperfect, and therefore tuning towards/using that metric needs to be done with care and with visual confirmation (otherwise queue up the incoming VMAF artifacts - I wonder what they will look like?).

MoSal · 30th January 2019, 21:57

@Beelzebubu

What's really funny, Netflix will not be using VMAF on their published AOM content as is. Why? Because of film grain synthesis

benwaggoner · 31st January 2019, 06:00

Quote:

Originally Posted by MoSal

@Beelzebubu

What's really funny, Netflix will not be using VMAF on their published AOM content as is. Why? Because of film grain synthesis

Well, if VMAF is used on the reconstructed video, it should be as good as VMAF is at dealing with film grain.

If VMAF is bad at dealing with film grain, they need to address that.

The nice thing about VMAF is that it's really a machine learning framework. They can keep on adding new clips and kinds of encodings and training it to rate those. The big expenses is getting the subjective ratings to use as ground-truth data. But VMAF itself can always be as good as the ground truth data from subjective testing.

TD-Linux · 31st January 2019, 06:38

Quote:

Originally Posted by benwaggoner

The nice thing about VMAF is that it's really a machine learning framework. They can keep on adding new clips and kinds of encodings and training it to rate those. The big expenses is getting the subjective ratings to use as ground-truth data. But VMAF itself can always be as good as the ground truth data from subjective testing.

The inputs to VMAF itself are the outputs of a bunch of simpler metrics. In that way, the machine-learned part is sort of a "meta-metric". That also means that if the input metrics all respond poorly to film grain, no amount of machine learning is going to be able to make sense of it. I think more work on the input metrics will be needed before VMAF can be used to make film grain decisions.

I don't know what Netflix currently does, but if I were them I would filter the grain from the video, run the VMAF-targeting dynamic optimizer to produce the rate controlled stream, and then add the noise parameters back as a final step.

LigH · 31st January 2019, 08:55

Quote:

Originally Posted by hajj_3

Franhaufer

Fraunhofer

Frau = woman
Hof = yard

TomV · 31st January 2019, 19:27

HEVC Advance standard essential patent owned by GE challenged as likely invalid

benwaggoner · 31st January 2019, 20:33

Quote:

Originally Posted by TD-Linux

The inputs to VMAF itself are the outputs of a bunch of simpler metrics. In that way, the machine-learned part is sort of a "meta-metric". That also means that if the input metrics all respond poorly to film grain, no amount of machine learning is going to be able to make sense of it. I think more work on the input metrics will be needed before VMAF can be used to make film grain decisions.

I don't know what Netflix currently does, but if I were them I would filter the grain from the video, run the VMAF-targeting dynamic optimizer to produce the rate controlled stream, and then add the noise parameters back as a final step.

Good point on the underlying metrics. In particular I think the temporal metric was quite weak. They changed it for the most recent VMAF, but I'm not confident it'll catch all common kinds of visible temporal distortions. Two frames can look equally "good" but switching between them can be terribly jarring. Open GOP and RADL exist in large part to smooth inter-GOP transitions. And that still requires some cleverness around GOP boundaries to do well.

Mr_Khyron · 1st February 2019, 20:13

https://github.com/OpenVisualCloud/SVT-AV1

Quote:

Welcome to the GitHub repo for the SVT-AV1 encoder! To see a list of feature request and view what is planned for the SVT-AV1 encoder, visit our Trello page: http://bit.ly/SVT-AV1 Help us grow the community by subscribing to our SVT-AV1 mailing list

Quote:

Hardware

The SVT-AV1 Encoder library supports the x86 architecture

CPU Requirements

In order to achieve the performance targeted by the SVT-AV1 Encoder, the specific CPU model listed above would need to be used when running the encoder. Otherwise, the encoder runs on any 5th Generation Intel® Core™ processor, (Intel® Xeon® CPUs, E5-v4 or newer).

RAM Requirements

In order to run the highest resolution supported by the SVT-AV1 Encoder, at least 48GB of RAM is required to run a 4k 10bit stream multi-threading on a 112 logical core system. The SVT-AV1 Encoder application will display an error if the system does not have enough RAM to support this. The following table shows the minimum amount of RAM required for some standard resolutions of 10bit video per stream:
Resolution Minimum Footprint (GB)
4k 48gb
1080p 16gb
720p 8gb
480p 4gb

Selur · 1st February 2019, 20:17

so an Intel only encoder?

nevcairiel · 1st February 2019, 20:20

Quote:

Originally Posted by Selur

so an Intel only encoder?

It should be able to run on any AVX2 CPU.

But the entire series of SVT encoders (SVT-HEVC is also a thing) is designed specifically for a use-case of running them on powerful datacenter systems with loads of memory and CPU cores.

benwaggoner · 1st February 2019, 20:47

Quote:

Originally Posted by Mr_Khyron

https://github.com/OpenVisualCloud/SVT-AV1

Wow, that's a LOT of RAM for 4K. But if it's somewhat proportional to number of cores, no biggie. Any 112 logical core system is going to have >> 48 GiB RAM. The biggest c5 instance today is 72 logical threads and 144 GiB RAM.

I don't think there's ever been an encoder that could usefully use anything like 112 cores except via GOP-level parallelism. But hey, it's Intel.

I've not been able to find much detailed documentation about the SVT HEVC or AV1 projects. Do they mean "Scalable Video" ala SVC and SHVC with enhancement layers, mainly used in videoconferencing? Or scalable in the sense of scaling with hardware?

Leveraging the new low-level encoder SDK from Intel offers some interesting potential for very fast initial estimates for encoding, leaving the CPU to focus more on refinement. There isn't an AV1 encoder in the current Intel CPUs, obviously, but perhaps some VP9 functionality added in Kaby/Coffee Lake can be leveraged. Certainly things like weighted prediction and coarse motion vectors could be reused to some degree. SVT HEVC has a full 8-bit HEVC encoder implementation to leverage in Skylake-S+ and 10-bit in Kaby/Coffee.

Unfortunately there aren't any Xeon processors with VP9 encoding yet. The best available is the 8/16 core i9-9900K. I don't see any public roadmap for when AV1 might be added. Ice Lake? I see that has an all new HEVC encoder at least. Although given tape-out schedules and how recent the AV1 bitstream was finalized, a full fixed-function implementation might not be there before Tiger Lake. (all just personal speculation fueled by Wikipedia).

I am very curious to see what comes out of the next generation of GPU-assisted software-defined encoding. Having it all on-die instead avoid the PCI bus latency challenges of past GPU+CPU implementations.

nevcairiel · 1st February 2019, 20:50

Quote:

Originally Posted by benwaggoner

I've not been able to find much detailed documentation about the SVT HEVC or AV1 projects. Do they mean "Scalable Video" ala SVC and SHVC with enhancement layers, mainly used in videoconferencing? Or scalable in the sense of scaling with hardware?

Hardware. Its not producing "scalable video".

TomV · 2nd February 2019, 00:26

Quote:

Originally Posted by benwaggoner

I've not been able to find much detailed documentation about the SVT HEVC or AV1 projects. Do they mean "Scalable Video" ala SVC and SHVC with enhancement layers, mainly used in videoconferencing? Or scalable in the sense of scaling with hardware?

No. Intel bought eBrisk (they already owned a good chunk of eBrisk, thanks to the Altera acquisition, because Altera had invested in eBrisk), and then open sourced their HEVC encoder. Then they started focusing on AV1, and now they've open sourced that encoder. The HEVC encoder is fast, but the video quality is not competitive. I'm not sure if it can beat x264 under equal conditions. It certainly can't beat x265 or Beamr 5 under any conditions.

TomV · 2nd February 2019, 00:26

Quote:

Originally Posted by nevcairiel

Hardware. Its not producing "scalable video".

Not hardware. Software.

30th January 2019, 21:57	#1408 \| Link
MoSal Registered User Join Date: Jun 2013 Posts: 95	@Beelzebubu What's really funny, Netflix will not be using VMAF on their published AOM content as is. Why? Because of film grain synthesis __________________ https://github.com/MoSal

1st February 2019, 20:17	#1415 \| Link
Selur Registered User Join Date: Oct 2001 Location: Germany Posts: 7,277	so an Intel only encoder? __________________ Hybrid here in the forum, homepage

30th January 2019, 04:39	#1401 \| Link
mandarinka Registered User Join Date: Jan 2007 Posts: 729	Interesting that some of the champions supporting or helping AOM are in the problematic(?) group of unpooled HEVC licensors... you would say these companies support AV1 because they hated that.

31st January 2019, 19:27	#1412 \| Link
TomV VP Eng, Kaleidescape Join Date: Jan 2018 Location: Mt View, CA Posts: 51	HEVC Advance standard essential patent owned by GE challenged as likely invalid