Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 30th January 2019, 04:39   #1401  |  Link
mandarinka
Registered User
 
mandarinka's Avatar
 
Join Date: Jan 2007
Posts: 729
Interesting that some of the champions supporting or helping AOM are in the problematic(?) group of unpooled HEVC licensors... you would say these companies support AV1 because they hated that.
mandarinka is offline   Reply With Quote
Old 30th January 2019, 05:55   #1402  |  Link
kuchikirukia
Registered User
 
Join Date: Oct 2014
Posts: 476
Quote:
Originally Posted by TD-Linux View Post
Here's a link to AWCY as shown in Tim's presentation:

https://beta.arewecompressedyet.com/...y-525f981376bd

tl;dr it is better than x264 at every bitrate, but still worse that libvpx VP9. It is also currently about 10x slower than x264, which is blazing fast compared to libaom but still has a lot of room for improvement.
Taking x264 PSNR and SSIM values without setting --tune PSNR and SSIM is fail.
kuchikirukia is offline   Reply With Quote
Old 30th January 2019, 10:35   #1403  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,342
Quote:
Originally Posted by kuchikirukia View Post
Taking x264 PSNR and SSIM values without setting --tune PSNR and SSIM is fail.
You get one encode to compare, do you really want that to be one tuned for PSNR? Because that would be the real fail.
One encode, several metrics. Not re-encoding targeted for metrics.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is online now   Reply With Quote
Old 30th January 2019, 10:42   #1404  |  Link
TD-Linux
Registered User
 
Join Date: Aug 2015
Posts: 34
Quote:
Originally Posted by kuchikirukia View Post
Taking x264 PSNR and SSIM values without setting --tune PSNR and SSIM is fail.
There are more metrics than those at the link, but fair. I compared against x264 --tune PSNR as well and it still beats x264 at PSNR:

https://beta.arewecompressedyet.com/...y-525f981376bd
TD-Linux is offline   Reply With Quote
Old 30th January 2019, 12:29   #1405  |  Link
MoSal
Registered User
 
Join Date: Jun 2013
Posts: 95
Quote:
Originally Posted by benwaggoner View Post
In particular I worry that VMAF is insufficiently sensitive to temporal shifts in video quality. A VMAF of 70, 65, 60, 60, 65, 60, 60 might come out as a nice "VMAF=65.3" but be a annoying to watch. Frame strobing was a weakness of libvpx.
That's not really a problem. Per-frame data is available (wrote this mostly in a couple of hours). It's even available for multiple metrics, which is nice.

The real problem is that VMAF is not that good. It, for example, spectacularly fails with samples that greatly benefit from AQ (yes, I know you already hinted at this).
__________________
https://github.com/MoSal
MoSal is offline   Reply With Quote
Old 30th January 2019, 13:57   #1406  |  Link
jonatans
Registered User
 
Join Date: Oct 2017
Posts: 56
Quote:
Originally Posted by TomV View Post
Thanks for posting an update Jonatan. Your efforts on this, and in the MC-IF are really appreciated.
Thank you Tom. And thanks for providing additional context to this interesting and complicated matter.

Quote:
Originally Posted by hajj_3 View Post
I think i read that Franhaufer sold their HEVC patents to General Electric, if true you should remove Franhaufer from your diagram.
This is correct. But my understanding is that Fraunhofer did not sell all their HEVC patents. They are listed as licensor in HEVC Advance. In the latest patent list from HEVC Advance there are two Fraunhofer patents listed: https://www.hevcadvance.com/pdfnew/H...nuary-2019.pdf
__________________
Jonatan Samuelsson
Co-founder and CEO at Divideon

www.divideon.com | xvc.io
jonatans is offline   Reply With Quote
Old 30th January 2019, 17:56   #1407  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 109
Quote:
Originally Posted by benwaggoner View Post
In particular I worry that VMAF is insufficiently sensitive to temporal shifts in video quality. A VMAF of 70, 65, 60, 60, 65, 60, 60 might come out as a nice "VMAF=65.3" but be a annoying to watch. Frame strobing was a weakness of libvpx.
First and foremost: yes! It's great to see some technical & independent thinking of how good VMAF really is.

Netflix uses "hVMAF" as official notation in their charts. "h" means "harmonic", which means it uses harmonic means, which bias towards the least favourable. So in your example, the harmonic mean would be 62.66, whereas the average would be 62.86. Neither of these is 65.3. So I'm personally not as concerned about the averaging mechanism aspect of your concern. (In the CLI, use --pool harmonic_mean or something similar, depending on which exact tool you use.) On the other hand, I don't believe that VMAF uses temporal consistency in the reconstruction (the "motion" component is calculated from the source), so that particular concern ("frame throbbing" - i.e. keyframe pulsing or grain/textured-background tearing) I agree with.

[edit] Actually, I have to hedge a little here, since I'm not 100% sure VIF (another VMAF component) has a temporal component to it. I don't think it does but I'm not 100% sure. [/edit]

Since we're on the subject, here's some more of my personal concerns about VMAF:
* it's luma-only;
* AQ (x264/5) or SAO (x265) appear to have a negative impact on vmaf score, which is inconsistent with the reported visual results. I do have more detailed thoughts on this but let's leave that for some other time;
* the actual MOS/VMAF correlation depends very strongly on the viewing environment and therefore on the used model file, but most poeple simply use the default model without knowing what viewing environment it represents.

Just to be clear, I'm not trying to talk badly about VMAF, I think it's a great tool, it's better than the alternatives and it's fantastic that they opensourced the library as well as the models so that we can learn and understand how it works and constructively critique it. Hopefully, over time, that will make it even better, which should be the ultimate goal.

Separately, I also do agree with you that in the end, we should probably make a distinction between codecs optimized using VMAF vs. those that did not. This isn't an excuse to suck at writing encoders or to not use VMAF when writing encoders, but at the end of the day, we have to acknowledge that as in any metric, we're assuming a perfect correlation between our metric-of-the-day and the visual experience (or MOS score). That correlation will in practice always be imperfect, and therefore tuning towards/using that metric needs to be done with care and with visual confirmation (otherwise queue up the incoming VMAF artifacts - I wonder what they will look like?).

Last edited by Beelzebubu; 30th January 2019 at 22:11.
Beelzebubu is offline   Reply With Quote
Old 30th January 2019, 21:57   #1408  |  Link
MoSal
Registered User
 
Join Date: Jun 2013
Posts: 95
@Beelzebubu

What's really funny, Netflix will not be using VMAF on their published AOM content as is. Why? Because of film grain synthesis
__________________
https://github.com/MoSal
MoSal is offline   Reply With Quote
Old 31st January 2019, 06:00   #1409  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by MoSal View Post
@Beelzebubu

What's really funny, Netflix will not be using VMAF on their published AOM content as is. Why? Because of film grain synthesis
Well, if VMAF is used on the reconstructed video, it should be as good as VMAF is at dealing with film grain.

If VMAF is bad at dealing with film grain, they need to address that.

The nice thing about VMAF is that it's really a machine learning framework. They can keep on adding new clips and kinds of encodings and training it to rate those. The big expenses is getting the subjective ratings to use as ground-truth data. But VMAF itself can always be as good as the ground truth data from subjective testing.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 31st January 2019, 06:38   #1410  |  Link
TD-Linux
Registered User
 
Join Date: Aug 2015
Posts: 34
Quote:
Originally Posted by benwaggoner View Post
The nice thing about VMAF is that it's really a machine learning framework. They can keep on adding new clips and kinds of encodings and training it to rate those. The big expenses is getting the subjective ratings to use as ground-truth data. But VMAF itself can always be as good as the ground truth data from subjective testing.
The inputs to VMAF itself are the outputs of a bunch of simpler metrics. In that way, the machine-learned part is sort of a "meta-metric". That also means that if the input metrics all respond poorly to film grain, no amount of machine learning is going to be able to make sense of it. I think more work on the input metrics will be needed before VMAF can be used to make film grain decisions.

I don't know what Netflix currently does, but if I were them I would filter the grain from the video, run the VMAF-targeting dynamic optimizer to produce the rate controlled stream, and then add the noise parameters back as a final step.
TD-Linux is offline   Reply With Quote
Old 31st January 2019, 08:55   #1411  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,752
Quote:
Originally Posted by hajj_3 View Post
Franhaufer
Fraunhofer

Frau = woman
Hof = yard
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 31st January 2019, 19:27   #1412  |  Link
TomV
VP Eng, Kaleidescape
 
Join Date: Jan 2018
Location: Mt View, CA
Posts: 51
HEVC Advance standard essential patent owned by GE challenged as likely invalid
TomV is offline   Reply With Quote
Old 31st January 2019, 20:33   #1413  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by TD-Linux View Post
The inputs to VMAF itself are the outputs of a bunch of simpler metrics. In that way, the machine-learned part is sort of a "meta-metric". That also means that if the input metrics all respond poorly to film grain, no amount of machine learning is going to be able to make sense of it. I think more work on the input metrics will be needed before VMAF can be used to make film grain decisions.

I don't know what Netflix currently does, but if I were them I would filter the grain from the video, run the VMAF-targeting dynamic optimizer to produce the rate controlled stream, and then add the noise parameters back as a final step.
Good point on the underlying metrics. In particular I think the temporal metric was quite weak. They changed it for the most recent VMAF, but I'm not confident it'll catch all common kinds of visible temporal distortions. Two frames can look equally "good" but switching between them can be terribly jarring. Open GOP and RADL exist in large part to smooth inter-GOP transitions. And that still requires some cleverness around GOP boundaries to do well.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 1st February 2019, 20:13   #1414  |  Link
Mr_Khyron
Member
 
Mr_Khyron's Avatar
 
Join Date: Nov 2002
Posts: 203
SVT-AV1 encoder!

https://github.com/OpenVisualCloud/SVT-AV1
Quote:
Welcome to the GitHub repo for the SVT-AV1 encoder! To see a list of feature request and view what is planned for the SVT-AV1 encoder, visit our Trello page: http://bit.ly/SVT-AV1 Help us grow the community by subscribing to our SVT-AV1 mailing list

Quote:
Hardware

The SVT-AV1 Encoder library supports the x86 architecture

CPU Requirements

In order to achieve the performance targeted by the SVT-AV1 Encoder, the specific CPU model listed above would need to be used when running the encoder. Otherwise, the encoder runs on any 5th Generation Intel® Core™ processor, (Intel® Xeon® CPUs, E5-v4 or newer).

RAM Requirements

In order to run the highest resolution supported by the SVT-AV1 Encoder, at least 48GB of RAM is required to run a 4k 10bit stream multi-threading on a 112 logical core system. The SVT-AV1 Encoder application will display an error if the system does not have enough RAM to support this. The following table shows the minimum amount of RAM required for some standard resolutions of 10bit video per stream:
Resolution Minimum Footprint (GB)
4k 48gb
1080p 16gb
720p 8gb
480p 4gb
Mr_Khyron is offline   Reply With Quote
Old 1st February 2019, 20:17   #1415  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,259
so an Intel only encoder?
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 1st February 2019, 20:20   #1416  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,342
Quote:
Originally Posted by Selur View Post
so an Intel only encoder?
It should be able to run on any AVX2 CPU.

But the entire series of SVT encoders (SVT-HEVC is also a thing) is designed specifically for a use-case of running them on powerful datacenter systems with loads of memory and CPU cores.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is online now   Reply With Quote
Old 1st February 2019, 20:47   #1417  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by Mr_Khyron View Post
Wow, that's a LOT of RAM for 4K. But if it's somewhat proportional to number of cores, no biggie. Any 112 logical core system is going to have >> 48 GiB RAM. The biggest c5 instance today is 72 logical threads and 144 GiB RAM.

I don't think there's ever been an encoder that could usefully use anything like 112 cores except via GOP-level parallelism. But hey, it's Intel.

I've not been able to find much detailed documentation about the SVT HEVC or AV1 projects. Do they mean "Scalable Video" ala SVC and SHVC with enhancement layers, mainly used in videoconferencing? Or scalable in the sense of scaling with hardware?

Leveraging the new low-level encoder SDK from Intel offers some interesting potential for very fast initial estimates for encoding, leaving the CPU to focus more on refinement. There isn't an AV1 encoder in the current Intel CPUs, obviously, but perhaps some VP9 functionality added in Kaby/Coffee Lake can be leveraged. Certainly things like weighted prediction and coarse motion vectors could be reused to some degree. SVT HEVC has a full 8-bit HEVC encoder implementation to leverage in Skylake-S+ and 10-bit in Kaby/Coffee.

Unfortunately there aren't any Xeon processors with VP9 encoding yet. The best available is the 8/16 core i9-9900K. I don't see any public roadmap for when AV1 might be added. Ice Lake? I see that has an all new HEVC encoder at least. Although given tape-out schedules and how recent the AV1 bitstream was finalized, a full fixed-function implementation might not be there before Tiger Lake. (all just personal speculation fueled by Wikipedia).

I am very curious to see what comes out of the next generation of GPU-assisted software-defined encoding. Having it all on-die instead avoid the PCI bus latency challenges of past GPU+CPU implementations.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 1st February 2019, 20:50   #1418  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,342
Quote:
Originally Posted by benwaggoner View Post
I've not been able to find much detailed documentation about the SVT HEVC or AV1 projects. Do they mean "Scalable Video" ala SVC and SHVC with enhancement layers, mainly used in videoconferencing? Or scalable in the sense of scaling with hardware?
Hardware. Its not producing "scalable video".
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is online now   Reply With Quote
Old 2nd February 2019, 00:26   #1419  |  Link
TomV
VP Eng, Kaleidescape
 
Join Date: Jan 2018
Location: Mt View, CA
Posts: 51
Quote:
Originally Posted by benwaggoner View Post
I've not been able to find much detailed documentation about the SVT HEVC or AV1 projects. Do they mean "Scalable Video" ala SVC and SHVC with enhancement layers, mainly used in videoconferencing? Or scalable in the sense of scaling with hardware?
No. Intel bought eBrisk (they already owned a good chunk of eBrisk, thanks to the Altera acquisition, because Altera had invested in eBrisk), and then open sourced their HEVC encoder. Then they started focusing on AV1, and now they've open sourced that encoder. The HEVC encoder is fast, but the video quality is not competitive. I'm not sure if it can beat x264 under equal conditions. It certainly can't beat x265 or Beamr 5 under any conditions.
TomV is offline   Reply With Quote
Old 2nd February 2019, 00:26   #1420  |  Link
TomV
VP Eng, Kaleidescape
 
Join Date: Jan 2018
Location: Mt View, CA
Posts: 51
Quote:
Originally Posted by nevcairiel View Post
Hardware. Its not producing "scalable video".
Not hardware. Software.
TomV is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:17.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.