Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 5th August 2021, 12:05   #1  |  Link
birdie
Artem S. Tashkinov
 
birdie's Avatar
 
Join Date: Dec 2006
Posts: 337
AV1 successor: AV2

Some news about AV2: https://ottverse.com/av2-video-codec-evaluation/
birdie is offline   Reply With Quote
Old 5th August 2021, 23:42   #2  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Some promising initial results. I appreciate the simplification work for hardware decoders; AV1 is rather a beast to implement in low-cost hardware.

Building off the libaom encoder makes for an enormously faster test encoder for this stage of development compared to MPEG codecs, which is very helpful for rapid iteration of experiments and testing a wide variety of content and scenarios.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 19th August 2021, 18:41   #3  |  Link
BlueLane
Registered User
 
Join Date: Jan 2021
Posts: 2
Quote:
Originally Posted by benwaggoner View Post
Some promising initial results. I appreciate the simplification work for hardware decoders; AV1 is rather a beast to implement in low-cost hardware.

Building off the libaom encoder makes for an enormously faster test encoder for this stage of development compared to MPEG codecs, which is very helpful for rapid iteration of experiments and testing a wide variety of content and scenarios.
Ben, what about hardware encoding of AV1? Is that also a beast?

I wonder if AV2 simplifies hardware encoding over AV1.

The use case I'm thinking of most is security cameras, like the Ring systems, as well as more commercial/industrial systems. This means real-time encoding, not the kind of optimized, curated encoding Netflix or Prime Video would take its time with.

And it also means lower framerates in many cases. A default frame rate when no motion is detected might be something like 3 fps, just there as a backup to capture events that somehow didn't trigger the motion detectors. When the motion detectors fire, then the camera would flip into something like 15 or 20 fps, still lower than standard 30 fps video or 24 fps film. The motivation is to conserve storage capacity, since nothing eats storage like video, and higher framerates aren't necessary for surveillance.

I wonder what the codec design implications are for such a use case, and if any of the newer codecs contemplate these reduced framerate scenarios. The "trickle" or standby rate of ≈3 fps is interesting because it will almost always consist of invariant video, of just frame after frame of the same thing, with perhaps slight movement of things like leaves, grass, flags, etc. due to wind. Do most codecs handle invariance well? I assume that any kind of delta encoding would deliver huge compression ratios.

And the 15-20 fps action streams are more the opposite in that there will be lots of bulk movement in the frame, with maybe only homogenous perimeters. I'm not sure what the implications are for codecs. Efficiently encoding action is probably their main gig, though they also like to leverage the invariant parts of frames.

I wonder if the reduced framerate would make most codecs less efficient, because there's less interframe duplication per unit time. Given X amount of delivery driver movement in a one-second span, 30 frames should carry less interframe differentiation than 15 or 20 frames. In the latter case, each frame should be less similar to its predecessor than in the 30 fps case, right? That seems to imply less bitrate reduction per second in the 15-20 fps case, compared to 30 fps, as a percentage. How that plays out in terms of raw bitrate figures is interesting – could 30 fps end up costing the same bitrate as 15 or 20 fps with some codecs?
BlueLane is offline   Reply With Quote
Old 27th August 2021, 21:12   #4  |  Link
nhw_pulsar
Registered User
 
Join Date: Apr 2017
Posts: 171
Quote:
Originally Posted by BlueLane View Post
Ben, what about hardware encoding of AV1? Is that also a beast?

I wonder if AV2 simplifies hardware encoding over AV1.

The use case I'm thinking of most is security cameras, like the Ring systems, as well as more commercial/industrial systems. This means real-time encoding, not the kind of optimized, curated encoding Netflix or Prime Video would take its time with.

And it also means lower framerates in many cases. A default frame rate when no motion is detected might be something like 3 fps, just there as a backup to capture events that somehow didn't trigger the motion detectors. When the motion detectors fire, then the camera would flip into something like 15 or 20 fps, still lower than standard 30 fps video or 24 fps film. The motivation is to conserve storage capacity, since nothing eats storage like video, and higher framerates aren't necessary for surveillance.

I wonder what the codec design implications are for such a use case, and if any of the newer codecs contemplate these reduced framerate scenarios. The "trickle" or standby rate of ≈3 fps is interesting because it will almost always consist of invariant video, of just frame after frame of the same thing, with perhaps slight movement of things like leaves, grass, flags, etc. due to wind. Do most codecs handle invariance well? I assume that any kind of delta encoding would deliver huge compression ratios.

And the 15-20 fps action streams are more the opposite in that there will be lots of bulk movement in the frame, with maybe only homogenous perimeters. I'm not sure what the implications are for codecs. Efficiently encoding action is probably their main gig, though they also like to leverage the invariant parts of frames.

I wonder if the reduced framerate would make most codecs less efficient, because there's less interframe duplication per unit time. Given X amount of delivery driver movement in a one-second span, 30 frames should carry less interframe differentiation than 15 or 20 frames. In the latter case, each frame should be less similar to its predecessor than in the 30 fps case, right? That seems to imply less bitrate reduction per second in the 15-20 fps case, compared to 30 fps, as a percentage. How that plays out in terms of raw bitrate figures is interesting – could 30 fps end up costing the same bitrate as 15 or 20 fps with some codecs?
Hi,

As nobody has answered you for now, I'll try to give you an answer even if I am not an expert...

Yes for 3fps with static invariant frames (except sometimes when there are a movement or some wind), yes I think a custom encoder would be more practical, as just simple delta encoding will be certainly the most efficient in most of the cases.Does the security cameras market is waiting for the next standard of AOM/MPEG which is not their main target/market, or a special custom encoder is conceivable? Because I have made an extremely fast codec (NHW) that has competitive results on intra coding, do you think your industry could look at NHW for a special custom codec?

Concerning AV2, yes it will be more complex than AV1, but as written in the article above, 5-7% improvement over AV1 is a good start, but as has said AOM and Google teams many times, there are other experiments underway, because they said that they clearly aim with AV2 at 25-30% more compression over AV1 to be viable and to justify a (slower) codec switch.Does this objective has diminished now?

Last edited by nhw_pulsar; 27th August 2021 at 22:29.
nhw_pulsar is offline   Reply With Quote
Old 2nd September 2021, 19:10   #5  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by nhw_pulsar View Post
Concerning AV2, yes it will be more complex than AV1, but as written in the article above, 5-7% improvement over AV1 is a good start, but as has said AOM and Google teams many times, there are other experiments underway, because they said that they clearly aim with AV2 at 25-30% more compression over AV1 to be viable and to justify a (slower) codec switch.Does this objective has diminished now?
A 25-30% improvement is really small potatoes for a new codec generation. Historically, we've seen broad multiple-industry adaptions of new codecs when they have a potential ~100% more compression than the prior standard technology.

MPEG-2 to H.264 offered that kind of improvement, as did H.264 to HEVC, and HEVC to VVC. Codecs that offered ~20-30% improvements, like MPEG-4 Part 2, VC-1, VP3-9, and so far AV1 haven't ever gotten much momentum outside of their primary sponsor companies and their specific ecosystems.

An AV2 30% better than AV1 would still be somewhat short of VVC. It's hard to see it becoming the "default" HW decoder unless VVC's licensing goes horribly pear-shaped.

Even with HEVC's licensing challenges, it has been ubiquitous in pretty much every video-capable SoC, GPU, CPU, etcetera for years now. It's pretty dominant for paid premium content, with user-generated and non-commercial content the main place VP9 and AV1 are used. Big markets, but still a relatively moderate slice of the overall video market pie.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd September 2021, 19:57   #6  |  Link
nhw_pulsar
Registered User
 
Join Date: Apr 2017
Posts: 171
Quote:
Originally Posted by benwaggoner View Post
A 25-30% improvement is really small potatoes for a new codec generation. Historically, we've seen broad multiple-industry adaptions of new codecs when they have a potential ~100% more compression than the prior standard technology.

MPEG-2 to H.264 offered that kind of improvement, as did H.264 to HEVC, and HEVC to VVC. Codecs that offered ~20-30% improvements, like MPEG-4 Part 2, VC-1, VP3-9, and so far AV1 haven't ever gotten much momentum outside of their primary sponsor companies and their specific ecosystems.

An AV2 30% better than AV1 would still be somewhat short of VVC. It's hard to see it becoming the "default" HW decoder unless VVC's licensing goes horribly pear-shaped.

Even with HEVC's licensing challenges, it has been ubiquitous in pretty much every video-capable SoC, GPU, CPU, etcetera for years now. It's pretty dominant for paid premium content, with user-generated and non-commercial content the main place VP9 and AV1 are used. Big markets, but still a relatively moderate slice of the overall video market pie.
Ok, but for AV2 to have 30-50% better compression over AV1, AOM will have to invent new breakthrough technology (and different from the VVC patented new technology).So far we can read that AOM has started to look at promising machine learning and neural networks (MPEG has also started this research), but a current "problematic" with AI is that certainly encoding will be slower but especially decoding will be very slower? Does the complexity budget will explode?

So with 30-50% improvement over AV1, I think AV2 certainly needs more years of development and won't be ready in the near future?
nhw_pulsar is offline   Reply With Quote
Old 2nd September 2021, 21:41   #7  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,988
Seems near term the lower hanging fruit for compression gains will be using AI/ML to make smarter early exits and maybe some image pre-processing (or clever post-processing with side-channel data a-la LC-EVC), plus of course good content-adaptive encoding strategies.

I'm not sure how much further we can take block sub-partitioning and motion prediction etc
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 7th September 2021, 11:29   #8  |  Link
ksec
Registered User
 
Join Date: Mar 2020
Posts: 117
Quote:
Originally Posted by Blue_MiSfit View Post

I'm not sure how much further we can take block sub-partitioning and motion prediction etc
That is what I have been thinking about for quite some time as well. Law of diminishing returns.

What I really want to see is ML and Post Processing / LC on top of VVC or AV1. Especially with LC VVC. Could we squeeze another 20-30% bitrate reduction on leading edge codec?
__________________
Previously iwod
ksec is offline   Reply With Quote
Old 7th September 2021, 22:18   #9  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,988
LC-EVC absolutely allows for this.

I'm not aware of any production implementation, but in my conversations with the V-Nova folks they have implemented LC-EVC on top of VVC and AV1 for testing and got great improvements.
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 8th September 2021, 15:10   #10  |  Link
Gravitator
Registered User
 
Join Date: May 2014
Posts: 292
Quote:
Originally Posted by Blue_MiSfit View Post
LC-EVC absolutely allows for this.

I'm not aware of any production implementation, but in my conversations with the V-Nova folks they have implemented LC-EVC on top of VVC and AV1 for testing and got great improvements.
The analog of the hardware filter for encoding and decoding is already ready: AMD FidelityFX, NVIDIA DLSS.
__________________
Win10x64, Xeon E5450, GTX 750 2GB, DDR3 8GB.

Last edited by Gravitator; 8th September 2021 at 17:36.
Gravitator is offline   Reply With Quote
Old 8th September 2021, 19:37   #11  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by ksec View Post
That is what I have been thinking about for quite some time as well. Law of diminishing returns.

What I really want to see is ML and Post Processing / LC on top of VVC or AV1. Especially with LC VVC. Could we squeeze another 20-30% bitrate reduction on leading edge codec?
This is exactly the approach MPAI is taking, albeit on top of EVC Baseline. But the concepts and approaches they develop there should be more broadly applicable.

As for the diminishing returns of block-based encoding, people have been predicting that for decades, but we're continuing to see 50% bitrate reductions in PSNR each decade, with even bigger psychovisual gains. We can do 4K HDR at a lower ABR than for SD MPEG-2 on DVD!

Maybe we'll hit a wall where some a new fundamental approach will be needed, but probably not for a while. There's lots of block-based enhancements proposed that aren't included in VVC or AV1.

Over the next decades, I can imagine a ML-powered fusion of block based encoding and GPU/VR-style rendering with texture mapping. So object detection and extraction for things that can be synthesized more efficiently than 2D rendered.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 14th September 2021, 09:24   #12  |  Link
ksec
Registered User
 
Join Date: Mar 2020
Posts: 117
Quote:
Originally Posted by Blue_MiSfit View Post
LC-EVC absolutely allows for this.

I'm not aware of any production implementation, but in my conversations with the V-Nova folks they have implemented LC-EVC on top of VVC and AV1 for testing and got great improvements.
I really hope they release more details soon. Although this is probably not the thread to be talking about LC and VVC.
__________________
Previously iwod
ksec is offline   Reply With Quote
Reply

Tags
av2

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 18:58.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.