Log in

View Full Version : Alliance for Open Media codecs


Pages : 1 2 3 4 5 6 [7] 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

x265_Project
26th September 2017, 19:41
HS : x265_Project, is this someone from your team ? https://bambuser.com/v/6909221#t=205
Yes, that's Pradeep Ramachandran, who runs the x265 development team (also known as pradeeprama (https://forum.doom9.org/member.php?u=223284))

Clare
29th September 2017, 08:44
Evolution of the AV1 codec between October 2016 and July 2017

http://wyohknott.github.io/image-formats-comparison/comparison.html

I'll add another data point after bitstream freeze.

Tommy Carrot
29th September 2017, 16:59
Clare, in your image comparison page, the old and the new AV1 images are completely identical. Other than that, it's a very useful page, much appreciated.

Clare
29th September 2017, 21:39
Clare, in your image comparison page, the old and the new AV1 images are completely identical. Other than that, it's a very useful page, much appreciated.

I don't know the files got mixed but it should work now.

mzso
29th September 2017, 21:43
I don't know the files got mixed but it should work now.

Wow. the new one is even worse than the old (which is pretty bad alread) on that factory image. Large almost flat areas.

easyfab
30th September 2017, 13:56
@clare

could you add PIK ( https://github.com/google/pik ) in a next comparison ?

Tommy Carrot
30th September 2017, 14:56
Thanks Clare.

Wow. the new one is even worse than the old (which is pretty bad alread) on that factory image. Large almost flat areas.

Unfortunately, i have to agree. AV1 still image encoding has improved in some regards (less ringing), but it smoothes a bit too much for my liking. In many cases the old AV1 looks better than the new in my opinion.

IgorC
5th October 2017, 00:27
I've watched some images from here http://wyohknott.github.io/image-formats-comparison/
And yes, AV1 and HEVC don't retain fine details as much as Daala but they have less blocking though.

Clare
8th October 2017, 12:59
@clare

could you add PIK ( https://github.com/google/pik ) in a next comparison ?

Yeah some guy from Mozilla asked for Pik too, but the thing seems to require a lot of CPU power and I just have a mobile CPU.

MoSal
9th October 2017, 19:11
Yeah some guy from Mozilla asked for Pik too, but the thing seems to require a lot of CPU power and I just have a mobile CPU.

@Clare Here you go:
subset1 (https://archive.org/download/unsorted_files/subset1-pik-2017.10.4-ab748ffe0fb62500baaee98be831bae6f4282d57.tar) | subset2 (https://archive.org/download/unsorted_files/subset2-pik-2017.10.4-ab748ffe0fb62500baaee98be831bae6f4282d57.tar)

Encoded with --distance 3 (maximum compression). Images are still larger than the ones used in your current comparison, especially the more compressible ones.

bstrobl
20th October 2017, 20:34
Demuxed 2017 videos are on their Youtube channel (https://www.youtube.com/channel/UCIc_DkRxo9UgUSTvWVNCmpA). Apparently the AV1 still image format will be called AVIF.

hajj_3
21st October 2017, 02:19
https://i.imgur.com/H5tly3v.png

This is a screenshot from the October 5th Demuxed video in the post above this. Its a shame the version of x265 is old that they are comparing this to and no x264 to compare to.

benwaggoner
22nd October 2017, 17:12
This is a screenshot from the October 5th Demuxed video in the post above this. Its a shame the version of x265 is old that they are comparing this to and no x264 to compare to.
Also, these aren't particularly great metrics for moving images; none of them include any temporal component to detect things like keyframe strobing.

DMOS is the gold standard of course, but is expensive and non-automatable. I would like to see at least VMAF scores, as VMAF looks to be the substantially best objective video metric available, at least for SDR <=1080p and >300 Kbps.

I am baffled by the use of such an old x265 build. Even in placebo it'll run a ton faster than the current AV1 reference implementation. So it'd be trivial to rerun the test at the same time the final tests for AV1 are done for a presentation like this.

Are the different command lines used for the different codecs listed somewhere?

dapperdan
24th October 2017, 14:15
In case anyone else was confused like I was, the still image format based on AV1 called AVIF is mentioned (briefly) in a video mostly on other topics, given by Netflix at the Demuxed conference, not in either of the two videos that mention AV1 in their titles:

https://www.youtube.com/watch?v=PSdhW-R9u6s

birdie
25th October 2017, 08:06
AV1 status report (Oct 20, 2017)

https://www.youtube.com/watch?v=yKEDf5-2sT4

mzso
25th October 2017, 09:26
In case anyone else was confused like I was, the still image format based on AV1 called AVIF is mentioned (briefly) in a video mostly on other topics, given by Netflix at the Demuxed conference, not in either of the two videos that mention AV1 in their titles:

https://www.youtube.com/watch?v=PSdhW-R9u6s

A timecoded link would have been nice. Or a mention of what was said. Not everyone has time for these long-ass videos.

dapperdan
25th October 2017, 13:20
It's at 1:30 through to about 4 minues though it pretty much only says "We expect to have a new image format based on [AV1] called AVIF" and about 3 minutes makes clear this is just a I-Frame of AV1 and aimed towards distribution rather than storage.

dapperdan
25th October 2017, 17:44
AV1 status update talk from Gstreamer conference, very similar content to the Demuxed talk.

https://gstconf.ubicast.tv/videos/av1-the-quest-is-nearly-complete/

dapperdan
25th October 2017, 17:47
A PhD-level internship for working on AV1 at Mozilla is open for applications:

https://careers.mozilla.org/position/gh/881961

bstrobl
25th October 2017, 21:34
Hope not too many issues crop up in the reference implementation due to the final rush to get things done.

TD-Linux
27th October 2017, 17:36
I am baffled by the use of such an old x265 build. Even in placebo it'll run a ton faster than the current AV1 reference implementation. So it'd be trivial to rerun the test at the same time the final tests for AV1 are done for a presentation like this.

Are the different command lines used for the different codecs listed somewhere?

Yeah using the old version is not great (which is why I was sure to list it) I want to provide better x265 numbers before presenting numbers with something closer to the "final" codec (the "AV1" here is missing features too). Note it's probably far from the only issue comparing x265 and AV1 - both use very different reference frames and keyframe boosts so doing a comparison with metrics is quite hazard frought. The VP9 numbers are much more comparable. The x265 parameters were:

--preset placebo --no-wpp --bframes 16 --merange 256--min-keyint 1000 --keyint 1000 --no-scenecut --crf=$x

The bframes and merange options were determined through trial and error to improve x265's metric score. I've run x265 since but haven't tried to re-optimize these on a newer build.

This is done on the AWCY infrastructure. Here are two more recent results, still with the old x265 runs:

https://arewecompressedyet.com/?job=x265-1.9-hl-placebo-nowpp-b16-me256-objective-1-fast-3&job=debargha-adopted-0928%402017-09-28T21%3A43%3A40.468Z

with --tune psnr

https://arewecompressedyet.com/?job=x265-1.9-hl-placebo-nowpp-psnr-objective-1-fast-4&job=debargha-adopted-0928%402017-09-28T21%3A43%3A40.468Z

benwaggoner
27th October 2017, 19:09
Yeah using the old version is not great (which is why I was sure to list it) I want to provide better x265 numbers before presenting numbers with something closer to the "final" codec (the "AV1" here is missing features too). Note it's probably far from the only issue comparing x265 and AV1 - both use very different reference frames and keyframe boosts so doing a comparison with metrics is quite hazard frought. The VP9 numbers are much more comparable. The x265 parameters were:
An apples-to-apples comparison of a bitstream is impossible. All that really is possible is to compare DMOS of an optimal encode from the best available encoders for each format targeting the same scenario.

--preset placebo --no-wpp --bframes 16 --merange 256--min-keyint 1000 --keyint 1000 --no-scenecut --crf=$x
No IDR frames and no rate control isn't a real-world scenario. But assuming these are short clips

Why a CRF encode? If you want a particular bitrate to compare quality, you should use 2-pass VBR. Iteratively selecting a CRF to get the target file size is just a very slow way to get an identical result.
If you stick with single pass, use --rc-lookahead 250
You should add --cu-lossless to fully exercise x265 and HEVC features. It can improve efficiency with synthetic elements and when visual lossless.
You should add --tskip, which can often improve x265 with fine details.
You should try --subme 7; placebo defaults to 5. I don't know if it'll offer practical improvements, but may well for objective metrics when using --tune for those metrics.
You should try --me sea or even full if you want Full Placebo. They likely would help a bit with PSNR and objective metrics, although it doesn't seem that useful in the real world.
For lower syntax overhead, try--opt-cu-delta-qp.
You should be testing with the current release version of x265 for when you are running the comparison. AV1 isn't going to compete against last year's x265! The competition is between the best available encoders at the time of a comparison.


The bframes and merange options were determined through trial and error to improve x265's metric score. I've run x265 since but haven't tried to re-optimize these on a newer build.
Those should be fine, albeit super slow. Even placebo only uses merange of 92. Tskip is going to offer a lot better quality @ speed improvement.

Also, none of the metrics I've seen published have particularly good correlation to DMOS. If you can't do a real subjective test, you should at least include VMAF, which is definitely the best available subjective metric today.

This is done on the AWCY infrastructure. Here are two more recent results, still with the old x265 runs:

https://arewecompressedyet.com/?job=x265-1.9-hl-placebo-nowpp-b16-me256-objective-1-fast-3&job=debargha-adopted-0928%402017-09-28T21%3A43%3A40.468Z

with --tune psnr

https://arewecompressedyet.com/?job=x265-1.9-hl-placebo-nowpp-psnr-objective-1-fast-4&job=debargha-adopted-0928%402017-09-28T21%3A43%3A40.468Z
If you are testing SSIM, you need to use --tune ssim and --rd-ssim. --tune PSNR generally gives worse results with SSIM than not using any --tune, as your results show.

But, really, these results don't matter with an old version. There has been a ton of tuning since that x265 build. Ones relevant to placebo 8-bit SDR testing include:

Support for non-IDR I-frames (very relevant with --no-keyint)
new lambda tables
A bunch of syntax simplification tools

There is a temptation to assume there exists a way to compare bitstreams outside of their encoders, but there simply isn't. Which is why I recommend picking a real world scenario and tuning each encoder as best as possible for that scenario. Ignoring rate control, keyframes, and subjective quality excludes critical features always used in real-world encoding, and results in a less relevant real-world test.

And bitrates need to be used where all streams to be tested show some degree of compression artifacting. If any stream is visually lossless for any significant period, we don't know if the bitrate is higher than is needed to be visually lossless. That tends to underestimate the advantage of the more efficient encoder.

Two tests I like are:

VBR:
2-pass VBR
peak bitrate 2x average bitrate
VBV set to the max of the lowest Profile @ Level for the content frame size and fps
Open GOP with adaptive duration of a maximum of 5 seconds


CBR:
1-pass CBR
VBV set to 2x bitrate
Fixed 2 second Closed GOP


The minimum viable objective metric is VMAF, in my opinion. All the ones currently used for this comparison don't include temporal comparisons(need to catch keyframe strobing and grain swirling!), and all have lots of known defects resulting in poor subjective comparisons. Plus the average of squared errors of individual frames does nothing to pick up on quality variation inside of a clip. A clip that is 3 second great and 3 seconds awful can have a better psnr and ssim as a clip that is consistently mediocre, but consistently mediocre is a much better subjective experience. I understand that is why rate control is deactivated in codec comparisons, but that itself assumes that QP==constant quality, when we know it isn't (which is why CRF was invented).

The better the metrics being used, the more relevant the comparison. And it is also helpful for designers of new bitstreams and encoders to get them focused on optimizing the stuff that really matters. The whole sum of squared PSNR and no rate control biases codec development for codecs where fixed QP optimizes for high mean PNSR per frame. And that just isn't something that matters in a real-world encoder with rate control, adaptive quantization, and watched by humans.

It is illustrative to encode with --tune psnr and see the ways that subjective quality degrades when optimizing for that metric. Or just using fixed QP.

TD-Linux
27th October 2017, 23:32
An apples-to-apples comparison of a bitstream is impossible. All that really is possible is to compare DMOS of an optimal encode from the best available encoders for each format targeting the same scenario.

Agreed. I've set up some infrastructure that makes it possible to crowdsource this as well, so I'd like to do so once AV1's done.


No IDR frames and no rate control isn't a real-world scenario. But assuming these are short clips
Yup, they are only 60 frames long and intentionally don't have scenecuts. This is done primarily for performance reasons (it also approximately matches the HEVC test conditions, though whether that should be a goal is questionable). Note that the primary use case of this platform is to quickly iterate small changes - it is a slight abuse to be doing cross codec comparisons...

Why a CRF encode?

We're calculating bd-rate, which plots several rate points on a curve and integrates the area difference between them to get an "average" change in rate. Precisely hitting a rate target isn't important. In fact, using VBR is counterproductive as it often adds noise to the results for small changes, and also x265 and libaom's VBR modes aren't really comparable. It might make more sense to use VBR modes for visual comparisons, though (with careful consideration of the buffer models). And possibly with some objective metrics as well, but I think libaom's rate controller will need to be rewritten to have a more x265-like mode before this really produces interpretable results.
You should add --cu-lossless to fully exercise x265 and HEVC features. It can improve efficiency with synthetic elements and when visual lossless.
Noted. There a couple of clips in the test set that might benefit.

You should add --tskip <snip>

Noted, I'll try iteratively adding them to the current release (previously I've found that that some of these options were less well tested and didn't always offer improvements)


Also, none of the metrics I've seen published have particularly good correlation to DMOS. If you can't do a real subjective test, you should at least include VMAF, which is definitely the best available subjective metric today.

I've actually added it already, the main issue I've ran into is that VMAF tends to saturate quickly at higher qualities and actually goes down slighty, this causes bd-rate to be come uncomputable. I need to come up with some sort of solution to this. Select VMAF and individual videos at this link to see the problem: https://arewecompressedyet.com/?job=x265-2.5-hl-placebo-nowpp-normal-tuning-objective-1-fast%402017-09-23T17%3A06%3A32.304Z

I also do want to warn that I've seen VMAF give rather unexpected results. For example, AV1's CDEF filter produced clearly superior results in a subjective test but VMAF shows it as several percent worse. In addition, VMAF hates x264 and x265's AQ modes. The greatest strength of VMAF is that it tends to offer an absolute quality measure across different types of video - but locally it's far from perfect.

iwod
30th October 2017, 08:25
Hope not too many issues crop up in the reference implementation due to the final rush to get things done.

Which is exactly the same I am feeling now.

But I dont think it matters, at least I hope that is the way Open Media Alliance sees it. If they are in it for the long run. AV1 shouldn't really be perfect. But it should be a statement to the world, that it is possible to do video codec this way.

Then I hope they will start drafting a AV2 spec along with a ecosystem around it.

But given the way google does thing, I am not holding my breath.

benwaggoner
30th October 2017, 17:46
Yup, they are only 60 frames long and intentionally don't have scenecuts. This is done primarily for performance reasons (it also approximately matches the HEVC test conditions, though whether that should be a goal is questionable). Note that the primary use case of this platform is to quickly iterate small changes - it is a slight abuse to be doing cross codec comparisons...
Only 60 frame clips? No scenecut makes perfect sense in that case, but I worry about the broad applicability of the results. The VBV is a huge percentage of the bitrate at those low sizes, which can yield a lot of variability of the size of a compliant stream. And a 2-sec GOP is on the low end of typical usage anyway.

We're calculating bd-rate, which plots several rate points on a curve and integrates the area difference between them to get an "average" change in rate. Precisely hitting a rate target isn't important. In fact, using VBR is counterproductive as it often adds noise to the results for small changes, and also x265 and libaom's VBR modes aren't really comparable.
Perhaps the implementations aren't comparable, but the problem to solve is the same: maximum quality given a maximum file size and a maximum VBV. So I would think the results are comparable, even if not comparable.

It might make more sense to use VBR modes for visual comparisons, though (with careful consideration of the buffer models). And possibly with some objective metrics as well, but I think libaom's rate controller will need to be rewritten to have a more x265-like mode before this really produces interpretable results.
That is a core point. Are we comparing the ability of the bitstream or of the best reference encoder? I'd argue that doing the former is impossible to do well with high precision. But I can see the value in getting a snapshot estimate.

I've actually added it already, the main issue I've ran into is that VMAF tends to saturate quickly at higher qualities and actually goes down slighty, this causes bd-rate to be come uncomputable. I need to come up with some sort of solution to this.
If perceptual quality is saturating, than VMAF is yielding appropriate results. From a DMOS perspective, the closer we are to visually lossless, the smaller the relative efficiency of different encoders becomes. We would expect meaningful distortion to flatten out at a certain point. It is kind of ridiculous that PSNR -10 to -30 has the same weight as PSNR -60 to -80, since the former is a huge visual quality difference and the latter is invisible.

I also do want to warn that I've seen VMAF give rather unexpected results. For example, AV1's CDEF filter produced clearly superior results in a subjective test but VMAF shows it as several percent worse. In addition, VMAF hates x264 and x265's AQ modes. The greatest strength of VMAF is that it tends to offer an absolute quality measure across different types of video - but locally it's far from perfect.
VMAF isn't perfect. Its test range was relatively limited (1080p down to ~300 Kbps IIRC) and was mainly done with x264. Since it was only trained on x264 style artifacts, types of artifacts only appearing in HEVC and/or AV1 might give weird results.

In theory, the VMAF framework can be used, but using new clips to retrain the ML. In practice, I think VMAF will need and enhanced temporal component to capture a broader range of codecs and scenarios. In particular, I think the current VMAF will underestimate the perceptual experience hit of keyframe strobing, and thus wouldn't capture Open versus Closed GOP differences.

Overall, I think you're making a very good effort here. This is really hard stuff to do well. Fundamentally, there are lots of real-world things we want to predict from metrics that we simply can't predict well. Even DMOS has its limitations, as it takes so long that the results generally don't reflect the abilities of encoder implementations by the time they are published.

And golden eyes have all kinds of biases. I know I am so trained to pick up and classic forms of encoding issues that I notice things real-world customers almost never will.

We continue to slouch towards mediocrity.

bstrobl
1st November 2017, 11:04
Which is exactly the same I am feeling now.

But I dont think it matters, at least I hope that is the way Open Media Alliance sees it. If they are in it for the long run. AV1 shouldn't really be perfect. But it should be a statement to the world, that it is possible to do video codec this way.

Then I hope they will start drafting a AV2 spec along with a ecosystem around it.

But given the way google does thing, I am not holding my breath.

Google seems to be cooperating well enough with the group(along with everyone else even if they are on opposing sides e.g. Intel/AMD). AV1 is unlikely to have the same major issues like VP9, but it is slowly looking to become more of a stopgap for AV2 due to time pressure. I am seeing a lot of experiments being moved to AV2, which will hopefully be done right once the pressure to rush things out clears after AV1 is released.

Can't complain about the current state that much however, AV1 is still a very good codec :)

As for the ecosystem, I am hoping for a family of related container formats, something like:
webm - video (AV1/Opus)
webp - picture (AV1 Intra) - throw out VP8
weba - audio (Opus)

It would definitely help to market these codecs and formats a bit more to the general populace. Not sure AVIF is that great to pronounce ;)

hajj_3
1st November 2017, 13:13
As for the ecosystem, I am hoping for a family of related container formats, something like:
webm - video (AV1/Opus)
webp - picture (AV1 Intra) - throw out VP8
weba - audio (Opus)

It would definitely help to market these codecs and formats a bit more to the general populace. Not sure AVIF is that great to pronounce ;)

using the extension .webp would be a bad idea. As people may try and view a vp8 .webp file in a view/browser that only supports av1 .webp. They should use the extension .avif. Avif will be good enough for a decade as image compression isn't going to improve that much. JPEG has been around for over 20yrs and avif is only going to be ~65% smaller. There isn't any point making a new image format every ~3yrs, there is a good benefit to making new aom video codecs every 3yrs though.

mzso
1st November 2017, 13:25
I am seeing a lot of experiments being moved to AV2, which will hopefully be done right once the pressure to rush things out clears after AV1 is released.

I will be thoroughly disappointed if AV2 turns out to be another tweakfest for old ideas. I don't think there's a point in creating yet another format to a similar concept. Development would will have ever diminishing returns anyway.

I think AV2 should be something fundamentally different (for the better) and they'll finish it whenever they do.

there is a good benefit to making new aom video codecs every 3yrs though.

I strongly disagree... In three years they can't even exploit the full potential of a codec format.
It's only good for HW manufacturers, who can resale the same crap, but with a newer HW decoder.

benwaggoner
1st November 2017, 18:51
As for the ecosystem, I am hoping for a family of related container formats, something like:
webm - video (AV1/Opus)
webp - picture (AV1 Intra) - throw out VP8
weba - audio (Opus)

Why not use MPEG-4 program streams for video, HEIF for picture, and MPEG-4 program streams for audio? Dropping in a new codec is a lot easier than full support for a new container format. And we have a LOT of mature muxing, distribution, and playback tech built around MPEG-4 PS.

benwaggoner
1st November 2017, 18:57
using the extension .webp would be a bad idea. As people may try and view a vp8 .webp file in a view/browser that only supports av1 .webp. They should use the extension .avif. Avif will be good enough for a decade as image compression isn't going to improve that much. JPEG has been around for over 20yrs and avif is only going to be ~65% smaller. There isn't any point making a new image format every ~3yrs, there is a good benefit to making new aom video codecs every 3yrs though.
Maybe a 65% size reduction for natural images, which are JPEG's sweet spot. But for sharp-edged and synthetic content like text, line art, screen shots, noise-free gradients, etcetera, JPEG is quite bad. I've been able to get 95% reductions for Manga-style art using HEIF HEVC versus JPEG. I imagine AV1 would be in the same rough ballpark.

The goal isn't just to replace JPEG, but also PNG and GIF. HEIF+HEVC is a superset of all of JPEG, PNG, and GIF, and offers better efficiency than any of them for all their scenarios, and adds a lot of additional scenarios as well (like real PQ Rec. 2020 style HDR). An HEIF+AV1 would likely have similar potential.

hajj_3
1st November 2017, 21:04
Maybe a 65% size reduction for natural images, which are JPEG's sweet spot. But for sharp-edged and synthetic content like text, line art, screen shots, noise-free gradients, etcetera, JPEG is quite bad. I've been able to get 95% reductions for Manga-style art using HEIF HEVC versus JPEG. I imagine AV1 would be in the same rough ballpark.

The goal isn't just to replace JPEG, but also PNG and GIF. HEIF+HEVC is a superset of all of JPEG, PNG, and GIF, and offers better efficiency than any of them for all their scenarios, and adds a lot of additional scenarios as well (like real PQ Rec. 2020 style HDR). An HEIF+AV1 would likely have similar potential.

heif requires licensing h265 patents though which can be expensive. Avif will be patent free, so much better to use that.

bstrobl
2nd November 2017, 10:28
Why not use MPEG-4 program streams for video, HEIF for picture, and MPEG-4 program streams for audio? Dropping in a new codec is a lot easier than full support for a new container format. And we have a LOT of mature muxing, distribution, and playback tech built around MPEG-4 PS.


I guess the point is to use simplified containers dedicated to few codecs to ensure a decent chance of compatibility, as dropping in newer codecs onto old containers may cause frustrations in the future. HEIF is also a very large spec from what I can tell, would make sense to use a heavily simplified version to ensure compatibility and reliability for standard web distribution. WebP is still at version 0.6 so something could still be fixed up for version 1.0 rather than ditching the name. Ingraining format names like JPEG can be quite useful sometimes.

heif requires licensing h265 patents though which can be expensive. Avif will be patent free, so much better to use that.

The HEIF container itself should not have any associated patent fees, only the HEVC Intra part. Not sure what AVIF brings to the table in this case actually.

MoSal
2nd November 2017, 21:48
Why not use MPEG-4 program streams for video, HEIF for picture, and MPEG-4 program streams for audio? Dropping in a new codec is a lot easier than full support for a new container format. And we have a LOT of mature muxing, distribution, and playback tech built around MPEG-4 PS.

I'm not sure who is the target of those suggestions.

* For a new image format to gain relevant success, browser vendors will have to reach a consensus. Google and Mozilla are not going to support anything HEVC or MPEG based.

* weba is already the unofficial extension of YouTube's Opus WebM DASH streams. weba/Opus/~160kbps is the highest quality 2ch audio format available (MP4/AAC/256kbps is discontinued).

* For local storage, Matroska works, and will work, just fine with any codec combination.

* Browsers will support AV1/Opus in WebM by default (muxed or separate, including live DASH/WebM streams).

TD-Linux
2nd November 2017, 23:59
Why not use MPEG-4 program streams for video, HEIF for picture, and MPEG-4 program streams for audio? Dropping in a new codec is a lot easier than full support for a new container format. And we have a LOT of mature muxing, distribution, and playback tech built around MPEG-4 PS.

We plan to have first class support for ISOBMFF. There is already an Opus-in-MP4 mapping which would make a good pairing.

As far as a still image format, I dunno. HEIF is an option, though it has its own problems (it would be too easy for it to just be a single frame video track, so it's got a totally separate codec mapping).

LigH
8th November 2017, 10:22
Since MABS is able to build aomenc/aomdec as separate encoder/decoder application pair, I wanted to try it once with a small sample (Siemens "foreman", CIF PAL, 300 frames, Derf's Y4M fixed to 25 fps), running on an AMD Phenom-II X4 (other encoders want to use at most SSE2 here).

Basic batch:
aomenc -v --passes=2 --pass=1 --fpf=foreman_cif.basic.av1.fpf --target-bitrate=300 -o foreman_cif.basic.av1.webm foreman_cif_pal.y4m
aomenc -v --passes=2 --pass=2 --fpf=foreman_cif.basic.av1.fpf --target-bitrate=300 -o foreman_cif.basic.av1.webm foreman_cif_pal.y4m

Pass 1 finishes quickly, only few seconds. Pass 2, instead ... a previous run with a more explicit command line estimated several days to finish. So I reduced the parameter set and tried again. Still:

>aomenc -v --passes=2 --pass=1 --fpf=foreman_cif.basic.av1.fpf --target-bitrate=300 -o foreman_cif.basic.av1.webm foreman_cif_pal.y4m
Codec: AOMedia Project AV1 Encoder v0.1.0-6495-g63d190aea
Source file: foreman_cif_pal.y4m File Type: Y4M Format: I420
Destination file: foreman_cif.basic.av1.webm
Coding path: LBD
Encoder parameters:
g_usage = 0
g_threads = 8
g_profile = 0
g_w = 352
g_h = 288
g_bit_depth = 8
g_input_bit_depth = 8
g_timebase.num = 1
g_timebase.den = 25
g_error_resilient = 0
g_pass = 0
g_lag_in_frames = 19
rc_dropframe_thresh = 0
rc_resize_mode = 0
rc_resize_denominator = 8
rc_resize_kf_denominator = 8
rc_end_usage = 0
rc_target_bitrate = 300
rc_min_quantizer = 0
rc_max_quantizer = 63
rc_undershoot_pct = 25
rc_overshoot_pct = 25
rc_buf_sz = 6000
rc_buf_initial_sz = 4000
rc_buf_optimal_sz = 5000
rc_2pass_vbr_bias_pct = 50
rc_2pass_vbr_minsection_pct = 0
rc_2pass_vbr_maxsection_pct = 2000
kf_mode = 1
kf_min_dist = 0
kf_max_dist = 9999
Pass 1/2 frame 300/301 55384B 1476b/f 36922b/s 3788536 us (79.19 fps)

>aomenc -v --passes=2 --pass=2 --fpf=foreman_cif.basic.av1.fpf --target-bitrate=300 -o foreman_cif.basic.av1.webm foreman_cif_pal.y4m
Pass 2/2 frame 22/3 24753B 1018316 ms 1.30 fpm [ETA 6:55:11] 4F
Pass 2/2 frame 24/5 25314B 1435997 ms 1.00 fpm [ETA 8:18:27] 4F

The output seems to play with ANSI Escape sequences, which Windows 7 cmd may not support on its own...

The ETA will rise the longer I let it ... do whatever it does. Such a speed is ... "concerning", nicely said. And I wonder what the reason could be: Is there a minimum CPU requirement for assembly optimized routines way beyond SSE2? Or is there a possible mis-configuration in the build? Or are the computational efforts so extreme when not capped by a per-frame deadline?

Unfortunately, there is no support in ffmpeg yet (JEEB mentioned some shared variable names with libvpx in IRC), thus no playback in LAV Filters (e.g. in MPC-HC) yet, either. At most it is detected...

Codecs:
D..... = Decoding supported
.E.... = Encoding supported
..V... = Video codec
..A... = Audio codec
..S... = Subtitle codec
...I.. = Intra frame-only codec
....L. = Lossy compression
.....S = Lossless compression
-------
..V.L. av1 Alliance for Open Media AV1

BTW, the low target bitrate of 300 kbps is not the reason, it is just as slow for 3,000. And a deadline goal '--good' did not change the speed either.

Tommy Carrot
8th November 2017, 15:44
The ETA will rise the longer I let it ... do whatever it does. Such a speed is ... "concerning", nicely said. And I wonder what the reason could be: Is there a minimum CPU requirement for assembly optimized routines way beyond SSE2? Or is there a possible mis-configuration in the build? Or are the computational efforts so extreme when not capped by a per-frame deadline?

BTW, the low target bitrate of 300 kbps is not the reason, it is just as slow for 3,000. And a deadline goal '--good' did not change the speed either.
You need to play around with "--cpu-used" command. It's basically a speed/quality tradeoff, 8 is the fastest, 0 (the default) is the slowest but is supposed to have the best quality.

LigH
8th November 2017, 15:52
I do now ... for my hardware, 4 is still a challenge (~10 min for 300 frames CIF). And I guess to watch the result, I still need to decode it to Y4M again...

BTW, what's this hex code at the end of the progress line? A flag for developers, giving hints about the result type and efficiency?

benwaggoner
8th November 2017, 22:46
heif requires licensing h265 patents though which can be expensive. Avif will be patent free, so much better to use that.
HEIF is codec agnostic. It can use JPEG, H.264, HEVC, and it would be trivial to extend to AV1.

I don't have any specific thoughts about how HEVC and AV1 would compare technically as still image codecs.

Also, I would speculate that a lot of the HEVC IP wouldn't apply to still image encoding; so much is around interframe encoding.

LigH
9th November 2017, 20:58
Brief result: Tested --cpu-used 8..4 with rather low speed differences (every run around 10 minutes) and with quite good quality (no really annoying artefacts, except for floating textures). Will postpone further tests here, no immediate relevance for me, personally.

bstrobl
13th November 2017, 18:18
Facebook has joined as founding member. Guess they want to cut costs on streaming and storing video too.

cheerow
13th November 2017, 18:47
I just did some test encodes with the recent git version. All the now default enabled features (experiments) have massively increased the VMAF score compared to a few months ago. Just judging by VMAF, as flawed as it may be, AV1 is now way above x265.

Also tried some options to speed up encodes. Seems like --cpu-used has very limited effect. Encoder always ran on just one cpu until I supplied --tile-columns=X but that is bugged, only the leftmost columns comes out correctly, the others are basically broken. And even then it doesn't saturate any cpu completely.

mzso
13th November 2017, 21:57
Also tried some options to speed up encodes. Seems like --cpu-used has very limited effect. Encoder always ran on just one cpu until I supplied --tile-columns=X but that is bugged, only the leftmost columns comes out correctly, the others are basically broken. And even then it doesn't saturate any cpu completely.

Well, vpx also has crappy multi-processing which never saturates my cpu, so not really surprising.
Hopefully eventually it'll get decent multi-processing like x264.

Blue_MiSfit
13th November 2017, 22:01
I wouldn't hold your breath. The main use case for AV1 and VP9 is cloud scale distributed encoding which can very happily be single threaded because inputs are broken into small chunks and encoded in parallel across many many many small cloud VMs. Given the availability of a system to do this, it's a much better solution than trying to do per-encode-instance multithreading because it's way simpler and has zero downsides.

Multi-threading is more important for live, and for desktop encoding.

LigH
13th November 2017, 22:19
The more complex an algorithm, the more probable are dependencies between intermediate results, the less probable is optimal parallelizability...

bstrobl
18th November 2017, 10:22
IETF 100: https://www.youtube.com/watch?v=_wRLR8ypCg0

wiak
20th November 2017, 18:11
Well, vpx also has crappy multi-processing which never saturates my cpu, so not really surprising.
Hopefully eventually it'll get decent multi-processing like x264.
there is hope now that vp9 has row-mt

and on a related note streaming media C104 video with alot of AV1/HEVC talk
http://streamingmedia.brightcovegallery.com/detail/videos/streaming-media-west-2017/video/5643859282001/c104:-the-future-of-video-codecs:-vp9-hevc-av1?autoStart=true&page=1

Clare
21st November 2017, 18:44
http://demo.bitmovin.com/public/firefox/av1/

LigH
21st November 2017, 19:50
I love sloppily dropped URLs I have to visit to know what to expect...

MPEG-DASH Adaptive Streaming with AV1 by Mozilla and Bitmovin

Content encoded in AV1 by Bitmovin Encoding and played back with Bitmovin Player (7.3.0-b7) in Firefox (**.*)

Seems to require a nightly Firefox to be installed.

mzso
21st November 2017, 19:57
http://demo.bitmovin.com/public/firefox/av1/

To me video hangs every few seconds for several seconds in FF Nightly. Any better way to view it?

Can I download it somehow? It might be that something glitchy with networking.

LigH
21st November 2017, 20:00
Possibly if you somehow manage to record the MPEG DASH video stream locally and play it afterwards; but that's the opposite of the intended use: Video streaming. Who knows whether the streaming or the player is the bottleneck here (but it is hosted on Akamai network servers...).