View Full Version : Definitive link for AOM AV1 encoder parameter documentation?
benwaggoner
12th May 2021, 18:15
I spent some time trying to find the definitive link for description of what the myriad AV1 encoder options do. Most seem to be common across implementations.
I found https://github.com/master-of-zen/Av1an/blob/78fd7b1663f55ba2f3ab322534308ce684e57714/docs/Encoders/aomenc.md and https://ffmpeg.org/ffmpeg-codecs.html#librav1e. But where is the definitive kept-current source of that information?
Funky080900
12th May 2021, 21:53
Best I could find is the SVT-AV1 encoder guide (https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/svt-av1_encoder_user_guide.md). Might be relevant to aomenc too, since SVT-AV1 and aomenc share many setting.
benwaggoner
13th May 2021, 03:16
Best I could find is the SVT-AV1 encoder guide (https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/svt-av1_encoder_user_guide.md). Might be relevant to aomenc too, since SVT-AV1 and aomenc share many setting.
Yeah, I can find many things that document parameters derived from AOM, but there's no good way to tell if or how they'd modified those parameters or their behavior.
I hope someday there will be an AV1 equivalent of https://x265.readthedocs.io/en/master/cli.html. Which is the best encoder documentation that I've not written myself :sly:.
RanmaCanada
23rd May 2021, 18:07
Yeah, I can find many things that document parameters derived from AOM, but there's no good way to tell if or how they'd modified those parameters or their behavior.
I hope someday there will be an AV1 equivalent of https://x265.readthedocs.io/en/master/cli.html. Which is the best encoder documentation that I've not written myself :sly:.
I myself have been trying and trying to find said documentation and from what I can see, there is nothing that is definitive, nor helpful to someone who is trying to migrate over to AV1 from different codecs. I see a lot of gatekeeping, or snippets of information, but no one really going out of their way to explain things. It's possible I may be looking in the wrong areas, as I want to try using AV1, but the documentation is extremely lacking, and that makes it frustrating as I don't want to wait literally days to test out parameters. Attempted to encode a movie earlier and it gave me a timeline of 122 days! on my 5800x.
Marsu42
23rd May 2021, 21:37
Attempted to encode a movie earlier and it gave me a timeline of 122 days! on my 5800x.
Personally, I've gotten encoding guidelines from reddit's r/AV1 - beside enthusiasts that have figured out the settings some devs write there (like BlueSwordM (https://www.reddit.com/user/BlueSwordM/)):
https://www.reddit.com/r/AV1/comments/n4si96/encoder_tuning_part_3_av1_grain_synthesis_how_it/
https://www.reddit.com/r/AV1/comments/lfheh9/encoder_tuning_part_2_making_aomencav1libaomav1/
RanmaCanada
25th May 2021, 01:46
And still no documention as to why AV1 doesn't respect --threads. No matter what I do I can not get it to use more than 1 thread.. I would love to use the encoder, but there is still too much "magic" or "try settings and figure it out yourself" and when it's single threaded (because I can't get --threads to work on any builds, maybe I'm cursed!), that takes far, FAR too long. Maybe I'm spoiled because of the awesome documentation that x265 and x264 had.
benwaggoner
25th May 2021, 03:28
I suppose it is the Modern Way of Things, but I really appreciate a description of what parameters do with input from the people who implemented it. "Read the source" is gallows humor, not an actual plan ;).
Although only Media Cleaner Pro's manuals and the x265 readthedocs have really met that bar in the last 25 years. And those projects happened because the project leaders personally took responsibility to write a lot of the documentation themselves. The CEO of Terran Interactive (who made Media Cleaner Pro) liked to do product planning by writing the manual for the new version and having the developers implement that.
RanmaCanada
27th May 2021, 05:00
Yes. I think it is pretty stupid that all these companies want to make this the next big codec, but are refusing to explain to people how to use it. I guess it's just more proof that AV1 was never intended for the masses, and was just a way for the media cartels to stop paying royalties, like Disney is attempting to do with authors (https://www.theguardian.com/books/2021/apr/28/disneymustpay-authors-form-task-force-missing-payments-star-wars-alien-buffy).
Marsu42
27th May 2021, 11:17
I suppose it is the Modern Way of Things, but I really appreciate a description of what parameters do with input from the people who implemented it. "Read the source" is gallows humor, not an actual plan ;).
Well, I have certainly read the source to figure out the params, and we should rename RTFM to RTFS (https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libaomenc.c) :-)
Yes. I think it is pretty stupid that all these companies want to make this the next big codec, but are refusing to explain to people how to use it. I guess it's just more proof that AV1 was never intended for the masses, and was just a way for the media cartels to stop paying royalties
To be fair, at least AOM doesn't argue otherwise (https://aomedia.org/about/) and a multi-codec-world is the future (https://bitmovin.com/multi-codec-world-2020/) which includes the patented (and more efficient) mpeg codecs like VVC.
benwaggoner
28th May 2021, 00:13
Yes. I think it is pretty stupid that all these companies want to make this the next big codec, but are refusing to explain to people how to use it. I guess it's just more proof that AV1 was never intended for the masses, and was just a way for the media cartels to stop paying royalties, like Disney is attempting to do with authors (https://www.theguardian.com/books/2021/apr/28/disneymustpay-authors-form-task-force-missing-payments-star-wars-alien-buffy).
AV1 was really driven by companies focused on user generated content playing in web browsers: YouTube and Facebook. Those are all about bulk encoding in the cloud, not high-touch consumer tweaking like we can do with x264 and x265.
benwaggoner
28th May 2021, 00:27
To be fair, at least AOM doesn't argue otherwise (https://aomedia.org/about/) and a multi-codec-world is the future (https://bitmovin.com/multi-codec-world-2020/) which includes the patented (and more efficient) mpeg codecs like VVC.
Indeed. Although their assertion that VP9 is 40% better than H.264 and equivalent to HEVC is flat-out wrong for every scenario I've tested. Maybe PSNR at fixed QP or something, but certainly not in subjective MOS testing of subjectively-tuned encodes.
There's no doubt that, with a good Film Grain Synthesis implementation, AV1 can outperform HEVC for grainy content at lower bitrates. But that's really the only case where I've seen a reliable reduction in bitrate versus HEVC when compared at the same bitrate and encoding time. And the FGS implementations aren't quite to the point where they can be used automatically, but that's orthogonal to codec bitstream itself, and I believe will be continuously improved.
Of course, the FGS being orthogonal, decoders which support it with AV1 could easily support FGS for HEVC and VVC too. I quite like the AV1 FGS implementation, and is probably technology that will be reused in other contexts. It's certainly much better than the H.264 FGS model that was available with HD-DVD players (although never used in any actual discs AFAIK).
Beelzebubu
4th June 2021, 12:50
And still no documention as to why AV1 doesn't respect --threads. No matter what I do I can not get it to use more than 1 thread.. I would love to use the encoder, but there is still too much "magic" or "try settings and figure it out yourself" and when it's single threaded (because I can't get --threads to work on any builds, maybe I'm cursed!), that takes far, FAR too long. Maybe I'm spoiled because of the awesome documentation that x265 and x264 had.
Threads in aom is implemented using tiling, so you need tiling active in order for --threads to do anything. --tile-columns and --tile-rows are both in log2 units, and have a value of 0 as default. For example, to get 4 tiles & threads, use --tile-columns=2 --tile-rows=0 --threads=4 or --tile-columns=1 --tile-rows=1 --threads=4.
RanmaCanada
10th June 2021, 05:23
Threads in aom is implemented using tiling, so you need tiling active in order for --threads to do anything. --tile-columns and --tile-rows are both in log2 units, and have a value of 0 as default. For example, to get 4 tiles & threads, use --tile-columns=2 --tile-rows=0 --threads=4 or --tile-columns=1 --tile-rows=1 --threads=4.
And this is why better documentation is needed. But then what do the tile settings mean and do? All we get is "you should use this command line!". I've even tried reading reddit threads and people just say "do this" with no explanation as to why...
Thank you.
benwaggoner
10th June 2021, 19:18
And this is why better documentation is needed. But then what do the tile settings mean and do? All we get is "you should use this command line!". I've even tried reading reddit threads and people just say "do this" with no explanation as to why...
Exactly!
And I imagine some implementations would add frame-level or other kinds of threading. With a single definitive resources for aomenc, other encoders could explain their changes relative to that so we could actually track what's happening.
Self-documenting code is a myth, and doubly so in compression.
Beelzebubu
11th June 2021, 16:42
other encoders could explain their changes relative to that
But that suggests that other AV1 encoders are based on aomenc.
Beelzebubu
11th June 2021, 16:51
And this is why better documentation is needed.
I agree with the problem statement. I just don't think it's easily solved. I try to help with individual questions as I did here, but anything more than that would probably require much more effort than I'm willing to put into it... Libaom lacks the activist/enthusiastic volunteer contributor community that x264 had back in the day.
But then what do the tile settings mean and do?
--tile-columns=N means exp2(N) tile columns, and --tile-rows=M means exp2(M) tile rows. So, values of 0,1,2,3,4 for either one would imply 0,2,4,8,16 tile columns or rows. The total number of tiles in a file is equal to tile_rows * tile_columns, so to get 4 tiles, you can use 2 tile columns and 2 tile rows (--tile-columns=1 --tile-rows=1), or 4 tile columns and 1 tile row (--tile-columns=2 --tile-rows=0), or 1 tile column and 4 tile rows (--tile-columns=0 --tile-rows=2).
benwaggoner
11th June 2021, 21:13
The readthedocs.io system that is used for x265 documentation looks reasonably simple to get an outline in and fill out as needed. It used to allow user editing, which would be appropriate here.
https://x265.readthedocs.io/en/master/
It didn't start anything close to as comprehensive as that.
plonk420
25th June 2021, 18:16
Yes. I think it is pretty stupid that all these companies want to make this the next big codec, but are refusing to explain to people how to use it. I guess it's just more proof that AV1 was never intended for the masses, and was just a way for the media cartels to stop paying royalties, like Disney is attempting to do with authors (https://www.theguardian.com/books/2021/apr/28/disneymustpay-authors-form-task-force-missing-payments-star-wars-alien-buffy).
with the exception of lag-in-frames (kind of like a lookahead, but seems a little more effective?), and maybe 10-bit encoding, aomenc is pretty much like x264/x265. just stay with the defaults unless you know what you're changing. tho yeah, the "official" documentation COULD be a little better. maybe Blue's documentation can be expanded upon and be absorbed into "official."
give it time, however. x264 didn't have great documentation within the first couple of years, either, IIRC.
benwaggoner
25th June 2021, 19:04
with the exception of lag-in-frames (kind of like a lookahead, but seems a little more effective?), and maybe 10-bit encoding, aomenc is pretty much like x264/x265. just stay with the defaults unless you know what you're changing. tho yeah, the "official" documentation COULD be a little better. give it time, however. x264 didn't have great documentation within the first couple of years, either, IIRC.
I don't consider x264 to have great documentation even today, honestly. x265 is really the unique outlier for codec documentation. And I think having that great documentation has has a material impact on the quality of x265 and HEVC encoding.
Too much codec documentation, including AV1's, is largely just restating the name of the parameter. What's really useful is to explain the qualitative and quantitative impact of a giving setting or its parameters. Stuff like "increases encoding time by 10-25% and reduces bitrate 1-3% for content with lots of sharp edges, like text and cel animation. Has little impact on natural image film and video content. Parameter X and Y also are helpful with similar content"
I can read the documentation and guess what it might have been done. But each feature is in there based on testing and tuning, and sharing the results of that is very helpful. And also indicating features and parameter ranges that haven't been tested well.
The linked posts on film grain synthesis and general recommendations are great reading, and would make lovely deep-dive essays as part of documentation. But each parameter deserves a few hundred of words about it specifically.
For an example from the ffmpeg documentaiton
lag-in-frames
Set the maximum number of frames which the encoder may keep in flight at any one time for lookahead purposes. Defaults to the internal default of the library.
No suggestion of what happens if higher or lower values are used. Presumably higher increases latency. Does it reduce encoding speed? Increase memory allocation? Do lower values reduce quality? If so, by around how much, and does it vary with content types or other factors? Are there threshold values below which quality gets really bad or higher doesn't offer any improvements.
Just reading that, my first guess would be "2x keyint for optimal VoD quality" but that could be wildly wrong. And I don't know if that would increase compute or memory use by 10x or 10%.
Beelzebubu
26th June 2021, 15:23
No suggestion of what happens if higher or lower values are used.
It means more frames in memory, so especially at higher resolution (1080p/4k) and/or high bitdepth (10bit/HDR), this can significantly affect memory usage. You'll have to do your own measurements on how much exactly, but frames-in-memory (along with their associated metadata) tend to be one of the biggest memory consumers in encoders, so you want to keep this low if you can.
However, on the flip-side, this is used for adaptive/predictive encoding decision making, i.e. better quality and/or better quality/speed ratio (see "tpl" in various parts of the codebase, which means "temporal"). At minimum, this should be a few frames larger than the max. alt-ref frame group size (32? I think), but larger will be better (with diminishing returns). For real-time/low-latency coding, you can set this to 1, but it means no out-of-order frame coding.
RanmaCanada
26th June 2021, 18:46
Ah yes the memory usage. I found that out when trying to burn in subtitles in a 4k HDR encode with AV1. I quickly ran out of memory as the encoding process was using 10 gigs! I thought I would be fine with 16gigs of ram, but it's obvious I need to double it at minimum. Now if only ECC ram didn't spike in cost..
Losko
27th September 2021, 09:17
Best I could find is the SVT-AV1 encoder guide (https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/svt-av1_encoder_user_guide.md). Might be relevant to aomenc too, since SVT-AV1 and aomenc share many setting.
Quite rich, but we're not there yet.
What I find most weird of all is the amount of options listing a bunch of allowed values (0, 1, ... 9) and the default value -1 : yes, -1 (and no description).
So, what is the encoder supposed to do with that particulat option set to -1 ???
Beelzebubu
30th September 2021, 20:53
Quite rich, but we're not there yet.
What I find most weird of all is the amount of options listing a bunch of allowed values (0, 1, ... 9) and the default value -1 : yes, -1 (and no description).
So, what is the encoder supposed to do with that particulat option set to -1 ???
The description for many options explains that -1 means "DEFAULT", which means it's a value that is filled in by the speed preset - each speed preset has a different value.
The speed preset docs will then explain what the default values are per speed preset.
rupeshforu3
2nd July 2022, 03:21
Hi I am Rupesh and I converted large size MP4 avc x264 video files into av1 using ffmpeg and libaom and libopus successfully. The output file quality is acceptable to me but I can't play these videos in vlc media player especially in android smartphone and tablet
I can't play means when I fast forward to say 20 mins then video stops and audio plays.
I heard that av1 codec has less support in many platforms but it will be supported in future.
Can you suggest how to play these videos in android using vlc or any other player.
RanmaCanada
4th July 2022, 02:16
Hi I am Rupesh and I converted large size MP4 avc x264 video files into av1 using ffmpeg and libaom and libopus successfully. The output file quality is acceptable to me but I can't play these videos in vlc media player especially in android smartphone and tablet
I can't play means when I fast forward to say 20 mins then video stops and audio plays.
I heard that av1 codec has less support in many platforms but it will be supported in future.
Can you suggest how to play these videos in android using vlc or any other player.
You need a platform that supports AV1 in hardware decoding, otherwise you will be doing software decoding which requires massive amounts of power. The S905x4 I know does it, and as for phones I only know of 4 Samsung Galaxy S21 series (Exynos 2100 variants)
iQOO Z1 5G, Reno5 Pro and Realme X7 Pro. Qualcomm has stated they won't support it till at least 2023.
https://arstechnica.com/gadgets/2022/02/report-qualcomm-will-support-av1-video-codec-in-2023/
rupeshforu3
4th July 2022, 17:00
Hi I am going to buy a new tablet and seeing which is best. Can I expect any tablet which has av1 video hardware decoding next year. Upto that time I will wait.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.