Suggested aomenc settings [Archive]

View Full Version : Suggested aomenc settings

Blue_MiSfit

12th February 2019, 03:19

Hey folks!

I thought it would be good to chat about recommended settings for using aomenc.

What I'm most interested in understanding / documenting:

I'll be updating this as I go based on findings

* Quality / speed tradeoff of various values for cpu-used
- Which value gives you most of the AV1 magic without being placebo (e.g. "veryslow" for x264)

- Which value should be considered the fastest acceptable without shooting yourself in the foot (e.g. "superfast" for x264)

- Is there a matrix describing which options cpu-used affects?

* Best practices for rate control settings like
- end-usage
Use fixed QP mode if you want constant quality. Rate control is not very good and should not be counted on.

- bias-pct
You generally shouldn't mess with this since RC is pretty bad still.

- resize options

- superres options

- buffering options

- CRF? I see it mentioned in the context of av1 in ffmpeg but not in aomenc?

* Best practices for bitstream / threading settings like
- auto-alt-ref
This is enabled by default, you shouldn't have to mess with it.

- lag-in-frames
This is maxed out by default (25 frames). You shouldn't mess with it unless you need to reduce encoder latency e.g. live streaming

- aq-mode
This is not recommended as of February 2019.

- tile-columns and tile-rows

- row-mt

- Any other mechanism to effectively multithread? Or should you really not do this, and just go massively parallel (chunked encode) if you want speed?

* Maybe example commands of
- "good quality" CBR / capped VBR encoding
No go. As of February 2019, aomenc is not capable of good quality CBR / capped VBR encoding. If you need this use case you'll need to implement the "dynamic optimizer" approach of encoding each scene at various QPs and then stitching together depending on your requirements.

- "good quality" pure VBR encoding for offline viewing
Fixed QP mode is the only way to get consistent quality today

- "HDR10" encoding (can this even be done today?)

utack

12th February 2019, 04:30

I'd like to discuss "maxsection-pct"
Is there a default bitrate limit over the targeted VBR set?
Because I had the feeling that without explicitly setting it aomenc can limit bitrate in 2pass mode, butchering "difficult" scenes unreasonably.

Blue_MiSfit

14th February 2019, 20:40

My initial experiments, focusing around a pure VBR low bitrate 400 Kbps 576p use case have shown the following:

- Constant QP mode makes pretty nice output, but of course it's unsuitable for streaming

- Rate control is still quite bad. Using 2 pass VBR mode there are still a few frames that completely explode for no apparent reason, despite the average quality being pretty good, and sometimes quite a bit better than HEVC.

I used cpu-used=1 for all these tests, but I'm trying the latter with cpu-used=0 just to see if there's any difference. I'm guessing 1=veryslow 0=placebo but we'll see.

Blue_MiSfit

14th February 2019, 21:56

In chatting with TD-Linux on #aomedia I learned a few interesting things:

1) Rate control (VBR and CBR) is not very good.

2) Constant QP mode is a better solution. Most folks using AV1 use the "dynamic optimizer" approach of encoding each scene with multiple QPs and picking one per scene to hopefully get consistent quality while controlling bitrate. A good place to start would be using qp values like 20, 32, 43, 55, and 63 (for both 8 and 10 bit)

3) aq-mode is not recommended at this point

4) lag-in-frames controls the intrinsic delay. It defaults to the max and should only be reduced when targeting for low latency (e.g. live encoding)

I'm doing a range of CQ encodes :)

kanaka

15th February 2019, 10:01

2) Constant QP mode is a better solution. Most folks using AV1 use the "dynamic optimizer" approach of encoding each scene with multiple QPs and picking one per scene to hopefully get consistent quality while controlling bitrate. A good place to start would be using qp values like 20, 32, 43, 55, and 63 (for both 8 and 10 bit)

What metrics do they use to set correct qp? vmaf at 93-95?

Blue_MiSfit

15th February 2019, 21:27

I didn't ask, but given the Netflix tech blog post on this item I'd imagine your idea is in the ballpark :)

This is an almost inconceivably expensive solution, though. As if encoding AV1 isn't bad enough already, let's go ahead and do each representation 5 times! Multiply that by however many representations you want to provide and the necessary compute resources are astronomical, even just doing SD content!

In my test last night at q55 I got quite good results at just over 500 Kbps.

plonk420

22nd October 2019, 11:15

i'd sure would love some help with settings...

i'm pretty comfortable with x264 via MeGUI, however ffmpeg syntax gives me a headache.

i was told to try encoding VP9 to get practice for AV1, but i couldn't control the filesize with Handbrake regardless of CRF 25, 36, or 42. but at least i got ~40% cpu usage. so then i switched to AV1 encoding directly from ffmpeg, but couldn't for the life of me get past frame 53 until i got rid of -v:b 0, which the @#@!#$ instructions TELL you do use (https://trac.ffmpeg.org/wiki/Encode/AV1). still not sure i'm going to get a usable file as it's not growing beyond 5kB, but we'll see in 10-30 minutes

Losko

1st May 2020, 16:16

I'm doing a range of CQ encodes :)

Do CQ encodes require 2-pass mode?
Or perhaps 2-pass mode is only needed when doing VBR?

Blue_MiSfit

1st May 2020, 21:53

I haven't thought about this in a bit :)

I can't see why you'd need 2 pass mode for fixed QP encoding. I could be wrong tho.

benwaggoner

2nd May 2020, 01:55

I haven't thought about this in a bit :)

I can't see why you'd need 2 pass mode for fixed QP encoding. I could be wrong tho.
If it's truly fixed QP encoding, than 2-pass can't even do anything, by definition.

Of course, no one actually needs or wants fixed QP encoding! Rate control and adaptive quantization are key features that make encoders good. Fixed QP is really only seen in JPEG images and in early stage codec testing before rate control is implemented.

It's honestly kind of dumb that we keep designing codecs with an implicit bias for fixed QP delivering optimal PSNR, which are things that don't correlate particuarly well with human vision.

Sagittaire

2nd May 2020, 02:07

It's honestly kind of dumb that we keep designing codecs with an implicit bias for fixed QP delivering optimal PSNR, which are things that don't correlate particuarly well with human vision.

I don't know. It's more easy if you want compare codec at constant quantizer.

For PSNR, it's always the same thing: if delta is 0.2 dB, you can't conclude, but if delta is 2.0 dB, no doubt, with psy optimisation or not.

Blue_MiSfit

2nd May 2020, 05:04

Of course, no one actually needs or wants fixed QP encoding! Rate control and adaptive quantization are key features that make encoders good. Fixed QP is really only seen in JPEG images and in early stage codec testing before rate control is implemented.

It's honestly kind of dumb that we keep designing codecs with an implicit bias for fixed QP delivering optimal PSNR, which are things that don't correlate particuarly well with human vision.

Isn't it valid (if not wasteful) to do your own brute force rate control by doing per-scene adaptive fixed QP encoding? This is precisely the technique Netflix outlined for optimized VP9 encoding. In theory it sounds like a great solution, just hugely expensive :)

https://netflixtechblog.com/optimized-shot-based-encodes-now-streaming-4b9464204830

Tadanobu

2nd May 2020, 08:49

Do CQ encodes require 2-pass mode?
Or perhaps 2-pass mode is only needed when doing VBR?

2-pass is default behavior for all aomenc encodes for a few months now, even for --end-usage=cq. But the first pass is fast (like 15x faster than 2nd pass) and worth it quality-wise.

Losko

2nd May 2020, 15:16

2-pass is default behavior for all aomenc encodes for a few months now, even for --end-usage=cq. But the first pass is fast (like 15x faster than 2nd pass) and worth it quality-wise.

True, I didn't manage to get aomenc writing anything until I added "--passes=1" which is quite not obvious.

Losko

2nd May 2020, 15:27

Anyway, a 2-pass encoding reached the end.
Slowly, of course.
$ ./aomenc -v --good --fps=25/1 --cpu-used=1 --row-mt=1 --threads=16 --passes=2 --pass=2 \
--fpf=aom1st.pass --limit=2000 --bit-depth=8 --end-usage=cq --cq-level=50 --tune=ssim \
--obu -o encoding_test_1.obu sample.y4m

I used --cpu-used=1 and --cq-level 50 and the result is quite ugly - so I will test different options.

What seems weird to me is the CPU load monitor, which shows this:
https://i.postimg.cc/HsS0yycc/Schermata-a-2020-05-01-18-01-44.png (https://postimages.org/)
This is very different from the graph generated during x265 encodings (100% full time) and makes me think that setting better options could bring CPU load at the maximum level and the encoding to end (hopefully) faster.

But which options can tune this?

Tadanobu

2nd May 2020, 16:36

If you don't specify anything, aomenc will do 2-pass on its own (unless you are using a very old version). But you can manually set each pass.

Your command line misses something. Either you want to use constant quality like this
--end-usage=q --cq-level=50

Or constrain quality like this
--end-usage=cq --cq-level=50 --target-bitrate=XXXX

I know, this is confusing. Also --good is default value.

As for your target, 50 is quite low quality (best is 0, worst is 63). The 30-40 range should suit better (depending on your source and what you're trying to achieve). Aomenc is very bad at multi-threading. It will not use much more than one core. The current best option is to cut your video in chunks and do parallel encoding. That way, it can be as fast as high quality x265. Tools like Neav1e and Av1an are here for that. Or you can try SVT-AV1.

Losko

2nd May 2020, 20:03

Either you want to use constant quality like this
--end-usage=q --cq-level=50

Or constrain quality like this
--end-usage=cq --cq-level=50 --target-bitrate=XXXX

Thanks, this was not clear.

Tools like Neav1e and Av1an are here for that.

Thank you again, didn't know either tools, I think I will play a bit with Av1an.

benwaggoner

5th May 2020, 03:52

I don't know. It's more easy if you want compare codec at constant quantizer.
Yes, it's done because it is easy. Not because it is particularly relevant :sly:. One example of what that approach misseed were deficits in the VC-1 adaptive quantization feature. It turned out in practice that the signalling overhead of the QP variance was so high that it wound up eating up the psychovisual.

For PSNR, it's always the same thing: if delta is 0.2 dB, you can't conclude, but if delta is 2.0 dB, no doubt, with psy optimisation or not.
Oh, you can definitely find the better looking version having a mean PSNR >2 dB worse! Especially if you're talking about a clip >>10 seconds long. The mean of per-frame PSNR misses all sorts of utterly essential rate control features.

PSNR also has weaker psychovisual correlation with HDR content, and at 4K and beyond.

Sagittaire

6th May 2020, 20:00

Yes, it's done because it is easy. Not because it is particularly relevant :sly:. One example of what that approach misseed were deficits in the VC-1 adaptive quantization feature. It turned out in practice that the signalling overhead of the QP variance was so high that it wound up eating up the psychovisual.

Oh, you can definitely find the better looking version having a mean PSNR >2 dB worse! Especially if you're talking about a clip >>10 seconds long. The mean of per-frame PSNR misses all sorts of utterly essential rate control features.

PSNR also has weaker psychovisual correlation with HDR content, and at 4K and beyond.

yes perhaps with Average PSNR but it's more difficult with Overall PSNR. OSPNR is good RC controler (with same setting and same codec): if you introduce only one bad frame in 1000 frames sequence, you destroy OPSNR score but not APSNR score.

For exemple in your Benwaggoner challenge, OPSNR detect RC bug with x265 Rate Control.

Losko

8th May 2020, 08:26

... I think I will play a bit with Av1an.

And I think I won't anymore, IMHO the project is not ready for prime time.

Here the facts.
I ran the encoding using my sample file (y4m, 1080p, 2000 frames, about 6 GB) and someway the scene detector found 4 chunks - I would disagree on those chunk identification but I will assume they're ok.
Encoding parameters:
--fps=25/1 --cpu-used=1 --row-mt=1 --passes=1 \
--bit-depth=8 --end-usage=q --cq-level=40 --tune=ssim
Then av1an splitted the source file and created 4 chunks in a temp directory (which is bad). They were uncompressed as well as the source file, so they were as big.
My cpu is an i5 6200U (2 cores with HT), Av1an decided to create 2 workers (and this is wrong, in fact the cpu monitor showed the load almost only on two "cpus").
The entire encoding took more than 66 hours (!!! and this is very bad).
[as default, av1an removes its temp files - it has an option to tell it not to, but when I discovered this, it was too late].
The resulting concatenated file is horrible: it's more than 35 MB big and it plays wrong, as if it had a very low frame rate BUT it's not. It is 25 fps (as I requested on command line) but has LOTS of repeated frames and now its duration is more than 53 minutes (minutes! - did I say this is wrong already?)

Then, I did all this by hand.
I took the very same chunks av1an used, and called aomenc on four different terminals, and just adding the options --skip and --limit . They worked as expected - no temp files, thanks.
They kept the four cpu at 100% load as I aimed to, the longest encoding ended after 5.5 hours (no, not 55) and the four jobs all stayed around 0.03-0.04 fps (which is slow but reasonable).
The concatenated file is 7.5 MB long. It has 2000 frames played wonderfully at 25 fps.

Tadanobu

8th May 2020, 11:43

There are many options in av1an to set the number of workers (-w), the threshold for scene cut detection (-tr) and many other things. I recommend you read the documentation (https://github.com/master-of-zen/Av1an). The project is still in development tho, but so is av1.

It's hard to tell what went wrong for your case. We'd need more information about your system, the source... I really think is has to do with what settings you have been using and/or what you expected the tool to do. Many users have been using this tool (including myself) and have had good results. If you want to know more about the tool or av1 encoding, you can join the av1 discord (https://discord.com/invite/Ecu428C). The creator of av1an is often available there

The video is indeed split in chunks, but not necessarily uncompressed. If you use an uncompressed file as a source, the chunks will be uncompressed. If you use a compressed file, the compressed chunks will be piped to aomenc.

What I don't get is what the matter is with working with a temp directory ? Space ?

Losko

11th May 2020, 14:19

I see your point but I'm not gonna invest time in av1, not now.
I just have had a ride, encoding some clip, trying some tools and learning something new (av1 offers unbeatable quality for bit, for example).
Further, the advantages av1an will offer are not enough to me, as encoding a video in separate chunks is no big deal (unless one wants to fullfill >20 cores -hey, here we're not into a home use case anymore).
[Having said that, however you encode your chunks, issue to decoding buffers may still arise at joint points]

No, today I see av1 is great but ATM I cannot afford it - 4 encode tasks @ 0.3/0.4 fps mean 0.12/0.16 global fps on a four-core cpu and this is way tooooo slow, even compared to x265 and its ~1 fps. I will take back av1 under consideration (and I will for sure, since I like using a free codec) when I will buy a more powerful CPU (If quality/size ratio was no matter, then I'd go happy with qsv encoding but...).

About chuncks: yes, I always want to avoid wasting space, because it is risky (what if the main application crashes?) and avoidable: every OS I know offers piping (which is fast and light), so why not? What could any encoders do with my bluray videos before encoding? they're >20GB big, and making copies takes time, apart from disk space. No, thanks - kids play with temp files, adults pipe.

Sagittaire

16th May 2020, 11:14

Well when you see this really impressive result:
https://forum.doom9.org/showthread.php?p=1904806#post1904806

--denoise-noise-level is certainely one the most impressive option for AV1. All option for retain grain are absolutely useless with that.