Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
![]() |
|
Thread Tools | Search this Thread | Display Modes |
![]() |
#1 | Link |
Registered User
Join Date: Dec 2021
Location: Canada
Posts: 34
|
Encoder tuning Part 4: A 2nd generation guide to aomenc-av1, shooting for the stars!
So, this is a follow-up to the 2nd part guide regarding aomenc-av1, which can be found here:
https://old.reddit.com/r/AV1/comment...cav1libaomav1/ While that guide is still fine for the most part at a first glance, I've learned a lot regarding the sudo-reference AV1 encoder, its options, its intricacies, and best of all, its shortcomings. It now means I understand a lot more about the options themselves, what they do, how to take advantage of them, when to actually use them, and even how to get around their downsides through some clever options and even a custom WIP build on how to address aomenc-av1's greatest weakness: a surprising lack of deep psycho-visual optimizations(intra only has a nice number of them, but barely any video coding versions). Before I begin, I have to add that this is not a comprehensive documentation. A simple Reddit forum post is far too small for such a massive endeavour, so a separate post will be done with an entry on a dedicated Wiki of some sorts to explain what each and every option does in detail, and even speed-features and their explanations. Now, to get on to the main subject of the post itself: the 2nd generation tuning guide for aomenc-av1! Encoder speed preset The encoder preset itself: Code:
--cpu-used=X For realtime purposes like streaming, the RT presets range from 5 to 10, with 5 being the slowest RT preset and 10 being the fastest. For reference, the default is 0. Not exactly optimal... My general recommendation for choosing what preset to utilize is based on speed, usability and quality. In that context, all realtime presets are off of the table until aomenc gets their frame-threading merged into the mainline build due to their low single instance speed/quality ratio; you are better off using SVT-AV1 right now in that sense. Otherwise, my general recommendation is in the middle: CPU-2 being the lowest preset I'd recommend actually using, CPU-3 being a good middle ground in general since it keeps most of the juicy features on. CPU-4 is good for those wanting faster encoding than CPU-3 while not losing much. CPU-5 is where tradeoffs start getting a bit more severe since pruning and the disabling of features(particularly loop restoration filtering). gets disabled. CPU-6 is the fastest I'd go utilizing aomenc. Any faster today, and going with SVT-AV1 is a better tradeoff. General recommendations: `--cpu-used=2` for slow encoding, `--cpu-used=3` as the middle ground, and `--cpu-used=5` as the fast option. Keyframe refresh intervals Code:
--kf-max-dist=240 --kf-min-dist=12 For seeking purposes in most content, the standard recommendation is 10 seconds worth of frames, with 300 frames usually being the max number of frames being put to keep good seeking performance. So, my recommendations would for 240 frames for 24FPS, 250 frames for 25FPS, and 300 frames for >30FPS content. As for kf-min-dist, it is the minimum amount of frames before you can place a keyframe. This is mainly done in case the scene-detection fails to insert intra-refreshes or fails to detect flashes and places unnecessary keyframes all over the place. Threading options Code:
--threads=cpu-threads --sb-size=64 Code:
--threads=cpu-threads --sb-size=64 --tile-columns=1 Code:
--threads=cpu-threads --sb-size=64 --tile-columns=2 --tile-rows=1 Code:
--threads=cpu-threads --tile-columns=2 --tile-rows=1 Code:
--threads=2 --sb-size=64 Now, threading in aomenc. What an interesting subject. Aomenc has access to these threading parameters: - Row threading --- - Tile Threading --- - Smaller task threading - Frame-threading(experimental, so will not be tackled in this guide) The AV1 standard has access to 2 types of SuperBlock types: 64x64-128x128, also allowing for the usage of larger partitions at higher resolutions. Not very useful at standard HD resolutions(<=1080p), but it does exist for a good reason. In aomenc, the default behavior is to dynamically choose between 64x64-128x128 superblocks. This is good, as very large static SBs and partitions might prove detrimental to speed and perceptual quality to a small extent. Another side effect of using larger SBs is that row threading gets less effective. To balance it out, tile threading can be used, but as IÂ’ve tested personally, the penalty for using static 64x64 Sbs is lower than even adding just one additional tile column, so if you worry a bit about encoder side threading for the encoder to use 64x64 Sbs before adding tiles. The main reason to add tiles would be to boost random access performance for the decoder, as frame threads are much higher latency than tile threads. Adding tiles boosts seeking performance. Finally, tiles still follow the power of 2 rules. Therefore, `--tile-columns=1` = 2¹ = 2 tile columns. The total number of tiles is dictated by: # of tile columns * # of tile rows = total number of tiles. Thus, --tile-columns=2 --tile-rows=1 = 2² columns x 2¹ rows = 4x2 tiles = 8 tiles. Rate control Code:
--end-usage=q --cq-level=24 The Q rate control mode is basically a modulated quantizer depending on spatial adaptive quantization, temporal-rdo, spatio-temporal AQ(deltaq-mode=1,2) and motion in general. Basically, its closest equivalent is CRF, so use it if you target maximum quality encodes without a bitrate limit. CQ is Constrained Quality, meaning it's similar to it, except it can't go as high in terms of quality because of the bitrate constrained quality and other stuff. This is not recommended unless you have very specific requirements. VBR and CBR are Variable and Constant Bitrate respectively. Unless you have a very recent aomenc build with the bitrate accuracy compiler flag enabled, I wouldnÂ’t recommend using them if youÂ’re trying to target a certain ratio of quality-bitrate. As for cq-level, it is basically how you choose your base quality level/modulated quantizer. 24 is usually a good target for encoding at a decent quality. 20 is usually a good target for higher quality encoding, and 18 is where high quality encoding starts. 30 is where the threshold for low-mid quality starts and where aomenc-av1 really starts to pull away in front in quality/bitrate vs other encoders. 35-40 is where Youtube quality can be achieved without using more exotic settings. Anything higher is where the low quality threshold starts. Note that these guidelines are all for 8-bit SDR live-action/animation sources. Very high motion and high contrast sources like video games have different requirements entirely, and thatÂ’s not even mentioning native 10-bit HDR sources with larger color gamuts; for video games, I usually recommend upping the Q level by 10-15 above the usual recommendations to achieve similar bitrates compared to easier content. As for HDR sources, keep reading ![]() Bit-depth and chroma subsampling Code:
--bit-depth=10 In AV1, you have access to 8-bit coding and 16-bit coding. That leaves you with these bit-depths that the AV1 standard allows: 8-bit, 10-bit, and 12-bit. I **always** recommend encoding in **10-bit**, particularly if your source is 4:2:0 YCbCr chroma subsampled limited range, even from an 8-bit source. So, most video sources currently found on the Internet. Not only does encoding in 10-bit allow the encoder to process everything in 16-bit buffers(getting higher coding efficiency due to considerably less truncating/rounding off), but the much higher color depth allowed by 10-bit coding and output allows for a more perceptually efficient output, **particularly in darker shades where differences are more easily noticeable by the human eye and where dithering is more prominent.** Also, since 8-bit YCbCr <> 8-bit RGB coding is not lossless unlike other transforms like YCoCg and XYB, 10-bit YcbCr allows for lossless RGB conversion to your screen. As for other high bit-depth sources, keeping the same bit-depth is what is most optimal, especially if you value general HW decoder compatibility. The same thing applies with chroma subsampling: unless you must support widespread HW decoders, keep the same chroma subsampling parameters as the source. Encoding passes and lookahead Code:
--lag-in-frames=48 2-pass was extremely important in vpxenc-vp9, as not only was it the only way for the encoder to utilize scene-detection, but it also allowed for the placement of alternate reference frames. Not doing that seriously cripples the encoder in what it can do. It also disables other stuff, but this also applies to aomenc-av1, so letÂ’s move on to the AV1 encoder again. In aomenc-av1, 2-pass allows for these things in particular: - More advanced scene detection when the lookahead buffer is high enough. - Partition recoding: the encoder itself can decide whether or not to redo partition selection based on the preset on other conditions, resulting in better partition selection. - Better auto-alt-ref placement through the encoded stream. It also does some more advanced things, so IÂ’d advise keeping it on if you can ![]() So yeah, always use 2-pass if you can. Luckily, itÂ’s set by default in the standalone encoder, so you donÂ’t need to do anything if you utilize a utility like nmkoder or av1an ![]() As for lookahead, it is controlled through a parameter thatÂ’s called --lag-in-frames. More lookahead in the form of lag-in-frames in aomenc gives you - Better rate control. - Better temporal-rdo. - Better frame-placement. - Generally more effective motion preservation due to a combination of previous and other factors. In default aomenc, the range of lag-in-frames is 0-48, with the default being 35. I always recommend putting to 48 as it increases efficiency nicely without any significant penalties other than higher memory consumption. Another effect of lag-in-frames is the kind of scene detection the encoder decides to choose. 0-18: No scene-detection. 19-32: Scene detection mode 1 is active(due to limited future frame prediction) 33 and higher: Scene detection mode 2 is active due to large number of future references allowing for the highest level of scene detection present in aomenc and more information is gathered. Temporal filtering Code:
--arnr-strength=2 --arnr-maxframes=3 Code:
--arnr-strength=1 --arnr-maframes=3 Code:
--arnr-strength=0 Contrary to what I and many others believed, the arnr-maxframes=X parameter does not affect the maximum number of alternate reference in the encoderÂ’s search space sadly. So, the settings written above affect temporal filtering, and nothing else. Interestingly enough, temporal filtering isnÂ’t exclusive to AV1 encoders: it can be found in other encoders for other standards and can even be found in some HW encoders, but thatÂ’s a discussion for another day. That means `--arnr-strength=X` affects the strength of the filtering itself. Higher = stronger = less detailts/artifacts pass through at the same quantizer. I am of the philosophy that less is more, and if you want more filtering, you want to use external filtering which has way more dials to turn with to tweak the output. However, the filtering within the encoder is simple, decently effective, and tied to the encoding process decently(which can cause some problems however...) by lowering the filtering strength if your quantizer chosen is low enough. Of course, the adjustment itself isnÂ’t very high(1), so I prefer setting it lower myself. As for arnr-maxframes, the trick is pretty simple: lower number of frames gets you higher visual consistency as with all spatio-temporal filtering, while a bigger filtering window gets you potentially higher quality filtering at the cost of a higher change of temporal artifacts. I prefer a low amount of frames to be used for temporal filtering for a more consistent look. Animation is low variance by default, so there is no need to have temporal filtering on at all. Spatial and spatio-temporal adaptive quantization Code:
--aq-mode=1 --deltaq-mode=1 Code:
--aq-mode=1 --deltaq-mode=0 Code:
--aq-mode=1 --deltaq-mode=0 --enable-tpl-model=0 At very low bitrates, you can disable adaptive quantization entirely. In aomenc, you have access to 3 spatial aq-modes:
I pretty much always recommend aq-mode=1, since encoders are usually not very good at giving bits to low variance spots, and aomenc is no exception to that(in fact, IÂ’d argue it’s not very good at it in the 1st place). It would be nice if the aq-mode=1 also had an AC bias like in x264/x265’s aq-modes, but thatÂ’s a topic for another day. As for the spatio-temporal deltaq-mode=X options(1/2, 3/4 are meant for AVIF/all-intra currently), they do some things rather interestingly. deltaq-mode=1 is spatio-temporal adaptive quantization based on objective metrics, working in tandem with temporal RDO (tpl-model) to get nice coding gains by deciding costs between inter and intra coding modes alongside temporal optimizations. Works well at low-mid bitrates, but at higher fidelity levels and especially grainy stuff, it can be a detriment to fidelity. deltaq-mode=2 is supposed to be the perceptual version of this , but not only does it not work well currently, but it also comes with a large speed penalty even at CPU-2/3, so I do not recommend using it at all as of March 2022. Sorry if this is not the full post, but there are character limits on Doom9, and since I didn't post much until today, I need to wait for the mods' approval to post the 2nd part. Last edited by BlueSwordM; 5th March 2022 at 06:38. Reason: Mistakes |
![]() |
![]() |
![]() |
#2 | Link |
Registered User
Join Date: Dec 2021
Location: Canada
Posts: 34
|
Sharpness
Code:
--sharpness=0 Code:
--sharpness=1 Before June 2021, the sharpness parameter affected how End of Block(EoB) optimizations were done and how high the RD multiplier offset was set at(every sharpness uptick added +0.1 to the RD multiplier), which forced the encoder to utilize sharper transforms, leading to more of the original sharpness being kept, higher detail retention and most importantly, better clarity in high motion segments. After June 2021, the aomenc devs decided to F everything up, and while trying to make good changes, mostly succeeding, they decided to remove the RD multiplier offset entirely, which meant that they made `--sharpness=1` equal to `--sharpness=2-5`, making it practically useless under our noses before some us noticed and decided to change that BS behaviour in my aom-av1-psy fork. Grain synthesis Code:
--enable-dnl-denoising=0 –denoise-noise-level=5 Code:
--film-grain-table=photon-noise-isoXXX.tbl Code:
--photon-noise=X Since the grain synth guide is still valid, I’ll just copy paste it from my 3rd generation guide: For --denoise-noise-level=XX (crappy name, I know), a higher number dictates a larger amount of noise. The default mode of operation (--enable-dnl-denoising=1) denoises the input in the 1st pass, after which the denoised stream is passed on to the encoder to do the rest of the job. I It does an ok job at grain synthesis, but because of the denoising pass, not only does the 1st pass become agonizingly slow, practically doubling the already lengthened encoding process, but it also gives a lower quality output than would be expected. That is why a new option in the form of giving the user control to disable that pesky denoising was added in 2020, being --enable-dnl-denoising=0. This bypasses the denoiser entirely, restoring the normal 1st pass speed, making the normal encoding process a bit faster, and giving a higher quality output. In live-action content, it does quite well, which is why I always recommend enabling it for that kind of content. Of course, the grain synth process in aomenc is still not threaded, so it can cause some problems still at it is a latency bottleneck. For photon noise, I’d rather link directly to my still valid old guide since this post is getting long as is: https://old.reddit.com/r/AV1/comment...is_tables_for/ Rate distortion tuning Code:
--tune=psnr The SSIM RD tune is indeed superior since it performs additional psy block distortion optimizations to distribute bitrate more evenly towards what we deem as higher quality. I recommend it somewhat for live-action, but I will repeat myself: do not use it for animation :P The VMAF tunes are all bad except for `--tune=vmaf_without_preprocessing`, but it’s quite slow, so I wouldn’t use it. The butteraugli tune is the best, but it currently only works in 8-bit and on Linux builds, so I’m not even going to mention it. There is also other tune that is pretty decent and works on all OSes, but I will reserve that for another time. Decoding optimizations Code:
--enable-cdef=0 --enable-restoration=0 Restoration filtering are filters that aomenc can use to get back some detail lost by the encoding process, utilizing filters like wiener restoration filtering and self guided restoration filtering. These are normally quite useful and at higher bitrates, they usually back off in terms of strength quite nicely. However, they can be decoding bottlenecks at high resolutions, so disabling them is a good idea. I personally recommend to disable restoration filtering first, and if really needed, you can disable CDEF filtering completely as well. You could also disable the loop filtering, but doing that honestly is never a good idea until you want your stream to look like x264 ultrafast. Note: Starting at CPU-5, restoration filtering is disabled entirely, which is one of the main reasons CPU-5 is a decent bit faster vs CPU-4. Miscellaneous arguments --tune-content=default --- Leave this to the default tune unless you encode pure screen content(screen sharing or Peppa the Pig types of animation). For gaming, just leave the encoder to decide. --enable-qm=1 --- This enables quantization matrices for aomenc. I have 0 idea why it’s not enabled by default, as it provides free psy and coding gains. Always leave it on no matter what. There are no penalties for enabling it. For reference, the default min-qm table is 5, and the default max-qm table is 9, which is a good choice of constants. Smaller QM table = steeper quantization matrix(bigger differences between each step) Bigger QM table = flatter quantization matrix(smaller differences between each step) --quant-b-adapt=0/1 --- This parameter, unlike what I said in the previous guide, does not enable a special adaptive quantization flag. Instead, it enables further block optimizations for “trellis” optimization adaptively. Enabling it does increase efficiency, but it can decrease fidelity in some cases, but the fact that it’s not consistently doing so means it’s not bad for high fidelity. On or off doesn’t matter too much unless you’re at low bitrates, where enabling it does consistently help. --enable-fwd-kf=1 -- This parameter enables bi-directional keyframes and open-GOP. Always leave it on since there aren’t any significant encoding or decoding penalties with it on. Even with the nature of chunked encoding causing bi-directional Kfs to be much rarer, it still allows for open-GOP at the mini-GOP level to give a decent efficiency uplift. --enable-chroma-deltaq=0 --- To those reading the previous guide, this might seem rather strange. Why would I recommend a parameter in the past that I’m not recommending anymore? Well, it’s because this parameter takes away chroma bits: specifically, it increases the Q by 2 for chroma channels. I thought it was the opposite for a long time. Why? It was meant for 4:4:4 sources and was never tweaked beyond that. It is actually very good for 4:4:4 sources where chroma resolution is plenty. For 4:2:0 sources where chroma data is scarce, utilizing such a parameter in default aomenc starves the chroma channels even more, creating even more distracting color artifacts. For that reason alone, I would not use it for video sources where 4:2:0 is the most prevalent chroma subsampling factor. That might change in the near future, but that is currently not the case sadly. --enable-keyframe-filtering=0/1/2 Use KF=2 if you can use av1an/nmkoder/aomenc-by-gop with MKVToolnix/MKVMerge to merge the clips and it is the most efficient. --keyframe-filtering=1 –arnr-strength=1 if you want to avoid the dreaded KF=1 low probability random BS artifacts unless you use the aom-av1-psy build which manages to fix it in a smart way, and KF=0 if you want to avoid all of that at a significant efficiency penalty. --profile=0/1/2 --- profile 0 for 10-bit 4:2:0, profile 1 for 10-bit 4:4:4, profile 2 for 12-bit and 4:2:2. HDR encoding and metadata Code:
--deltaq-mode=5 --color-primaries=bt2020 --transfer-characteristics=smpte2084 --matrix-coefficients=bt2020ncl --deltaq-mode=5 is a deltaq mode that adjust the luma and chroma quantizer in blocks according to the block luma average in HDR defined in T-REC-H.Sup15. Sorry for the much bigger walls of text, but I’ve amassed an immense amount of knowledge and experience ever since I’ve written the 1st aomenc-av1 guide, and as such, I had to be much more thorough in my writing, while also correcting my previous rather naive mistakes caused by my lack of knowledge in the encoder and the standard itself. I’m actually surprised no one tried to correct me until a few months ago, which is when I started to write the 2nd generation aomenc-av1 guide. Important note: These parameters are all meant for the mainline aomenc build. My current aom-av1-psy build is an entirely different monster that deserves its own separate post since half of the post would be a rant. Now, for the piece of resistance; the settings you’ve been waiting for all along! Settings for standalone aomenc that I use with default aomenc(mostly for chunked encoding in av1an/nmkoder with thread pinning and aomenc-by-gop) at 1080p: Code:
--threads=2 --cpu-used=3 --end-usage=q --cq-level=24 --enable-fwd-kf=1 --aq-mode=1 --lag-in-frames=48 --bit-depth=10 --kf-max-dist=240 --kf-min-dist=12 –enable-qm=1 --sb-size=64 --enable-keyframe-filtering=2 --arnr-strength=2 --arnr-maxframes=3` `--sharpness=1 --enable-dnl-denoising=0 --denoise-noise-level=5 Code:
--threads=2 --cpu-used=3 --end-usage=q --cq-level=18 --enable-fwd-kf=1 --aq-mode=1 --lag-in-frames=48 --bit-depth=10 --kf-max-dist=240 --kf-min-dist=12 --enable-qm=1 --sb-size=64 --enable-keyframe-filtering=2 --arnr-strength=1 --arnr-maxframes=3 --deltaq-mode=0 --sharpness=1 --enable-dnl-denoising=0 --denoise-noise-level=5 Highest fidelity Code:
--threads=2 --cpu-used=3 --end-usage=q --cq-level=16 --enable-fwd-kf=1 --aq-mode=1 --lag-in-frames=48 --bit-depth=10 --kf-max-dist=240 --kf-min-dist=12 --enable-qm=1 --sb-size=64 --enable-keyframe-filtering=2 --arnr-strength=1 --arnr-maxframes=3 --enable-restoration=0 --deltaq-mode=0 --sharpness=1 --enable-dnl-denoising=0 --denoise-noise-level=5 Code:
--threads=2 --cpu-used=3 --end-usage=q --cq-level=18 --enable-fwd-kf=1 --aq-mode=1 --lag-in-frames=48 --bit-depth=10 --kf-max-dist=240 --kf-min-dist=12 --enable-qm=1 --sb-size=64 --arnr-strength=1 --arnr-maxframes=3 --deltaq-mode=0 --sharpness=1 --enable-dnl-denoising=0 --denoise-noise-level=5 If you’re encoding at higher resolutions, you can up that to 8 threads, discard grain synthesis if you like since you’re using higher bitrates, and up the parameter `--tile-columns` to `--tile-columns=1` and at 4k, `--tile-columns=2 –tile-rows=1` to gain maximum decoding performance. For 2D animation, just setting `--arnr-strength` to --arnr-strength=0 is your best bet ![]() If you like to encode using ffmpeg, here are some base parameters you can play with(use 2-pass ffmpeg please if you want the most optimal encoding with aomenc; for simple encoding, just use SVT-AV1): Code:
ffmpeg -i input.mkv -c:v libaom-av1 -cpu-used 3 -threads 8 -crf 18 -arnr-max-frames 3 -arnr-strength 1 -aq-mode 1 -denoise-noise-level=5 -lag-in-frames 48 -tile_columns 1 -aom-params sb-size=64:enable-qm=1:enable-dnl-denoising=0:deltaq-mode=0 g 240 -keyint_min 12 -pix_fmt yuv420p10le -c:a copy If you have any additional questions or any corrections/clarification you would like for me to add in, please leave them below. Criticisms welcome. Last edited by BlueSwordM; 3rd March 2022 at 18:10. Reason: Clarification |
![]() |
![]() |
![]() |
#4 | Link | |
Registered User
Join Date: Sep 2010
Posts: 34
|
Quote:
Many thanks ![]() Last edited by rbauer; 3rd March 2022 at 15:41. |
|
![]() |
![]() |
![]() |
#6 | Link | |
Moderator
![]() Join Date: Jan 2006
Location: Portland, OR
Posts: 4,997
|
Great article! Very useful to the community.
I was bedeviled by one missing verb, though: Quote:
|
|
![]() |
![]() |
![]() |
#8 | Link |
Registered User
Join Date: Dec 2021
Location: Canada
Posts: 34
|
@benwaggoner, the verb that was missing was raising the Q/CRF.
@rbauer, the recommend parameter is keeping it on, AKA --enable-cdef=1 Anyway, that's all from me now. I have to finish an article for a website, and my 5th post about aom-av1-psy. |
![]() |
![]() |
![]() |
#9 | Link | |
Registered User
Join Date: Sep 2010
Posts: 34
|
Quote:
![]() I'm testing the latest ffmpeg (64bit; Latest Auto-Build, 2022-03-16, with libaom-avi 3.3.0. Win10-64bit): trying to transcode an mp4 file (x264 video and aac audio. I want to transcode the video stream and just copy the audio stream to a av1/mkv file. The original video stream is 15 fps (power point slides, screen sharing, remote college lessons, etc.). Command line: Code:
>ffmpeg -i Test.mp4 -c:v libaom-av1 -pix_fmt yuv420p10le -cpu-used 3 -threads 6 -crf 41 -arnr-max-frames 3 -arnr-strength 1 -aq-mode 1 -denoise-noise-level=5 -lag-in-frames 48 -tile_columns 1 -aom-params sb-size=64:enable-qm=1:enable-dnl-denoising=0:deltaq-mode=0:quant-b-adapt=1: enable-keyframe-filtering=1:sharpness=1 -g 150 -keyint_min 12 -c:a copy Test.mkv This work: Code:
>ffmpeg -i Test.mp4 -c:v libaom-av1 -pix_fmt yuv420p10le -cpu-used 3 -threads 6 -crf 41 -arnr-max-frames 3 -arnr-strength 1 -aq-mode 1 -lag-in-frames 48 -tile_columns 1 -aom-params sb-size=64:enable-qm=1:enable-dnl-denoising=0: deltaq-mode=0:denoise-noise-level=5:quant-b-adapt=1: enable-keyframe-filtering=1:sharpness=1 -g 150 -keyint_min 12 -c:a copy Test.mkv Unfortunately "enable-fwd-kf=1" doesn't work both in the ffmpeg cl (-enable-fwd-kf=1) and in the -aom-params (:enable-fwd-kf=1): "Unrecognized option 'enable-fwd-kf=1'. Error splitting the argument list: Option not found". I should probably use ffmpeg just to pass the file content (rawvideo or similar, I suppose) to your aom-av1 variant (aom-av1-psy): could you kindly suggest a good combination with ffmpeg and your aom-av1-psy for my "user case"? Many Thanks ![]() |
|
![]() |
![]() |
![]() |
#10 | Link | |||||
I might be autistic...
Join Date: Apr 2022
Location: 16-235
Posts: 34
|
Curious what you think! I get the feeling I'm just thinking through something you've already considered?
Quote:
Quote:
I tried looking through how AOMEnc handles input, and it does already have extra..."bits" for reading IVF, OBU, and WEBM, but I imagine it's only aomdec.exe that uses them. I'd try to tackle it myself if I knew even a "hello world" level of C/C++, but alas, even VapourSynth intimidates me. If it's not a batch or shell script, I probably can't figure out how it works. As to why it's the only thing keeping what's already there from being practical...unless AOMEnc's grain synthesis function/table implementation while encoding is lacking - there's no easy way to pipe two different videos that I'm aware of. FFASTrans seems really fascinating but I don't think it can do it either. Piping one video? Easy. ffmpeg, or any of the avs2xyz pipes out there. in | out. Two or more videos? Can they be AVIs? Rename to .avi if it's something that can use VFW. Or if ffmpeg is in there somewhere, compile it with AVISynth input support. Else, use something like AVFS... Two or more videos as .yuv? I heard you liked filling up hard drives! 1,382,400 Y/U/V pixels in 1280x720 4:2:0 x 10bpp = 13,824,000 bits/frame...nearly 14 Mbps/frame. x 23.976fps / 8bit/byte blah blah blah 41,430,528 bytes/sec. Oh. 40 MB/s isn't going to kill a mechanical hard drive, but it will fill it up pretty quickly. 23 GB to store a 10 minute 720p uncompressed "intermediate". 88.9 MB/s for 1080p. Uncompressed YUV is massive. And you better have multiple drives available for this, unless you're considering gobbling up all of those SSD write cycles for a few hundred gigabytes of raw video with sweet sweet random access time... So it already exists. It's just extremely impractical for anything but professional use. Quote:
|
|||||
![]() |
![]() |
![]() |
#11 | Link |
Registered User
Join Date: Dec 2021
Location: Canada
Posts: 34
|
@BuccoBruce, yeah it would work, but the main problem with aomenc's current grain synth implementation(and SVT-AV1's to a lesser extent), is that it's not very good.
That does is mostly due to a lack of dynamic strength control with the normal video toolset and a suboptimal default random seed for noise generation(although that can be fixed on the code side funnily enough, so not a problem). aomenc does have a grain synth estimation tool within the all-intra tools, but it's a bit buggy when trying to get it to work for non all-intra sources last time I tried it. Last edited by BlueSwordM; 29th May 2022 at 05:32. Reason: Clarification and corrections |
![]() |
![]() |
![]() |
#12 | Link | ||
I might be autistic...
Join Date: Apr 2022
Location: 16-235
Posts: 34
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#14 | Link |
Moderator
![]() Join Date: Jan 2006
Location: Portland, OR
Posts: 4,997
|
My sense is that commercial encoder vendors are the ones who will probably come out with the first practical FGS toolchains. The grain classification and removal process in preprocessing is the hard part, and requires creative professional input to tune properly. The codec proper is blind to FGS, as it only ever would see the degrained output frames of the FGS preprocessing module.
|
![]() |
![]() |
![]() |
#16 | Link | |
Moderator
![]() Join Date: Jan 2006
Location: Portland, OR
Posts: 4,997
|
Quote:
Given that temporal randomness is one of the canonical features and challenges of grain, making it static doesn't seem very valuable; it's not like it would save on bits or compute. If a source has static grain, it's probably better to just encode it as is rather than use FGS, as static grain is much easier to encode. Decent temporal denoisers, like used in FGS, generally wouldn't classify static grain as film grain due to the lack of motion. Instead it would be passed through as texture. |
|
![]() |
![]() |
![]() |
#17 | Link | |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,213
|
Hi there, guys,
I got asked to get ready to trial AV1 encodes for distribution, so I was starting to dig a bit into it. The problem is that while I'm so used to tweak x262, x264 and x265 that I have absolutely no idea about AV1, what the standard is for it and what should be used. To put this into context, this will only be for natural contents (no animation) FULL HD BT709 SDR 25p 4:2:0 8bit contents, so it's supposed to be somewhat comparable to an H.264 1920x1080 25p Profile High Level 4.1 keyint 25 Limited TV Range BT709 SDR encoded at crf 22 with x264. The idea being that with AV1 we could get the same result of x264 at crf18 but with the bitrate generally used for crf 22. Now, by taking a quick look at the libaom-av1 documentation, I think something like this should be a good start: Quote:
Now... how can I specify the access unit delimiters? In x264 I generally use --aud, but I couldn't find anything for av1. How do I specify --overscan show to make sure decoders aren't cropping the result? In x264, I generally make sure slices are set to 4 for distribution with --slices 4, is there such a requirement for official av1 distribution specs? Is there a preset for av1? I mean like the ones we have for x264 and x265, like medium, slow, slower, veryslow and placebo? And if so, how do I set one? In x264 and x265, I generally specify a closed gop, in fact for instance in x265 command lines I have --no-open-gop. Is there such a requirement for hardware av1 decoding in official distribution profiles? And if so, how do I set closed gop? Last but not least, in x265 I have to repeat headers for hardware distribution encodes, in fact I always have --repeat-headers turned on. Is there a requirement for av1 too? And if so, how do I turn repeat headers on? Thank you in advance to everyone who wants to help and sorry about my lack of info but I've really never ever touched Google codecs, from the various VP8, VP9 etc to AV1. |
|
![]() |
![]() |
![]() |
#18 | Link |
Moderator
![]() Join Date: Jan 2006
Location: Portland, OR
Posts: 4,997
|
In AV1, you'd use tile-columns instead of slices for threading. I'm not aware of any HW decoders that require that for 1080p or lower. It could help improve multithreaded software decoding, though.
Are there really still devices that need --slices 4 for H.264 encoding? I don't think I've seen that outside of high bitrate Blu-ray encoding specs, and even though all the players I can think of from the last 10+ years work just fine with single slices. A lot of what you're interested in gets mentioned in the OP in this thread. |
![]() |
![]() |
![]() |
#19 | Link | ||||
Registered User
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 113
|
Quote:
Quote:
Quote:
closed GOP is default in libaom, use --enable-fwd-kf to use open GOP. Quote:
Last edited by Beelzebubu; 20th December 2022 at 13:50. |
||||
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|