Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 23rd December 2022, 12:51   #21  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,664
Quote:
Originally Posted by benwaggoner View Post
Are there really still devices that need --slices 4 for H.264 encoding?
all the players I can think of from the last 10+ years work just fine with single slices.
Probably not, but you know, the spec was there back then and it's still there now.
Could it be tossed? Probably, but you know, who would take the risk when you have something in production that has been like this for years and years and suddenly *poof* you get that 1 customer coming back to you complaining eheheheheh
You know what they say "as long as it works, don't touch it" xD


@Beelzebubu... Thank you so much for the very detailed explanation!
Looks like libaom is already doing all the things I want by default (new sequence header for each keyframe, closed gop, temporal delimiter)!
Very very nice.
Now it's just a matter of me tweaking and trying to optimize it for our trailers (yes, that's gonna be the ONLY thing we'll trial with AV1 when we publish to the web, for now).
It's gonna be a good exercise for... next year!
It's December 23rd, so I think I can safely say this now: Merry Christmas and Happy New Year guys!!
FranceBB is offline   Reply With Quote
Old 5th February 2023, 11:19   #22  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,664
Houston, we have a problem.

The encoding is going:


however at 0.1 fps and when I check the task manager...



So... I've implemented AV1 encoding in FFAStrans (the free software I keep working on along with the community) with a simple workflow:



which receives a request via REST API, indexes some XDCAM-50 files with FFVideoSource() and FFAudioSource(), grabs CH.1-2, deinterlaces to 25p if needed, downscales the chroma from yv16 to yv12 creating the AVS Script automatically and then encodes the video in AV1, the audio in Opus, muxing everything in webm, delivers the result and replies to the API request.

The following encoding settings are used:

Quote:
-strict -2 -c:v libaom-av1 -profile:v 0 -level:v 4.1 -refs 4 -crf 18 -keyint_min 25 -g 25 -enable-cdef true -enable-global-motion true -arnr-strength 4 -color_primaries bt709 -color_trc bt709 -colorspace bt709 -color_range tv -field_order progressive -c:a libopus -b:a 510k -ar 48000 -max_muxing_queue_size 700 -map_metadata -1 -f webm

The question is: is the fact that AV1 uses non square blocks and macroblocks making it hard to parallelize?
I mean, I can play the parallelism card by encoding several contents at the same time rather than 1 at a time of course like I do with x262 for MPEG-2, but seeing a 56c/112th AVX-512 Intel Xeon sit idle with just 1 core at 100% on such a new codec breaks my heart a bit.

Anyway, I'm testing in production and so far so good.

Last edited by FranceBB; 13th February 2023 at 22:48.
FranceBB is offline   Reply With Quote
Old 5th February 2023, 12:52   #23  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,080
Yeah, parallelism wasn't great with video encoders from google (vp8, vp9 and isn't great in aomenc), I don't think it's format inherent.
Assuming you don't want to switch to another encoder (i.e. svt-av1) you could also use Av1an which uses chunked encoding and seems like a wrapper for ffmpeg.

Cu Selur
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 7th February 2023, 03:52   #24  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,571
Quote:
Originally Posted by Selur View Post
Yeah, parallelism wasn't great with video encoders from google (vp8, vp9 and isn't great in aomenc), I don't think it's format inherent.
Assuming you don't want to switch to another encoder (i.e. svt-av1) you could also use Av1an which uses chunked encoding and seems like a wrapper for ffmpeg.
YouYube encoding is all about segment-level parallelism, run on whatever cores are free for long enough in Google's cloud. I don't think multithreading has ever been a key scenario for them.

There are AV1 encoders that do provide good parallelization, so I concur it's not codec-specific.

VP8 did have some seriously serialized aspects which hampered multithreaded encode and decode.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 7th February 2023, 17:25   #25  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 108
Quote:
Originally Posted by FranceBB View Post
Houston, we have a problem.
[..]
Code:
 -strict -2 -c:v libaom-av1 -profile:v 0 -level:v 4.1 -refs 4 -crf 18 -keyint_min 25 -g 25
 -enable-cdef true -enable-global-motion true -arnr-strength 4 -color_primaries bt709
 -color_trc bt709 -colorspace bt709 -color_range tv -field_order progressive
 -c:a libopus -b:a 510k -ar 48000 -max_muxing_queue_size 700 -map_metadata -1
 -metadata creation_time=now -f webm
Building a bit on what's said above: you're expected to provide parameters for parallelism in your encoder configuration when working with Google encoders. Please look into -tile-rows, -tile-columns etc. - and make sure -row-mt 1 is set also (I don't recall if it's set by default). --threads should be set by default when encoding with FFmpeg. Note that -tile-columns and -tile-rows are in log2 unit, so -tile-columns 3 means exp2(3)=8 tile columns.

Each tile can be encoded independently, so with enough tiles, you can keep a number of cores active in parallel.
Beelzebubu is offline   Reply With Quote
Old 9th February 2023, 16:15   #26  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,080
@Beelzebub: you can see in FranceBBs screenshot the issue isn't that the cores aren't active, the issue is that they are hardly utilized.
=> only way I see to properly utilize a more cpu power at the same time to get good results faster is chunked encoding.
Tweaking
Code:
-t <arg>, --threads=<arg>             Max number of threads to use
--row-mt=<arg>              Enable row based multi-threading (0: off, 1: on (default))
--fp-mt=<arg>               Enable frame parallel multi-threading (0: off (default), 1: on)
--tile-columns=<arg>        Number of tile columns to use, log2
--tile-rows=<arg>           Number of tile rows to use, log2
doesn't really change much here.
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 10th February 2023, 03:47   #27  |  Link
foxyshadis
ангел смерти
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Lost
Posts: 9,555
Av1an is a really nice little project that supports massively chunked encoding, via (as one option) a real-time scene detector lookahead, and a lot of other niceties in addition. It's more designed for VapourSynth, but simple workflows like that are a snap to convert from Avisynth.

SVT-AV1 on its own has much better thread-utilization capabilities on a single process than aom, but with that many cores it'll still run into limitations.
foxyshadis is offline   Reply With Quote
Old 13th February 2023, 22:46   #28  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,664
As a bit of an update on this, I've encoded 106 trailers in total from XDCAM-50 and PCM Stereo to AV1 + Opus Stereo, very briefly QCed 'em and they look fine.
Although it's true that a single encode didn't use all the cores on a server (and I have a farm dividing jobs as they come etc), I played the parallelism card in terms of files, so I encoded all 106 trailers at the same time.
This was really part of a trial and I've sent those out to the distribution team who will... uhm... do something with them, dunno, but they will hopefully be distributed to end users if everything is ok.
As for FFAStrans, given that it will be used by other users as well, I'll try tweaking the parameters that Selur suggested above and see how it goes given that I'm trying to find the least painful way to implement the node for the users to be able to use it.

Anyway, thank you everyone for the support, you've been really kind and I'll let you know how the trial went once they'll tell me something (keep in mind that this will now go into distribution and I don't work there, so it's outside of my area, but I'll keep you posted).
FranceBB is offline   Reply With Quote
Old 9th May 2023, 13:06   #29  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,664
And... I'm back, 'cause I have to encode more trailers for the website.
I'm experimenting a bit with -row-mt -tile-columns -tile-rows -tiles and -cpu-used.
For FULL HD 25p contents, what do you think it's the sweet spot between quality and speed?

I'm trying with:

Quote:
ffmpeg.exe -i "AVS Script.avs" -hide_banner -strict -2 -c:v libaom-av1 -profile:v 0 -level:v 4.1 -refs 4 -crf 22 -keyint_min 25 -g 25 -enable-cdef true -enable-global-motion true -arnr-strength 4 -color_primaries bt709 -color_trc bt709 -colorspace bt709 -color_range tv -field_order progressive -c:a libopus -b:a 510k -ar 48000 -max_muxing_queue_size 700 -map_metadata -1 -metadata creation_time=now -row-mt 1 -tile-columns 1 -tile-rows 0 -tiles 2x1 -cpu-used 5 -f webm -y "A:\MEDIA\temp\test.webm"

pause
on my PC (an old i7 5930K 6c/12th) and it definitely improved things. CPU usage is much higher at 68% and I'm seeing 22fps vs the original 0.8fps.





and it's about the same on a 56c/112th Xeon server, with 23fps as top speed and 11% total usage:





Did I get the settings right? Is it gonna affect the quality in a significant way or can I just set this as default?
I mean, 23fps is definitely way more acceptable than 0.8fps as it's at least almost 1:1 (25fps would be realtime).
FranceBB is offline   Reply With Quote
Old 9th May 2023, 19:46   #30  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,401
AV1 23fps on a 5930K ? I gotta try AV1 once again.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're working on that issue. Synce invntoin uf lingöage..."
Emulgator is offline   Reply With Quote
Old 9th May 2023, 22:58   #31  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,664
Quote:
Originally Posted by Emulgator View Post
AV1 23fps on a 5930K ? I gotta try AV1 once again.
Eh, but with

Code:
-row-mt 1 -tile-columns 1 -tile-rows 0 -tiles 2x1 -cpu-used 5
which I don't know whether it's considered cheating or not 'cause I'm unfamiliar with the encoder.
I can see that the more you increase the -cpu-used parameter, the faster it is.
There are also the tiles to be taken into account but from the topics I read here on Doom9 some people were advocating against raising tiles, especially for FULL HD encodes, 'cause it degrades quality as the more you raise it the more the video is split in individual tiles encoded separately.
That's what I understood, but I might be wrong. I think someone who has much more experience than me should pop in and comment on that 'cause the documentation only tells you half the story.
FranceBB is offline   Reply With Quote
Old 10th May 2023, 13:07   #32  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 108
Quote:
Originally Posted by FranceBB View Post
I'm experimenting a bit with -row-mt -tile-columns -tile-rows -tiles and -cpu-used.
For FULL HD 25p contents, what do you think it's the sweet spot between quality and speed?
[..]
Code:
-row-mt 1 -tile-columns 1 -tile-rows 0 -tiles 2x1 -cpu-used 5
[..]
it definitely improved things. CPU usage is much higher at 68% and I'm seeing 22fps vs the original 0.8fps.
The trade-off is up to you. For the second tile (2x1 instead of the "theoretically optimal 1x1"), you'll lose approximately 0.5-1.0% BDRATE-metric quality. -cpu-used=5, on the other hand, likely loses you something like 20% BDRATE-metric quality. So if you look at it that way, you can definitely use more tiles if you like to balance the trade-off between tiles vs. higher -cpu-used preset (so you might get the same runtime but better quality using e.g. 4 tiles and -cpu-used 4, or even 8 tiles and -cpu-used 3). But if you just want more speed, both is fine also (-cpu-used 5 and 8 tiles, so -tiles 4x2 or -tile-rows 1 -tile-columns 2 -row-mt 1).

The tile-or-no-tile discussion you're referring to here is kind of pointless in this particular situation, because libaom has limited other threading options. I believe it has frame threading nowadays, but the implementation is a bit different so it's not yet as good (speed/quality-wise) as what you'd see in x264/5 or other (commercial) encoders. But if you're looking for more speed and run out of tiles to add, it's another option to consider.
Beelzebubu is offline   Reply With Quote
Old 10th May 2023, 15:22   #33  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,664
Ah, gotcha!
Thank you for the clarification on this, again I'm new to AV1 as I don't generally encode stuff for internet (i.e our website) but rather for hardware playout ports that are used to produce satellite feeds, so seeing how libaom works is definitely interesting.
Anyway, although I'm gonna lose some quality, I think for simple trailers -cpu-used 4 and 4 tiles is probably enough, especially compared to the savings made from moving away from H.264 which are still there.

Thanks!
FranceBB is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:59.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, vBulletin Solutions Inc.