Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 28th October 2019, 07:26   #1  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Intel SVT-AV1

I wanted to make a thread dedicated to discussing Intel's SVT AV1 encoder, as it seems to be quite good and rapidly improving. I'm seeing fantastic results using fixed QP encoding, though as far as I can tell rate control is pretty basic still.

For example, using fixed QP 44 encoding of a 1080p24 hand drawn animation source, I was able to get very nice looking video (average VMAF of 93.6) at 2.8 Mbps using enc-mode 1 (next to the slowest preset).

Speed is quite good for AV1 - I got 0.2 fps on my i7-7700k. This encoder scales very well and saturated my whole system!

What have others seen?

Last edited by Blue_MiSfit; 28th October 2019 at 07:36.
Blue_MiSfit is offline   Reply With Quote
Old 28th October 2019, 10:24   #2  |  Link
Tadanobu
Registered User
 
Join Date: Sep 2019
Posts: 37
The last time I made subjective comparisons, SVT AV1 was behind libaom in term of grain and very fine details, even at speed 0. The image looked a bit smoothed, like comparing stock x265 with stock x264. But for real world scenarios, the quality for speed for compression was already quite good. Speeds 3/4 give very decent results and are way faster than libaom and rav1e. It's very promising.
Tadanobu is offline   Reply With Quote
Old 28th October 2019, 19:21   #3  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Any suggested encoding settings other than the usual "pick the slowest preset you can tolerate"?
Blue_MiSfit is offline   Reply With Quote
Old 29th October 2019, 04:14   #4  |  Link
quietvoid
Registered User
 
Join Date: Jan 2019
Location: Canada
Posts: 574
My line for 10 bit tests:
Quote:
SvtAv1EncApp.exe -i sample.y4m -enc-mode 4 -bit-depth 10 -irefresh-type 2 -q 17 -tile-rows 6 -tile-columns 6 -output-stat-file out.stats -input-stat-file out.stats -enc-mode-2p 6 -b out.ivf
Has to be executed twice for 2 pass QP.
quietvoid is offline   Reply With Quote
Old 30th October 2019, 18:12   #5  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Thanks for sharing.

Just to be clear, do you use tile-rows and tile-columns to make decoding more parallel?

Also, you set irefresh-type to 2 (which makes the encoder use closed GOP), is this for an adaptive streaming use case?
Blue_MiSfit is offline   Reply With Quote
Old 30th October 2019, 18:53   #6  |  Link
quietvoid
Registered User
 
Join Date: Jan 2019
Location: Canada
Posts: 574
Using tiles makes both encoding and decoding "parallel", decoding scales better.

From what I've read, irefresh-type 1 (default) produces a non-seekable output.
Also I omitted 'intra-period 24' because that's specific to my tests (just an input source divided in scenes of 24 frames), but that requires closed GOP.
quietvoid is offline   Reply With Quote
Old 30th October 2019, 20:40   #7  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
I see. And you see a quality benefit to 2 pass even when using fixed QP?

Last edited by Blue_MiSfit; 30th October 2019 at 20:51.
Blue_MiSfit is offline   Reply With Quote
Old 30th October 2019, 22:35   #8  |  Link
quietvoid
Registered User
 
Join Date: Jan 2019
Location: Canada
Posts: 574
Have not compared, but 2 pass is only fixed QP for now.
I don't even know if it's worth using because of that.

Last edited by quietvoid; 30th October 2019 at 22:40.
quietvoid is offline   Reply With Quote
Old 30th October 2019, 22:48   #9  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Scratching my head a bit trying to understand the 2 pass procedure. You literally run that full command above two times (specifying enc-mode enc-mode-2p input-stat-file and output-stat-file each time)?
Blue_MiSfit is offline   Reply With Quote
Old 30th October 2019, 23:01   #10  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,885
Quote:
. Speed is quite good for AV1 - I got 0.2 fps on my i7-7700k. This encoder scales very well and saturated my whole system!
1) 0.2 FPS on 4c/8t means that on epyc 64c/128t you would get less than 3fps. That's still pathetic encoding speed.
2) Saturated all CPU 8 threads during encoding is not a spectacular achievement. 100% CPU usage on ryzen 9 3850x would be something.
Atak_Snajpera is offline   Reply With Quote
Old 30th October 2019, 23:48   #11  |  Link
quietvoid
Registered User
 
Join Date: Jan 2019
Location: Canada
Posts: 574
Quote:
Originally Posted by Blue_MiSfit View Post
Scratching my head a bit trying to understand the 2 pass procedure. You literally run that full command above two times (specifying enc-mode enc-mode-2p input-stat-file and output-stat-file each time)?
Currently that's how it works, it will change to be more user friendly.

However I'm looking at the user guide and I think my 2 pass options are wrong.
-output-stat-file is for the first pass, -input-stat-file is for the 2nd pass. The two should probably not be used at the same time like I did (I used an example cmd from an issue ).
-enc-mode-2p is the preset at which the first pass is done, from what I understand.

Perhaps like this:

Quote:
First pass: SvtAv1EncApp.exe -i sample.y4m -enc-mode 4 -bit-depth 10 -irefresh-type 2 -q 17 -tile-rows 6 -tile-columns 6 -output-stat-file out.stats -enc-mode-2p 6 -b out.ivf
Second pass: SvtAv1EncApp.exe -i sample.y4m -enc-mode 4 -bit-depth 10 -irefresh-type 2 -q 17 -tile-rows 6 -tile-columns 6 -input-stat-file out.stats -enc-mode-2p 6 -b out.ivf

Last edited by quietvoid; 30th October 2019 at 23:55.
quietvoid is offline   Reply With Quote
Old 31st October 2019, 03:23   #12  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Quote:
Originally Posted by Atak_Snajpera View Post
1) 0.2 FPS on 4c/8t means that on epyc 64c/128t you would get less than 3fps. That's still pathetic encoding speed.
2) Saturated all CPU 8 threads during encoding is not a spectacular achievement. 100% CPU usage on ryzen 9 3850x would be something.
That was with very high quality encoding. This is still AV1 0.2 fps isn't too bad, to be honest.

Using faster settings (preset 5) I was able to achieve 4 fps, which is quite fast for 1080p AV1 with 4 cores. The fastest preset delivers 12 fps for me.

Regarding point 2, I think that's exactly what SVT AV1 is designed for, particularly when encoding multiple bitrates in parallel for ABR delivery. A full ABR ladder encoded live appears to be one of the goals of the project. libaom has terrible threading last I checked, as many big OTT VOD services will run encodes single threaded.

Last edited by Blue_MiSfit; 31st October 2019 at 03:28.
Blue_MiSfit is offline   Reply With Quote
Old 31st October 2019, 03:41   #13  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Okay I finally got 2 pass working. @quietvoid you're correct, you have to use -enc-mode-2p with -output-stat-file for the first pass and -enc-mode with -input-stat-file for the second pass.

Here's a quick example using the fastest speed preset and raw 8 bit yuv input, targeting 1080p at 3 Mbps VBR.

First Pass
Code:
svtav1encapp -i Beauty.yuv -w 1920 -h 1080 -fps-num 24000 -fps-denom 1001 -intra-period 96 -irefresh-type 2 -rc 2 -tbr 3000000 
-enc-mode-2p 8 -output-stat-file stats.file -b Beauty_svtav1_8_3000.ivf
Second Pass
Code:
svtav1encapp -i Beauty.yuv -w 1920 -h 1080 -fps-num 24000 -fps-denom 1001 -intra-period 96 -irefresh-type 2 -rc 2 -tbr 3000000 
-enc-mode 8 -input-stat-file stats.file -b Beauty_svtav1_8_3000.ivf
I get an output that decodes okay (and looks awful, predictably), but curiously there's an error during the second pass:

Error in freed returnVal 0

Odd.

Rate control seems to at least hit the target, it achieved 2995 Kbps which is within tolerance.

Any ideas on that error?

Last edited by Blue_MiSfit; 31st October 2019 at 07:55.
Blue_MiSfit is offline   Reply With Quote
Old 31st October 2019, 15:49   #14  |  Link
Tadanobu
Registered User
 
Join Date: Sep 2019
Posts: 37
I think I've read somewhere that high -tile-rows and -tile-columns values are bad for quality encode. Can somebody confirm this ? I have deleted my test encodes but I think I tried not to go above 2x2. Or is it CPU related ?
Tadanobu is offline   Reply With Quote
Old 1st November 2019, 07:29   #15  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
It makes sense to me that these would impact quality somewhat, since they're a bit like slices for MPEG codecs.

I don't know exactly how much they would impact quality.
Blue_MiSfit is offline   Reply With Quote
Old 3rd November 2019, 03:27   #16  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Yikes. It's not too bad for small numbers of tiles, but it blows up really fast.

Using fixed QP 45, here's some results. Not a full BD-Rate analysis, but some food for thought. I used identical values for tile-column and tile-row for each of the below examples

Code:
row/column  | bitrate   | fps    | vmaf average
    0       | 4867      | 4.52   |   91.773
    1       | 4923      | 4.94   |   91.759
    2       | 5000      | 4.88   |   91.725
    4       | 5346      | 4.88   |   91.676
    6       | 5995      | 5.01   |   91.557
So, similar encoding quality across the board but with an enormous bitrate cost when using 6 columns + rows with 1080p, almost 23% more bits.

I imagine the impact would be less for UHD.

Last edited by Blue_MiSfit; 3rd November 2019 at 03:41.
Blue_MiSfit is offline   Reply With Quote
Old 3rd November 2019, 10:05   #17  |  Link
Tadanobu
Registered User
 
Join Date: Sep 2019
Posts: 37
Yeah, I made some new tests and I can confirm that. Higher numbers give higher bitrates while the encoding speed is not significantly faster. I can't see any good reason to use it at the moment.

So, here are some screenshots for subjective comparison. Source is 8bits 1080p, encoded with an i7-8750H using ffmpeg.

Speed went from 0.01FPS (-q 20 -enc-mode 0) to 8.07FPS (-q 50 -enc-mode 8). I personally find -q 20 -enc-mode 4 to be a good balance but it is still as slow as 0.89FPS.

Cropped and zoomed in
https://i.imgur.com/Q0IDhhL.jpg
https://i.imgur.com/JuviCld.jpg
https://i.imgur.com/jBSR00P.jpg
https://i.imgur.com/XCZucKs.jpg
https://i.imgur.com/Sw9fTs5.jpg
https://i.imgur.com/fzT3jba.jpg
https://i.imgur.com/RMlXto3.jpg
https://i.imgur.com/FM7Iwj8.jpg
https://i.imgur.com/ZoNvbsh.jpg
https://i.imgur.com/YQiPuvE.jpg

Full frames
https://i.imgur.com/y6VX0j9.png
https://i.imgur.com/9QHBAnK.png
https://i.imgur.com/IuhcV8j.png
https://i.imgur.com/IkDEXT5.png
https://i.imgur.com/oHFdv3Q.png
https://i.imgur.com/mCavZbf.png
https://i.imgur.com/G4n1vtt.png
https://i.imgur.com/TJ72o5A.png
https://i.imgur.com/3wWDFld.png
https://i.imgur.com/B83rdXk.png
Tadanobu is offline   Reply With Quote
Old 3rd November 2019, 11:38   #18  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,366
I would generally recommend to use at least two tiles at this point for the significant decoding boost it gives. Two tiles, compare to only one, almost doubles possible decoding speed, which for a new codec like this is quite significant.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 3rd November 2019, 21:26   #19  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,992
Interesting, so to make two tiles, would you set tile-row to 2 and leave tile-column at 0 (or vice versa)?

[edit]
Just did some research, the config params are log2 values.

e.g.
param -> actual
1 -> 2
2 -> 4
3 -> 8
4 -> 16

I have no idea WHY this is the case (seems to be in order to use the same syntax as libaom)...

Did a quick sanity check, using -tile-columns 1 gave slightly better metrics than -tile-rows 1, but that's just a couple quick tests. It will likely be content dependent based on motion.

[/edit]

Last edited by Blue_MiSfit; 3rd November 2019 at 23:10.
Blue_MiSfit is offline   Reply With Quote
Old 4th November 2019, 21:06   #20  |  Link
Tadanobu
Registered User
 
Join Date: Sep 2019
Posts: 37
Thanks for the tiles trick. I'll use it from now on.

By the way, what do you guys think of SVT HEVC and SVT VP9 ? Is it my tests that are wrong or do they compress better than SVT AV1 ?

Visually speaking, SVT HEVC has been giving me very nice results. SVT VP9 is a little bit behind in term of compression, but I have almost similar results with faster encoding. Even -enc-mode 0 is ~3.5FPS, not bad. -enc-mode 10 goes up to 82FPS and the difference with -enc-mode 0 is barely perceptible. I need to do more tests, but seriously I'm impressed.

Last edited by Tadanobu; 5th November 2019 at 19:19.
Tadanobu is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:54.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.