Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th November 2018, 11:04   #1201  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 103
Quote:
Originally Posted by v0lt View Post
@SmilingWolf
In this case, the video stream received in the multi-thread mode can be worse than the one-thread mode (with the same bitrate of course). Because motion prediction algorithms will not be able to work effectively.
Indeed tile columns affect compression efficiency. However it is also the only way to have threaded decoding and the only way to have smooth 1080p decoding back when I began testing (far before my registration on doom9). Well, at least it was before dav1d, which seems to be using a couple different parallelization techniques.
I'll be running a couple of simple test encodes and decodes and report back some numbers.

Quote:
Originally Posted by v0lt View Post
Added:
I also noticed that some files are decoded by 2 threads, while others are always in single-threaded mode.
Do you have any samples? Files coming from YouTube perhaps? I'd like to inspect them. I know at least some of the first videos they put online in the AV1 test playlist used a single tile column, which forced single threaded decoding in anything using libaom (e.g. Firefox, Chrome, FFMpeg)
SmilingWolf is offline   Reply With Quote
Old 4th November 2018, 12:13   #1202  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,492
@SmilingWolf
As far as I remember, streams obtained using rav1e v1.0.116 were decoded in single-threaded mode. But samples from elecard.com loaded at least 2 cores.
Now it is difficult for me to recheck it, because the decoder in the player works faster than before.
v0lt is online now   Reply With Quote
Old 4th November 2018, 12:20   #1203  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 103
Alright, rav1e doesn't support tiles yet, so every frame is a single big column
I'll inspect the Elecard samples ASAP.

Meanwhile my encodes are finishing up, so I'll post size, quality and decoding time differences when they're done

Last edited by SmilingWolf; 4th November 2018 at 12:25.
SmilingWolf is offline   Reply With Quote
Old 4th November 2018, 18:23   #1204  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 103
The clip used is the F.Y.C one I described some pages ago

aomenc/aomdec: 1.0.0-877-ge5761e020
dav1d: 0.0.1 e0c3186

Quality and sizes:
Code:
Sizes:
test.av1.cq20.tc0.ivf: 5956739
test.av1.cq20.tc2.ivf: 6001827 +0.75%
test.av1.cq20.tc6.ivf: 6091937 +2.22%

PSNR-HVS-M:
test.av1.cq20.tc0.ivf: 43.192
test.av1.cq20.tc2.ivf: 43.1736 -0.04%
test.av1.cq20.tc6.ivf: 43.1489 -0.10%

MS-SSIM:
test.av1.cq20.tc0.ivf: 26.5095
test.av1.cq20.tc2.ivf: 26.4895 -0.07%
test.av1.cq20.tc6.ivf: 26.467  -0.15%
Decoding:
Code:
# aomdec --threads=8 --progress -o /dev/null test.av1.cq20.tc0.ivf
480 decoded frames in 4660361 us (103.00 fps)

# aomdec --threads=8 --progress -o /dev/null test.av1.cq20.tc2.ivf
480 decoded frames in 3365067 us (142.64 fps) +27,79%

# aomdec --threads=8 --progress -o /dev/null test.av1.cq20.tc6.ivf
480 decoded frames in 3267103 us (146.92 fps) +29,89%

# time dav1d -i test.av1.cq20.tc0.ivf -o /dev/null --muxer yuv4mpeg2 -q --framethreads 8 --tilethreads 4
480 decoded frames in 1997 ms    (240,36 fps)

# time dav1d -i test.av1.cq20.tc2.ivf -o /dev/null --muxer yuv4mpeg2 -q --framethreads 8 --tilethreads 4
480 decoded frames in 1747 ms    (274,75 fps) +12,51%

# time dav1d -i test.av1.cq20.tc6.ivf -o /dev/null --muxer yuv4mpeg2 -q --framethreads 8 --tilethreads 4
480 decoded frames in 1763 ms    (272,26 fps) +11,71%
TC 0 means one single column (whole frame)
TC 2 generates 4 columns
TC 6 generates 10 columns
I write they "generate" N columns because there's an upper limit to how many columns fit in a given horizontal resolution, TC 6 implies an actual max of 2^6=64 columns and I use it as a catch all to generate as many columns as possible for my clips.

So the take aways from all this:
  • for very negligible quality and size differences you can get up to 30% faster decoding performances on 720p. I'd expect it to be even more noticeable on higher resolutions;
  • dav1d is now faster than libaom;
  • interestingly, dav1d gave slightly better results with less columns on this particular clip. This might warrant more thorough investigation in the future.

Also, RE: Elecard clips:
they use tile columns (5 for the 720p clips, 10 for the HD clips), so that's why the player used more than one core on those
SmilingWolf is offline   Reply With Quote
Old 6th November 2018, 19:15   #1205  |  Link
Mr_Khyron
Member
 
Mr_Khyron's Avatar
 
Join Date: Nov 2002
Posts: 157
Quote:
ffmpeg -hide_banner -t 60 -c:v libdav1d -threads 16 -tilethreads 4 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
[libdav1d @ 000001ec27984180] libdav1d bd747b1
Input #0, matroska,webm, from 'Stream2_AV1_4K_22.7mbps.webm':
Metadata:
encoder : libwebm-0.2.1.0
Duration: 00:02:24.12, start: 0.000000, bitrate: 22728 kb/s
Stream #0:0(eng): Video: av1 (Main), yuv420p(tv), 3840x2160, SAR 1:1 DAR 16:9, 25 fps, 25 tbr, 1k tbn, 1k tbc (default)
[libdav1d @ 000001ec27a66b80] libdav1d bd747b1
Stream mapping:
Stream #0:0 -> #0:0 (av1 (libdav1d) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf58.22.100
Stream #0:0(eng): Video: wrapped_avframe, yuv420p, 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc (default)
Metadata:
encoder : Lavc58.39.100 wrapped_avframe
frame= 1500 fps= 77 q=-0.0 Lsize=N/A time=00:01:00.00 bitrate=N/A speed=3.09x
video:785kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
bench: utime=228.359s stime=38.609s rtime=19.687s
bench: maxrss=2776212kB
I tried bencmarking with ffmpeg 4.2
from 16fps with libaom to 77fps with Dav1d
Mr_Khyron is offline   Reply With Quote
Old 9th November 2018, 19:02   #1206  |  Link
Clare
Registered User
 
Join Date: Apr 2016
Posts: 61
Quote:
Originally Posted by SmilingWolf View Post
What does your command line look like? So far I've been unable to make row-mt work myself

My tries so far:
this one generates invalid bitstream:
../../bin8/aomenc --frame-parallel=0 --tile-columns=2 --tile-rows=2 --row-mt=1 --threads=4 --auto-alt-ref=1 --cpu-used=4 --tune=psnr --passes=2 --end-usage=q --cq-level=40 --test-decode=fatal -o test.av1.cq40.webm orig.i420.y4m
Pass 1/2 frame 480/481 92352B 1539b/f 36899b/s 17849 ms (26.89 fps)
Pass 2/2 frame 19/0 0B 17871 ms 1.06 fps [ETA unknown] 2423FFailed to decode frame 2 in stream 0: Corrupt frame detected
Failed to decode tile data

This one works but uses only 12-13% (one core) of my 4c/8t CPU:
../../bin8/aomenc --frame-parallel=0 --tile-columns=2 --tile-rows=2 --row-mt=1 --auto-alt-ref=1 --cpu-used=4 --tune=psnr --passes=2 --end-usage=q --cq-level=40 --test-decode=fatal -o test.av1.cq40.webm orig.i420.y4m
Pass 1/2 frame 480/481 92352B 1539b/f 36899b/s 18130 ms (26.47 fps)
Pass 2/2 frame 16/0 0B 18152 ms 52.88 fpm [ETA unknown]
aomenc -v --threads=8 --cpu-used=4 --row-mt=1 --lag-in-frames=25 --auto-alt-ref=1--passes=2 --pass=2 --bit-depth=10 --input-bit-depth=10 --end-usage=q --cq-level=28 -o Chimera_DCI4k2398p_HDR_P3PQ.ivf Chimera_DCI4k2398p_HDR_P3PQ.y4m
Clare is offline   Reply With Quote
Old 9th November 2018, 20:16   #1207  |  Link
Mr_Khyron
Member
 
Mr_Khyron's Avatar
 
Join Date: Nov 2002
Posts: 157
Microsoft release AV1 Video codec for Windows 10

https://mspoweruser.com/microsoft-re...or-windows-10/
Quote:
Microsoft has released support for the new AV1 royalty-free video codec for Windows 10 via the Microsoft Store.

AOMedia Video 1 (AV1), is an open, royalty-free video coding format designed for video transmissions over the Internet. It is being developed by the Alliance for Open Media (AOMedia) and is meant to be a successor to VP9 without relying on any MPEG patents.

The AV1 extension in the Microsoft Store is an early beta version of the AV1 software decoder. Since this is an early release, users may see some performance issues when playing AV1 videos.

Microsoft says they will be regularly updating the codec via automatic store updates.

Find the new codec in the Microsoft Store here.
Mr_Khyron is offline   Reply With Quote
Old 10th November 2018, 10:08   #1208  |  Link
hydra3333
Registered User
 
Join Date: Oct 2009
Location: crow-land
Posts: 540
he he, clicked on "Get" 30 times in the microsoft store and nothing happens ... that may be saying something about quality.
hydra3333 is offline   Reply With Quote
Old 10th November 2018, 13:51   #1209  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,492
Quote:
Originally Posted by Mr_Khyron View Post
I tried bencmarking with ffmpeg 4.2
from 16fps with libaom to 77fps with Dav1d
Where can I download ffmpeg with libdav1d library?
v0lt is online now   Reply With Quote
Old 10th November 2018, 14:22   #1210  |  Link
lvqcl
Registered User
 
Join Date: Aug 2015
Posts: 149
Quote:
from 16fps with libaom to 77fps with Dav1d
AFAICS dav1d has only x86-64 AVX2 assembly code, right?
I wonder what's their plans about older hardware...
lvqcl is offline   Reply With Quote
Old 10th November 2018, 15:04   #1211  |  Link
SmilingWolf
I am maddo saientisto!
 
SmilingWolf's Avatar
 
Join Date: Aug 2018
Posts: 103
Quote:
Originally Posted by Clare View Post
aomenc -v --threads=8 --cpu-used=4 --row-mt=1 --lag-in-frames=25 --auto-alt-ref=1--passes=2 --pass=2 --bit-depth=10 --input-bit-depth=10 --end-usage=q --cq-level=28 -o Chimera_DCI4k2398p_HDR_P3PQ.ivf Chimera_DCI4k2398p_HDR_P3PQ.y4m
That worked, thanks!

Quote:
Originally Posted by v0lt View Post
Where can I download ffmpeg with libdav1d library?
Win64 GCC 8.2 static build:
ffmpeg-4.2-92396-g55e021f39b: https://mega.nz/#!IgAAVayA!jpzHzBaE6...FnbjR-ruOD8lCI
- libaom 1.0.0-902-g03d8ebedc
- libdav1d 58fc516

Last edited by SmilingWolf; 10th November 2018 at 15:06.
SmilingWolf is offline   Reply With Quote
Old 10th November 2018, 18:58   #1212  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
In order to GET the new MS AV1 codec from MS Store, you need to install the forbidden (banned) Windows October 2018 Update.

Test:
MS Windows October x64
Core i3 4170
DXVA Checker (new beta version)

Sample:
Chimera AV1 1080p 8bit (Netflix free sample)

LAV x64 0.73.1 vs MS MFT AV1

LAV x64 19/34/144 (min/avg/max fps) CPU Usage: 57/70/83 (%)

MS MFT AV1 15/26/156 CPU Usage: 50/68/81

It seems that AOM AV1 codec is ~30% faster than MS MFT AV1 on average fps
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 10th November 2018, 19:16   #1213  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,492
@SmilingWolf
Thank.
But my results are different from those that were announced here.
I ran the following tests:
Code:
ffmpeg -hide_banner -t 10 -c:v libaom-av1 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -hide_banner -t 10 -c:v libdav1d -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -hide_banner -t 10 -c:v libdav1d -threads 4 -tilethreads 4 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
And got the following results:
libaom-av1 - max 14 fps
libdav1d - max 7.4 fps
libdav1d -threads 4 -tilethreads 4 - max 9.7 fps

Added:
Intel i5-3570k, Windows 7 Sp1 x64.

Last edited by v0lt; 13th November 2018 at 19:22.
v0lt is online now   Reply With Quote
Old 10th November 2018, 20:20   #1214  |  Link
richardpl
Registered User
 
Join Date: Jan 2012
Posts: 168
Probably because you are not using right arch and CPU combo.
richardpl is offline   Reply With Quote
Old 11th November 2018, 04:36   #1215  |  Link
Wolfberry
Helenium(Easter)
 
Wolfberry's Avatar
 
Join Date: Aug 2017
Location: Hsinchu, Taiwan
Posts: 99
Quote:
Originally Posted by v0lt View Post
I ran the following tests:
Code:
ffmpeg -hide_banner -t 10 -c:v libaom-av1 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -hide_banner -t 10 -c:v libdav1d -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
ffmpeg -hide_banner -t 10 -c:v libdav1d -threads 4 -tilethreads 4 -i Stream2_AV1_4K_22.7mbps.webm -benchmark -f null -
I ran the same test as above and get 16/38/46 fps.
What is the CPU you use for testing?
It might be related to the AVX2 code used in dav1d.
__________________
Monochrome Anomaly
Wolfberry is offline   Reply With Quote
Old 11th November 2018, 05:12   #1216  |  Link
v0lt
Registered User
 
Join Date: Dec 2008
Posts: 1,492
Quote:
Originally Posted by wolfberry View Post
it might be related to the avx2 code used in dav1d.
sse2, sse4.1?
v0lt is online now   Reply With Quote
Old 11th November 2018, 07:37   #1217  |  Link
Aleksoid1978
Registered User
 
Aleksoid1978's Avatar
 
Join Date: Apr 2008
Location: Russia, Vladivostok
Posts: 2,551
Very "good" optimisation dav1d - much slower on my system...
__________________
AMD Ryzen 5 3600 /GIGABYTE B450 Gaming X /AMD Radeon R9 16Gb@3200 /Kingston 500Gb M.2 /GTX 1650 /Samsung U28R550UQI /LG 47LM620T /Yamaha RX-V471 + NS-555 + NS-C444 + NS-333 + YST-SW215

Last edited by Aleksoid1978; 11th November 2018 at 08:59.
Aleksoid1978 is online now   Reply With Quote
Old 11th November 2018, 09:04   #1218  |  Link
Nintendo Maniac 64
Registered User
 
Nintendo Maniac 64's Avatar
 
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 444
Quote:
Originally Posted by lvqcl View Post
AFAICS dav1d has only x86-64 AVX2 assembly code, right?
I wonder what's their plans about older hardware...
Don't forgot that Pentiums and Celerons don't support AVX, and this includes the models that use full-fat Sky/Kaby/Coffee cores such as the ever-popular 2c/4t G4560 and its successor the G5400 (as well as the variants with the faster iGPU like the G4600 and G5500).

And of course, it's those very same AVX-lacking Celerons and Pentiums and such that would stand to gain the biggest benefit from any such software decoder optimizations because those processors simply lack the raw "moar cores!" computational grunt that their i7 and Ryzen brethren have for brute-forcing their way through.

So needless to say, it'd be pretty disappointing to me if dav1d pretty much required having an AVX-capable CPU in order to have any benefit.


Quote:
Originally Posted by Aleksoid1978 View Post
Very "good" optimisation dav1d - mush slower on my system...
...that's not a Fernando Alonso reference, is it?
Nintendo Maniac 64 is offline   Reply With Quote
Old 11th November 2018, 13:14   #1219  |  Link
Mystery Keeper
Beyond Kawaii
 
Mystery Keeper's Avatar
 
Join Date: Feb 2008
Location: Russia
Posts: 718
I wish aomenc/vpxenc had GOP-level parallelism. When each thread is encoding one GOP, and then they are stitched together. That would make use of all CPU power without compromising quality/compression.
__________________
...desu!
Mystery Keeper is offline   Reply With Quote
Old 11th November 2018, 13:25   #1220  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 6,231
Quote:
I wish aomenc/vpxenc had GOP-level parallelism.
Which would require 2pass encoding and a fixed gop structue (in regard to the gop sizes), iirc 2nd pass normally should be able to overwrite GOP to archive vbv limits (not totally sure).
__________________
Hybrid here in the forum, homepage
Notice: Since email notifications do not work here any more, it might take me quite some time to notice a reply to a thread,..
Selur is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, vBulletin Solutions Inc.