View Full Version : Alliance for Open Media codecs
benwaggoner
18th April 2019, 18:53
Xiph update on one year of AV1 (slides from NAB 2019) [PDF]
https://people.xiph.org/~negge/NAB2019.pdf
There certainly has been a ton of progress in the last year!
Although I am again frustrated by the lack of a real apples-to-apples subjective quality comparison. The only one given was libaom versus HEVC HM for ultra low latency (Slide 38). I don't know that the HM is even optimized for low latency; libaom has a lot more rate control than the typical reference encoder.
And even then, we can see that while Y-PSNR a bitrate increase of 5%, subjective MOS testing showed a decrease of -4%. Metrics are not closely coupled!
The VVC JEM, conversely, showed a 32% decrease for Y-PSNR and 30% decrease for MOS; much better correlated.
benwaggoner
18th April 2019, 19:09
Cmdlines:
x264 --preset veryslow --tune ssim --crf 16 -o test.x264.crf16.264 orig.i420.y4m
x265 --preset veryslow --tune ssim --crf 16 -o test.x265.crf16.hevc orig.i420.y4m
Why --tune ssim if targeting VMAF?
We know that --tune ssim looks subjectively worse than --tune film in x264 and not using --tune at all in x265.
vpxenc --codec=vp9 --frame-parallel=0 --tile-columns=1 --auto-alt-ref=6 --good --cpu-used=0 --tune=psnr --passes=2 --threads=2 --end-usage=q --cq-level=20 --test-decode=fatal --ivf -o test.vp9.cq20.ivf orig.i420.y4m
And why a different tune, PSNR, here?
SvtAv1EncApp.exe -i orig.i420.yuv -b test.svtav1.cq20.ivf -w 1280 -h 720 -q 20 -enc-mode 3 -fps-num 24000 -fps-denom 1001 -intra-period 23
aomenc --frame-parallel=0 --tile-columns=1 --auto-alt-ref=1 --cpu-used=4 --tune=psnr --passes=2 --threads=2 --row-mt=1 --end-usage=q --cq-level=20 --test-decode=fatal -o test.av1.cq20.webm orig.i420.y4m
VMAF: model used: vmaf_v0.6.1, pooling: harmonic_mean
Also PSNR.
It seems like the same tuning should be used across all encoders! Although tuning for a given metric and then comparing with that metric is more a test of mathematical correctness of rate control than something that says much about viewer experience.
We've seen data that shows libaom underperforms HM and particularly the VVC JEM in subjective metrics versus objective metrics. I'm guessing because libaom has baked in a lot of VMAF-tuned optimizations.
The gold standard for AV1's current competitiveness would be a double-blind comparison of subjective quality at the same total encoding time.
I guess I'm unsure on what exactly the goal of these particular tests are, or how they are expected to be fruitfully applied.
Double-blind testing is a whole lot of work, but inescapably necessary at this point in the codec universe. Things are going to be crazy over the next few years with H.264, HEVC, and AV1 today and with VVC, EVC, and AV2 on the horizon. VMAF is going to need a data set with subjective tests of the "flavor" of artifacts each produces to be able to make good inter-codec quality comparisons.
It'd be nice to know the relative encoding times as well.
SmilingWolf
18th April 2019, 19:29
--tune ssim on both x264 and x265 gave the best objective metrics scores for both PSNR-HVS-M and MS-SSIM (more than --tune psnr even when measuring PSNR-HVS-M).
VMAF has not been tested because it has been added to the encoding and scoring pipeline later than when I carried out the tune tests.
--tune psnr in the libvpx and libaom cmdlines is there as more of a way to make it explicit.
You can consider the whole "--tune" thing in those encoders as either a joke or a misnomer: instead of turning some knobs like they do on the x26X encoders, they set the RDO metric used during encoding.
To add insult to injury, in libaom out of 4 tunes (psnr, ssim, cdef-dist, daala-dist) 2 of them are usable only in single-threaded builds and give terrible results, and ssim is not even implemented, leaving --tune psnr as the only available one.
And it's not a question of setting it or leaving it alone, --tune psnr is the default (https://aomedia.googlesource.com/aom/+/08ee14a076e0a9f4405d14cd5d301bfa52f4cdf7/av1/av1_cx_iface.c#164) and there is no way to change or unset it. Whatever you do, whether you know or not, if you encode with libaom you're using --tune psnr.
Relative encoding times are unavailable (or, rather, unrealiable) because the machine comes under various loads because I use it while the encodes are running, and have set the pipeline to leave me at least a couple of free cores at all times.
sneaker_ger
19th April 2019, 00:18
Short decoding speed test on 10 year old Intel T3400 (2C2T laptop CPU, SSSE3, no SSE4)(both zeranoe's ffmpeg 20190417-8a3ed5a-win64-static, dav1d 20190410-44d0de4 -threads 4 -tilethreads 2), Chimera 720p 8 bit:
libaom: 749.241 (12 fps)
dav1d: 293.281 (30 fps)
Nintendo Maniac 64
19th April 2019, 00:37
Short decoding speed test:
libaom: 749.241 (12 fps)
dav1d: 293.281 (30 fps)
OK, just how are you going about benchmarking this?
Last time I inquired about this, the best way was pretty much just trial and error by using something like mkvtoolnix to set a given frame rate and then use madvr's OSD to see if there were any dropped frames while playing it back.
10 year old Intel T3400 (2C2T laptop CPU, SSSE3, no SSE4)
Kind of odd that CPU released in late 2008 when it uses the same architecture (Merom) as the original Conroe/Merom (65nm) Core 2 Duo from 2006 (though being a Pentium it has less L2 cache).
Weirder yet considering that the Wolfdale/Penryn 45nm Intel CPUs were available by then, and Nehalem was even available on desktop.
sneaker_ger
19th April 2019, 00:39
OK, just how are you going about benchmarking this?
Last time I inquired about this, the best way was pretty much just trial and error by using something like mkvtoolnix to set a given frame rate and then use madvr's OSD to see if there were any dropped frames while playing it back.
This is just using ffmpeg -benchmark and the fps values are averages. I didn't test for framedrops during difficult scenes.
soresu
19th April 2019, 02:45
Hmmm, the commit here (https://aomedia.googlesource.com/aom/+/d85d2477239d9b9bf36d94daccec79f48ab3784d) on the libaom experimental branch has the title "Add comparison between cnn and cdef/restoration."
I wonder if this means they are targetting an ML tool to replace CDEF, which wouldnt surprise me considering how Tim Terriberry mentioned CDEF being evaluated for a more efficient replacement during the latter stages of AV1 development.
Nintendo Maniac 64
19th April 2019, 05:36
This is just using ffmpeg -benchmark
I'll be honest, I'm actually completely unfamiliar with using ffmpeg...I am at least familiar with how to use command line, but I've no idea what to actually input to get ffmpeg's benchmark argument to actually function.
Could you perhaps share the exact entire command you used? From there I should be able to figure out how to get things going over here.
(software is a bit of a weak point for me - hardware is much more of my specialty)
nevcairiel
19th April 2019, 08:55
Could you perhaps share the exact entire command you used? From there I should be able to figure out how to get things going over here.
If you want to benchmark solely decoding, something like this:
ffmpeg -benchmark -i file.mp4 -f null -
Fill in the filename, of course, but don't move its position in the command line. :)
If you want to benchmark DirectShow on Windows, a far better option then your madVR hack is to use GraphStudioNext, which has View -> Performance Test, which lets you specify a file and a decoder, and it'll run only that decoder, without rendering involved.
hajj_3
19th April 2019, 10:46
DAV1D decoder v0.2.2 has been released, here are the changes:
- Large improvement on MSAC decoding with SSE, bringing 4-6% speed increase. The impact is important on SSSE3, SSE4 and AVX-2 cpus
- SSSE3 optimizations for all blocks size in itx
- SSSE3 optimizations for ipred_paeth and ipref_cfl (420, 422 and 444)
- Speed improvements on CDEF for SSE4 CPUs
- NEON optimizations for SGR and loop filter
- Minor crashes, improvements and build changes
dapperdan
19th April 2019, 12:48
Double-blind testing is a whole lot of work, but inescapably necessary at this point in the codec universe. Things are going to be crazy over the next few years with H.264, HEVC, and AV1 today and with VVC, EVC, and AV2 on the horizon. VMAF is going to need a data set with subjective tests of the "flavor" of artifacts each produces to be able to make good inter-codec quality comparisons.
It'd be nice to know the relative encoding times as well.
MSU did some subjective testing with their subjectify.us platform for their recent HEVC tests. Interestingly VP9 improved more than x265 when you compare SSIM to the subjective scores, though two other HEVC encoders beat both.
I think Netflix did a talk about how to use machine learning to reduce the number of comparisons that the real humans needed to do, making this kind of thing more efficient.
Nintendo Maniac 64
19th April 2019, 20:26
ffmpeg -benchmark -i file.mp4 -f null -
Yep that's exactly what I needed, and things are working now!
...except that the ffmpeg build I used seems to use the AOMedia AV1 decoder rather than dav1d. So now the question is where are you getting your ffmpeg builds so that they actually use dav1d?
Beelzebubu
19th April 2019, 20:44
Yep that's exactly what I needed, and things are working now!
...except that the ffmpeg build I used seems to use the AOMedia AV1 decoder rather than dav1d. So now the question is where are you getting your ffmpeg builds so that they actually use dav1d?
They probably use both, but prefer aom. To use dav1d, try -c:v libdav1d before -i.
foxyshadis
20th April 2019, 13:14
I'd like to solicit opinions on splitting this thread up, especially into aom, rav1e, dav1d, still image (avif) news, as well as solicitations to get the best quality command lines. I'd like to create a separate AV1 forum entirely at this point, but one megathread does not a forum make.
SmilingWolf
20th April 2019, 14:33
VMAF isn't a still image metric. Has anyone run a correlation for VMAF against subjective testing for still images?
Here you go, based on the TID2013 dataset (http://www.ponomarenko.info/tid2013.htm):
Actual profile:
Spearman: | Kendall:
PSNRHA 0.938 | PSNRHA 0.787
PSNRHMA 0.934 | PSNRHMA 0.777
PSNRHVS 0.926 | PSNRHVS 0.766
PSNRHVSM 0.917 | PSNRHVSM 0.749
FSIMc 0.915 | FSIMc 0.742
FSIM 0.911 | FSIM 0.736
WSNR 0.897 | WSNR 0.718
MSSIM 0.887 | MSSIM 0.697
VSNR 0.882 | VSNR 0.690
VMAF_v0.6.1 0.863 | VMAF_v0.6.1 0.675
VMAF_rb_v0.6.3 0.862 | VMAF_rb_v0.6.3 0.674
NQM 0.857 | NQM 0.666
PSNR 0.825 | PSNR 0.624
VIFP 0.815 | VIFP 0.621
PSNRc 0.803 | PSNRc 0.596
SSIM 0.788 | SSIM 0.577
Simple profile:
Spearman: | Kendall:
PSNRHA 0.953 | PSNRHA 0.818
PSNRHVS 0.951 | PSNRHVS 0.809
FSIM 0.949 | FSIM 0.795
FSIMc 0.947 | FSIMc 0.792
PSNRHVSM 0.938 | PSNRHMA 0.785
PSNRHMA 0.937 | PSNRHVSM 0.780
WSNR 0.933 | WSNR 0.772
PSNR 0.913 | PSNR 0.745
VSNR 0.912 | VSNR 0.731
MSSIM 0.905 | MSSIM 0.720
VIFP 0.897 | VIFP 0.714
VMAF_rb_v0.6.3 0.891 | VMAF_rb_v0.6.3 0.698
VMAF_v0.6.1 0.889 | VMAF_v0.6.1 0.696
PSNRc 0.876 | PSNRc 0.689
NQM 0.875 | NQM 0.681
SSIM 0.837 | SSIM 0.628
Full profile:
Spearman: | Kendall:
FSIMc 0.851 | FSIMc 0.666
PSNRHA 0.819 | PSNRHA 0.643
PSNRHMA 0.813 | PSNRHMA 0.631
FSIM 0.801 | FSIM 0.629
MSSIM 0.787 | MSSIM 0.607
VMAF_rb_v0.6.3 0.749 | VMAF_rb_v0.6.3 0.564
VMAF_v0.6.1 0.748 | VMAF_v0.6.1 0.563
PSNRc 0.687 | VSNR 0.508
VSNR 0.681 | PSNRHVS 0.507
PSNRHVS 0.654 | PSNRc 0.496
PSNR 0.640 | PSNRHVSM 0.481
SSIM 0.637 | PSNR 0.470
NQM 0.635 | NQM 0.466
PSNRHVSM 0.625 | SSIM 0.463
VIFP 0.608 | VIFP 0.456
WSNR 0.580 | WSNR 0.446
All bitmap images have been converted to raw full range YUV444P with ffmpeg and then measured with the vmafossexec program.
ffmpeg.exe -i i01_01_1.bmp -vf "scale=flags=accurate_rnd+bitexact+full_chroma_int+full_chroma_inp,format=yuvj444p" i01_01_1.bmp.yuv
vmafossexec.exe yuv444p 512 384 reference_images/i01.bmp.yuv distorted_images/i01_01_1.bmp.yuv model/vmaf_v0.6.1.pkl
vmafossexec.exe yuv444p 512 384 reference_images/i01.bmp.yuv distorted_images/i01_01_1.bmp.yuv model/vmaf_rb_v0.6.3/vmaf_rb_v0.6.3.pkl --ci
I'm also attaching the raw scores, for completeness sake.
A note on how to read the numbers:
from the paper (http://www.ponomarenko.info/papers/tid2013.pdf) I get the following: a SROCC of 0.95 is considered excellent, 0.90 is good, and 0.85 is barely acceptable.
bstrobl
20th April 2019, 16:32
I'd like to solicit opinions on splitting this thread up, especially into aom, rav1e, dav1d, still image (avif) news, as well as solicitations to get the best quality command lines. I'd like to create a separate AV1 forum entirely at this point, but one megathread does not a forum make.
Seems sensible, I would welcome a couple more threads.
TomV
20th April 2019, 17:59
I'd like to solicit opinions on splitting this thread up, especially into aom, rav1e, dav1d, still image (avif) news, as well as solicitations to get the best quality command lines. I'd like to create a separate AV1 forum entirely at this point, but one megathread does not a forum make.
Makes sense. Implementations should have separate threads from the main standardization effort and aomenc. AOM/AV1 news, legal discussions, etc. can be separate threads.
NikosD
20th April 2019, 18:31
I'd like to solicit opinions on splitting this thread up, especially into aom, rav1e, dav1d, still image (avif) news, as well as solicitations to get the best quality command lines. I'd like to create a separate AV1 forum entirely at this point, but one megathread does not a forum make. Too much effort, too many double posts, too much separated information and too much overhead in general.
Probably a separation of AV1 encoding and AV1 decoding would be more than enough for AV1 codec.
Audionut
21st April 2019, 13:40
Too much effort, too many double posts, too much separated information and too much overhead in general.
Agreed. Not busy enough yet. You can come back after a couple of days and still might only have a full page to read.
dapperdan
21st April 2019, 14:42
VMAF isn't designed for still images, but they do provide the tools to create your own VMAF for specific use cases (e.g. anime on a phone screen, or video game cobtebt) so it surprises me that no one has taken the framework and applied it to still images yet.
It should in theory be able to fuse the results of those other still image tests and create something even better aligned with human reported scores than any one alone. Presumably not Netflix's main use case but you'd think they deliver enough still images to make it worthwhile since they already have the skills.
SmilingWolf
21st April 2019, 15:04
VMAF isn't designed for still images, but they do provide the tools to create your own VMAF for specific use cases (e.g. anime on a phone screen, or video game cobtebt) so it surprises me that no one has taken the framework and applied it to still images yet.
It should in theory be able to fuse the results of those other still image tests and create something even better aligned with human reported scores than any one alone. Presumably not Netflix's main use case but you'd think they deliver enough still images to make it worthwhile since they already have the skills.
I was looking into this very matter earlier today and the main problem is, as always for this kind of problems, the lack of high quality MOS datasets. In particular, the only "extensive" dataset I've found is TID2013, and even that only comprises of 2 kinds of image compression distortions, for 25 images, at 5 intensities = 250 distorted images and relative scores.
When calculating the SROCC for only the "compression" distortions (JPEG and J2K) these are the results:
--- top 33%
PSNRHA 0.9686
DSSIM -0.9683
PSNRHVS 0.9677
PSNRHMA 0.9651
PSNRHVSM 0.9603
FSIMc 0.9589
FSIM 0.9580
VMAF_rb_v0.6.3 0.9524
SSIMULACRA -0.9519
VMAF_v0.6.1 0.9505
WSNR 0.9468
--- middle
MSSIM 0.9427
VIFP 0.9380
--- low 33%
PSNRc 0.9200
CQM 0.9190
PSNR 0.9170
VSNR 0.9162
SSIM 0.9147
NQM 0.9023
I also tweeted to Jon Sneyers about the dataset they used to validate SSIMULACRA, will see if he can release it indipendently of a blogpost that now, after two years, is probably not going to happen.
hajj_3
24th April 2019, 17:41
dav1d decoder v0.3.0 is out:
Changes for 0.3.0 'Sailfish':
------------------------------
This is the final release for the numerous speed improvements of 0.3.0-rc.
It mostly:
- Fixes an annoying crash on SSSE3 that happened in the itx functions
nevcairiel
24th April 2019, 19:14
Just because someone updates the changelog doesn't mean it has been released already. You can see actual release tags here, hopefully to help avoid confusing premature announcements:
https://code.videolan.org/videolan/dav1d/tags
There is no 0.3.0 yet. There will need to be one or two additional maintenance changes before that is the case. Probably in a day or two.
Motenai Yoda
24th April 2019, 19:47
there isn't a 3.0 tag yet, but you can always git the master branch with the last commit
dapperdan
25th April 2019, 12:35
Interesting snippet here:
https://www.streamingmedia.com/Articles/Editorial/Featured-Articles/NAB-2019-NGCodec-Talks-Hardware-Based-High-Quality-Live-Video-Encoding-131160.aspx
what we believe is, it's really a variant of their VP9.
Jan Ozer: When you say variant you mean...
Oliver Gunasekara: So Intel bought a company called eBrisk which they then open-sourced and that is the team that has delivered this. And what they did for time to market was take their VP9 implementation and just remove all the functionality that is not appropriate for AV1, tweak the syntax to have a legal AV1. So the end result is, it is an AV1 encoder but it doesn't perform anywhere near like the capabilities that AV1 can deliver. That will come in the future.
Basically claims the AV1-SVT encoder has barely begun development. If that's the case should be interesting to follow it's progress.
I noticed a ticket on their tracker where people were asking them to tag a pre-release so they could begin the process of integrating with Austria etc and the Devs didn't think it was ready for even a pre-release status, then a press release came out announcing version 1.0 was ready.
clsid
25th April 2019, 15:17
- Fixes an annoying crash on SSSE3 that happened in the itx functionsThis fix is only for non-Windows systems. So not important for most of us.
Mjpeg
25th April 2019, 15:51
Really nice writeup of adding tiles to rav1e - explains what tiles are nicely:
https://blog.rom1v.com/2019/04/implementing-tile-encoding-in-rav1e/
(credit: reddit av1 channel https://www.reddit.com/r/AV1/)
Beelzebubu
25th April 2019, 16:16
This fix is only for non-Windows systems. So not important for most of us.
It depends on the build configuration (stack alignment, to be exact), but this could trigger on all systems.
[edit] removed some nonsense because I misread your reply, sorry about that.
iwod
25th April 2019, 19:12
Interesting snippet here:
https://www.streamingmedia.com/Articles/Editorial/Featured-Articles/NAB-2019-Twitch-Talks-VP9-AV1-and-its-Five-Year-Encoding-Roadmap-131163.aspx
Basically claims the AV1-SVT encoder has barely begun development. If that's the case should be interesting to follow it's progress.
I noticed a ticket on their tracker where people were asking them to tag a pre-release so they could begin the process of integrating with Austria etc and the Devs didn't think it was ready for even a pre-release status, then a press release came out announcing version 1.0 was ready.
I cant find the quoted snippet anymore.
dapperdan
25th April 2019, 21:52
Sorry, posted wrong link:
https://www.streamingmedia.com/Articles/Editorial/Featured-Articles/NAB-2019-NGCodec-Talks-Hardware-Based-High-Quality-Live-Video-Encoding-131160.aspx
Blue_MiSfit
26th April 2019, 00:11
Good interview. I spoke with Oliver from NGCodec at NAB this year and I agree with a lot that was said.
FPGA is neat and disruptive because it's cloud native now, so you can get a lot of the flexibility of a pure software solution. I think there's a span of a few years where FPGAs make a lot of sense for dense live encoding, but then eventually ASIC encoders get even better / faster / more power efficient, and software encoders continue to offer better quality.
I think offline encoding for VOD streaming will still be done in software no matter what. I thought maybe there'd be a use case for FPGA AV1 encoding in the next year or so while software encoders (and CPUs) get fast enough to make AV1 encoding practical, but Oliver didn't seem to think this was a great use case. In retrospect, I'm inclined to agree.
ShogoXT
27th April 2019, 21:09
I'd like to solicit opinions on splitting this thread up, especially into aom, rav1e, dav1d, still image (avif) news, as well as solicitations to get the best quality command lines. I'd like to create a separate AV1 forum entirely at this point, but one megathread does not a forum make.
I think for sure that AV1 needs it's whole forum section like hevc has. Within it there for separate threads for rav1e, media industry news, etc.
It's very difficult for a sporadic doom9 reader like myself to follow with what has been discussed in this thread...
VincAlastor
3rd May 2019, 08:50
dav1d 0.3.0 decodes AV1 video’s 24% faster on SSSE3, 26% on SSE4.1 and 4% on AVX2 (all PC), and 12% faster on Arm64 (mobile).
https://medium.com/@ewoutterhoeven/dav1d-0-3-0-sailfish-armed-to-the-teeth-af5bbf845a16
singhkays
6th May 2019, 15:10
https://www.singhkays.com/blog/its-time-replace-gifs-with-av1-video/
I did a quick comparison of AV1 vs x264 vs VP9 at ultra low bitrates and how it can be used to replace GIFs in the browser
https://www.singhkays.com/blog/its-time-replace-gifs-with-av1-video/
I did a quick comparison of AV1 vs x264 vs VP9 at ultra low bitrates and how it can be used to replace GIFs in the browser
"80% better than H.264".
The claimed 50% bit rate reduction of VP9 vs. AVC is not substantiated by independent studies, or in practice by anyone. Also, you can't add bit rate reductions, you have to multiply bit rate ratios. If B encodes to the same quality as A at 0.5x the bit rate, and C encodes to the same quality as B with 0.7x the bit rate, C theoretically is 0.7 x 0.5 = 0.35x the bit rate of A... a 65% reduction, not 80%. But in practice studies have shown that AV1 is roughly on par with HEVC when measured with objective metrics (PSNR, SSIM, VMAF, etc.), delivering roughly a 50% bit rate reduction over AVC. However, measured subjectively it's behind HEVC. Although all of the above measures video, and not still still image compression, I expect the results for image (I frame only) compression to be quite close. Mozilla published a study (https://research.mozilla.org/2013/10/17/studying-lossy-image-compression-efficiency/)in 2013 which confirmed the superiority of HEVC still image compression over other existing formats. Strangely, the link to the study no longer works, but I saved a copy. Maybe one of the Mozilla guys can reshare it.
I agree that content publishers and web sites should be leveraging more powerful video codecs for still image compression. They can start with AVC, as device support is ubiquitous, and it is an improvement over JPEG and GIF. If/when they add support for an advanced codec, they will want the largest range of devices to support that codec, and they will want hardware decoding (for speed and vastly reduced power consumption). They can leverage HEVC (in HEIC container files... based on the ISO Base Media File Format, the evolution of .mov and .mp4) for the majority of devices which already have hardware HEVC support.
See https://nokiatech.github.io/heif/comparison.html
Google developed the WebP standard (https://en.wikipedia.org/wiki/WebP) for still image compression, based on VP8 technology. It's only 10% more efficient than JPEG. What you're proposing would seem to be a new version of WebP.
sneaker_ger
6th May 2019, 22:27
Mozilla published a study (https://research.mozilla.org/2013/10/17/studying-lossy-image-compression-efficiency/)in 2013 which confirmed the superiority of HEVC still image compression over other existing formats. Strangely, the link to the study no longer works, but I saved a copy. Maybe one of the Mozilla guys can reshare it.
http://web.archive.org/web/20160312174628/http://people.mozilla.org/~josh/lossy_compressed_image_study_october_2013/
soresu
7th May 2019, 16:28
Possible partial GPU acceleration coming in Dav1d during this years GSoC.
I wonder how much latency is incurred for only partial GPU decode, some guy going by atomnuker discussed possible GPU AV1 at FOSSDEM last year I think.
NikosD
7th May 2019, 17:45
Support for offloading some of the dav1d AV1 video decoder's work to GPUs using compute shaders in OpenGL/Vulkan/Metal/Direct3D. https://www.phoronix.com/scan.php?page=news_item&px=Google-GSoC-2019-Projects
nevcairiel
7th May 2019, 18:19
Don't get too excited quite yet, such efforts have in the past been problematic to get truely faster. We'll have to see.
soresu
7th May 2019, 19:57
I'd be less interested in faster playback using GPU in favor of lower power for phones prior to decoder ASIC rollouts, which probably wont be until at least next year.
I think Rockchip's recently announced RK3588 SoC for 2020 has an AV1 decoder, but details were sparse in the announcement.
The thing which concerns me most is the lack of any announcement from Qualcomm in support of AV1, considering their huge market share in Android devices
sneaker_ger
7th May 2019, 21:02
Yes, hardware support is disappointing.
IIRC for HEVC the Samsung and LG TVs were the first to have hardware decoding in late 2013/early 2014 after HEVC approval by ITU in April 2013. Now AV1 finalization was in June 2018 and still no hardware in sight, really. Intel probably Tiger Lake Q2 2020 at the earliest. Nvidia with Ampere also 2020? Nothing from Qualcomm.
Looks like they all started working on it pretty late.
alex1399
8th May 2019, 16:39
However, the feature of VP8 hardware decoding was barely seen in commercial products. When does the VP9 finalization happen?
nevcairiel
8th May 2019, 19:02
When does the VP9 finalization happen?
What do you mean?
Majority of new devices has VP9 support now.
benwaggoner
8th May 2019, 19:35
I'd be less interested in faster playback using GPU in favor of lower power for phones prior to decoder ASIC rollouts, which probably wont be until at least next year.
I think Rockchip's recently announced RK3588 SoC for 2020 has an AV1 decoder, but details were sparse in the announcement.
The thing which concerns me most is the lack of any announcement from Qualcomm in support of AV1, considering their huge market share in Android devices
I've heard indications that the extra transistors required for AV1 decoding are a lot higher than anticipated, and higher than the delta for HEVC. The cost in increased die size is a lot more than the savings from not paying MPEG-LA fees.
That would indicate a trend towards AV1 decode launching in high end chipsets first, and taking longer to get into lower-cost handsets.
benwaggoner
8th May 2019, 19:37
Yes, hardware support is disappointing.
IIRC for HEVC the Samsung and LG TVs were the first to have hardware decoding in late 2013/early 2014 after HEVC approval by ITU in April 2013. Now AV1 finalization was in June 2018 and still no hardware in sight, really. Intel probably Tiger Lake Q2 2020 at the earliest. Nvidia with Ampere also 2020? Nothing from Qualcomm.
Looks like they all started working on it pretty late.
The incremental cost to add AV1 into a CPU would be a lot lower than in a SoC, because a GPU already has so many transistors.
The make or break is how many extra mm^2 the decoder takes.
marcomsousa
9th May 2019, 16:19
Yes, hardware support is disappointing.
Now AV1 finalization was in June 2018 and still no hardware in sight, really. Intel probably Tiger Lake Q2 2020 at the earliest. Nvidia with Ampere also 2020? Nothing from Qualcomm.
Looks like they all started working on it pretty late.
What? HW support in less that one year? no way...
1st August 2018, 11:02
(...)
About the AV1 roadmap
We just complete phase 1.
In 1 year we complete phase 2.
In 2 to 3 years we complete phase 3.
In 4 to 5 years we complete phase 4.
https://forum.doom9.org/attachment.php?attachmentid=16445&stc=1&d=1533115434
So, now we are in phase 2. We need to wait 1 or 2 more years to have HW support. So we are in schedule.
* Next year we will starting see some high end CPU/GPU with AV1 HW decode support.
* In 2021 HW decode for low end CPU/GPU and encode for high end CPU/GPU. (some TVs, consoles)
* And in 2022 for all cpu/gpu. (All modern TVs, consoles)
Note: The first HW with encoding support will have bad quality comparing with software. They have to mature over the years.
And if you are asking, all the nextgen consoles that will release next year will not have AV1 HW decode support (because they release with a cpu of this year).
EwoutH
9th May 2019, 16:57
I've heard indications that the extra transistors required for AV1 decoding are a lot higher than anticipated, and higher than the delta for HEVC.
Hmm, this is quite strange. From what I've heard hardware vendors had a large say at the table with AV1 design, with the purpose of minimizing the complexity of encoding and decoding hardware. Do you know which functions or aspects of the fixed function hardware takes more die size than expected?
Also some two detail from Google Stadia: At launch they will use VP9 hardware encoding and somewhere in the future they will switch to AV1, but only when hardware encoding is available.
Stadia Streaming Tech: A Deep Dive (Google I/O'19) (https://www.youtube.com/watch?v=9Htdhz6Op1I)
15:19 for VP9, 22:53 for AV1, 31:45 for hardware video encoders.
Mr_Khyron
10th May 2019, 07:33
https://www.reddit.com/r/AV1/comments/bmp5v9/allegro_dvt_announces_ale210_encoder_ip_with_av1/
Allegro DVT released the first AV1 hardware encoder IP publicly known. The AL-E210 succeeds the AL-E200 with the main addition being support for the AV1 codec. It supports Profile 0, meaning 4:2:0 chroma subsampling with 8 and 10 bit color depth.
It claims support real-time encoding up to 4K, but with multiple cores up to 8K or (/ and?) 120fps. With the AL-E200 one core could encode 4K at 30fps, so with multiple cores this could be higher
Press release:Allegro DVT Introduces the Industry First Real-Time AV1 Video Encoder Hardware IP for 4K/UHD Video Encoding Applications (http://www.allegrodvt.com/allegro-dvt-introduces-the-industry-first-real-time-av1-video-encoder-hardware-ip-for-4kuhd-video-encoding-applications/)
Product page:AL-E210 Encoder IP (http://www.allegrodvt.com/products/silicon-ips/al-e210/)
I've heard indications that the extra transistors required for AV1 decoding are a lot higher than anticipated, and higher than the delta for HEVC. The cost in increased die size is a lot more than the savings from not paying MPEG-LA fees.
now I'm wondering, whether PVQ from the daala folks would've made it smaller ^^ :p
IIRC PVQ was less complex so easier to realize by hw, but was decided against, because of needed additional development time, even though it would've increased efficiency, too.
We'll know when gen2 will be realized (in 5 or 10 years) hehe
They can start with AVC, as device support is ubiquitous, and it is an improvement over JPEG and GIF.
That was like 5 years ago hehe (for animated pictures on major sites)
https://blog.embed.ly/what-twitter-isnt-telling-you-about-gifs-e1b74068cebd
https://rigor.com/blog/optimizing-animated-gifs-with-html5-video
they do it for reduced memory footprint, pause/play, reduced size, better caching, hw decoding, partial decoding, fast first playthrough, higher possible bit depth for animations etc.
EDIT:
for regular pictures it needs to be an image format people can download and share
https://www.cnet.com/news/facebook-tries-googles-webp-image-format-users-squawk/
hevc in ISOBMFF (heic/heif) might turn out a good solution if windows users don't need to add support manually through the microsoft store and android switches to it.
then whatsapp could switch to it as well (with transcoding back to jpeg for older devices)
EDIT2:
at least Firefox finally added WEBp support (including animations) so it counts as an alternative (especially when they update their internal codec again)
https://hacks.mozilla.org/2019/01/firefox-65-webp-flexbox-inspector-new-tooling/
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.