Alliance for Open Media codecs [Archive] - Page 36

View Full Version : Alliance for Open Media codecs

Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 [36] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

soresu

29th June 2019, 15:21

I noticed that several talks mentioned rav1e, but none directly covered it

Was I missing a video, or did the Mozilla/Xiph rav1e guys not get a talk at BAV?

birdie

30th June 2019, 11:23

Twitch's AV1 deployment roadmap (from Big Apple Video 2019)

https://i.redd.it/blmo96lxl2731.png

Beelzebubu

1st July 2019, 17:33

I noticed that several talks mentioned rav1e, but none directly covered it

Was I missing a video, or did the Mozilla/Xiph rav1e guys not get a talk at BAV?

I believe that because the conference was organized by Vimeo/Mozilla, who just announced (https://press.vimeo.com/61553-vimeo-introduces-support-for-royalty-free-video-codec-av1) a partnership around rav1e, they wanted to prevent a potential conflict of interest in talk selection and decided to not give a talk on it.

Beelzebubu

1st July 2019, 17:41

Having watched some (but not all) the BAV presentations - I know that AV1 is currently not ideal for running a battery of tests at short notice, but did they really need to use such outdated versions of competing codecs?

Im pretty sure that the x265 build was from January, and the libaom build from february in one of them.

Maybe I'm missing something and those builds were picked for stability?

The builds used in the Eve-AV1 talk (https://vimeo.com/344663992) were:

rav1e c68d68c6fa80dabf5e4ed9b379f090572eb43d96 (Mon Jun 3 2019)
libaom a385cc44e15833f56de45bbbc1cc6c474751ac9f (Wed Apr 24 2019)
x264 5493be84cdccecee613236c31b1e3227681ce428 (Thu Mar 14 2019)
x265 12522:10decf67c077 (Fri Jun 07 2019)
SVT-AV1 6fd564611bdb48a2a6d2c7b90a91b4b1bdbe74b9 (Mon Jun 10 2019)
libvpx f836d8ba87dcba437228580fe65afe151ccf7659 (Thu Apr 25 2019)

So basically - ignoring x264 for a second (which is pretty mature/stable) - some from late April and some from early June, none from January or February.

soresu

1st July 2019, 19:30

Ah, must have misread the presentation then, easier for me to read from slides than video - still the position of rav1e seems odd, it shows on graphs to be still hovering around x264 - I could have sworn it passed x264 months ago, and then nothing has been said since despite all the work commits being merged into it.

Is it still suffering that regression from awhile ago?

"I believe that because the conference was organized by Vimeo/Mozilla, who just announced a partnership around rav1e, they wanted to prevent a potential conflict of interest in talk selection and decided to not give a talk on it."

Yes that makes sense, just seemed a little odd, like going to WWDC and getting nothing from Apple - still they got a lot of mentions from the presenters in any case.

TD-Linux

1st July 2019, 21:55

The rav1e stats are correct. We still fall behind on high bitrate VMAF - most likely due to that being more sensitive to activity masking (aq) which is still in progress by s_p.

The multithreading performance is limited by a serialization point of the loop filters between frames (by far the slowest part of rav1e right now). There's some outstanding PRs to make it better, e.g. https://github.com/xiph/rav1e/pull/1396

dapperdan

1st July 2019, 22:02

The Visionular talk from Zoe Liu used builds from Jan and Feb.

I think the x265 release used was the last stable release so that doesn't seem too crazy. You could easily cry foul if someone used a non stable git commit and it performed worse than expected due to hitting a bug.

It's also worth bearing in mind that that was basically the same talk as given at the Agora.io thing, so some people tour these things around for a while, it's not ridiculous for them to reuse slides and not have something fresh for every talk they give. Fairly certain I'd seen the NGCodec talk slides before too.

benwaggoner

2nd July 2019, 17:53

The builds used:

rav1e c68d68c6fa80dabf5e4ed9b379f090572eb43d96 (Mon Jun 3 2019)
libaom a385cc44e15833f56de45bbbc1cc6c474751ac9f (Wed Apr 24 2019)
x264 5493be84cdccecee613236c31b1e3227681ce428 (Thu Mar 14 2019)
x265 12522:10decf67c077 (Fri Jun 07 2019)
SVT-AV1 6fd564611bdb48a2a6d2c7b90a91b4b1bdbe74b9 (Mon Jun 10 2019)
libvpx f836d8ba87dcba437228580fe65afe151ccf7659 (Thu Apr 25 2019)

Are the actual command line parameters used documented somewhere?

benwaggoner

2nd July 2019, 18:11

I wonder how much does VMAF really speak about visual quality and compression efficiency while keeping detail (as opposed to the usual issue with metrics, the "blur more for maximum PSNR/SSIM" effect), seeing how in those slides, *everything* except Rav1e and x264 is shown as matching or outdoing x265. Well, I guess there's already the usual assertion/claim that x265 = lipvpx-vp9 that raises questions. :rolleyes: I always stop wondering at that point in these presentations...
VMAF is the least-bad objective metric we've ever had, but it's still far from perfect.

Also, VMAF isn't static. Netflix comes out with new ML models that will give different (and more accurate) scores compared to older models. It can be estimated for mobile, 1080p, or UHD devices. The scores vary based on the resolution of comparison (720p tested at 720p will deliver higher scores than 720p tested at 1080p, compared to 1080p encoding). And it is a per-frame metric, and how to aggregate per-frame scores into an overall clip quality is an unanswered question. Using a harmonic mean helps, but even that is probably only useful for <20 second durations. A single VMAF score for a whole movie or episode could indicate highly variable quality or highly consistent quality.

None of this is a diss on VMAF. Netflix did what they set out to do well, put a huge amount of effort into it, and made reasonable design decisions. But like all metrics, it measures what it is designed to measure, not what we wish it measured :).

I've seen VMAF do a poor job of detecting:

Banding in gradients
Detail in low luma
Adaptive quantization improvements
Artifacts in encoders/formats that weren't included in the VMAF training set
Differences between two pretty high quality encodes.

And it doesn't do HDR at all. It used to not do UHD, but does now.

Another problem with a popular metric is that developers start tuning for that metric instead of what the metric is supposed to measure (subjective quality in this case). When developers start tuning for metrics over eyes, the correlation of that metric to subjective quality actually gets WORSE. So, for an encoder like libaom that got tuning based on VMAF ratings, we'd expect that its VMAF scores would be higher relative to actual subjective quality than for encoders that weren't tuned that were. But it'll be better than ones tuned for PSNR, like the vp? series.

Not that tuning for VMAF is a bad strategy. But it does result in less meaningful VMAF scores.

benwaggoner

2nd July 2019, 18:17

My theory is that the people hired for the subjective tests that underly the objective stats or that vote VP9 as very slightly better than x265 in the MSU tests on subjectify.us have a different notion of quality than the kind of person who is interested in codecs for their own sake.
These are double-blind tests; the people doing it just compare two encodes.

That said, I've not seen any study demonstrating better subjective quality from a well-tuned libpvx encode versus a well-tuned x265 encode, using the same bitrate @ time.

Like, I read a paper recently where someone was applying their grain synthesis approach to HEVC and the subjective tests they did to prove it worked showed they could get basically all the subjective benefit by just doing the noise removal step and not bothering to add the grain back in, something that could be done by any encoder, for any codec (and I'm guessing this makes up part of the secret sauce of some encoders).
This opens up the interesting question of no-reference quality versus creative intent. Someone just looking at a clip without grain might think it looks great. But if the creators meant there to be grain, than the output isn't accurate. That's something that some studies might not rate. And if customers dislike grain, they might rate the encoded version higher than the source!

But I guess someone who said they could get a massive increase in subjective quality via the Psy optimisation of basically blurring the input would get some pushback on that view in some quarters, even with subjective tests to back it up.
Well, that is what adaptive quantization is all about, really. Put the artifacts where they are less painful, and used the saved bits where they'll provide the most visible improvements.

There are similar debates about TV's default "vivid" mode. Some people claim to like it, even though what's displayed in manifestly wrong on many axes.

benwaggoner

2nd July 2019, 18:17

Great talk from Ronald. If only I had the time to do an evaluation of Eve_AV1
Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.

benwaggoner

2nd July 2019, 18:19

I think the x265 release used was the last stable release so that doesn't seem too crazy. You could easily cry foul if someone used a non stable git commit and it performed worse than expected due to hitting a bug.
And there weren't any substantial quality improvements inx x265 between the Jan 2019 builds before the June 3.1 release.

How the encoders got tuned is what matters. And if quality is being compared at fixed encoding time, performance improvements become quality improvements.

dapperdan

2nd July 2019, 19:31

These are double-blind tests; the people doing it just compare two encodes.

My point still stands for tests that intend to be double-blind since some people's abilities and/or preferences would effectively unblind the test.

Imagine, for example, people who believe that tube amps or vinyl is better than digital audio. In a double-blind test they would probably still vote for the tube amp or the vinyl because it has distinctive audio characteristics that can't be removed without invalidating the test. They can hear things that they prefer and associate (conciously or not) with quality.

On the other hand, they would potentially be fooled by audio that had been processed to sound like tube amps or vinyl or passed through a digital chain before output.

I considered this possibility because two recent tests that were presented as being negative for AV1 specifically mentioned that some of their test participants were video engineers. They mentioned this as evidence that it was all done properly, but it seemed like an obvious test methodology failure to me.

I think it was Monty from Xiph that said his party trick used to be identifying the encoder used just by listening to mp3s, and I bet certain bitrates and content would let people here do the same with video codecs and there's a possibility their opinion scores would differ from Joe Public as a result.

dapperdan

2nd July 2019, 19:54

That said, I've not seen any study demonstrating better subjective quality from a well-tuned libpvx encode versus a well-tuned x265 encode, using the same bitrate @ time.

Have you read the full version of the last MSU subjective comparison? I've only read the free snippet, which doesn't have enough info to say either way, but it's possible that fits the criteria or is at least in the right ballpark, potentially a statistical tie:

http://www.compression.ru/video/codec_comparison/hevc_2018/#subjective_report

On the other hand, similar to how complaints about electric cars are now "I don't like the minimalism of their touchscreen interfaces" when not too long ago you'd hear how they were physical impossibilities, I think the fact that we're now at this level of complaint for the previous generation of royalty-free codecs is a testament to how far we've come.

soresu

2nd July 2019, 21:00

Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.

Seems like a wonky business model if one of Amazon's principal video engineers can't get their hands on a build of it to at least do some testing.

Blue_MiSfit

2nd July 2019, 21:39

Re: Eve evaluation, I've never tried, TBH. I've been wanting to spend more time looking at Beamr 5x (fantastic so far!) but have been quite busy.

benwaggoner

3rd July 2019, 01:30

Imagine, for example, people who believe that tube amps or vinyl is better than digital audio. In a double-blind test they would probably still vote for the tube amp or the vinyl because it has distinctive audio characteristics that can't be removed without invalidating the test. They can hear things that they prefer and associate (conciously or not) with quality.
Yep, and then we're back into "Zen and the Art of Motorcycle Maintenance" style philosophical ruminations on the nature and meaning of "quality." Which is unavoidable at a certain point, which is why we try to test for something more specific than just quality. Accuracy to a source and creative intent can be quite different from a no-reference "is this clip pleasing" or "do you see anything wrong with this clip?"

On the other hand, they would potentially be fooled by audio that had been processed to sound like tube amps or vinyl or passed through a digital chain before output.
Exactly. And it's not a particularly hard thing to synthesize. It's not like the film grain in Marvel movies is ACTUALLY grain-from-film. It's digitally synthesized. Grain helps make blending in VFX a lot easier, and allows for rendering at 2K instead of 4K.

I considered this possibility because two recent tests that were presented as being negative for AV1 specifically mentioned that some of their test participants were video engineers. They mentioned this as evidence that it was all done properly, but it seemed like an obvious test methodology failure to me.
It depends on what the question they were asking was intended to be. But yeah, having a bunch of video engineers look at something is very different than having the general public look at something, and can provide different (but both useful!) answers. Video engineers are going to pick up on more subtle things, and are going to care about accuracy and creative intent more.

Generally I'll have video experts to an initial pass on something to see "is there something that can be seen here?" and then using double-blind testing with a more general population to confirm details. The second is a LOT slower and more expensive than the first, of course.

I think it was Monty from Xiph that said his party trick used to be identifying the encoder used just by listening to mp3s, and I bet certain bitrates and content would let people here do the same with video codecs and there's a possibility their opinion scores would differ from Joe Public as a result.
Oh, no doubt. I've done that party trick **many** times. x264 versus WMV3 versus VC-1 versus Main Concept versus VP9; it's generally pretty obvious if you've been in the field for a while.

Beelzebubu

3rd July 2019, 13:45

Is Eve available for evaluation in any way? I've never been able to get my hands on a build, or clips encoded to my specifications.

Have you asked?

benwaggoner

3rd July 2019, 22:55

Seems like a wonky business model if one of Amazon's principal video engineers can't get their hands on a build of it to at least do some testing.
To be clear, I speak only for myself, not Amazon, on these forums. I actually tinker with video stuff on my off hours too. I should probably get out more.

Anyway, I requested an optimal Eve encoding for My encoding challenge (https://forum.doom9.org/showthread.php?t=175776&highlight=benwaggoner), but they declined to participate.

It is common for encoder vendors who think they are doing some magic things in the bitstream to want to have the bitstream output under NDA and such. I get the impulse, but it just isn't practical for doing actual comparisons or due diligence evaluation.

dapperdan

4th July 2019, 08:12

Ilya87

4th July 2019, 18:07

Hi guys, I've desided to make a comparison of x264, rav1e and x265 encoders with 500, 600, 700, 800, 900, 1000 kbit/s with the following settings:

rav1e -b $g --tiles 6 -s 5 --matrix BT470BG /D/sintel/sintel720.y4m --output sintel720_rav1e_s5_$g.ivf
rav1e -b $g --tiles 6 -s 3 --matrix BT470BG /D/sintel/sintel720.y4m --output sintel720_rav1e_s5_$g.ivf
x264 -t 2 -m 11 --me umh --weightp 2 --direct spatial --aq-mode 2 --b-adapt 2 -B $g -b 4 -r 6 -I 240 --b-pyramid normal --no-dct-decimate --no-fast-pskip -A all -o sintel720_x264_$g.264
x265 /D/sintel/sintel720.y4m --y4m -o "Sintel720_x265_$g.h265" --rd 3 -b 4 --b-adapt 2 --b-pyramid --ref 6 -I 240 --bitrate $g --aq-mode 2 --weightp --weightb -m 2 --no-early-skip --psy-rd 1 --me star
where $g stands for bitrate value

My OS is Arch Linux x86_64 and CPU Core i5 8600K, rav1e was build recently and for testing 1191 frames of sintel 1k 16bit (from 12987 to 14177) were taken and converted to 720x306 yuv420p. x265 and x264 are from the distro's repository.

To measure MS-SSIM and PSNR-HVS-M daala's tools were used. To measure VMAF score I used ffmpeg's VMAF filter.

Results:

https://i110.fastpic.ru/big/2019/0704/69/c8b3817496c197135466a8ee32998c69.png
https://i110.fastpic.ru/big/2019/0704/a6/93b47b0ce22c8127bab17738dfee1ea6.png
https://i110.fastpic.ru/big/2019/0704/e6/151968254c3ab19c8abded7ee4a49ae6.png

x265 is a clear winner with 50.59-61.01 fps (lowest to highest bitrate settings)
x264 80.08-102.73 fps
rav1e s3 1.603-2.187 fps
rav1e s5 3.736-4.469 fps

Average CPU utilization of rav1e was 66%-70% (and I couldn't increase it).

marcomsousa

4th July 2019, 22:44

Intel SVT-AV1 0.6 Released With AV1 Decoding, SIMD Optimizations

https://www.phoronix.com/scan.php?page=news_item&px=Intel-SVT-AV1-0.6-Released

Ilya87

4th July 2019, 23:35

Intel SVT-AV1 0.6 Released With AV1 Decoding, SIMD Optimizations

https://www.phoronix.com/scan.php?page=news_item&px=Intel-SVT-AV1-0.6-Released

Still not supported dimensions multiple 2, only 8. Still segfaults. And many other bugs.

benwaggoner

7th July 2019, 20:23

Still not supported dimensions multiple 2, only 8. Still segfaults. And many other bugs.
It is a 0.6 release. I'd expect a smaller set of limitations like that and other issues will still be in 0.7.

Nintendo Maniac 64

8th July 2019, 05:39

soresu

9th July 2019, 05:46

Well there goes my bank account down the tubes after seeing those Ryzen 3000 results - roll on september so I can become poor and happy with my 3950X.

benwaggoner

10th July 2019, 17:38

While now a version old, Phoronix tested (on Linux) the encoding performance of SVT-AV1 v0.5 on the new 3rd gen AMD Ryzen 8core (3700X) and 12core (3900X) chips compared to existing Intel CPUs (primarily the 8core 9900K and 16core 7960X):

https://www.phoronix.com/scan.php?page=article&item=ryzen-3700x-3900x-linux&num=4
Wow, interesting result! And no way did Intel worry about AMD optimizations when compiling SVT :sly: I wonder what the comparison between cpu-tuned x265 and libaom would be like, which should tilt more in AMD's favor.

I note that the top Intel processor used has only half the cores as the top AMD, so this difference could easily be due to multithreading more than per-core performance improvements. But that in no way invalidates the price/performance delta.

Also, and AV1 encoder that's running only ~2.5x slower than a HEVC encoder! Of course, I have no idea if the output quality is similar. As always, the key metric is quality @ bitrate @ performance.

benwaggoner

10th July 2019, 17:45

Also, I note that the Intel processor used in comparison is from 2017. The current equivalent would probably be the i9-9980XE, which as two more cores and 7% faster clock. That would probably have similar SVT performance to the Threadripper. At more than 2x the price, though (although for an encoding workstation/instance, the CPU is typically less than half the cost).

SmilingWolf

10th July 2019, 20:18

Status report!
"Yes I keep tweaking the params" edition

1st edition: https://forum.doom9.org/showthread.php?p=1852449#post1852449
2nd edition: https://forum.doom9.org/showthread.php?p=1857587#post1857587
3rd edition: https://forum.doom9.org/showthread.php?p=1860475#post1860475
4th edition: https://forum.doom9.org/showthread.php?p=1871939#post1871939
Whatever paragraph I don't repeat here can be assumed to be the same as in the aforementioned post

First of all: graphs!
Click to enlarge

Y axis: chosen metric
X axis: bits per pixel

720p:
https://i.ibb.co/Rh3db1D/hvmaf-720.png (https://ibb.co/Rh3db1D) https://i.ibb.co/fGffmdM/msssim-720.png (https://ibb.co/fGffmdM) https://i.ibb.co/86Y4ssM/psnrhvsm-720.png (https://ibb.co/86Y4ssM)

1080p:
https://i.ibb.co/sCzgwpd/hvmaf-1080.png (https://ibb.co/sCzgwpd) https://i.ibb.co/BNj3DCR/msssim-1080.png (https://ibb.co/BNj3DCR) https://i.ibb.co/cJcN2by/psnrhvsm-1080.png (https://ibb.co/cJcN2by)

BD rates for 720p:
Codecs ladder: | x264 relative:
x264 -> svtav1 | x264 -> svtav1
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM -10.5381 0.426713 | MSSSIM -10.5381 0.426713
PSNRHVS -11.296 0.557542 | PSNRHVS -11.296 0.557542
HVMAF -19.6867 0.689824 | HVMAF -19.6867 0.689824
----------------------------|-----------------------------
svtav1 -> vp9 | x264 -> vp9
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM -12.4136 0.464516 | MSSSIM -24.2802 1.23124
PSNRHVS -13.288 0.615572 | PSNRHVS -25.1991 1.68477
HVMAF -14.5152 0.598246 | HVMAF -26.3686 2.81799
----------------------------|-----------------------------
vp9 -> x265 | x264 -> x265
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM -1.73618 0.0667664 | MSSSIM -26.2541 1.24552
PSNRHVS -6.07444 0.298073 | PSNRHVS -30.4815 1.87719
HVMAF -9.04578 0.359953 | HVMAF -31.4265 3.28152
----------------------------|-----------------------------
x265 -> av1 | x264 -> av1
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM -20.8531 0.881529 | MSSSIM -39.9238 2.1343
PSNRHVS -16.9627 0.860883 | PSNRHVS -40.3335 2.76154
HVMAF -23.5865 1.00102 | HVMAF -48.1341 3.64521

BD rates for 1080p:
Codecs ladder: | x264 relative:
x264 -> svtav1 | x264 -> svtav1
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM -14.3136 0.452642 | MSSSIM -14.3136 0.452642
PSNRHVS -10.1078 0.374405 | PSNRHVS -10.1078 0.374405
HVMAF -20.4048 0.58988 | HVMAF -20.4048 0.58988
----------------------------|-----------------------------
svtav1 -> vp9 | x264 -> vp9
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM -19.1279 0.563386 | MSSSIM -34.6951 1.70828
PSNRHVS -21.5428 0.778635 | PSNRHVS -33.6391 2.16168
HVMAF -21.4399 0.750138 | HVMAF -34.3162 3.93015
----------------------------|-----------------------------
vp9 -> x265 | x264 -> x265
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM 8.56339 -0.282927 | MSSSIM -30.5146 1.24699
PSNRHVS 3.02814 -0.139956 | PSNRHVS -32.9536 1.71646
HVMAF -3.70741 0.0299945 | HVMAF -35.6727 3.2304
----------------------------|-----------------------------
x265 -> av1 | x264 -> av1
RATE (%) DSNR (dB) | RATE (%) DSNR (dB)
MSSSIM -28.044 1.00637 | MSSSIM -47.6676 2.30149
PSNRHVS -23.4583 0.991831 | PSNRHVS -45.8303 2.79923
HVMAF -26.6387 0.978822 | HVMAF -51.9814 3.88658

Encoders:
x264 157-2970-5493be8
x265 3.1-4-4f6dde51a5db
libvpx-vp9 1.8.0-591-g19bda215d
SVT-AV1 0.6.0-1424-8977f443
libaom 1.0.0-2036-ge2c1d5ef8

Cmdlines:
x264 --preset veryslow --tune ssim --crf 16 -o test.x264.crf16.264 orig.i420.y4m
x265 --preset veryslow --tune ssim --crf 16 -o test.x265.crf16.hevc orig.i420.y4m
vpxenc --codec=vp9 --frame-parallel=0 --tile-columns=0 --auto-alt-ref=6 --good --cpu-used=0 --tune=psnr --passes=2 --threads=1 --end-usage=q --cq-level=20 --test-decode=fatal --ivf -o test.vp9.cq20.ivf orig.i420.y4m
SvtAv1EncApp.exe -i orig.i420.yuv -b test.svtav1.cq20.ivf -w 1280 -h 720 -q 20 -enc-mode 3 -fps-num 24000 -fps-denom 1001 -intra-period 23
aomenc --frame-parallel=0 --tile-columns=0 --auto-alt-ref=1 --cpu-used=4 --tune=psnr --passes=2 --threads=2 --row-mt=1 --end-usage=q --cq-level=20 --test-decode=fatal -o test.av1.cq20.webm orig.i420.y4m
VMAF: model used: vmaf_b_v0.6.3, pooling: harmonic_mean, bagging score (arithmetic mean of 21 models' scores)

Notes:
TearsOfSteel720 and TheFifthElement, two clips in the 720p category, had a vertical resolution incompatible with SvtAv1EncApp (not divisible by 8).
They have been padded to 1280x536, so they have been included in this round of measurements again.
Meanwhile, rav1e still has got a nasty bug that makes it bloat encodes, which brings up to 25% BD rate regression, so it has been excluded from this edition.
Again, no time infos because I use the PC while it encodes etc. etc.
If somebody REALLY wants some encoding time infos I can run a battery of encodes under ideal conditions on my favourite 1080p clip (PresageFlowerFight) and report the stats in a followup post (ping @benwaggoner)

This concludes this report.
As always, I'm open to any kind of feedback to improve my comparisons and my encodes.

Nintendo Maniac 64

10th July 2019, 20:41

I note that the top Intel processor used has only half the cores as the top AMD, so this difference could easily be due to multithreading more than per-core performance improvements.

Keep in mind that the 2990WX uses a very nontraditional CPU die topology that makes it more akin to something like a dual-socket system with two full CPUs. It's so nontraditional that you basically need to use Linux to get any semblance of good performance at all (Wendell from level1techs did a good analysis on the subject in this video here (https://www.youtube.com/watch?v=M2LOMTpCtLA)).

You can also see from the results that even normal Threadripper like the 12core 2920X (which still uses a somewhat nonstandard die configuration) is getting beaten by the 9900K and 3700X which both use a very traditional CPU core configuration by comparison (one could even argue that the separate I/O die on the 3700X is actually more traditional and is akin to the days of northbridges and external memory controllers ala Athlon XP and Core 2 Duo).

Nevertheless, there could very well be a point of diminishing returns in terms of multicore scalabilty for SVT-AV1 that 32c/64t just isn't seeing the utilization that it could otherwise, and even more-so with such the nontraditional core arrangement of the 2990WX.

Also, I note that the Intel processor used in comparison is from 2017.
While true, keep in mind that the per-GHz performance on Intel has not changed at all and won't change until their 10nm parts.

The current equivalent would probably be the i9-9980XE, which as two more cores and 7% faster clock.
The i9-7960X was not the flagship part of its generation - there was in fact an 18core 7980XE during that gen as well (albeit with a bit lower base clock).

This tells me that Phoronix wasn't actually trying to use the highest-end Intel CPU parts that are available, even within a given CPU generation.

mandarinka

12th July 2019, 23:11

When you say they "declined to participate" did they respond and say they didn't want to take part or did you just not hear from them after making a broad request in a forum post?

I believe the comment above yours saying ("Have you asked?") Is written by a developer of EVE, which suggests they didn't know they'd been asked, so possibly an email has got lost in a spam trap.

I suspect that offer is for Amazon, not for the puproses of this open forum/us plain end users.

Recently somebody asked on AOM IRC whether it would be possible for the Parkjoy encode that Two Orioles showed on the recent conference presentation to be shared/uploaded. They were rejected (https://freenode.logbot.info/aomedia/20190704) - by their words the encodes are not actually"classified", but it is "too much work" (https://freenode.logbot.info/aomedia/20190705). Take it as you will but it AFAIK there is not a signle public stream or file encoded by their software, out in the open (or am I missing something?), so it might not be a coincidence or something that is gonna change. I don't want this to sound like bashing, but perhaps NDAing all that is the business policy that they want/need/are forced to use by the nature of the field. Few years have passed and I can't see how there was no way some more transparent comparison test couldn't have been arranged one way or another, so I assume they just don't wish to do that. Sharing samples like what Beamr guys did is one way they could brag about their quality without the encoder leaving their hands...

You would probably have to get such sample from streaming/video services who are using the software, when content encoded with Eve appears via them.

Blue_MiSfit

15th July 2019, 23:37

Yeah what the heck, guys! Why can't they release some sample bitstreams of open source content? The pros among us are totally interested in commercial encoders, but only if they can compete honestly.

benwaggoner

17th July 2019, 02:31

I suspect that offer is for Amazon, not for the puproses of this open forum/us plain end users.
The shootout is a personal effort of mine, unrelated to my day job. And a direct request for an Eve sample. was made and declined by someone personally familiar with us both.

Recently somebody asked on AOM IRC whether it would be possible for the Parkjoy encode that Two Orioles showed on the recent conference presentation to be shared/uploaded. They were rejected (https://freenode.logbot.info/aomedia/20190704) - by their words the encodes are not actually"classified", but it is "too much work" (https://freenode.logbot.info/aomedia/20190705). Take it as you will but it AFAIK there is not a signle public stream or file encoded by their software, out in the open (or am I missing something?), so it might not be a coincidence or something that is gonna change. I don't want this to sound like bashing, but perhaps NDAing all that is the business policy that they want/need/are forced to use by the nature of the field. Few years have passed and I can't see how there was no way some more transparent comparison test couldn't have been arranged one way or another, so I assume they just don't wish to do that. Sharing samples like what Beamr guys did is one way they could brag about their quality without the encoder leaving their hands...
And Beamr has seen a lot of success. I'm not sure of anyone using Eve in production.

You would probably have to get such sample from streaming/video services who are using the software, when content encoded with Eve appears via them.
Do we know of any that have confirmed they are using Eve in production?

LigH

18th July 2019, 09:25

New uploads: (MSYS2; MinGW32 / MinGW64: GCC 9.1.0)

AOM v1.0.0-2084-g42451f74e (https://www.mediafire.com/file/a4w52hb8l2wbrzt/aom_v1.0.0-2084-g42451f74e.7z/file)

rav1e 0.1.0 (20190430-207-g6ac87d8) (https://www.mediafire.com/file/j1fqdmj9ye5ejm0/rav1e_0.1.0_20190430-207-g6ac87d8.7z/file) built 2019-07-17; new verbose version numbering

dav1d 0.3.1 (2019-07-17, g15a9386) (https://www.mediafire.com/file/ylq5u0k6af32flj/dav1d_0.3.1_2019-07-17_15a9386.7z/file)

IgorC

24th July 2019, 15:20

AV1 Ecosystem Update: June 2019
https://www.singhkays.com/blog/av1-ecosystem-update-june-2019/

Spyros

6th August 2019, 23:13

dav1d 0.4.0 'Cheetah' released (https://code.videolan.org/videolan/dav1d/-/tags/0.4.0)

It supports all the AV1 features and all bitdepths.

0.4.0 brings large improvements in speed on ARM64 (up to 25% speedup) and minor improvements on SSE and ARM. It also improves the RAM usage quite significantly, sometimes more than halving the RAM used.

FFmpeg 4.2 released (https://ffmpeg.org/index.html#pr4.2) with AV1 decoding support (through libdav1d) & more

LigH

8th August 2019, 08:06

Instead, MABS disabled ffmpeg support for librav1e because the previously working patch doesn't work anymore. I guess the two projects have to find a common and more stable API again.

Nintendo Maniac 64

8th August 2019, 19:10

Phononix has a new review including more SVT-AV1 v0.5 encoding performance metrics (located ~1/3 the way down the page) as well as dav1d v0.3 decoding performance metrics (located ~2/3 the way down the page); do note that all this testing was conducted on Ubuntu Linux:

https://www.phoronix.com/scan.php?page=article&item=amd-epyc-7502-7742&num=4

This time they were testing the new Zen2-based AMD Epyc 32core (Epyc 7502) and 64core (Epyc 7742) chips against existing top-end Intel Xeon and AMD Epyc CPUs in both single-socket and dual-socket configurations.

dav1d v0.3 decoding was also included on the performance-per-dollar page, though oddly enough SVT-AV1 was not (scroll down to around half way down the page):

https://www.phoronix.com/scan.php?page=article&item=amd-epyc-7502-7742&num=9

And in the according forum thread for that review, there's a post containing information for dav1d's decoding performance in fps at both 1080p and 4k as well as frames-per-dollar at 4k:

https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1118516-amd-epyc-7502-epyc-7742-linux-performance-benchmarks#post1118526

soresu

24th August 2019, 12:57

Github commits on rav1e have been fairly busy recently, any chance we can get a comparative improvement since the last result on this thread?

soresu

26th August 2019, 14:11

benwaggoner

27th August 2019, 20:56

I found a gitlab repo for the dav1d GPU acceleration GSoC, seems like SGR and CDEF have been implemented in Vulkan, and the same repo even has a GLES branch.

Link here (https://code.videolan.org/stebler/dav1d/commits/vulkan3).

It will be interesting to see if they can get weaker non ASIC SoC's running well by taking advantage of the previously untapped GPU.
For all the attention encoding on GPU has had over the years, compression with a modern codec is actually about the worst video-related task to run on a GPU. Preprocessing, compositing, and decoding are all much more determinate and parallelizable processes than optimal encoding in complex modern codecs with so many interrelated mode decisions.

Nintendo Maniac 64

27th August 2019, 21:31

For all the attention encoding on GPU has had over the years, compression with a modern codec is actually about the worst video-related task to run on a GPU. Preprocessing, compositing, and decoding are all much more determinate and parallelizable processes than optimal encoding in complex modern codecs with so many interrelated mode decisions.

Considering that the post you quoted is referring to dav1d rather than rav1e, I would presume that they were in fact referring to GPU-accelerated decoding rather than GPU-accelerated encoding.

soresu

27th August 2019, 23:54

Considering that the post you quoted is referring to dav1d rather than rav1e, I would presume that they were in fact referring to GPU-accelerated decoding rather than GPU-accelerated encoding.

Correct, I understand there are limitations to what the average ARM SoC can do, but leaving the GPU running idle during decode seems a sad waste.

soresu

31st August 2019, 21:31

Given Qualcomm has still yet to join AOM, I wouldn't expect them to do a Hexagon DSP implementation of an AV1 decoder as they did with HEVC back in the day.

NikosD

1st September 2019, 08:43

@benwaggoner

There is no such thing as "GPU encoding" nowadays.
It's an old term referring to the old days where GPGPU processing used for video encoding.

Nowadays all GPUs from the three major vendors (Intel, nVidia, AMD) contain a fixed-function hardware unit, an ASIC, just for the purpose of video decoding/encoding/pre-post processing.

The quality of H.265 hardware encoding of Turing cards aka Turing specific ASIC for video encoding is more than good enough and its speed is out of this world compared to software encoders.

Nintendo Maniac 64

2nd September 2019, 05:56

The quality of H.265 hardware encoding of Turing cards aka Turing specific ASIC for video encoding is more than good enough and its speed is out of this world compared to software encoders.

EposVox has similarly praised both the HEVC encoder on Navi as well as Turing's AVC encoder for being very fast with very good quality as well, especially Navi's HEVC encoder (though has also mentioned that actually getting it to work is a bit of a pain).

(Navi's AVC encoder is still no better than previous AMD GPUs however, meaning you really shouldn't use it)

benwaggoner

3rd September 2019, 19:30

@benwaggoner

There is no such thing as "GPU encoding" nowadays.
It's an old term referring to the old days where GPGPU processing used for video encoding.
There are still some products and ongoing experimentation for how to leverage GPU in parallel with CPU for improved encoding. But software has certainly pulled ahead in the last five years.

This could be of particular interest for new codecs like AV1, EVC, and VVC where mature fixed-function implementations aren't yet available. The rapid iteration possible with software is a huge benefit in early-stage codec development and deployment.

Nowadays all GPUs from the three major vendors (Intel, nVidia, AMD) contain a fixed-function hardware unit, an ASIC, just for the purpose of video decoding/encoding/pre-post processing.
Of course. And they are essential for things like game streaming where "good enough" quality without taxing GPU or CPU primary processing is needed. But the efficiency is going to be a lot lower than with a good software encode (Like 30%+ higher bitrates required).

The quality of H.265 hardware encoding of Turing cards aka Turing specific ASIC for video encoding is more than good enough and its speed is out of this world compared to software encoders.
Yes, certainly. The economics might not make sense for content that gets streamed multiple times, but for personal use when file size is less of a concern than encoding time, there's a place for it.

Although we don't have any GPUs with AV1 fixed function units yet, do we?

NikosD

3rd September 2019, 19:39

This could be of particular interest for new codecs like AV1, EVC, and VVC where mature fixed-function implementations aren't yet available. The rapid iteration possible with software is a huge benefit in early-stage codec development and deployment.

Although we don't have any GPUs with AV1 fixed function units yet, do we? I think there are no HW encoders or decoders for AV1 yet.
And I'm starting to believe that due to the complexity of the codec, it could be the first time that we will not see hybrid (GPU+CPU) decoders/ encoders and we will go straight to fixed-function decoders/encoders of AV1.

Agreed with your post above.

NikosD

3rd September 2019, 20:37

I think it's already posted, but just as a reminder we could also wait for Vulkan/OpenGL GPU assisted hybrid decoding of dAV1d.
I have my doubts, but OK:
https://www.phoronix.com/scan.php?page=news_item&px=DAV1D-Vulkan-GLES-Experiment

soresu

4th September 2019, 03:44

Not sure when the GSoC finishes, but he's still posting commits on that branch including CDEF opts.