Google VP9 "Next Generation Open Video" information posted [Archive] - Page 23

View Full Version : Google VP9 "Next Generation Open Video" information posted

Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 [23] 24 25

utack

14th December 2018, 05:34

Now VMAF gives different scores for mobile, 1080p, and UHD screen sizes. They don’t seem to specify which they used here.

>HVMAF is computed after scaling the encodes to the display resolution (assumed to be 1080p)

I’m having a hard time figuring out what settings they are actually using. It sounds like it’s fixed QP, with one psychovisual parameter added for each “perceptual” tuned mode. So are x264 and x265 fixed QP encodes with psy-rd=1? Is aq-mode=0? And really QP instead of CRF?
They are using the approach described here.
https://medium.com/netflix-techblog/dynamic-optimizer-a-perceptual-video-encoding-optimization-framework-e19f1e3a277f
As far as I understood it encodes it "shot" multiple times, adjusting target bitrate until it hits a desired VMAF

And odd libvpx gets to use 2 passes while everything else is 1
It is, probably because everything else has lookahead?

And when aiming for PSNR, why not --tune psnr which is EXACTLY for that scenario! Psy-rd=0 is NOT tuning for PNSR!
They did tune for PSNR, in part1 "Results with the traditional approach"

This seems a quite poor study for predicting the real-world subjective quality different encoders can produce, as it will substantially underestimate the achievable perceptual quality the x26x codecs can deliver in the real world.

I don't think you can get any of the encoders to do much better, the article from above also goes into detail about how much the internal ratecontrol/quantization choice of all encoders could be improved with their "VMAF feedback loop".

benwaggoner

14th December 2018, 08:14

>HVMAF is computed after scaling the encodes to the display resolution (assumed to be 1080p)
That was the prior VMAF model. The new one also has a mobile and a UHD scale and comparison.

They are using the approach described here.
https://medium.com/netflix-techblog/dynamic-optimizer-a-perceptual-video-encoding-optimization-framework-e19f1e3a277f
As far as I understood it encodes it "shot" multiple times, adjusting target bitrate until it hits a desired VMAF
But that’s just tuning in BD rate. It isn’t actually tuning any of the other parameters.

Also, VMAF itself has plenty of flaws as a metric. It doesn’t do a good job differentiating between high quality encodes, doesn’t pick up on issues with gradients well, and has other flaws. It is the best objective metric we have, absolutely. But real MOS subjective testing is required to be talking about percentage differences in BD-rate.

It is, probably because everything else has lookahead?
Unless they are using a lookahead equal to clip duration, it wouldn’t be apples to apples.

They did tune for PSNR, in part1 "Results with the traditional approach"
No. If they were actually tuning for PSNR, they would have used --tune PSNR!

I don't think you can get any of the encoders to do much better, the article from above also goes into detail about how much the internal ratecontrol/quantization choice of all encoders could be improved with their "VMAF feedback loop".
With fixed QP and no adaptive quant? I can do a LOT better than that with some tuning, and enocode a lot faster at the same time. A reasonably tuned - -preset slower would look better and be a lot faster than the --preset placebo without real psychovisual tuning they did here. By the description, I guess they turned off psychovisual features that are on by default, like aq-mode > 0. And fixed QP is a silly rate control mechanism, since the optimal average QP per frame varies a fair amount by the content in the frame. Anime needs lower QP than a jungle scene, for example (even though it can deliver lower QP at a lot lower bitrate than the higher QP of the jungle scene takes).

Beelzebubu

14th December 2018, 16:37

I’m having a hard time figuring out what settings they are actually using. It sounds like it’s fixed QP [..] And really QP instead of CRF?

The paper says they use CRF for x264/5. I would recommend reading the paper if you want to know all the details, it's pretty detailed.

And odd libvpx gets to use 2 passes while everything else is 1. Although that wouldn’t really matter that much if it’s truly a fixed QP encode with no rate control.

The reason people don't use 1-pass CRF in libvpx is because it's pretty broken. Try it. Several bitstream (!!) features get *completely disabled* (!!) when using 1-pass encoding using libvpx. So you get several % BDRATE quality improvements when switching from 1-pass to 2-pass CRF.

mandarinka

14th December 2018, 19:01

It sort of conflates encoder psychovisual tuning and bitstream capabilities. I imagine a port of x264’s psyovisual stuff to a vp9 encoder would offer big improvements, like was seen when x264 algorithms got ported into x265.

If this QP + arbitrary options toggling is true, then WTF I don't even. After all these years of proper methods being discussed and after their inclusion in AOM which should give them more expertise in this, they should know better.

I would start to suspect actual malice here (marketing/PR deciding the results here?). Because they are really jumping through hoops here to not use the encoders in a way they are supposed with that QP bullshit.

mandarinka

14th December 2018, 19:05

With fixed QP and no adaptive quant? I can do a LOT better than that with some tuning

I think the main point is that the default settings which they made effort to override could do much better. CQP is inferior to CRF mode, that is common knowledge, one of the first things that a non-100%clueless x264/x265 user picks up. It has been established 10 years ago.

Beelzebubu

14th December 2018, 19:50

Is aq-mode=0?

I missed this one. According to the figure in the blog post, as well as the paper, the default encoder setting for aq-mode was used for x264/5. The only encoder where aq-mode was specifically disabled is vpxenc, probably because enabling AQ in libvpx causes BDRATE losses in both PSNR as well as VMAF.

mzso

20th January 2019, 12:32

Hi!

Is libvpx really poor at encoding grainy video, or is something wrong on my end? I sporadically encoded some videogame footage, or desktop screencasts, and it looked perfect to me with CRF 30.

But recently I tested with footage from a grainy film and it still looked horrible at CRF 20, and at only CRF 15 it started to look good, which is 5 times the bitrate...

The CLI is like this:
ffmpeg -i infile -g 30 -vcodec libvpx-vp9 -b:v 0 -threads 12 -row-mt 1 -cpu-used 1 -crf 30 out.webm

benwaggoner

21st January 2019, 20:37

If this QP + arbitrary options toggling is true, then WTF I don't even. After all these years of proper methods being discussed and after their inclusion in AOM which should give them more expertise in this, they should know better.

I would start to suspect actual malice here (marketing/PR deciding the results here?). Because they are really jumping through hoops here to not use the encoders in a way they are supposed with that QP bullshit.
It's not malice. Fixed QP without rate control has been a standard codec comparison method for ages. What I do get nervous about is results comparing encoders where one has a bunch of its features turned off to match the limitations of another, and then the results treated as meaningful.

One of the interesting artifacts of this kind of testing is we tend to build codecs where fixed QP optimizes for high mean PSNR and, to a lesser degree, high psychovisual quality. Which is a silly design goal in my opinion, because real-world encoders don't ever do that. Adaptive inter-frame and intra-frame quantization has been standard for professionally-created content for 10+ years.

So why compare codecs in a mode where EVERY block, be it in an I or non-ref B frame, has the exact same QP?

This will also overemphasize adaptive deadzone techniques, which don't show up in the bitstream. So it's not like there aren't ways to sneak in psychovisual or other optimizations; it's just that adaptive quant is disallowed.

benwaggoner

21st January 2019, 20:59

Is libvpx really poor at encoding grainy video, or is something wrong on my end? I sporadically encoded some videogame footage, or desktop screencasts, and it looked perfect to me with CRF 30.

But recently I tested with footage from a grainy film and it still looked horrible at CRF 20, and at only CRF 15 it started to look good, which is 5 times the bitrate...[/CODE]
Grainy video is quite hard to encode, and requires a lot of psychovisual tuning for it to look good at reasonable bitrates. It's a HUGE difference between grain/noise free screen captures. You'll always have to use some more bits for grain to look good versus a very clean source.

Since YouTube is getting UGC and thus not a lot of grainy video, I wouldn't be surprised if libpvx hasn't gotten a lot of tuning around grain. And what grain they do get is going to almost always be synthesized as a video effect, not true grain like something shot on film would have.

AV1 has some grain synthesis features that would help a lot, but IIRC they weren't in VP9.

mzso

2nd February 2019, 17:16

Since YouTube is getting UGC and thus not a lot of grainy video, I wouldn't be surprised if libpvx hasn't gotten a lot of tuning around grain. And what grain they do get is going to almost always be synthesized as a video effect, not true grain like something shot on film would have.

AV1 has some grain synthesis features that would help a lot, but IIRC they weren't in VP9.

Too bad. It fails hard. It's essentially useless for grainy video. (And I guess we will never see whether Eve does any better)

mandarinka

2nd February 2019, 18:45

I always wondered if Two Orioles could release a public demo build of encoder so that people could validate their quality claims for themselves. Assuming they even want that of course.

My idea was that if they made a binary that lacked SIMD assembly, it would be slow enough to be unusable for production and also not be at risk of software piracy, but it could still be used to check and compare the quality.

utack

2nd February 2019, 23:11

benwaggoner

4th February 2019, 19:33

Some of the classic xiph samples encoded with EVE would also help
One screenshot on their website saying "hey this is better" does not really help
Yeah, EVE gets talked about a lot for something for which there isn't any apparent way to actually test.

utack

5th February 2019, 03:05

Libvpx 1.8
https://www.phoronix.com/scan.php?page=news_item&px=Libvpx-1.8-Released

Phanton_13

5th February 2019, 17:10

from: https://chromium.googlesource.com/webm/libvpx/+/refs/tags/v1.8.0
- Enhancements:
2 pass vp9 encoding has improved substantially. When using --auto-alt-ref=6,
we see approximately 8% for VBR and 10% for CQ. When using --auto-alt-ref=1,
the gains are approximately 4% for VBR and 5% for CQ.

For real-time encoding, speed 7 has improved by ~5-10%. Encodes targeted at
screen sharing have improved when the content changes significantly (slide
sharing) or scrolls. There is a new speed 9 setting for mobile devices which
is about 10-20% faster than speed 8.
what? "--auto-alt-ref=6" it isn't supposed "--auto-alt-ref" to have as only valid values 1 or 0? plus the documentation don't reference to values other than 1 or 0. someone have an explanation?

Tommy Carrot

5th February 2019, 18:16

I don't know what exactly --auto-alt-ref does, but setting it to 6 makes the popping or frame strobing effect, which is probably the biggest problem of VP9, significantly less noticeable.

Selur

5th February 2019, 19:55

some updated documentation really would help,... :/

hajj_3

5th February 2019, 20:30

2 pass vp9 encoding has improved substantially. When using --auto-alt-ref=6,
we see approximately 8% for VBR and 10% for CQ. When using --auto-alt-ref=1,
the gains are approximately 4% for VBR and 5% for CQ.

Are these percentages compression improvements or speed improvements. If compression improvements then that would be very nice indeed. Are these benchmarks for live encoding or non-live encoding?

Selur

5th February 2019, 20:38

May be if some more folks vote https://bugs.chromium.org/p/webm/issues/detail?id=1597 up the will update the documentation,...

LigH

6th February 2019, 18:30

New upload (MSYS2; MinGW32: GCC 7.4.0 / MinGW64: GCC 8.2.1):

VPx v1.8.0-142-gce4336c2a (https://www.mediafire.com/file/cpjkqwyxpppghwy/vpx_v1.8.0-142-gce4336c2a.7z)

benwaggoner

6th February 2019, 21:03

I don't know what exactly --auto-alt-ref does, but setting it to 6 makes the popping or frame strobing effect, which is probably the biggest problem of VP9, significantly less noticeable.
I believe it dynamically determines the optimal alt-ref ("golden") frame. Which should exactly reduce strobing artifacts, by providing a superior frame for other frames to be predicted from.

Over-tuning for mean per-frame PSNR tends to result in some strobing, because improving quality of reference frames pays off more than non-reference frames.

Dark Shikari had a great blog post about that way back when.

Beelzebubu

7th February 2019, 19:09

I believe it dynamically determines the optimal alt-ref ("golden") frame. Which should exactly reduce strobing artifacts, by providing a superior frame for other frames to be predicted from.

This is what --auto-alt-ref=1 does; --auto-alt-ref=N for N>1 enables multi-level hierarchical frame ordering, with the number of layers being equal to N. This is known to improve coding efficiency, HEVC/H264 use it also (pyramid frame ordering).

Selur

7th February 2019, 19:10

Thanks for that info!

Blue_MiSfit

7th February 2019, 22:43

Typical Google - I really wish they would have usable documentation for all of this stuff! Who wants to dig through mailing lists and forum posts to have any idea how to properly do VP9 encoding??

Glad to see libvpx continuing to improve in any case, particularly when it comes to rate control.

benwaggoner

8th February 2019, 00:55

This is what --auto-alt-ref=1 does; --auto-alt-ref=N for N>1 enables multi-level hierarchical frame ordering, with the number of layers being equal to N. This is known to improve coding efficiency, HEVC/H264 use it also (pyramid frame ordering).
So, N=maximum number of alt-ref frames that can be currently used, ala the --ref parameter in x264?

I wish VP9 had a x265.readthedocs.io equivalent :).

And a translation guide would be helpful as well. The VPx codecs wind up trying to do the same essential thing as MPEG codecs, but structured differently to get around patents.

Mr_Khyron

17th February 2019, 15:10

https://github.com/OpenVisualCloud/SVT-VP9
The Scalable Video Technology for VP9 Encoder (SVT-VP9 Encoder) is a VP9-compliant encoder library core. The SVT-VP9 Encoder development is a work-in-progress targeting performance levels applicable to both VOD and Live encoding/transcoding video applications.

The SVT-VP9 Encoder is being optimized to achieve excellent performance levels currently supporting 10 density-quality presets (please refer to the user guide for more details) on a system with a dual Intel® Xeon® Scalable processor targeting:

Real-time encoding of up to two 4Kp60 streams on the Gold 6140 with M8.

SVT-VP9 Encoder also supports 3 modes:

A visually optimized mode for visual quality (-tune 0)

An PSNR/SSIM optimized mode for PSNR / SSIM benchmarking (-tune 1 (Default setting))

An VMAF optimized mode for VMAF benchmarking (-tune 2)

Mr_Khyron

18th February 2019, 01:03

https://phoronix.com/scan.php?page=news_item&px=SVT-VP9-Open-Source
At the start of the month Intel open-sourced SVT-AV1 aiming for high-performance AV1 video encoding on CPUs. That complemented their existing SVT-HEVC encoder for H.265 content and already SVT-AV1 has been seeing nice performance improvements. Intel now has released SVT-VP9 as a speedy open-source VP9 video encoder.

Uploaded on Friday was the initial public open-source commit of SVT-VP9, the Intel Scalable Video Technology VP9 encoder. With this encoder they are focusing on being able to provide real-time encoding of up to two 4Kp60 streams on an Intel Xeon Gold 6140 processor. SVT-VP9 is under a BSD-style license and currently runs on Windows and Linux.

Earlier today I added now the SVT-VP9 test profile for benchmarking this new VP9 encoder via the Phoronix Test Suite. I've been testing SVT-VP9 on a few Intel/AMD Linux systems so far today and the early results are extremely promising.

Beelzebubu

21st February 2019, 21:07

So, N=maximum number of alt-ref frames that can be currently used, ala the --ref parameter in x264?

Yes, it's somewhat similar to that. Note that the distance between equi-level ref-frames isn't necessarily the same, so it's not exactly the same. (I can explain this a bit further if I'm not making sense here.)

I wish VP9 had a x265.readthedocs.io equivalent :).

Good luck getting Google to spend effort on that :).

[edit]

Maybe I'm being unfair in that last comment, bit snarky also, so allow me to re-phrase that. I guess Google would have hoped that if they provide engineering talent, that other community contributors would have done these things that make it more widely useful but aren't necessary for Google internally. Obviously this never happened, but it would make sense from Google's perspective to hope for some more community participation than they've seen.

benwaggoner

21st February 2019, 21:26

Good luck getting Google to spend effort on that :).

[edit]

Maybe I'm being unfair in that last comment, bit snarky also, so allow me to re-phrase that. I guess Google would have hoped that if they provide engineering talent, that other community contributors would have done these things that make it more widely useful but aren't necessary for Google internally. Obviously this never happened, but it would make sense from Google's perspective to hope for some more community participation than they've seen.
Doing it with community support would make plenty of sense. It seems more likely for AV1, which has a much broader set of stakeholders and a lot more options to choose between.

dipje

1st March 2019, 00:26

With vpxenc (tried with 1.7.0 and 1.8.0) I get weird rate-control.. 1.7.0 was undershooting quite a bit, 1.8.0 seemed to do better but with the same source and less bitrate it starts to overshoot out of nowhere (by quite a lot).

vpxenc --cpu-used=9 --good --end-usage=vbr --auto-alt-ref=6 --target-bitrate=6000 --passes=2 --pass=1 (and 2) --aq-mode=1 --row-mt=1

I also tried adding '--cq-level=0' in there when which I've read in this thread somewhere. Am I doing something wrong (I get a videostream of 6856 kbit/sec, while x264 hits 5847 and x265 hits 5888) or is this just a VP9 thing that the rate-control might be a bit all over the place?

Asilurr

2nd March 2019, 06:49

vpxenc --cpu-used=9 --good --end-usage=vbr --auto-alt-ref=6 --target-bitrate=6000 --passes=2 --pass=1 (and 2) --aq-mode=1 --row-mt=1 1. For general-purpose encoding is recommended to set --cpu-used to [0 .. 3]. The latter, namely 3, is already reasonably fast (when properly supported by multithreading, more on this later on) while still maintaining decent quality.
2. Do keep in mind that --auto-alt-ref accepts the full range of integers in [0 .. 6], not only 0/1/6. The benefits of switching from 0 to 1 are usually (most sources, at most resolutions) significantly greater than switching from 1 to any >1 value; higher values such as 6 may provide negligible gains over lower values such 2/3, in which case the speed penalty may become undesirable. As you are already instructing the encoder to be fast with --cpu-used, it's probably a saner option to set --auto-alt-ref to 1. For general-purpose encoding, I'd look at auto-alt-ref 2/3 for cpu-used 1/2 thus reserving the highest possible value of 6 to cpu-used 0.
3. It's redundant to manually specify a --pass, by setting up --passes the encoder can automatically run multi-pass encodings.
4. While --aq-mode 1/2 will occasionally (some sources, at some resolutions) provide better results, accompanied or not by --alt-ref-aq >0, for general-purpose encoding is strongly recommended to disable AQ by setting aq-mode to 0.
5. For general-purpose encoding is recommended to use explicitly --lag-in-frames 25, and also --enable-tpl 1.

Properly enabling multithreading in vpxenc/libvpx requires several parameters, not just --row-mt on its own. In addition to setting row-mt to 1, it's needed to set up the tiles too and that can be achieved as it follows: explicitly disable tile rows (--tile-rows 0), and explicitly enable tile columns instead (set --tile-columns to 5/6, 5 is already enough for 99.9999% of encodings; do note that the encoder will automatically use as many horizontal tiles as permitted by the horizontal resolution, i.e. width, of the source. In practical scenarios, the encoder will restrict tile-columns to [0 .. 3] much more often than not because that corresponds to sources up to 16:9 2160p). Lastly, enable explicitly --threads by setting up a high value, for instance 64 (do NOT change --threads according to the number of actual logical cores of the machine used for encoding, set up a high value and the encoder will spawn automatically as many threads as it needs). Wrapping it up, this is how multithreading should be set up for file-based encoding (nota bene, chunk-based encoding is addressed differently): --tile-rows=0 --tile-columns=5 --row-mt=1 --threads=64.

The rate control of vpxenc/libvpx is peculiar, to say the least. It works better with quantizers (--end-usage cq/q, plus --cq-level [desired CRF], plus --min-q and/or --max-q [desired QP]) than with bitrates (--end-usage vbr/cbr, plus --target-bitrate, plus --undershoot-pct and/or --overshot-pct, plus --bias-pct). The absolute best results are achieved through chunk-based encoding (i.e. the very way vpxenc was designed to be used), rather than the file-based encoding that would be expected by normal people (a very non-specific alias for consumers). Yes, it's probably fair to state that "rate control in libvpx sucks", as long as it's understood that attempting to achieve strict rate control in file-based encoding will undershoot/overshoot much more often than not especially when using lax parameters (for instance only --target-bitrate on its own, without the other RC tools).

Belated disclaimer: my insight of vpxenc/libvpx is purely empirical, gained from using them extensively myself on various sources, at various resolutions, with various parameter sets. Professionals such as Ronald (http://forum.doom9.org/member.php?u=27401) can provide technical insight, if you can actually persuade them to do so. :p

LigH

13th March 2019, 14:21

New upload (MSYS2; MinGW32: GCC 7.4.0 / MinGW64: GCC 8.3.0):

VPx v1.8.0-226-g7969c6e0b (https://www.mediafire.com/file/8t8kgvyttphg7qg/vpx_v1.8.0-226-g7969c6e0b.7z/file)

Leeloo Minaï

15th March 2019, 11:42

New upload (MSYS2; MinGW32: GCC 7.4.0 / MinGW64: GCC 8.3.0):

VPx v1.8.0-226-g7969c6e0b (https://www.mediafire.com/file/8t8kgvyttphg7qg/vpx_v1.8.0-226-g7969c6e0b.7z/file)

Thanks for your work !

Is there any changelog about VP9 evolution ?
Each new build is slower than its predecessor... I expected that there could be some speed optimization all over the time and not the contrary :scared:

LigH

18th March 2019, 02:57

The commit log is at https://chromium.googlesource.com/webm/libvpx

singhkays

13th April 2019, 02:04

I'm getting VMAF score of VP9 video which is less than H.264 at 500Kbps using highest quality settings on each. Anyone encounter this before?

the ffmpeg 2-pass settings for

x264 - VMAF score 72.8

./ffmpeg -i scene1.mp4 -c:v libx264 -an -b:v 500k -filter:v scale=720:-1 -preset placebo -pass 1 -f mp4 -y /dev/null /
&& ./ffmpeg -i scene1.mp4 -movflags +faststart -pix_fmt yuv420p -c:v libx264 -an -b:v 500k -filter:v scale=720:-1 -pass 2 -preset placebo ./x264/scene1/x264_500k.mp4

vp9 - VMAF score 67.1

./ffmpeg -i scene1.mp4 -c:v libvpx-vp9 -an -b:v 500k -filter:v scale=720:-1 -row-mt 1 -quality best -cpu-used 0 -tile-columns 2 -threads 8 -pass 1 -f webm -y /dev/null && /
./ffmpeg -i scene1.mp4 -pix_fmt yuv420p -c:v libvpx-vp9 -an -b:v 500k -filter:v scale=720:-1 -row-mt 1 -quality best -cpu-used 0 -tile-columns 2 -threads 8 -pass 2 ./vp9/scene1/vp9_500k.webm

This seems completely unexpected! I've attached both encoded files below

VMAF was then measure with the following command

../../ffmpeg -i vp9_500k.webm -i ../../scene1.mp4 -filter_complex "[0:v]scale=1920x800:flags=bicubic[main];[main][1:v]libvmaf=model_path=/home/kay/vmaf/model/vmaf_v0.6.1.pkl" -f null - >>vmaf-scene1.txt -hide_banner

The encoded files can be downloaded from here https://github.com/Netflix/vmaf/files/3075485/encoded-files.zip

poisondeathray

13th April 2019, 02:52

I'm getting VMAF score of VP9 video which is less than H.264 at 500Kbps using highest quality settings on each. Anyone encounter this before?

the ffmpeg 2-pass settings for

x264 - VMAF score 72.8

./ffmpeg -i scene1.mp4 -c:v libx264 -an -b:v 500k -filter:v scale=720:-1 -preset placebo -pass 1 -f mp4 -y /dev/null /
&& ./ffmpeg -i scene1.mp4 -movflags +faststart -pix_fmt yuv420p -c:v libx264 -an -b:v 500k -filter:v scale=720:-1 -pass 2 -preset placebo ./x264/scene1/x264_500k.mp4

vp9 - VMAF score 67.1

This seems completely unexpected! I've attached both encoded files below

VMAF was then measure with the following command

../../ffmpeg -i vp9_500k.webm -i ../../scene1.mp4 -filter_complex "[0:v]scale=1920x800:flags=bicubic[main];[main][1:v]libvmaf=model_path=/home/kay/vmaf/model/vmaf_v0.6.1.pkl" -f null - >>vmaf-scene1.txt -hide_banner

The encoded files can be downloaded from here https://github.com/Netflix/vmaf/files/3075485/encoded-files.zip

something buggy about your vp9 encode , and the frames are not aligned at the end between x264 and vp9 . You didn't upload the immediate source, but both encodes have duplicate frames that aren't in the movie

You can decode to uncompressed or lossless I-frame like utvideo and compare them

ffmpeg gives warning when encoding the vp9 version too

#invalid length 0x11 > 0x3dfb4 in parent

Beelzebubu

13th April 2019, 14:50

something buggy about your vp9 encode

Sometimes the timebase is different. You can ignore timestamps by using "settb=1/30,setpts=N" as extra lavfilters in your commandline for each of the two video streams.

poisondeathray

13th April 2019, 15:43

Sometimes the timebase is different. You can ignore timestamps by using "settb=1/30,setpts=N" as extra lavfilters in your commandline for each of the two video streams.

Still buggy . More frame drops now, gaps in motion, and the framecount does not match when setting timebase and pts

You can index it with avisynth/vapoursynth, but frames still don't match

I suspect he had issues with the immediate source

user1085

13th April 2019, 16:13

What is the proper way to multithread in 2019? Is it possible to achieve 100% cpu usage on 8 cores? Also, I read somewhere that frame parallel is not needed anymore. Is that correct?

singhkays

14th April 2019, 03:58

Still buggy . More frame drops now, gaps in motion, and the framecount does not match when setting timebase and pts

You can index it with avisynth/vapoursynth, but frames still don't match

I suspect he had issues with the immediate source

I think I fixed it. I changed the VP9 in WebM container to VP9 in MP4 container and getting expected VMAF scores now

72.8 for x264 vs 74.7 for VP9

Beelzebubu

16th April 2019, 12:45

What is the proper way to multithread in 2019? Is it possible to achieve 100% cpu usage on 8 cores?

Using aomenc, --tile-columns=3 --threads=8 should give you 8 tile col threads. Also try --row-mt and non-zero values for --tile-rows.

Also, I read somewhere that frame parallel is not needed anymore. Is that correct?

See last paragraph in https://forum.doom9.org/showthread.php?p=1859991#post1859991

singhkays

16th April 2019, 20:28

Still buggy . More frame drops now, gaps in motion, and the framecount does not match when setting timebase and pts

You can index it with avisynth/vapoursynth, but frames still don't match

I suspect he had issues with the immediate source

Sometimes the timebase is different. You can ignore timestamps by using "settb=1/30,setpts=N" as extra lavfilters in your commandline for each of the two video streams.

I'm seeing another issue with my VP9 encode. I'm getting the following message after 1st pass -"output file is empty". Is this expected for first pass?

[libvpx-vp9 @ 0x5f5d480] v1.8.0-366-gc46694c1d
Output #0, mp4, to '/dev/null':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.27.101
Stream #0:0(eng): Video: vp9 (libvpx-vp9) (vp09 / 0x39307076), yuv420p, 720x300 [SAR 1:1 DAR 12:5], q=-1--1, 500 kb/s, 24 fps, 12288 tbn, 24 tbc (default)
Metadata:
handler_name : VideoHandler
encoder : Lavc58.49.100 libvpx-vp9
Side data:
cpb: bitrate max/min/avg: 500000/500000/500000 buffer size: 1000000 vbv_delay: -1
frame= 158 fps=0.0 q=0.0 Lsize= 0kB time=00:00:00.00 bitrate=N/A speed= 0x
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Output file is empty, nothing was encoded

Here's the CLI for trying to get a 500K file
./ffmpeg -i scene6.mp4 -c:v libvpx-vp9 -pix_fmt yuv420p -auto-alt-ref 1 -lag-in-frames 25 -b:v 500k -filter:v scale=720:-1 -row-mt 1 -quality best -cpu-used 1 -tile-rows 2 -tile-columns 2 -threads 16 -enable-tpl 1 -pass 1 -minrate 500k -maxrate 500k -bufsize 1000k -g 240 -f mp4 -hide_banner -y /dev/null && \
./ffmpeg -i scene6.mp4 -c:v libvpx-vp9 -pix_fmt yuv420p -auto-alt-ref 1 -lag-in-frames 25 -b:v 500k -filter:v scale=720:-1 -row-mt 1 -quality best -cpu-used 1 -tile-rows 2 -tile-columns 2 -threads 16 -enable-tpl 1 -pass 2 -minrate 500k -maxrate 500k -bufsize 1000k -g 240 -hide_banner -movflags faststart ./vp9/scene6/vp9_500k.mp4

Asilurr

17th April 2019, 06:54

singhkays

17th April 2019, 07:39

These suggestions will probably help you:
1. With contemporary versions of libvpx there's no real benefit to using tile-rows>0. The combination of tile-rows=0/tile-columns>0/row-mt>0 is advisable.
2. Do not expect multithreading miracles when working with sources which have a paltry 720 pixels horizontal resolution. There are "thresholds" which allow specific values of tile-columns (the primary MT mechanism of libvpx), and then there's row-mt which basically "doubles" the thread count (it's not exactly a 2x count, it depends on the source). Practical example: with your 720 pixels horizontal resolution source, you input tile-columns=2 and threads=16 to the encoder. It accepts them without throwing errors back at you, but it will default to what it can actually use on your source: tile-columns=1 and threads~4 (assuming row-mt=1); the encoder simply can't "fit in" 16 threads working with a 720 width source.
horizontal resolution | superblock columns | horizontal tiles | tile-columns | rough thread count with row-mt=1
0001-0448 | 001-007 | 01 | 0 | ~02
0449-0960 | 008-015 | 02 | 1 | ~04
0961-1984 | 016-031 | 04 | 2 | ~08
1985-4032 | 032-063 | 08 | 3 | ~16
4033-8128 | 064-127 | 16 | 4 | ~32
3. Considering your current quality best, you don't really care about the encoding speed; however do keep in mind that quality good usually provides superior results (yes, really). As such, switching to quality good/cpu-used=0/auto-alt-ref=6 is advisable.
4. FFmpeg's "output file is empty" notification is entirely expected, you are outputting the first pass to /dev/null (https://en.wikipedia.org/wiki/Null_device).

Thank you for the recommendations, will update my scripts! :thanks:

Re: quality = good - wow! the more I delve into VP9, the more I'm surprised who came up with all these quirks! Is there more info on why quality = good is better?

Re: auto-alt-ref - Is there more info on what values 1 to 6 mean?

Re: output file - The reason I ask is that ffmpeg VP9 wiki states - "In pass 1, output to a null file descriptor, not an actual file. (This will generate a logfile that ffmpeg needs for the second pass.)" https://trac.ffmpeg.org/wiki/Encode/VP9

Few other questions

Do you have recommendations for aq-mode?
Any other recommendations on preventing the VP9 encoder from undershooting the target bitrate?

Leeloo Minaï

17th April 2019, 12:22

Re: output file - The reason I ask is that ffmpeg VP9 wiki states - "In pass 1, output to a null file descriptor, not an actual file. (This will generate a logfile that ffmpeg needs for the second pass.)" https://trac.ffmpeg.org/wiki/Encode/VP9

If you are under Windows, you may have not noticed the following note on https://trac.ffmpeg.org/wiki/Encode/VP9 :
Note: Windows users should use NUL instead of /dev/null

singhkays

17th April 2019, 18:32

If you are under Windows, you may have not noticed the following note on https://trac.ffmpeg.org/wiki/Encode/VP9 :

I did but I'm on Linux

LigH

18th July 2019, 08:20

New upload (MSYS2; MinGW32 / MinGW64: GCC 9.1.0):

VPx v1.8.1-34-g53dc2d9d9 (https://www.mediafire.com/file/g2ggjb01egj40yq/vpx_v1.8.1-34-g53dc2d9d9.7z/file)

utack

21st August 2019, 06:58

LigH

21st August 2019, 08:31

Apart from raising the bitrate? ... Reducing "noise" is one of the most important ways to encode efficiently; maybe you can change the amount with the VP9 specific parameter
--noise-sensitivity=<arg> Noise sensitivity (frames to blur)
but its default is already 0, so this may make it only worse...

Or it might be possible to move bitrate distribution away from usually preferred frames (keyframes, "golden frames"), which may hurt the quality in other scenes.

Unfortunately, not all encoders offer control over the same kind of algorithms. They may not even contain the same set of algorithms. If VPx has no exposed parameters to control rate distribution in relation to specific content dynamics, "enough bitrate" (or a forced limit for the maximum quantization) may be your only hope.

benwaggoner

22nd August 2019, 18:44

Is there a way to get libvpx not to "crush" flat or dark parts of the image so much? It is one of the major weak points it still has that can ruin certain parts of a stream and that x264 solved ages ago.
Both tune-content=film and aq-mode=1/2 do not seem to suffice to control this behaviour.
Is there another method like limited the quantizer range or such?
Thank you
This is where adaptive quantization algorithms shine, and where PNSR-tuned encoders can fall flat.

I've yet to see a VPx series encoder that had a good adaptive quant algorithm. I don't know that this is a limitation of the bitstream itself; it's more likely just a reflection of the psychovisual maturity of the encoders.