Log in

View Full Version : Alliance for Open Media codecs


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [15] 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

MoSal
13th June 2018, 21:16
I have bought 1,5 years ago LG smartTV. This year after upgrading the Youtube aplication it starts to support VP9 (stats for nerds reports it) and plays Youtube 4K without single drop.

I tested a cheap locally-assembled TV with Chinese parts the other day. VP9 4K@60fps is supported out of the box. Opus was the codec that's not supported.

No one will forget AV1, not even the no name chip manufacturers. Here is hope, from now on, they will not forget Opus either.

amichaelt
14th June 2018, 03:13
It's not like VP9 isn't supported by any smart TVs.

I have bought 1,5 years ago LG smartTV. This year after upgrading the Youtube aplication it starts to support VP9 (stats for nerds reports it) and plays Youtube 4K without single drop.

But it’s not supported by everything whereas anything that can do 4K with a Netflix app has to have HEVC support.

Blue_MiSfit
15th June 2018, 09:46
anything that can do 4K with a Netflix app has to have HEVC support.

Exactly

blurred
22nd June 2018, 10:24
Interesting discussion regarding the choice of entropy coder for AV1: https://encode.ru/threads/1890-Benchmarking-Entropy-Coders?p=56945&viewfull=1#post56945

Daala range coder using 16 multiplications per symbol has won with rANS using 1 multiplication per symbol, and ~7x faster implementations: https://sites.google.com/site/powturbo/entropy-coder

Do anybody know why the slower and more costly one was chosen?

ps. This nibble adaptive rANS is e.g. used in recent open source Dropbox DivANS: https://blogs.dropbox.com/tech/2018/06/building-better-compression-together-with-divans/

nevcairiel
22nd June 2018, 11:43
Often the choice is for simpler hardware implementations, since thats really the future, not software. I'm also not convinced a generic benchmark can fully represent the performance characteristics of an actual codec.

Phanton_13
22nd June 2018, 12:13
If I don't remenber incorrectly the rANS does some things in a reverse way and othes things that complicated the cost of implimentation in hardware, in other works of implementing in silicon the 16 multipliers in Daala range coder is cheapier than implementing the memory need to implement the rANS. Also it apears that rANS can increase the latency specially at low rates due to the buffers. Most of this problems are being tackled in a new generation of ANS coders, but they are not going to be ready for a possible implementation in AV1.

Also remenber: Faster/cheapier in software is not the same as faster/cheapier in hardware.

blurred
22nd June 2018, 17:03
But doesn't 16 multiplications cost more energy than one - paid in energy consumption and battery life of our devices?

Phanton_13
23rd June 2018, 00:55
That only aplies to software implementations, in hardware you dont think in number of instructions but in number of gates/transistors, and the maping is not that simple as sometimes you can implement a function with 9 multiplications into the same number of gates that takes to implent 2 multiplications (this in an example of a real case). Memory also have cost in gates and power and you ned to evaluate If it's more eficient to expend the gates on memory or in procesing, the result of this evaluation is what determined the use of the Daala range coder as av1 have been designed with the hardware implementation in mind because it is critical for mobile phones.

blurred
24th June 2018, 12:32
(...)you can implement a function with 9 multiplications into the same number of gates that takes to implent 2 multiplications(...)
Looks like you are referring to serial execution, which might require 16x frequency increase here (?)
And hardware decoding requires replacing current hardware - meanwhile (~5 years) it will be made software, where being 7x slower seems a huge sacrifice.
Additionally, Google is still fighting for this ANS patent over dead bodies ( https://arstechnica.com/tech-policy/2018/06/inventor-says-google-is-patenting-work-he-put-in-the-public-domain/ ) - if it is not intended for AV1, will it prevent others using ANS in video compression?

Phanton_13
24th June 2018, 15:21
Looks like you are referring to serial execution, which might require 16x frequency increase here (?)
No, I refering to reimplemt the function, in that case a 50Mhz FPGA implementation of the full ASIC was able to match a Core2duo at 2Ghz runing its functionality in software, and in silicon the asic was runing at 1Ghz.

Hardware design is very diferent that software development, for example in the range coder of daala most multiplication are constant*value, in this case in hardware you don't need to do multiplication always, for example in the case that the constant is "2" there are various variants as for example in unsigned is only a bit shift but in harware is even cheaper as you only resoute the data and for signed you use and adder or a modified shifter. And for other values most of the time there is an alternative and faster way to implement it instead of doing a full multiplier. Also most of the time you don't need to implement a full multiplication unit as you only implement it what you need, for example you can do a 12 bit multiplier instad of a 16bit one if you values always fit in 12 bits, or you only implement the lower bits of a 16 bits multiplication and ignore any value over 16bits...

In hardware design the frecuency is a derived value of data propagation (delay, timing) and what you whant is results, even if some implementation have slower frecuency but produces the result faster you go for it.


And hardware decoding requires replacing current hardware - meanwhile (~5 years) it will be made software, where being 7x slower seems a huge sacrifice.

That is true, bus is more like 2-3 years for hardware to start apearing, and in this case it can be reduced to 1 year due to the varios hardware designers and manufactures in AOM.


Additionally, Google is still fighting for this ANS patent over dead bodies ( https://arstechnica.com/tech-policy/2018/06/inventor-says-google-is-patenting-work-he-put-in-the-public-domain/ ) - if it is not intended for AV1, will it prevent others using ANS in video compression? No, actually having it refused can actually be good as it's detimentral if its aproved at posteriori for other entity because it can be used to put the patent office and the posteriori aproval in question and invalidate it. More this also demostrated the disfuntionality in both the patent system and the legal teams in companies.

blurred
24th June 2018, 16:17
(...)in the range coder of daala most multiplication are constant*value(...)
If I properly understand, there are 16 multiplications due to "maximal alphabet size" = 16 - it needs to multiply "range size" by CDF value for all 16 symbols.
In contrast, rANS needs to multiply by only one value (p[s] = CDF[s+1]-CDF[s]), where s is the currently decoded symbol.

CDF changes with data type (context), and can be adapted - these are definitely not constant values.
In hardware you can build 16 parallel multipliers not to increase frequency, but it would need 16x more gates, and most importantly: consume 16x more energy.

Phanton_13
24th June 2018, 19:45
In part you are rigth but at the same time you are forgeting one thing, those 16 pararell multipliers consume more energy than the extra memory needed in rANS? the hardware is inerent pararell, then is theupdate posible to do in pararel with another task during the decoding process? Also there is the posibility of optimization for those 16 pararell multiplications as one operand is comon to all. On thing that help with hardware is not to think of it like a computer program but as a data flow between operands.

blurred
24th June 2018, 21:24
Such additional (for rANS) buffer is only needed in encoder, which for video compression is usually an order of magnitude more costly, and for example for youtube, netflix video used only once per thousands or millions of views (decodings).
And video compressor seems to require huge flexible buffers for various modellings/predictions - is it a non-negligible cost to share a few kilobytes with entropy coder?

Also there is the posibility of optimization for those 16 pararell multiplications as one operand is comon to all.
Interesting, indeed the range is varying, but the same for all 16 multiplications.
Thinking about multiplication as shifts and additions, the cheap shifting part can be indeed shared, but it doesn't seem simple to get systematic optimization for separate additions - do you maybe know some paper showing how to optimize it?

Quikee
25th June 2018, 21:13
AV1 1.0.0 code tag (https://aomedia.googlesource.com/aom/+/v1.0.0)

Also specs (https://aomedia.org/av1-bitstream-and-decoding-process-specification/) don't have draft status anymore.

No official announcement yet..

GTPVHD
26th June 2018, 00:12
https://aomediacodec.github.io/av1-spec/

Still says Draft Document here.

TD-Linux
26th June 2018, 02:03
Daala range coder using 16 multiplications per symbol has won with rANS using 1 multiplication per symbol, and ~7x faster implementations: https://sites.google.com/site/powturbo/entropy-coder

Do anybody know why the slower and more costly one was chosen?

Firstly, the AV1 range coder only uses 1 multiplication per CDF entry, the 16 is the "worst case" (keep in mind that they can be done in parallel, e.g. with SIMD, so it's actually better to use more than less as the multiply is the cheapest part in software). Secondly, the difference is nowhere near 7x when we benchnmarked the two - rANS was faster, but by a factor of about 2. However, the requirement to buffer and reverse the symbols was unfortunately insurmountable.

Also keep in mind that AV1 adjusts the probabilities on a per-symbol basis. The entropy coder CDFs are designed to make adapting the probabilities very fast (with only adds and shifts). This puts some constraints on the design that don't exist in the linked benchmark (which uses fixed probabilities as far as I can tell).

benwaggoner
26th June 2018, 04:50
Often the choice is for simpler hardware implementations, since thats really the future, not software. I'm also not convinced a generic benchmark can fully represent the performance characteristics of an actual codec.
I think you can remain happily convinced that a generic benchmark will NOT "represent the performance characteristics of an actual codec"

There is so much clever that gets done, even in decoders. And there are so many different kinds of parallelization, SIMD, ASIC, etcetera available. And surprising numbers of decoders don't implement basic stuff like skipping non-reference frames when doing seeking, due to the system layer and the decoder layers not being tightly coupled enough.

AV1 is way better designed for parallelized HW decoders than VP9 was, which was pretty painfully serialized compared to HEVC, with software decoders pretty dependent on fast single-core performance.

LigH
26th June 2018, 07:37
@ GTPVHD: Then the github site may have outdated content?

MABS also retrieves sources from GoogleSource. And had to disable a TESTS flag to continue compiling, 2 days ago.
__

P.S.: New upload:

AOM v1.0.0-6-gce8f4811b (https://www.mediafire.com/file/chl6dyq78ej8lt9/aom_v1.0.0-6-gce8f4811b.7z) (yes, v1.0.0+)

Phanton_13
26th June 2018, 11:17
do you maybe know some paper showing how to optimize it? Lamentably no for the general case, also searching for it I found a paper:"DAALA_EC in AV1" that have some data for hadware implementations:

Daala_ec decoder 54k gates,performance 1 symbol per clock, decoding time 1 clock.
Daala_ec encoder 9k gates,performance 1 symbol per clock, encoding time 1 clock.
ANS decoder 49k gates,performance 1 symbol per clock, decoding time 1 clock.
ANS encoder 25k gates,performance 1 symbol every 2 clocks, encoding time 2 clocks.

As for reference VP9 G2 hardware codec has 2.60M gates (2160p@30fps content playback: ~250Mz)

Basically ANS has not faster decoding speed that Daala range coder once implemented in hardware, an even is slower in encoding. The thing that the speed diference in software implementation don't correlate to it in hardware implementation is enougth common as to call it a norm. Other thing is that it appears that in the decission of using the Daala range coder the hardware guys at ARM/AMD/Itel/Nvidia had a good hand in it.

Also rANS is quite recent and higthly optimised, plus it uses 32/64bit aritmetic and SIMD instructions while daala range coder uses only 16bit aritmethic. And you can do betwen 2 and 4 1 clock 16bit multipliers in the same number of gates that of a 32bit 1clock multiplier.

Blue_MiSfit
27th June 2018, 08:11
Congrats to the AOM team for hitting 1.0! It's only a few months late ;)

Good stuff tho, looking forward to the encoder maturing. It's always great to see more options.

wiak
27th June 2018, 11:26
I tested a cheap locally-assembled TV with Chinese parts the other day. VP9 4K@60fps is supported out of the box. Opus was the codec that's not supported.

No one will forget AV1, not even the no name chip manufacturers. Here is hope, from now on, they will not forget Opus either.
LG OLED telly dont support opus either sooo heh

LigH
27th June 2018, 11:34
Due to general interest:

My AOM builds are a result of jb-alvarado's media-autobuild_suite (https://github.com/jb-alvarado/media-autobuild_suite/), which has building ffmpeg as main purpose, but also offers several more features.

During the configuration, among several other options, I enabled building of separate executables (not only libraries used in ffmpeg), and building of AOM.

Tommy Carrot
27th June 2018, 17:18
I've done a few tests with the 1.0 build (thanks Ligh!). I've compared it to the 0.1.0-9043 build from april, i mainly tested --cpu-used 0 to 2, because above that the quality isn't really better than current gen codecs, while it's still horrendously slow. :D Overally the encoding speed is improved to around twice as fast in each speed settings, while the quality is very similar (both filesize and metrics). In some cases there are definite visual improvements, but in most cases it has more or less the same quality.

This encoder still needs a lot of work, it's still too slow even for short tests, not to mention everyday use. The quality is fairly impressive, definitely better than x265 or VP9, but IMO it falls short to XVC and VVC.

iwod
27th June 2018, 18:51
I've done a few tests with the 1.0 build (thanks Ligh!). I've compared it to the 0.1.0-9043 build from april, i mainly tested --cpu-used 0 to 2, because above that the quality isn't really better than current gen codecs, while it's still horrendously slow. :D Overally the encoding speed is improved to around twice as fast in each speed settings, while the quality is very similar (both filesize and metrics). In some cases there are definite visual improvements, but in most cases it has more or less the same quality.

This encoder still needs a lot of work, it's still too slow even for short tests, not to mention everyday use. The quality is fairly impressive, definitely better than x265 or VP9, but IMO it falls short to XVC and VVC.

Are there VVC encoder already? Or are you implying XVC?

Tommy Carrot
27th June 2018, 18:55
Are there VVC encoder already? Or are you implying XVC?
Well, the jvet encoder posted in the vvc thread. It should be more or less the same, just an earlier version.

blurred
27th June 2018, 20:41
Response by author of the benchmark from https://encode.ru/threads/1890-Benchmarking-Entropy-Coders?p=57093&viewfull=1#post57093
Firstly, the AV1 range coder only uses 1 multiplication per CDF entry, the 16 is the "worst case" (keep in mind that they can be done in parallel, e.g. with SIMD, so it's actually better to use more than less as the multiply is the cheapest part in software).
For SSE2 decoding in AV1 you need 4 SIMD multiplications (_mm_mullo_epi32) + 4 comparisons (_mm_cmpgt_epi32) + combining (after _mm_movemask_ps) 4 SSE2 registers
It is unlikely that this will be faster than scalar decoding.
For AV1 hardware implementations, you need 16 32x32 multipliers, otherwise parallel multiplications are not possible.
Also 16 comparisons + other operations are additionaly required.
Secondly, the difference is nowhere near 7x when we benchnmarked the two - rANS was faster, but by a factor of about 2.
For this benchmark and current implementations, rANS decoding is SEVEN times faster than AV1.
On ARM the scalar version is 5 times faster.
The AV1 nibble entropy coder is even slower than a bitwise range coder.
However, the requirement to buffer and reverse the symbols was unfortunately insurmountable.
This is only required in encoding which is usually done in software.
This irrelevant argument is always used in their discussions.
The benchmark shows that TurboANXN, even with reverse encoding is more than 4 times faster than the current AOMedia AV1 encoder.
Also keep in mind that AV1 adjusts the probabilities on a per-symbol basis.
The entropy coder CDFs are designed to make adapting the probabilities very fast (with only adds and shifts).
This puts some constraints on the design that don't exist in the linked benchmark (which uses fixed probabilities as far as I can tell).
The benchmark is using adaptive probabilities.
There is so much clever that gets done, even in decoders.
And there are so many different kinds of parallelization, SIMD, ASIC, etcetera available.
And surprising numbers of decoders don't implement basic stuff like skipping non-reference frames when doing seeking, due to the system layer and the decoder layers not being tightly coupled enough.
This is indepedant from entropy coding. Here we are comparing the AV1 entropy coder against rANS and they are interchangeable.
Also rANS is quite recent and higthly optimised, plus it uses 32/64bit aritmetic and SIMD instructions while daala range coder uses only 16bit aritmethic.
And you can do betwen 2 and 4 1 clock 16bit multipliers in the same number of gates that of a 32bit 1 clock multiplier.
According to the AV1 source code, 32 bits operations are used. rANS is 32 bits only.

I think the decision against rANS is politically motivated (Not-invented-here-Syndrom (https://de.wikipedia.org/wiki/Not-invented-here-Syndrom)).
Otherwise, why not simply let the (now removed) rANS version in the repository for comparisons.
Hardware comparisons (complexity,energie consumption,costs,...) are only possible after implementing both optimized versions.

Note, we are considering here only adapative rANS. Do not confuse this with block based ANS as used in zstd,lzfse, lzturbo...

Phanton_13
28th June 2018, 18:22
Response by author of the benchmark...
According to the AV1 source code, 32 bits operations are used. rANS is 32 bits only.

They are actually doing 16 arithmetic using 32 bit operations due that modern processor are faster unsing 32 bits operations that using 16bit operations, also varios presentations and documents indicates that dala range coder uses 15x16-> 31bit multiplications.


I think the decision against rANS is politically motivated (Not-invented-here-Syndrom).
Otherwise, why not simply let the (now removed) rANS version in the repository for comparisons.

The political motivation can also be viewed in the reverse and if it was included tell that it was for political reasons... An inventor of something can always and most of the time tell that the reason that other don't use it is political motivated. Other times something cam be included to make someone happy (this have hapened in hevc and x264).

Also the patent situation, not only by google but also by others that do the same dirts it's posibility of inclusion.

For me instead of whine for it not being included the correct is to continue perfecting for it to be included in AV2.

foxyshadis
29th June 2018, 06:06
Response by author of the benchmark from https://encode.ru/threads/1890-Benchmarking-Entropy-Coders?p=57093&viewfull=1#post57093

For SSE2 decoding in AV1 you need ....

I'm sorry, tell us again about how this codec isn't designed for your 20-year-old Pentium 4. That has nothing to do with optimizability under AVX/AVX2 or Altivec, which are the only instruction sets that matter today.

mzso
29th June 2018, 08:40
AV1 development is becoming disappointing. With the promise of a February delivery for the final bitstream format I expected youtube providing AV1 streams by now. (At least for new/popular videos)

wiak
29th June 2018, 14:05
AV1 development is becoming disappointing. With the promise of a February delivery for the final bitstream format I expected youtube providing AV1 streams by now. (At least for new/popular videos)

they shot them self in the foot during NAB, when they so called released the codec

but currently most tools with not encode, case in point ffmpeg has strict mode on and dont even do webm, aomenc does webm, the encoding and decoding parts are to slow to be ustable even on a modern ryzen 8-core

atleast with 1.0.x series we can finally decode stuff encoded in older builds

still useless for anything other than thinkering

and this is from a user perspective

a proper roadmap with set dates on when stuff is getting implemented like faster encoding, multi-threading, browser support?

am convinced that aomedia runs on valve time https://developer.valvesoftware.com/wiki/Valve_Time

anyway, will check back in 3 months time, pace out (but i guess they are still more than a year off)

nevcairiel
29th June 2018, 14:59
but currently most tools with not encode, case in point ffmpeg has strict mode on and dont even do webm, aomenc does webm

The container bindings are not finalized yet, which is why tools don't really create those yet. But both MP4 and MKV/WebM bindings are being worked on right now, and once those are final, expect at least FFmpeg to pick them up too.

MoSal
30th June 2018, 01:30
the encoding and decoding parts are too slow to be usable even on a modern ryzen 8-core


No kidding. I tested with --cpu-used=8 --tile-columns=4 expecting acceptable speed and awful quality.

The opposite turned out to be true. The speed is still slow. And the quality wasn't bad. It wasn't very good either, but still beets a tuned x265-slower profile with the specific sample I tested (1Mbps / 1080p / 30fps).


aomenc -o t.webm t.y4m -t 4 --target-bitrate=256 --enable-qm=1 \
--aq-mode=1 --film-grain-test=1 --cpu-used=8 --tile-columns=4



ffmpeg -i t.y4m -c hevc -crf 38 -preset slower -x265-params \
sao=0:deblock=-2,-2:psy-rdoq=5:qcomp=.75:ipratio=1.25:pbratio=1.18 t.mkv


The quality of --film-grain-test=1 is impressive. Better than no test, but --film-grain-test=2 adds too much grain.

On the decoding side. ffav1 should be available soon-ish.

paul97
5th July 2018, 15:38
Has AOMedia already started optimizing AV1 (especially its speed) after the bitstream freezed on the 25th of June?

iwod
5th July 2018, 18:06
Has AOMedia already started optimizing AV1 (especially its speed) after the bitstream freezed on the 25th of June?

Well you can be assured they have plans. But it will take time, counted in months, not days or weeks.

Mjpeg
5th July 2018, 22:07
Great Chris Montgomery article on CDEF

https://hacks.mozilla.org/2018/06/av1-next-generation-video-the-constrained-directional-enhancement-filter/

I'm thrilled to see a royalty-free option here, so I'll try to be patient as they speed up the encoder.

TD-Linux
6th July 2018, 00:04
They are actually doing 16 arithmetic using 32 bit operations due that modern processor are faster unsing 32 bits operations that using 16bit operations, also varios presentations and documents indicates that dala range coder uses 15x16-> 31bit multiplications.

Since those slides were made, they shrank even further - they are only 7x9 multiplications. Lower latency for this multiply was critical for hardware throughput, so we shrank it as much as we could without significant compression losses. See https://aomedia.googlesource.com/aom/+/master/aom_dsp/entdec.c#199

Has AOMedia already started optimizing AV1 (especially its speed) after the bitstream freezed on the 25th of June?

Yup! You can always follow the latest development of libaom on Gerrit. Here's the most recent merged changes: https://aomedia-review.googlesource.com/q/status:merged

MoSal
6th July 2018, 00:19
Had some time to do some tests with --cpu-used=4 and no tiles.

As the bitrate goes up, libaom continues to be superior compared to tuned x265 when it comes to I frames. For non-I frames, the story is different. Tuned x265 (see my previous comment) non-I frames look superior to libaom P frames which look terrible. Even when I give more bitrate to P frames (--max-intra-rate=700), they continue to look terrible.

The flexibility in the API and CLI options is quite lacking. The documentation is not great either (I had to check the source code to know what to pass to some options).

Again, this is with --cpu-used=4 which is probably analogous to veryfast in x264/x265. And it is needless to say that none of this is due to inherent limitations in the codec itself.

dapperdan
7th July 2018, 08:08
Great Chris Montgomery article on CDEF

https://hacks.mozilla.org/2018/06/av1-next-generation-video-the-constrained-directional-enhancement-filter/


Interesting tidbit from Monty in the comments, sounds like Mozilla is working on its own AV1 encoder.

"That’s much more to do with the way the encoder is tuned, and you’re seeing VPx’s traditional tuning biases here. The VP encoders have always been rather merciless to mid-frequency content, and I think the current AOM encoder isn’t yet using any sort of activity masking.

Compare AV1 to Daala on those images; the substantial difference there is difference in encoder tuning philosophy. We’ll be bringing that to the encoder we’re working on here at Moz."

Quikee
7th July 2018, 08:45
Interesting tidbit from Monty in the comments, sounds like Mozilla is working on its own AV1 encoder.

Yup - rav1e - https://github.com/xiph/rav1e

hajj_3
7th July 2018, 22:15
https://www.bbc.co.uk/rd/blog/2018-06-comparison-of-recent-video-coding-technologies-in-mpeg-and-aomedia

foxyshadis
9th July 2018, 06:28
Yup - rav1e - https://github.com/xiph/rav1e

Wow, I'm suddenly excited about AV1 again. Why is it that anything Monty does is like a kiss of gold to me?

Barough
9th July 2018, 15:47
AOM AV1 v1.0.0-82-gf77d93175 (http://www.mediafire.com/file/ad4l6rcz8iszznp/)
Built on July 09, 2018, GCC 7.3.0

https://aomedia.googlesource.com/aom

Zebulon84
9th July 2018, 16:19
AOM AV1 v1.0.0-82-gf77d93175 (http://www.mediafire.com/file/ad4l6rcz8iszznp/)
Built on April 19, 2018, GCC 7.3.0

I guess the "April 19, 2018" is just a copy-paste fromthat previous message (https://forum.doom9.org/showthread.php?p=1839762#post1839762). Files in the archive are from today, July 9, 2018.

Barough
9th July 2018, 17:50
I guess the "April 19, 2018" is just a copy-paste fromthat previous message (https://forum.doom9.org/showthread.php?p=1839762#post1839762). Files in the archive are from today, July 9, 2018.

Thnx for the heads up. As u said...... it's copy/paste from the old post. Fixed now. ;)

benwaggoner
9th July 2018, 19:27
Yup - rav1e - https://github.com/xiph/rav1e
The VPx series was quite unusual in that the bitstream proponent also made the primary encoder. For all the MPEG codecs, MPEG didn't worry much about the speed of the reference encoder because only 3rd party implementations are used in the market.

It's a good sign of the health of AV1 that we are seeing a variety of different encoders, both open-source and proprietary, being worked on. VPx never had enough interest to get that effort.

Competition drives tons of implementation innovations in encoding. That's why we're still seeing significant improvements in MPEG-2 after all these years.

utack
9th July 2018, 22:04
VPx never had enough interest to get that effort.

That is not entirely true, there is Eve (https://www.twoorioles.com/eve-for-vp9/), which Netflix used for a while

soresu
10th July 2018, 02:04
I was under the impression that Ronald Bultje from the libvpx team built Eve, so while technically separate its practically incestuous in creation :p

iwod
10th July 2018, 07:21
Competition drives tons of implementation innovations in encoding. That's why we're still seeing significant improvements in MPEG-2 after all these years.

I think I have read this statement sometime ago and may have asked a similar question before. What are the use case of MPEG-2 today? From high end bitrate to low end I cant think of a single use case where it would be valuable.

Blue_MiSfit
10th July 2018, 08:29
I think I have read this statement sometime ago and may have asked a similar question before. What are the use case of MPEG-2 today? From high end bitrate to low end I cant think of a single use case where it would be valuable.

So many.

- Terrestrial broadcast in the US using ATSC is 100% MPEG-2 for now.
- Legacy cable / satellite networks with tons of existing decoders / professional IRDs in the field
- Legacy broadcast contribution, especially at high bitrates for backup.
- Legacy cable VOD (still pushing 15 Mbps 1080i MPEG-2 in most cases)
- Broadcast playout - quite heavily using 50 Mbps XDCAM HD422, sometimes even lower quality.

TBH, for anything new that's for distribution, yeah you wouldn't use MPEG-2 most likely.

However, legacy stuff has a habit of staying around forever.

Plenty of current premium satellite TV networks use MPEG-2 as their house format because surrounding standards like XDCAM HD422 in MXF have robust support for in-band metadata like captioning, timecode, AFD, etc, and also have broad support from playout server vendors, NLE / post production tools, and pro transcoder tools. The other benefit is that MPEG-2 is quite lightweight to decode these days, so a video server can be dense and cost effective and still perform perfect frame accurate seeking and smooth playback, saving CPU cycles for graphics etc.

Quality is definitely "good enough" (especially considering transmission encoding typically being 6-12 Mbps real-time encoded CBR H.264 with small GOPs and tight buffers).

More modern alternatives like J2K (a-la AS-02 style MXF) and AVC Intra do have benefits in certain cases, especially where very high quality is desired, but they typically come with additional cost in terms of processing power, software licensing, and storage capacity. Higher quality formats tend to live in acquisition and post production. When you deliver to playout, 1080i XDCAM HD422 is probably the most common standard - at least in the US.

The broadcast industry as a whole tends to cling to the melting icebergs of trust for as long as possible, and TBH this is for good reason. Any change that introduces potential risk is an extremely tough sell when you have 4 or 5 nines of uptime required in your SLA. It only takes a few minutes of downtime per year to start feeling the pain.

I've heard of some improvements in MPEG-2 especially for H.264 -> MPEG-2 transcoders to glue new channels into legacy infrastructure, but TBH this was years ago. I haven't heard of anything major recently. Ben, care to share?

benwaggoner
10th July 2018, 21:49
I've heard of some improvements in MPEG-2 especially for H.264 -> MPEG-2 transcoders to glue new channels into legacy infrastructure, but TBH this was years ago. I haven't heard of anything major recently. Ben, care to share?
Elemental's encoder has delivered >20% bitrate reduction for statmuxed MPEG-2 over the last couple of years.

Granted statmuxed MPEG-2 is kind of a special case, except that it probably accounts for the majority of MPEG-2 eyeball hours these days.

(for those blissfully ignorant of channel-based broadcast world, statmuxing is when multiple video streams are encoded in parallel to fit within a given amount of total bandwidth. Basically inter-stream VBR).