Alliance for Open Media codecs [Archive] - Page 45

benwaggoner

18th June 2020, 02:04

My 7900X (OC) can also play that video in 8K without drops, at 40% usage at most - straight in Chrome 83. Make sure its not your GPU that struggles downscaling the 8K video.

With Ryzen making 8 cores available to "mere mortals" for relatively low pricing, I don't think you need any particular "beast" right now, nevermind the next decade.
How well are the fast SW decoders scaling with multiple cores? VP9 struggled to get much parallelism due to some unfortunate serialization in the loop filters and the lack of a clear reference structure that allowed for parallel decoding of non-reference frames.

AV1 is certainly going to be more parallizable than VP9. But how far have the best decoders gone so far?

huhn

18th June 2020, 04:00

gtx 1070 does not have a HW AV1 decoder...
And mpv uses the dav1d decoder which works on cpu only.

the video is as far as i can see only VP9 8 bit.
if there is a AV1 version i could try it with an R7 3700X which could yield up to ~2.5x more performance then a r5 2600 in this special case
How well are the fast SW decoders scaling with multiple cores? VP9 struggled to get much parallelism due to some unfortunate serialization in the loop filters and the lack of a clear reference structure that allowed for parallel decoding of non-reference frames.

AV1 is certainly going to be more parallizable than VP9. But how far have the best decoders gone so far?
this doesn't seam to be the case anymore or it's not as bad as it was in the past. while the microsoft VP9 decoder has issues using more then 8 threads ffmpeg in lavfilter doesn't have this issues easily using 70-90% CPU usage on a 8 core 16 thread CPU. it sometimes falls to 50% or lower CPU usage but with usually over 100 FPS.

ChaosKing

18th June 2020, 08:17

I think I activated AV1 on YT, it was under /testtube or so. Just google it.
I used youtube-dl.exe to download the video, it was the av1 version.

@nevcairiel I use Vivaldi and have currently over 100 tabs open xD It can have an effect on performance

AV1 is certainly going to be more parallizable than VP9. But how far have the best decoders gone so far?

https://www.phoronix.com/scan.php?page=news_item&px=Dav1d-0.7-Performance

Over 300fps for 4k is not bad
I hope 10bit decoding will be faster on ARM soon too. My 4k fire tv stick struggles with 1080p 10bit video

huhn

18th June 2020, 09:26

i'm still confused by YTDL doesn't showing it before. but this make far more sense now to me.
the stream is very easy to decode i get about 40~ FPS with 50% CPU usage using lavfilter from mpc-hc 1.9.4.

soresu

18th June 2020, 11:41

I hope 10bit decoding will be faster on ARM soon too. My 4k fire tv stick struggles with 1080p 10bit video

10 bit isn't likely to get much better on ARM64 as most of the NEON asm has been written at this point - the problem is that all Fire TV products use 32 bit ARM Android as a base due to laziness on their part IMHO, therefore you won't get as much performance out of it as a phone or tablet with the same HW spec.

They probably will fill out the ARM32 NEON asm over time, though I'm not sure if the performance will match the ARM64 equivalent code path.

10 bit on x86 is another matter entirely - as far as I am aware there are zero SIMD assembly optimisations currently for AVX2, SSSE3 or SSE2 where 10+ bpc video is concerned.

Beelzebubu

18th June 2020, 13:35

How well are the fast SW decoders scaling with multiple cores? VP9 struggled to get much parallelism due to some unfortunate serialization in the loop filters and the lack of a clear reference structure that allowed for parallel decoding of non-reference frames.

AV1 is certainly going to be more parallizable than VP9. But how far have the best decoders gone so far?

I don't think this is true. From a decoder's point-of-view, the reference structure (most notably the cross-frame entropy dependency) and postfilter dependency (tile-crossing) between VP9 and AV1 are the same, and equally parallelizable. dav1d uses the same techniques for frame threading as FFmpeg's native VP9 decoder (ffvp9) and achieves siimilar concurrency multipliers as dav1d. Tile threading works the same. See e.g. graphs at https://youtu.be/WgfklAi50nM?t=1378 (VP9 vs. AV1) and https://youtu.be/WgfklAi50nM?t=345 (AV1 tile vs. frame vs. both multithreading), and note that these results are over a year old, dav1d has surpassed ffhevc (SW) decoding speed since then. See also the documentation for the threading model (https://code.videolan.org/videolan/dav1d/-/wikis/Threading-model) in dav1d.

The practical problem in ffvp9 is that it decided (to fit in FFmpeg's more static design) to only allow one threading type (frame or tile) instead of multiple concurrently (frame and tile) like dav1d does. That's the only reason dav1d scales better with multi-threading. We could have resolved that, but it was decided that ffvp9 was fast enough and it wasn't worth it.

(I can explain libaom's and libvpx' threading models if you want to learn more, but since they are a subset of dav1d/ffvp9, I was assuming this would be enough. I'm not familiar with gav1's threading model.)

Blue_MiSfit

20th June 2020, 04:04

Great post! ^

Mr_Khyron

21st June 2020, 15:03

MainConcept Brings Fast, Efficient AV1 Encoding to More Video Platforms
https://www.prnewswire.com/news-releases/mainconcept-brings-fast-efficient-av1-encoding-to-more-video-platforms-301071452.html
SAN DIEGO, June 10, 2020 /PRNewswire/ --
MainConcept, a leading provider of codec and streaming technology to the professional and broadcast industries, today announced it has worked with Intel Corporation to integrate Scalable Video Technology for AV1 (SVT-AV1) encoder into the MainConcept codec portfolio. This move will allow content producers to better leverage the increased compression capacity of AV1 and bring high-performance, scalable and efficient encoding for video-on-demand (VOD) streaming to the ever-growing video delivery marketplace.

https://code.videolan.org/videolan/dav1d/-/tags/0.7.1
dav1d 0.7.1 'Frigatebird' the fast and lean AV1 decoder

This is a minor update of the dav1d decoder, from the 0.7.x branch.

This release increases the speed of decoding on ARM32 by up to 28%,
adds some SSE2 optimizations, some AVX2 for MC scaled
and fixes a couple of minor issues.

soresu

23rd June 2020, 16:44

And mpv uses the dav1d decoder which works on cpu only.

It has some code that can run on the GPU, but from what I can gather it only increases power efficiency on mobile systems.

Whether this is because they only tested 8 bpc content which is optimized up the yin yang on most CPU SIMD ISA's at this point I don't know, I have yet to see a more comprehensive testing of the GPU code shown which encompasses 8 bpc and 10+bpc content separately.

From what I can gather there is also not much code that runs on the GPU at this point, this could always change in the future.

benwaggoner

23rd June 2020, 22:26

It has some code that can run on the GPU, but from what I can gather it only increases power efficiency on mobile systems.

Whether this is because they only tested 8 bpc content which is optimized up the yin yang on most CPU SIMD ISA's at this point I don't know, I have yet to see a more comprehensive testing of the GPU code shown which encompasses 8 bpc and 10+bpc content separately.

From what I can gather there is also not much code that runs on the GPU at this point, this could always change in the future.
GPU accelerated decode tends to only be a tactical thing used in the early days of a codec. If it catches on, it gets implemented in hardware (CPU, GPU, SoC). Otherwise CPUs get fast enough all software decode gets used.

Stuff like CABAC is not well suited to run on a GPU, so an all GPU compute decoder hasn't really been feasible or worthwhile.

NikosD

26th June 2020, 14:38

If it catches on, it gets implemented in software. You mean hardware, obviously.

benwaggoner

26th June 2020, 19:13

You mean hardware, obviously.
Oops. yes indeed. Thanks for the catch, and I have corrected.

mzso

28th June 2020, 10:52

My 7900X (OC) can also play that video in 8K without drops, at 40% usage at most - straight in Chrome 83. Make sure its not your GPU that struggles downscaling the 8K video.

With Ryzen making 8 cores available to "mere mortals" for relatively low pricing, I don't think you need any particular "beast" right now, nevermind the next decade.

Well, I have an RX 580 with the R5 1600 and it's the CPU is what I see saturated. Both AV1 and VP9. And madVR shows decoder queue 1-4/4 upload/render 1-2/4, present 0-1/4

It doesn't seem like the GPU gets a chance to be too slow.

By the way I tried these videos' 8k AV1/VP9 streams:
https://www.youtube.com/watch?v=zCLOJ9j1k2Y
https://www.youtube.com/watch?v=1La4QzGeaaQ

huhn

28th June 2020, 11:41

zen 1 and 1+ have a bad AVX/AVX2 implementation they need 2 cycles to do such a operation unlike intel and zen 2 which can do that in 1 giving them 2x the ipc in decoding of modern codecs.

these are 8k 60 this is to much for my CPU too it's not even close.
edit:i take it back i get it barely working with EVR-CP.

marcomsousa

1st July 2020, 18:17

Chrome 85 will have AVIF support by default (Beta Jul 23, release Aug 25)

Yups

9th July 2020, 21:57

As for Intels Gen12 their Open source driver did enable AV1 decoding: https://github.com/intel/media-driver/commit/9491998f40d496fc458d282f213c0e9e945b8062

In one file (https://github.com/intel/media-driver/blob/7249f5c0d5185950da66ba9cd3d94defc19e2468/media_driver/agnostic/gen12/codec/shared/codec_def_common_av1.h) it says MaxTile is 4096x2304, this coincides with the leak about Tigerlake 4k60 video.

huhn

10th July 2020, 00:10

great there can't be enough hardware decoder for promising codecs.

Yups

10th July 2020, 15:52

benwaggoner

10th July 2020, 17:22

Not even 8K 12 bit sounds strange, there is no real need for 12 bit. Actually it's similar to Gen9.5 which supports 8/10 Bit 4:2:0 VP9/HEVC, even though it did support 8K afaik. Apparently it's not a trivial VP9 copy and paste, you should be happy that at least one major GPU IHV is going to support AV1 this year even if it's not the super highest possible variant.
AV1 is a pretty complex bitstream to decode.

For moving image content for humans to watch at a distance where they can comfortably see the action in all four corners of the screen, 8K is pretty much indistinguishable from 4K. No one is actually mastering premium content in 8K (some 8K cameras get used, but all post is done in 4K, and often 2K for VFX).

12-bit has some theoretical value, but it's not like there are native 12-bit panels to watch it on. Given all the dithering that goes on, 10-bit is sufficient in most cases. 12-bit's value would mainly to be to not have to worry about dithering in post and encoding with HDR, like how 10-bit pretty much eliminates those concerns with SDR.

All that said, has anyone seen any compelling research on AV1 encoding in HDR or 4K? Pretty much everything I've read at outside of marketing and demos has been SDR and 1080p or lower.

LigH

15th July 2020, 14:54

New uploads: (MSYS2; MinGW32 / MinGW64: GCC 10.1.0)

AOM v2.0.0-608-gcae201b6c (https://www.mediafire.com/file/hbra3g35y4eob50/aom_v2.0.0-608-gcae201b6c.7z/file)

rav1e 0.3.0 (01052fc / 2020-07-15) (https://www.mediafire.com/file/pdwer0ky1s7s8ep/rav1e_0.3.0_2020-07-15_01052fc.7z/file)

dav1d 0.7.1 (1317e61 / 2020-07-15) (https://www.mediafire.com/file/3jxrkg62tbp4c3t/dav1d_0.7.1_2020-07-15_1317e61.7z/file)

marcomsousa

17th July 2020, 13:39

Youtube is giving me AV1 videos without changing configuration on a fresh Windows 10 installed in Chrome without login in Youtube on a i5-1035G1 CPU - 499€ Laptop.

Justing clicking on the first 5 videos from the main page. 3 are AV01 at 1080p and the other VP9.

hbbs

17th July 2020, 13:54

Since AOM AV1 2.0.0 was released a couple of months ago. Have anyone seen a YouTube AV01 video encoded using it?

I always thought AV01 meant AOM 1.X.X

Sent from my Moto Z3 Play using Tapatalk

marcomsousa

17th July 2020, 14:01

AV01 = AOM (AV1)
The next AV2, would be AV02 in youtube.

Also Is normal that Google isn't executing exacly the same version as in the Web. Normal a big company adds small changes to source code for specific Google Platform or backporting fixes.

Note: libaom 2.0.0 is nothing related with AV02 is just an improved version ov AV01 encoder.

hbbs

17th July 2020, 14:14

AV01 = AOM (AV1) - the version encoded will not be shared (Also, is normal they adds small changes to source code for specific Google Platform)

The next AV2, would be AV02 in youtube.Thanks for replying it.

Since you mentioned AV2. Now that VVC is out. Is there anything shared about the developments of AV2?

I remember reading somewhere that Apple was pushing AV2 since they joined AOM late in the game.

Sent from my Moto Z3 Play using Tapatalk

nevcairiel

17th July 2020, 14:22

AV2 development has practically only just started, it'll be a while before anything substantial can be said about it.

mzso

18th July 2020, 08:52

AV2 development has practically only just started, it'll be a while before anything substantial can be said about it.

Will they do the same old? Stick with the basics and make it more convoluted and computation heavy?

(Somehow I doubt they'll try something different, like Daala did with lapped transforms)

benwaggoner

18th July 2020, 22:26

Will they do the same old? Stick with the basics and make it more convoluted and computation heavy?

(Somehow I doubt they'll try something different, like Daala did with lapped transforms)
More years of patents will have expired, so there are likely patented techniques they couldn't use in AV1 which are now available to use in AV2. Plus they have experience seeing what limitations there are holding back encoders and decoders, and they can engineer around those. Getting more HW decoder vendors involved early could help a LOT. MPEG codecs get way, way more input on how to optimize bitstream design to allow for low-cost HW implementations than AV1 did.

For all the complexity of VVC on the encode side, from everything I've heard it'll still be able to have cheaper, simpler decoders than AV1 can.

mzso

26th July 2020, 18:47

More years of patents will have expired, so there are likely patented techniques they couldn't use in AV1 which are now available to use in AV2. Plus they have experience seeing what limitations there are holding back encoders and decoders, and they can engineer around those. Getting more HW decoder vendors involved early could help a LOT. MPEG codecs get way, way more input on how to optimize bitstream design to allow for low-cost HW implementations than AV1 did.

For all the complexity of VVC on the encode side, from everything I've heard it'll still be able to have cheaper, simpler decoders than AV1 can.

One would hope that there's something better out there than making decades old heritage ever more complicated. Just because wavelets and lapped transforms didn't quite work out, it doesn't mean there isn't a better way to encode video.

Honestly, creating another similar (but more complicated) format AV2 seems really pointless at this point. Either create something revolutionary or keep optimizing AV1 encoding/decoding. AVC has been around for many years and will remain for many more. MPEG-2 and ASP didn't die out yet either.

marcomsousa

26th July 2020, 20:06

AV1 will continue to be optimized for the next 5-10 years without changing format.

Changing format is AV2. The development format process will takes 2 to 3 years+5 years for HW decoder+encoder. So developing the new AV2 can be started anytime now.

If software patents expired in 20 years, so AV2 can have a lot of old technics.

Marco Sousa

benwaggoner

27th July 2020, 02:12

One would hope that there's something better out there than making decades old heritage ever more complicated. Just because wavelets and lapped transforms didn't quite work out, it doesn't mean there isn't a better way to encode video.

Honestly, creating another similar (but more complicated) format AV2 seems really pointless at this point. Either create something revolutionary or keep optimizing AV1 encoding/decoding. AVC has been around for many years and will remain for many more. MPEG-2 and ASP didn't die out yet either.
Well, the thing is that iterations and refinement of block-based frequency transform coding keep on showing bigger potential gains than the Big Idea alternative transforms. Some of this could be because of the momentum of R&D around the traditional stuff. Or it could be that we luckily hit upon the right essential transform that balances spatial and temporal prediction better than available alternatives. Arguably modern compression is reallly "just" elaborations of JPEG and H.261. But oh, how elaborate!

mzso

27th July 2020, 15:03

Well, the thing is that iterations and refinement of block-based frequency transform coding keep on showing bigger potential gains than the Big Idea alternative transforms. Some of this could be because of the momentum of R&D around the traditional stuff. Or it could be that we luckily hit upon the right essential transform that balances spatial and temporal prediction better than available alternatives. Arguably modern compression is reallly "just" elaborations of JPEG and H.261. But oh, how elaborate!

"Or it could be that we luckily hit upon the right essential transform that balances spatial and temporal prediction better than available alternatives."

I highly doubt it. I think it's more likely the illusion of familiarity and the blinders that come with it.

"Some of this could be because of the momentum of R&D around the traditional stuff."

It certainly seems like that alternatives got only limited efforts on them.

benwaggoner

27th July 2020, 20:47

"Or it could be that we luckily hit upon the right essential transform that balances spatial and temporal prediction better than available alternatives."

I highly doubt it. I think it's more likely the illusion of familiarity and the blinders that come with it.

"Some of this could be because of the momentum of R&D around the traditional stuff."

It certainly seems like that alternatives got only limited efforts on them.
Yeah. It's really hard to disprove the hypothesis that there could be better fundamental ways of encoding.

That said, wavelets sure got a lot of attention for image and motion coding. Good for images, but no one figured out an efficient motion compensation strategy for it.

Daala had a lot of really intriguing notions, but the most interesting stuff in it never really got to a promising proof of concept. Sure, maybe with 10 years of 1000 engineers something could be found. Any alternative transforms have to compete with decades of refinement of block-based frequency coding.

A lot of promising ideas get figured out how to port into a block-based structure. For example, HEVC's transform skip mode can make anime, graphics, and text way easier to encode at low bitrates and high quality. So new features, like have been seen in VVC and AV1, can get included as tools. Arguably, once you have 64x64 or bigger blocks, you've pretty much got all the advantages of wavelet coding already, within a block based model. And intra-frame prediction brings a lot of the potential value of fractal encoding.

One exciting thing (to me at least) about Daala that didn't make it into AV1 was doing frequency-domain prediction, so there was no need to rasterize a frame that wasn't going to get displayed, and dithering didn't need to be included in quantization. It didn't work out for reasons I don't quite recall.

nhw_pulsar

29th July 2020, 20:17

Hello,

Thank you for your nice comment about wavelets for images (as I have also made a wavelet image codec, called NHW...).

You said: "but no one figured out an efficient motion compensation strategy for it". Do you think this is this aspect that prevents organizations such as Alliance for Open Media from starting and supporting a wavelet codec?

Cheers,
Raphael

foxyshadis

1st August 2020, 11:13

Arguably, once you have 64x64 or bigger blocks, you've pretty much got all the advantages of wavelet coding already, within a block based model.
I remember in the late 90's, first working with codecs in code, thinking that 8x8 must be some kind of fundamental limit of DCT, and wavelets must be superior since they can go from 128x128 all the way to 4096x4096 in JPEG2000. No, it turns out engineers were just excited about the new hotness instead of extending the old battleaxe, DCT, plus it would take at least until SSE2 to really be able to optimize transforms larger than 8x8.

I still think that *lets, curvelets, ridgelets, etc, could help further reduce still images/I-frames, but all the new prediction modes have really put a huge dent in how residuals look.

One exciting thing (to me at least) about Daala that didn't make it into AV1 was doing frequency-domain prediction, so there was no need to rasterize a frame that wasn't going to get displayed, and dithering didn't need to be included in quantization. It didn't work out for reasons I don't quite recall.

From the Graveyard of Dead Tools post (https://jmvalin.ca/daala/revisiting/), it just never worked as well as spatial-domain, since it was another NP-hard idea. It's notable that most of the dead tool ideas came from audio coding, which is Monty's real wheelhouse, but Xiph still managed to push the state of the art and conjure up a real codec; I'm still waiting for a good intra paint plugin for Photoshop, because that tool is amazing.

nhw_pulsar

2nd August 2020, 10:28

LigH

3rd August 2020, 07:43

Dead tools and video codecs and wavelets ... hmm ... I believe I still have a copy of Rududu.

nhw_pulsar

3rd August 2020, 09:33

I did not test Rududu video codec, but I have tested the latest Rududu Image codec (RIC) and it is very good.If I remember correctly, RIC is kind of enhanced and state-of-the-art SPIHT, which is a different technology from NHW.-For the little story, when Rududu author released RIC in march 2008, I was totally blown away by its very impressive results on objective metrics like PSNR and by its very good precision, and then I realized that I could not be at that level of PSNR and precision with NHW, and so then I definitely decided to orientate NHW towards neatness and visual aspect.-

To come back on-topic, yes possibly in the late 90's with JPEG2000, wavelets were the hotness, but frankly since 2001, DCT block-based intra prediction+residual coding is really the main research focus of the industry.Wavelet compression research has been abandoned for years (by industry) actually, the last release of Dirac was in 2008, the last release of Rududu was also in 2008, Snow is around 2008, and most of the main ideas of NHW were also made in 2008...

Who believes organizations like AOM could restart wavelet compression technology today?

Cheers,
Raphael

Yups

7th August 2020, 01:25

A new slide appeared on imgur with the media capabilities of Tigerlake-U.

https://i.imgur.com/Udoa851.png

Previously it was 4k60 and here it's 8k30 AV1.

NikosD

7th August 2020, 05:15

A new slide appeared on imgur with the media capabilities of Tigerlake-U. They have increased the speed - from 8K30 to 8K60 - of HEVC/VP9 decoder too.
They have added 12bit HEVC/VP9 decoding.
Also, that SCC of the table means Screen Content Coding and it's a HEVC profile/extension, optimized for screen captured content.
It could be used by streaming apps/services like YouTube, Skype, Zoom, Netflix etc but I don't know the real use of this extension.
And it's the first time I see this in the supported features of any decoder.

foxyshadis

7th August 2020, 06:35

Hello @foxyshadis,

Hope that I am not trolling too much, I of course agree on the technical side with you and the other impressive reference members here, but I contacted you about Xiph as you seem to know well Monty and this organization.

Do you think Xiph can be interested in the NHW Project? Unfortunately I can not have contact with them, and maybe just like Alliance for Open Media, Xiph is not interested in NHW because it does not work for any image resolution? And that's why my submissions at Xiph and AOM are ignored? I thought that NHW could be a good project for Xiph... (that's only my opinion of course), and certainly a better fit than AOM, but maybe Xiph also only supports excellent codecs and they don't estimate that NHW is one of them?

Many thanks.
Cheers,
Raphael

If you have something that pushes the state of the art, especially if it can be dropped in to a small code segment, not the whole codebase, and you are willing to give it away patent-free and can verify that no one else has patents on it, AOM wants to hear from you.

But they had to deal with getting things encoded and decoded in a reasonable time. AV1 seems slow, but it's miles ahead of what it could have been. Like MPEG, it chops out anything that isn't fast enough to make the cut, and maybe a refinement will make it next generation.

nhw_pulsar

7th August 2020, 09:14

If you have something that pushes the state of the art, especially if it can be dropped in to a small code segment, not the whole codebase, and you are willing to give it away patent-free and can verify that no one else has patents on it, AOM wants to hear from you.

But they had to deal with getting things encoded and decoded in a reasonable time. AV1 seems slow, but it's miles ahead of what it could have been. Like MPEG, it chops out anything that isn't fast enough to make the cut, and maybe a refinement will make it next generation.

Many thanks for your answer.

Yes, I think there are new ideas/processings in the NHW Project that can give interesting "state-of-the-art" results, I don't think they are patended because I never saw them described in the Internet nor in the litterature, and so I am totally willing to give them to AOM patent-free.

The "big" problem is that these new ideas/processings are completely tailored for wavelet coding and wavelet decomposition, I don't think they are adaptable/transposable to DCT AV1 codebase for example... And so that's maybe why AOM always answered me that they were not interested in NHW?

Cheers,
Raphael

nhw_pulsar

9th August 2020, 17:59

Hello,

Just a quick reply, it seems that wavelets are not well-suited for current highly-efficient video codecs with block-based motion compensation/estimation, and so I don't think AOM wants to include then NHW in one of its video codec...

However NHW seems well-suited for an image codec, because it has state-of-the-art results for 0.4bpp to 2bpp which is the Internet range (NHW is not good for extreme compression for now, which can also be a problem for a video codec...), it is also very fast which is an advantage for mobile devices...

Again I am totally open to give my technology to AOM for free, and maybe they'll review it, but for now, all the answers I had from AOM, Google, are: "sorry, we are not interested" or "sorry, we don't have time to study your work"... This is very brief... @foxyshadis, I am very sorry for my impoliteness, maybe you would have contact within AOM and maybe you could inform me what's blocking with NHW? What would need to be changed/improved? Because it would help me a lot to have such advice, and to eventually know what to improve and maybe then become of consideration/interest for AOM?

Cheers,
Raphael

benwaggoner

11th August 2020, 00:27

From the Graveyard of Dead Tools post (https://jmvalin.ca/daala/revisiting/), it just never worked as well as spatial-domain, since it was another NP-hard idea. It's notable that most of the dead tool ideas came from audio coding, which is Monty's real wheelhouse, but Xiph still managed to push the state of the art and conjure up a real codec; I'm still waiting for a good intra paint plugin for Photoshop, because that tool is amazing.
Anyone who has some idea they are sure is brilliant in video coding needs to read that Graveyard of Dead Tools post to see how all sorts of smart ideas wind up not being of practical advantage. It's a good reinforcer of humility.

++ on the intra paint plugin idea!

nhw_pulsar

11th August 2020, 10:16

Anyone who has some idea they are sure is brilliant in video coding needs to read that Graveyard of Dead Tools post to see how all sorts of smart ideas wind up not being of practical advantage. It's a good reinforcer of humility.

++ on the intra paint plugin idea!

Yes, I have also theoretical ideas for a wavelet video codec, but I also fear that they turn out of no practical advantage.

Very quickly, I wanted to rectify my previous post because I completely forgot that an engineer from Google told me that NHW has serious aliasing and discoloration artifacts that must be corrected.For aliasing, I thought about a post-processing function in the decoder which will detect aliasing and remove it from the decoded Y luma comp, but I must admit that I am ultra lazy and also demotivated for now... For discoloration, it can be corrected but I want to do this with Chroma from Luma technique because it will also save quite a lot of bits.

So yes NHW has some drawbacks, and there is a reason why the industry has chosen AVIF and JPEG XL as the new image compression standards.I think they have certainly evaluated the pros and the cons of the different solutions/codecs, and so made that choice, and I totally respect it of course because they are a lot more skilled than me to evaluate it.

Just to finish, if I can advertise my skills, I think I have a good knowledge of wavelet coding, and if you would have such projects, I am very interested and could work on it with a freelance contract for example... Image/video compression is a passion for me (and also as I struggle hard with jobs here), and I would like to live of it now...

I will try not to pollute that much the AOM thread now.

Cheers,
Raphael

benwaggoner

13th August 2020, 20:05

So yes NHW has some drawbacks, and there is a reason why the industry has chosen AVIF and JPEG XL as the new image compression standards.I think they have certainly evaluated the pros and the cons of the different solutions/codecs, and so made that choice, and I totally respect it of course because they are a lot more skilled than me to evaluate it.
There is also a HUGE advantage to technologies that get broadly implemented in HW decoders. The long term trend is absolutely towards using IDR frames of video codecs for still image encoding to maximize decode speed and reliability. JPEG in software is okay because it is very simple and fast to decode. But with more complex and efficient image coding, decoding complexity goes up and HW has an advantage. While an individual frame isn't such a big deal, but doing things like generating lots of thumbnails from JPEG can be quite slow even on fast computers today.[/QUOTE]

Just to finish, if I can advertise my skills, I think I have a good knowledge of wavelet coding, and if you would have such projects, I am very interested and could work on it with a freelance contract for example... Image/video compression is a passion for me (and also as I struggle hard with jobs here), and I would like to live of it now...
For an individual contributor, the real money is in better implementation of standards than in trying to create new standards or formats. Figuring out how to tune video encoders for better still images would be a valuable offer as a contractor. While the bitstream is the same, there's lots of stuff that an encoder does to optimize for moving images that isn't appropriate for still images. Once interframe coherancy is irrelevant, lots of different choices become optimal. For example, x264's --tune stillimage mode really:

- stillimage (psy tuning):
--aq-strength 1.2
--deblock -3:-3
--psy-rd 2.0:0.7
And even those didn't get much emperical testing

Given the huge increase in tools available in AV1, HEVC, and VVC, I'm sure optimal tunings would be correspondingly more complex. And improving content adaption is a huge deal. Coding a pure natural image photograph is very different from encoding a screen shot, which is different from an iamge that combines rendered text, graphics, and natural photography.

nhw_pulsar

13th August 2020, 20:47

There is also a HUGE advantage to technologies that get broadly implemented in HW decoders. The long term trend is absolutely towards using IDR frames of video codecs for still image encoding to maximize decode speed and reliability. JPEG in software is okay because it is very simple and fast to decode. But with more complex and efficient image coding, decoding complexity goes up and HW has an advantage. While an individual frame isn't such a big deal, but doing things like generating lots of thumbnails from JPEG can be quite slow even on fast computers today.

Yes, I agree with you and HW decoders have an advantage.But I still wanted to emphasize that NHW is extremely fast to encode/decode, and I even think that software NHW decoder will be very faster than hardware HEVC, AV1, VVC decoders.For example with the same level of (software) optimization, NHW is around 15x faster to decode than x265 (optimized HEVC)!

For an individual contributor, the real money is in better implementation of standards than in trying to create new standards or formats. Figuring out how to tune video encoders for better still images would be a valuable offer as a contractor. While the bitstream is the same, there's lots of stuff that an encoder does to optimize for moving images that isn't appropriate for still images.

Yes, it could be very interesting to tune video encoders for better still images, because I generally find that they lack of neatness, at least as a still image.And I have also developed processings that enhance neatness and that are not related to wavelet coding, and so transposable to any compression scheme.Yes neatness is very subjective, but really for me, despite NHW has far and far worse PSNR and SSIM scores than x265, AVIF, I still find that its results are visually more pleasant.So I do think that psychovisual tuning for still image is very important, and it would be great to work on it.
-For the little story, I did not intend to create a new standard, it's just I had very interesting course at university on wavelets in 2004-2005, and I absolutely did not have knowledge on DCT, and so naturally I orientated towards wavelets and played at home with them to try to see how far they can go...-

Many thanks again for your answer and your time Sir.
Cheers,
Raphael

foxyshadis

21st August 2020, 05:46

Many thanks for your answer.

Yes, I think there are new ideas/processings in the NHW Project that can give interesting "state-of-the-art" results, I don't think they are patended because I never saw them described in the Internet nor in the litterature, and so I am totally willing to give them to AOM patent-free.

The "big" problem is that these new ideas/processings are completely tailored for wavelet coding and wavelet decomposition, I don't think they are adaptable/transposable to DCT AV1 codebase for example... And so that's maybe why AOM always answered me that they were not interested in NHW?

Cheers,
Raphael

Unfortunately, before they're willing to consider the technical merits of your idea, you have to prove the legal merits, namely that it is not patented, that it's different enough from similar patents, or that you own a patent to the technology that you'll willing to sign over to AOMedia. "Haven't seen it before" isn't enough of a guarantee, because there are just way too many niche things in journals and patents out there.

AV1 does have a few pieces that really were designed specifically to benefit non-standard use cases, like still images and desktop streaming, so it's not entirely done for. But they won't take anything for AV2 that doesn't pass the patent minefield.

And yeah, you'll have to make some attempt at integrating the tool to prove it can help some use case, otherwise it's just another idea on the Mount Everest of ideas.

nhw_pulsar

21st August 2020, 08:37

Unfortunately, before they're willing to consider the technical merits of your idea, you have to prove the legal merits, namely that it is not patented, that it's different enough from similar patents, or that you own a patent to the technology that you'll willing to sign over to AOMedia. "Haven't seen it before" isn't enough of a guarantee, because there are just way too many niche things in journals and patents out there.

AV1 does have a few pieces that really were designed specifically to benefit non-standard use cases, like still images and desktop streaming, so it's not entirely done for. But they won't take anything for AV2 that doesn't pass the patent minefield.

Yes, you're right and I completely understand the very deep imperatives of AOM concerning patents.Just, for example, searching the whole US patents database website for prior art/patents will be quite of a hard task for me... and also I don't have the money to pay for a patent lawyer for that unfortunately... Do you think I can receive some help for that task? Also the thing that will be hard to defend is that most of the main ideas of NHW were done in 2007-2008, but from 2007 to 2012, NHW was close-source, it's just from 2012 that it was open-source, but really the main ideas are of 2007/2008.

>AV1 does have a few pieces that really were designed specifically to benefit non-standard use cases

That's great news, and it gives me a little hope now with AOM, thank you for letting me know, even if I know that it will be very difficult.But first how can I clear the patent concern?

And yeah, you'll have to make some attempt at integrating the tool to prove it can help some use case, otherwise it's just another idea on the Mount Everest of ideas.

Quite frankly, it will be very diffcult for me to integrate NHW or some of his tools in the AV1 code... For the tools, the main idea not related to wavelets that comes to mind, is to perform a pre-sharpening (with laplacian kernel, quite old technique) of the Y comp (at the very begining just after colorspace conversion) to enhance neatness of the results, but I don't know if it'll work with AV1, because I have read that DCT quantization naturally tends to sharpen image...

Cheers,
Raphael

nhw_pulsar

21st August 2020, 23:26

@foxyshadis (and to the other members),

I have read your post today that SVT-AV1 will become the AV1 "production" encoder because of its reasonable complexity, and you also wrote: "aomenc will continue as a research codec for AV2 development."

So AV2 will be based on aomenc and so I guess encoding time won't certainly be the problem.I have even read that a AOM founding member researcher said that for AV2, they are deeply devoted to really introduce ML/AI, for example for the good representation(/segmentation) of objects/shapes and better understand their motion and so further improve compression.

So from my understanding, AV2, based on aomenc, will be an experimental research codec that will further compress over AV1 and VVC and so will have exceptional PSNR and SSIM scores at the expense of a very "huge" encoder(/decoder) complexity/time.When I try to think about NHW in that picture, it seems contrary actually, because NHW strong point is extremely fast encoder/decoder with very good visual aspect but poor PSNR and SSIM scores.

So I start to have big doubts again that AOM could be interested in NHW for AV2, plus all the negative answers I had from AOM these last years, all this makes me very pessimistic again...

@foxyshadis, could you confirm what you said and do you really think AOM could be interested in NHW for few non-standard use cases like still image?

It would be great if you could give me your point of view for NHW and AOM codecs, even if it would be severe and negative (would still help me).

Cheers,
Raphael