Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
28th July 2021, 19:42 | #61 | Link |
Registered User
Join Date: Apr 2017
Posts: 171
|
Hello,
Just a quick note, as I wanted to correct that I did not have a bad experience with AOM, because recently the director of image compression of an AOM founding member big company (I can not give the name) told me that he found the NHW codec/technology very interesting, but he can absolutely not give me the guarantee that NHW will be considered/studied by AOM. And this is what I feel actually, NHW maybe didn't collect a majority of interest/voices among AOM members to go to an exploration phase, -notably maybe because of the established PSNR and SSIM evaulation and rating protocol-.If you would have any suggestion to "move the lines" and that NHW could enter the exploration phase of AOM, do not hesitate to show up! Cheers, Raphael |
28th July 2021, 20:50 | #62 | Link |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
Yeah, lots more tools get proposed for codecs than get adopted. And psychovisual optimization only tools get looked at later in development.
In proposing a new intra coding approach, it's essential to demonstrate that the new intra features provide good references for later predicted frames. Stuff that can optimize for a single image well may not be well suited for long GOP interframe encoding. An I-frame needs to be a good reference for a P-frame to be a good reference for a later P-frame which need to be good references for B frames between them. How your "neatness" approach would apply to interframe encoding isn't immediately obvious to me, nor would it be for others. No idea if your tech is good at that or not. Demonstrating that it IS good for that would be table stakes. |
28th July 2021, 21:33 | #63 | Link | |
Registered User
Join Date: Apr 2017
Posts: 171
|
Quote:
Thank you for your answer.Yes, you're right, but actually I don't have enough knowledge/background on inter-frame coding, so I don't know if NHW intra could code efficiently residuals of predicted inter-frames.What I can say is that I think that results will be different, neatness/sharpness can however be modulated in NHW according to the needs of inter-frame residuals, but clearly NHW washes details out especially at high compression, so I just think that NHW will be different than classic DCT block-based technique, but I can not say if it will be better or worse. And it was why I expected that NHW could enter an exploration phase at AOM/MPEG, in order to find competent person that could have some "dedication" to make it work well in video codec with inter-frames. |
|
30th July 2021, 16:05 | #64 | Link |
Registered User
Join Date: Apr 2017
Posts: 171
|
Hello,
Yes we can read that the evolution of image/video compression will be AI, and that AOM and MPEG are definitely turned toward this goal.We can also read that this will happen in 2 phases: in a first time, we will keep the traditional codec structure but each processing step and tool will be replaced by a better AI/machine learning equivalent, and then in a second time there will appear totally AI-native video codecs as this technology will be mature. So yes, as some people also told me, I should start to forget about my wavelet codec, but however it will be still extremely and extremely faster.But clearly also, you can tell me that today if you want a fast codec and that speed is your need then you stick with H.264.Do you think there is a market for an extremely low-power codec that will be very faster than H.264 and that will have a better visual quality than HEVC? Or NHW has definitely no future? Does NHW intra could find a use/ special use case in the AI era? Any insights very welcome! |
30th July 2021, 20:23 | #65 | Link | ||
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
Quote:
But there are compute limitations as well. ML models take a lot of FLOPS and watts to run, which is antithetical to the tremendous focus in codec development on keeping HW decoder implementations cheap and power efficient. Modern codecs are really good at reproducing a good image with remarkably few bits. Downloading new ML models per-title could work, but those are bits as well. I can't imagine having a new fundamental transform in ML would be practical. ML would make more sense for higher level features. But the more models are updated and switched, the more model overhead will inflate the bitrate. And realtime generation of models during live encoding would not make sense as a hard requirement; the codec would still need to be able to fall back to a no-ML required mode. Fixed ML models included in the decoders would get around the bitrate issue. But if there's a fixed set of models, a more efficient approach would be to reverse engineer algorithms out of the models for massively improved decoder efficiency. A "pure" ML codec is hard to imagine. It's not like even machine vision puts an array of pixels into ML model and goes from there. Lots of codec-like stuff like frequency transforms are applied to create the base data that the ML models receive as input. Expecting a ML model to rediscover chroma subsampling, DCT-like transforms, and the other stuff that makes a VVC possible seems really unlikely, and the compute required to play such a thing back would be orders of magnitude beyond what the market would accept. What is more likely is the MPAI project will identify individual tools where having optional ML enhancement provides worthwhile benefits. Those will most likely be in the tools where individual parameters get sweated over, and there's a lot of complexity in tradeoffs between signaling costs and benefits of additional modes. Some of those tools will find benefit on the encoder side that doesn't require decoder side ML. For example, dynamic qcomp, loop filter strength, adaptive quantization modes and strengths, psychovisual tools, grain/noise modeling and optimization. Tools that are very content-dependent, difficult to find a single "correct" value for a given title, or impact multiple pixels are the most likely to have value. Sample Adaptive Offset comes to mind. ML-based parameter selection certainly would be a benefit, as current encoders just have an on/off switch. Quote:
So, I'm always glad to see experimentation with other fundamental approaches. But it'd take a pretty big team to have a chance of adapting wavelets to something promising. And now we have intra-frame references in HEVC and later, a lot of those motion estimation refinements can be applied to still images as well, almost making the old "Fractal" IVF approach a subset. As software H.264 is universal an alternative image codec would need substantially better compression efficiency than an H.264 IDR across a wide variety of content types, running in a JavaScript engine with performance at least as good as a software H.264 decoder. It wouldn't need to be just vanilla JS; WASM would be broadly available in this timeframe. And WebGPU may become so. Do you think your encoder could get there? |
||
30th July 2021, 21:15 | #66 | Link |
Registered User
Join Date: Apr 2017
Posts: 171
|
Thank you benwaggoner for your insightful very professional opinion.Very instructive, -even if I think that other experts in the higher places don't share your opinion on AI compression (based on what they write on the Internet)-.
I was also thinking in order to tackle aliasing analysis/detection in NHW decoded images to use ML, with relatively small model, that will be trained to detect aliasing patterns (often relatively the sames) in NHW, also if it can be fast computing. >Do you think your encoder could get there? I have hardly no knowledge in motion estimation/compensation, so I can not turn NHW into a good video codec. For NHW intra image codec, I find that it is visually more pleasant than H.264, HEVC, AV1 because it has more neatness (but again it is my only personal opinion, for example AOM and MPEG don't seem to share it).Concerning speed, it's clear there is no debate, NHW is a lot faster to encode/decode than H.264. NHW has strong points like speed and neatness, but it is not these ones that AOM and MPEG have chosen and are focusing on, they are more in a race to ever and ever better PSNR and SSIM curves with ever more complex solutions.In this landscape, do you think NHW could have a (little) future? Many thanks again Sir. Cheers, Raphael |
31st July 2021, 11:23 | #67 | Link | |
Lost my old account :(
Join Date: Jul 2017
Posts: 325
|
Quote:
How does it perform for visually lossless compression? Could it be turned in to an mezzanine/DI format? Contribution? I think you need to find a specific use case/product that solves something for this to become anything. For example, look at JpegXS, as one in the broadcasting world working with SDI its a tech I'm super hyped on that as we are currently migrating from baseband SDI to IP/ethernet, and as an 1080 signal is either 1.5Gbps or 3Gbps, or 12Gbps for UHD we cannot fit our signal in standard GbE equipment. So something like the embronix/riedel sfps is amazing, they can do realtime visually lossless compression at 7.5W at only 20lines of latency so they can create an gateway so we can transport 40 1080i visually lossless signals in one 10Gb cable. It will be huge for transporting realtime video in our industry. What industry would benefit from your tech? What is it going to be used for? Just having good compression isnt enough, it needs to solve an issue people are having. If it could be used of visually lossless compression I think that one area that I very much would like to see some more options is mezzanine/DI codecs. With a huge increase in resolutions the last couple of years the data rates of a lot of the current ones becomes rather cumbersome , the ones that still are on the lower end of the spectrum are based on h264/hevc and for most of the high end formats (4:2:2, 10bit etc) hw decoding and encoding isnt wildly supported making them a pain to work with. I would love to see something as lightwight (or more) as prores at lower data rates. Working with an NLE vendor could be an possibility. Again as someone in broadcasting working with XDCAM 50 I can tell you that it raised quite a few eyebrows when I tried to explain that UHD would create an massive increase in file sizes for us "Wait, what, how can it increase by more 4x? Isnt XDCAM like from the 90s?" Last edited by excellentswordfight; 31st July 2021 at 12:23. |
|
31st July 2021, 12:14 | #68 | Link | |||
Registered User
Join Date: Apr 2017
Posts: 171
|
Quote:
Quote:
Quote:
|
|||
31st July 2021, 12:46 | #69 | Link | ||
Lost my old account :(
Join Date: Jul 2017
Posts: 325
|
Quote:
Quote:
If you can demonstrate an codec that vastly outperform prores with fast encoding & decoding I guess that Adobe (as they dont have any codec of their own unlike some of the competition) and blackmagic might be interested. Also note that multi-generational encoding is very important for an DI format (it should be able to be re-encoded in the same format without much degrading). Last edited by excellentswordfight; 1st August 2021 at 09:16. |
||
31st July 2021, 14:07 | #70 | Link | ||
Registered User
Join Date: Apr 2017
Posts: 171
|
Thank you for your help and advice.
Quote:
Quote:
Cheers, Raphael |
||
3rd August 2021, 18:31 | #71 | Link |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
The "neatness" aspect wouldn't be appealing to the mezzanine market at all. The goal of a mezzanine codec is to serve as a high quality input to additional filtering and then recompression with minimum multigenerational loss. This is well beyond visually lossless, to visually lossless after >10 generations of recompression.
PNSR is a much more relevant metric to the mezz use case than for highly compressed distribution to end-users. Also, I don't think that MJPEG has been a computational limitation for even tiny devices for a long time. The MJPEG products were well established 25 years ago, and there have been MANY iterations of Moore's Law since then. |
3rd August 2021, 19:19 | #72 | Link | ||
Registered User
Join Date: Apr 2017
Posts: 171
|
Quote:
Quote:
But I am more and more resigned now, NHW will not be used at all by the industry, as were very good codecs made by individuals before me like Rududu codec and DLI compression.The problem for me, is that's a real pity because I love to work in image/video compression contrarily to the very few job positions I can have access... because with my very bad CV I can't have access to interesting job positions (for me), that's also why in a way I may sound insisting with NHW and image/video compression on this forum... Last edited by nhw_pulsar; 4th August 2021 at 16:35. |
||
3rd August 2021, 20:52 | #73 | Link |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
Your best bet to do something with NHW is to make a JavaScript-compatible implementation of it so people can view the files without needing to use extra software. That would encourage experimentation at least, and make it a potentially viable alternative to AVIF. I imagine NHW would be faster than AVIF due to lower complexity, even as WASM instead of dav1d's hand-tuned AVX2.
|
3rd August 2021, 21:31 | #74 | Link | |
Registered User
Join Date: Apr 2017
Posts: 171
|
Quote:
Yes despite the extreme optimizations of AVIF, I do think that NHW is extremely faster to encode and very faster to decode... but I don't think that NHW will be an alternative to AVIF when all the compression bodies tell you that they are definitely not interested... Sounds incompatible/impossible... |
|
5th August 2021, 10:37 | #76 | Link | |
Registered User
Join Date: Apr 2017
Posts: 171
|
Quote:
Many thanks. Raphael |
|
5th August 2021, 23:48 | #77 | Link | |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
Quote:
Doing AVIF or HEIC in JavaScript is challenging. The relative simplicity of NWH could be an advantage if performant decode is available across a broad swath of web clients, including ARM devices. |
|
6th August 2021, 09:57 | #78 | Link | |
Registered User
Join Date: Apr 2017
Posts: 171
|
Quote:
No, there are no JavaScript nor SIMD/ARM decoder implementations of NHW for now, this is out of my skills.I've tried to develop a "good" demo version of NHW that could motivate the industry to look deeper at it.So far it didn't work, but yes that's right the most essential thing is to have fun, that's the case for now but that's also right that I don't spend much hours on NHW currently... Cheers, Raphael |
|
11th August 2021, 18:39 | #79 | Link |
Registered User
Join Date: Apr 2017
Posts: 171
|
Hello,
As I am very actively searching for a freelance contract with NHW, I came accross the LinkedIn profile of Mr. David Ronca, Facebook's Director of Video Encoding, and he says that they are working on low power video codec/processing, that will consumes very lesser energy for the same quality/compression.I know I come very very late in the game, but do you think NHW could be interesting for their intra scheme if it is not completely validated for now? As Mr. Ronca's email is not public, I wanted to know first if it is worth to disturb this people or if there are absolutely no way that Facebook could include NHW in their video encoder solution and that I should totally forget to contact them, and on the contrary as it was said, I should refocus on a niche use case? Any opinion is very welcome. |
11th August 2021, 21:17 | #80 | Link |
Moderator
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
|
I believe that project is focused on lowering ENCODING time for video codecs. This would be stuff like the ASICs YouTube is building. Pixels-per-second/watt is quite important for UGC like FaceBook and YouTube, as they do huge volumes of encoding of files, of which most get watched <5 times, and being free customer quality expectations are much lower. It's a very different market than for premium content.
Unless you can have a compelling argument for why decode would be at least as low power as AV1 software decoders, and a realistic plan for encoding much faster than AV1 at similar quality, it's probably not a good fit. At a minimum you'd need to show how interframe prediction would work, and work efficiently. |
|
|