Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 14th June 2021, 12:31   #61  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 688
Quote:
Originally Posted by skal View Post
* video codec usually don't care about lossless / alpha
Is it scam for the avif container that has lossless functionality added? Recently in the libheif discussion there is a question about adding the vvenc codec. There is no lossless feature to know how to apply it. The vvc codec is lossless but only for test trials.
I don't know how vvc works, but apparently it has alpha channels built in. So the layers do not need separate channels.
Jamaika is offline   Reply With Quote
Old 14th June 2021, 18:50   #62  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by Jamaika View Post
Is it scam for the avif container that has lossless functionality added? Recently in the libheif discussion there is a question about adding the vvenc codec. There is no lossless feature to know how to apply it. The vvc codec is lossless but only for test trials.
I don't know how vvc works, but apparently it has alpha channels built in. So the layers do not need separate channels.
The AVIF container is HEIF. The HEVC HEIF (HEIC) already supports lossless, as HEVC supports lossless in all profiles. And it also supports alpha channels.

It's pretty trivial to combine a normal YUV image and then a lossless luma-only image for the alpha. Most alpha channels will encode to tiny with HEVC's lossless mode.

It certainly isn't a "scam!"
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book

Last edited by benwaggoner; 15th June 2021 at 17:45. Reason: Fixed URL
benwaggoner is offline   Reply With Quote
Old 14th June 2021, 19:36   #63  |  Link
jon
Registered User
 
Join Date: Apr 2016
Posts: 4
Quote:
Originally Posted by skal View Post
Robustness to recompression has been often advanced as a critically missing feature. But it's in pratice quite difficult to guarantee: any very common editing operation (cropping, resizing, adding text) will ruin any codec's good intention. And it's not often the case you need to recompress more than 10-20 times (is it?).
For editing/authoring workflows, lossless is needed exactly because of that: any operation that moves pixels around will produce generation loss in any lossy codec. Also you want to have more precision/dynamic range than what you'll eventually export as a lossy image for end-user delivery.

But generation loss resilience is still useful in case after end-user delivery, further edits and recompression happens. The typical example is a meme, where an image hops between social media (which typically all apply lossy recompression), and occasionally gets edited (typically by changing a text caption or something like that). In these types of scenarios, often there is no editing at all, just recompression, and if there is editing, it's typically leaving most of the pixels intact (no cropping / rotation / rescaling etc, just replacing some text), and generation loss resilience can be useful.

A meme that goes viral can get recompressed MANY times - people download an image from Twitter, share it with a friend via Whatsapp, they post it on Instagram, then it goes to Facebook, back on Twitter, etc etc. When the artifacts get too bad, people might start looking for a 'cleaner' version, but I have seen images get to a point that probably corresponds to hundreds of generations...
jon is offline   Reply With Quote
Old 14th June 2021, 20:32   #64  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by jon View Post
For editing/authoring workflows, lossless is needed exactly because of that: any operation that moves pixels around will produce generation loss in any lossy codec. Also you want to have more precision/dynamic range than what you'll eventually export as a lossy image for end-user delivery.

But generation loss resilience is still useful in case after end-user delivery, further edits and recompression happens. The typical example is a meme, where an image hops between social media (which typically all apply lossy recompression), and occasionally gets edited (typically by changing a text caption or something like that). In these types of scenarios, often there is no editing at all, just recompression, and if there is editing, it's typically leaving most of the pixels intact (no cropping / rotation / rescaling etc, just replacing some text), and generation loss resilience can be useful.

A meme that goes viral can get recompressed MANY times - people download an image from Twitter, share it with a friend via Whatsapp, they post it on Instagram, then it goes to Facebook, back on Twitter, etc etc. When the artifacts get too bad, people might start looking for a 'cleaner' version, but I have seen images get to a point that probably corresponds to hundreds of generations...
Great insight there.

A great thing about using HEIC is that its lossless mode is quite efficient; a lot better than PNG or lossless JPEG and better than J2K.

There is also the option of mixing lossless and lossy blocks in the same image. For example, lossless for text or graphics and high quality lossy for natural images. And transform-skip blocks can be a great middle ground as well.

I've been able to get HEIC to match JPEG quality at 5% the bitrate for comic book art (all those sharp lines are a poor fit for JPEG).
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 15th June 2021, 10:44   #65  |  Link
Brazil2
Registered User
 
Join Date: Jul 2008
Posts: 530
Quote:
Originally Posted by benwaggoner View Post
file:///C:/Users/benwagg/Downloads/506_hevc_video_with_alpha.pdf
Linking to your hard drive won't work

Here is the public link :
https://devstreaming-cdn.apple.com/v...alpha.pdf?dl=1
Brazil2 is offline   Reply With Quote
Old 15th June 2021, 17:44   #66  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by Brazil2 View Post
file:///C:/Users/benwagg/Downloads/506_hevc_video_with_alpha.pdf
Linking to your hard drive won't work

Here is the public link :
https://devstreaming-cdn.apple.com/v...alpha.pdf?dl=1
DOH!

Thanks for the catch and the fix!
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 16th June 2021, 06:36   #67  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 688
What is not in the x265 codecs but in vvc. Why do libheif containers use dual x265 codec for only three additional alpha layers?

("Profile", extendedProfile, ExtendedProfileName::NONE, "Profile name to use for encoding. Use [multilayer_]main_10[_444][_still_picture], auto, or none")
("MultiLayerEnabledFlag", m_multiLayerEnabledFlag, false, "Bitstream might contain more than one layer")
("AllLayersIndependentConstraintFlag", m_allLayersIndependentConstraintFlag, false, "Indicate that all layers are independent")
( "MaxLayers", m_maxLayers, 1, "Max number of layers" )
( "MaxTemporalLayer", m_maxTemporalLayer, 500, "Maximum temporal layer to be signalled in OPI" )
("LambdaModifierI,-LMI", cfg_adIntraLambdaModifier, cfg_adIntraLambdaModifier, "Lambda modifiers for Intra pictures, comma separated, up to one the number of temporal layer. If entry for temporalLayer exists, then use it, else if some are specified, use the last, else use the standard LambdaModifiers.")

("SEIACIEnabled", m_aciSEIEnabled, false, "Control generation of alpha channel information SEI message")
("SEIACICancelFlag", m_aciSEICancelFlag, false, "Specifies the persistence of any previous alpha channel information SEI message in output order")
("SEIACIUseIdc", m_aciSEIUseIdc, 0, "Specifies the usage of the auxiliary picture in the alpha channel information SEI message")
("SEIACIBitDepthMinus8", m_aciSEIBitDepthMinus8, 0, "Specifies the bit depth of the samples of the auxiliary picture in the alpha channel information SEI message")
("SEIACITransparentValue", m_aciSEITransparentValue, 0, "Specifies the transparent interpretation sample value in the alpha channel information SEI message")
("SEIACIOpaqueValue", m_aciSEIOpaqueValue, 0, "Specifies the opaque interpretation sample value in the alpha channel information SEI message")
("SEIACIIncrFlag", m_aciSEIIncrFlag, false, "Specifies the opaque interpretation sample value in the alpha channel information SEI message")
("SEIACIClipFlag", m_aciSEIClipFlag, false, "Specifies whether clipping operation is applied in the alpha channel information SEI message")
("SEIACIClipTypeFlag", m_aciSEIClipTypeFlag, false, "Specifies the type of clipping operation in the alpha channel information SEI message")
Jamaika is offline   Reply With Quote
Old 16th June 2021, 18:38   #68  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by Jamaika View Post
What is not in the x265 codecs but in vvc. Why do libheif containers use dual x265 codec for only three additional alpha layers?
x265 is a specific HEVC encoder. Others are available and can be used with HEIC, although x265 is quite good at the still image use case. The HEIF wrapper can use H.264, JPEG, AV1 (AVIF is AV1 in HEIF), and is easily extendible to other still image codecs.

I don't see where you are getting three additional alpha layers. MaxLayers=1 in your sample.

I'd not looked at these below. It's clever to allow lambda modifiers to be specified at this level. Alpha channel encoding likely benefits from different lambda tables, and it's more efficient to store it once per HEIF than in the bitstream for each individual alpha channel.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 16th June 2021, 21:31   #69  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 688
Quote:
Originally Posted by benwaggoner View Post
x265 is a specific HEVC encoder. Others are available and can be used with HEIC, although x265 is quite good at the still image use case. The HEIF wrapper can use H.264, JPEG, AV1 (AVIF is AV1 in HEIF), and is easily extendible to other still image codecs.

I don't see where you are getting three additional alpha layers. MaxLayers=1 in your sample.

I'd not looked at these below. It's clever to allow lambda modifiers to be specified at this level. Alpha channel encoding likely benefits from different lambda tables, and it's more efficient to store it once per HEIF than in the bitstream for each individual alpha channel.
Channel alpha for heif are recent added feature. I don't know the spec, but I see config like this .

two_layers.cfg
#======== Layers ===============
MaxLayers : 2
MaxSublayers : 7
DefaultPtlDpbHrdMaxTidFlag : 0
AllIndependentLayersFlag : 0
#======== OLSs ===============
EachLayerIsAnOlsFlag : 0
OlsModeIdc : 2
NumOutputLayerSets : 2
OlsOutputLayer1 : 1 0
NumPTLsInVPS : 2
#======== Layer-0 ===============
LayerId0 : 0
#======== Layer-1 ===============
LayerId1 : 1
NumRefLayers1 : 1
RefLayerIdx1 : 0
#======== OLS-0 ===============
OlsPTLIdx0 : 0
#======== OLS-1 ===============
LevelPTL1 : 6.2
OlsPTLIdx1 : 1
Jamaika is offline   Reply With Quote
Old 20th June 2021, 20:50   #70  |  Link
birdie
Artem S. Tashkinov
 
birdie's Avatar
 
Join Date: Dec 2006
Posts: 334
@skal

Thank you for the insightful answer!
birdie is offline   Reply With Quote
Old 21st June 2021, 22:32   #71  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by Jamaika View Post
Channel alpha for heif are recent added feature. I don't know the spec, but I see config like this .

two_layers.cfg
#======== Layers ===============
MaxLayers : 2
MaxSublayers : 7
DefaultPtlDpbHrdMaxTidFlag : 0
AllIndependentLayersFlag : 0
I've not read the spec to this level, but that looks like a maximum of two layers to me.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 22nd June 2021, 09:55   #72  |  Link
birdie
Artem S. Tashkinov
 
birdie's Avatar
 
Join Date: Dec 2006
Posts: 334
@skal

I've been thinking about your reply and I have an issue with it.

Most modern videos are encoded with I frames inserted each 1 or 2 seconds, so the I frames compression ratio is super important - you may optimize your P and B frames all you want but if your I frames are huge, it's all for naught.

Doesn't that mean that AV1 should surpass anything before it in terms of compression efficiency including WebP/WebP2?
birdie is offline   Reply With Quote
Old 22nd June 2021, 17:26   #73  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by birdie View Post
Most modern videos are encoded with I frames inserted each 1 or 2 seconds, so the I frames compression ratio is super important - you may optimize your P and B frames all you want but if your I frames are huge, it's all for naught.

Doesn't that mean that AV1 should surpass anything before it in terms of compression efficiency including WebP/WebP2?
I don't know that it has been demonstrated that AV1 has the most efficient I-frame encoding to date, at least not psychovisually with still images. It might be true, but I'm not aware of any good double-blind testing demonstrating that is true.

Also, image sizes vary way more than in video (>8K is quite common), encode a much wider variety of images, and require different psychovisual tuning due to no motion and the variety of images. As old as JPEG is, we've still seen substantial compression efficiency improvements for it in the last decade.

I wouldn't expect a video encoder out of the box to do optimal image compression. We're probably comparing implementations as much as bitstream potentials.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 22nd June 2021, 19:09   #74  |  Link
nhw_pulsar
Registered User
 
Join Date: Apr 2017
Posts: 170
Video encoders are optimized for PSNR and SSIM which is not a guarantee that the (still) image and its artifacts will be visually pleasant... And with my experience, it's a big problem I had, because I tried to convince experts that a 30dB image was visually more pleasant than a 38dB image but it was inconceivable and impossible, and I was answered that they didn't trust my eyes and stick with the PSNR numbers...

I don't find that AV1 surpasses WebP2.AV1 has a notably better PSNR, but for those who care a lot about this, WebP2 last time I tested had a way better details preserving like grain, textures,... which is very impressive.
nhw_pulsar is offline   Reply With Quote
Old 22nd June 2021, 23:31   #75  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by nhw_pulsar View Post
Video encoders are optimized for PSNR and SSIM which is not a guarantee that the (still) image and its artifacts will be visually pleasant... And with my experience, it's a big problem I had, because I tried to convince experts that a 30dB image was visually more pleasant than a 38dB image but it was inconceivable and impossible, and I was answered that they didn't trust my eyes and stick with the PSNR numbers...
All too common a story. People with an engineering background but not an image science background can get into the "I don't care if it looks bad in practice, it still looks good in theory!" trap often.

One of the strengths of x264 and x265 is that they don't optimize for PSNR or SSIM directly, but for its own internal subjective quality metric.

Quote:
I don't find that AV1 surpasses WebP2.AV1 has a notably better PSNR, but for those who care a lot about this, WebP2 last time I tested had a way better details preserving like grain, textures,... which is very impressive.
Exactly. PSNR is quite lousy at measuring a lot of those factors, and SSIM is only somewhat better.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 23rd June 2021, 00:21   #76  |  Link
nhw_pulsar
Registered User
 
Join Date: Apr 2017
Posts: 170
Quote:
Originally Posted by benwaggoner View Post
All too common a story. People with an engineering background but not an image science background can get into the "I don't care if it looks bad in practice, it still looks good in theory!" trap often.

One of the strengths of x264 and x265 is that they don't optimize for PSNR or SSIM directly, but for its own internal subjective quality metric.


Exactly. PSNR is quite lousy at measuring a lot of those factors, and SSIM is only somewhat better.
Thank you for your support Sir.

I'm however not saying that AVIF (AV1) is bad visually, absolutely not, just for me it visually decreases a little image neatness (which PSNR absolutely not measures), and when you compare with a codec like NHW which increases neatness, then for me you clearly see the neatness difference and that for example more neatness is visually more pleasant despite a 5-6dB PSNR difference between the 2 codecs...

Yes BPG x265 -m 1 fastest speed setting is very impressive, because it has an impressive ratio between subjective visual quality, high PSNR ad very fast speed, and I even wonder if it is not a similar setting that is implemented in smartphones to shoot photos (as speed is very important for portable devices)?

Coming back on topic, just as an indication, I looked again at the 8 images I tested with WebP2, AVIF and NHW at -l9 high compression setting, and when comparing WebP2 and AVIF, it's tight visually, but slightly on 4 images out of 8, I visually prefer WebP2 because it has an impressive fine details preserving, and on the 4 other images, I prefer AVIF, but it's very subjective and quite tight.It also seems that WebP2 has the most precise details (like grain, textures, and other...) preserving/retention, so congratulations to the WebP2 team!

Cheers,
Raphael
nhw_pulsar is offline   Reply With Quote
Old 23rd June 2021, 10:21   #77  |  Link
skal
Registered User
 
Join Date: Jun 2003
Posts: 121
Quote:
Originally Posted by birdie View Post
@skal
Most modern videos are encoded with I frames inserted each 1 or 2 seconds, so the I frames compression ratio is super important - you may optimize your P and B frames all you want but if your I frames are huge, it's all for naught.

Doesn't that mean that AV1 should surpass anything before it in terms of compression efficiency including WebP/WebP2?
There's actually 3 situations where intra blocks are used:

a) Isolated intra-blocks in a P/B frame: they are used to fill-in "unpredictable" blocks (like: resolving silhouettes, unseen background appearing behind an object during a pan, etc.)
b) intra-blocks in a forced I-frame (or intra-refreshed sweeping refresh): these should be inter blocks, but...
c) "Real" intra-blocks on a real scene-change frame

The still image case is really only c), the other cases require the intra blocks specs (quant tables, how you predict from the context or bordering pixels, which are 'P-like' in essence) to be more tuned toward typical P video frame.

The forced intra-frames inserted every 1/2sec that you mentioned are mostly the b) case.

I'm thus not sure a video codec has to be the 'best still image' codec too, although it sure can't hurt!
skal is offline   Reply With Quote
Old 24th June 2021, 17:57   #78  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,725
Quote:
Originally Posted by skal View Post
There's actually 3 situations where intra blocks are used:

a) Isolated intra-blocks in a P/B frame: they are used to fill-in "unpredictable" blocks (like: resolving silhouettes, unseen background appearing behind an object during a pan, etc.)
b) intra-blocks in a forced I-frame (or intra-refreshed sweeping refresh): these should be inter blocks, but...
c) "Real" intra-blocks on a real scene-change frame

The still image case is really only c), the other cases require the intra blocks specs (quant tables, how you predict from the context or bordering pixels, which are 'P-like' in essence) to be more tuned toward typical P video frame.

The forced intra-frames inserted every 1/2sec that you mentioned are mostly the b) case.

I'm thus not sure a video codec has to be the 'best still image' codec too, although it sure can't hurt!
Plus HEVC introduced intra-frame prediction, so you can actually have inter-predicted blocks on an IDR frame, referencing earlier parts of the same frame.

Sometimes I feel like we're walking backwards into fractal encoding, with all the new prediction modes getting implemented ! If a codec can warp and rotate, all the ferns in a frame can be predicted from a section of the very first top left fern.

Of course, having this all general purpose is way better than the old fractal encoding demos. A frame full of ferns and a graduate student to do motion estimation and rate control were the best case scenario...
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 6th September 2021, 10:39   #79  |  Link
Jamaika
Registered User
 
Join Date: Jul 2015
Posts: 688
I have question about the compression add-ons for libtiff zlib or facebook zstd. I know these aren't google products. Will it be possible to compress libtiff photos with libweb2? I mean adding the effort <0-9> function in the decoder.
Jamaika is offline   Reply With Quote
Old 16th November 2021, 13:35   #80  |  Link
foxyshadis
ангел смерти
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Lost
Posts: 9,555
Quote:
Originally Posted by skal View Post
* video codecs are optimized for hardware, which is close to useless for image decoding. But once you target software decoding mostly, you can chose different algo that better suit.
Downside: If you don't target hardware in some fashion, even if only to make it FPGA or GPU-capable if not a full fixed chip, you end up with no mobile adoption, and mobile is the majority of the market now. AV1 got extraordinarily lucky that HEVC licensing imploded, otherwise it would never have existed as a consumer product because it requires herculean efforts to turn into hardware even with several advisors tweaking the final design. Cameras will want to offload as much encoding as possible, galleries and social media apps will want to offload as much decoding as possible, and even games will nearly always only ever use anything natively supported.

It doesn't need to be like MPEG, where almost every decision is based on hardware, but I hope at some point you have a plan for hardware engineers to at least take a look and point out where something that works in software is going to be nigh-impossible in hardware before it's too late.

Quote:
Originally Posted by skal View Post
* video containers are not always the most efficient (size and pratical-wise) when they target flexibility and editing ease.
I have a hard time squaring that idea with the fact that ISO-BMFF and Matroska have proven to be incredibly robust and versatile for images, without more than a few bytes of overhead compared to every vaunted custom container. With each generation it gets more confusing why the wheel keeps being reinvented but worse, if it's not just NIH. Then it has to be extended and re-extended until it finally has most of the features they could have had for free.

Let's face it, the Exif and thumbnail will always be orders of magnitude less efficient than container overhead.
foxyshadis is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 21:07.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.