Any news about a potential successor to VVC / H.266? [Archive] - Page 2

View Full Version : Any news about a potential successor to VVC / H.266?

Pages : 1 [2] 3

ksec

13th February 2024, 10:15

Thank you very much!

In random access mode unfortunately it crashes at the 2nd frame every time, but i could get it work in allintra mode, so i can at least examine how does it perform for still image compression.

Last time I checked on it ( ECM4 I think in late 2022 or early 2023? ) It was ridiculously good. To the point I think we might as well forget VVC and move forward with FVC / H.267.

hajj_3

13th February 2024, 11:17

ecm paper: https://arxiv.org/pdf/2401.02145.pdf

Tommy Carrot

13th February 2024, 12:38

I didn't check codecs. I created encoder only in AVX. I had no time.
No worries mate, i'm grateful for your build anyway.
Last time I checked on it ( ECM4 I think in late 2022 or early 2023? ) It was ridiculously good. To the point I think we might as well forget VVC and move forward with FVC / H.267.
Well, i could not test random access mode yet, but allintra is definitely very promising, still image quality is very noticably better than in VVC, and compared to HEIF, it's a completely different class.

kurkosdr

13th February 2024, 20:41

Last time I checked on it ( ECM4 I think in late 2022 or early 2023? ) It was ridiculously good. To the point I think we might as well forget VVC and move forward with FVC / H.267.
Personally, I still can't believe technology was able to improve significantly beyond HEVC. I had a university course that covered the basics up to AVC in 2014, I have read about the neat tricks HEVC employs to go beyond AVC, but VVC and beyond are a mystery to me. HEVC was already supposed at the limits of entropy and the limits of post-processing.

Also, are the gains for ECM consistent across the resolution range? I mean, AVC can already do 720x576p25 at 2mbps with decent quality, so does this mean that with ECM we'll be able to get 2*0.6*0.6*0.7= 0.5Mbps SD video with decent quality?

benwaggoner

14th February 2024, 20:04

Last time I checked on it ( ECM4 I think in late 2022 or early 2023? ) It was ridiculously good. To the point I think we might as well forget VVC and move forward with FVC / H.267.
VVC is a completed standard starting to go into silicon. H.267 may be the best codec to use in 2030, but I'd rather not have to stick to HEVC until then.

benwaggoner

14th February 2024, 20:14

Personally, I still can't believe technology was able to improve significantly beyond HEVC. I had a university course that covered the basics up to AVC in 2014, I have read about the neat tricks HEVC employs to go beyond AVC, but VVC and beyond are a mystery to me. HEVC was already supposed at the limits of entropy and the limits of post-processing.
Whomever said that had a limited imagination! There are tons of encoding techniques that haven't been implemented into a codec yet. Complex 3D warping. More elaborate forms of texture synthesis than just film grain.

A speculative codec from 25 years now could be a ML kernel that turns into an optimized entropy decoder for other ML kernels that synthesize a movie (including actors, acting, lighting, sets, dialog), with residuals as other ML kernels that correct the first approximation. A future codec could be a superset of H.267, Unreal Engine, ChatGPT, and three other things we've never considered. A video codec is just a stream of bits that turn into a sequence of images.

Also, are the gains for ECM consistent across the resolution range? I mean, AVC can already do 720x576p25 at 2mbps with decent quality, so does this mean that with ECM we'll be able to get 2*0.6*0.6*0.7= 0.5Mbps SD video with decent quality?
Older codecs weren't as well optimized for higher resolutions, so you'll typically see bigger gains there. AVC High Profile, which added 8x8 blocks was to make HD encoding more efficient. AVC Main was originally tuned much more for SD. HEVC can do 32x32 texture units, which is a whole lot better. VVC has some advantages over HEVC for 8K. Reference encoders are SLOW, so there has been a natural tendency to focus on lower resolutions as experiments can be run much faster. 4K is 24x the pixels of 720x480!

We're likely past the point of diminishing returns for improving higher resolutions more than lower ones. For moving images, it's really hard to resolve more than 4K anyway, and pretty much impossible for 24p with 1/48th sec motion blur.

kurkosdr

15th February 2024, 17:24

Whomever said that had a limited imagination! There are tons of encoding techniques that haven't been implemented into a codec yet. Complex 3D warping. More elaborate forms of texture synthesis than just film grain.

A speculative codec from 25 years now could be a ML kernel that turns into an optimized entropy decoder for other ML kernels that synthesize a movie (including actors, acting, lighting, sets, dialog), with residuals as other ML kernels that correct the first approximation. A future codec could be a superset of H.267, Unreal Engine, ChatGPT, and three other things we've never considered. A video codec is just a stream of bits that turn into a sequence of images.
Personally, I'd never use an ML encoder, since it can create things that aren't there. I've made my peace with the concept of lossy compression by considering it a form of clever downscaling in the frequency domain (the weights of the higher frequencies are recorded with less accuracy, that's what the whole "integer-divide the DCT'ed block by the quantization table" does). But ML could create things that were never there. For example, how such a video would be admissible as evidence, and how can we claim any kind of creative intent is maintained? And let's be real, in the real world no residual will be recorded in the name of bitrate reduction, much like both AVC and HEVC are typically pushed to their limits by content providers/broadcasters. But imagine instead of loss of detail and blockiness you get rogue details.

Anyway, back on topic, any good info on what ECM does to achieve its compression gains over VVC (and VVC over HEVC)?

benwaggoner

15th February 2024, 20:16

Personally, I'd never use an ML encoder, since it can create things that aren't there. I've made my peace with the concept of lossy compression by considering it a form of clever downscaling in the frequency domain (the weights of the higher frequencies are recorded with less accuracy, that's what the whole "integer-divide the DCT'ed block by the quantization table" does).
Good description of pretty much all important codecs from JPEG on. It's really down to a perceptually optimized multidimensional scaling at heart.

But ML could create things that were never there. For example, how such a video would be admissible as evidence, and how can we claim any kind of creative intent is maintained? And let's be real, in the real world no residual will be recorded in the name of bitrate reduction, much like both AVC and HEVC are typically pushed to their limits by content providers/broadcasters. But imagine instead of loss of detail and blockiness you get rogue details.
Yeah, "hallucinations" become an increasingly big risk with ML. And even with non-ML techniques to some degree, as compression ratios get more and more complex. Film Grain Synthesis can create a different grain texture. Complex error concealment, deranging, or deblocking can interpolate detail that wasn't there originally instead of just losing detail.

In general principle, the more advanced compression gets, the more plausible the results of a bitstream error will be, as everything but entropy gets squeezed out. Bits spent that can indicate a later bit is wrong is a failure of arithmetic encoding.

One way around this would be deterministic ML - if we have models that always respond the same way to the same input, we can ensure that pixels we see in QA are the same pixels that will be seen anywhere. We can think about this like the transition from MPEG-2 using floating point to iDCT in modern codecs. We don't need future codecs to be compatible with general purpose ML engines for decode; we can continue to specify particular behavior in decoders.

ksec

18th February 2024, 06:48

VVC is a completed standard starting to go into silicon. H.267 may be the best codec to use in 2030, but I'd rather not have to stick to HEVC until then.

VVC 1.0 was standardised in June 2020 at the time it was VTM 9. But in reality it was 2.0 in 2022 April with all compliance and standard test ready. Which at the time it was VTM 16.

I think FVC / ECM 11 right now is much closer to VTM 9 / VVC 1.0 stage. And in terms of Next Gen Codec, this is by far the earliest reach by historic standard. Part of me think that is due to COVID and researcher got insane productivity while being stuck at home.

I still remember I was impressed by VVC during VTM development, well turns out we squeeze another ~30% BD-Rate out of it within 2-3 years time.

But on the other hand it will likely be the first codec that requires hardware decoder to work.

Jamaika

18th February 2024, 16:42

Test VVC encoder

vvencapp: VVenC, the Fraunhofer H.266/VVC Encoder, version 1.6.1-c5da1b5 [Windows][GCC 14.0.0][64 bit][SIMD=SSE42]

Encoder always converts videos to 10bit. Currently frame rate isn't read in libavcodec.

VVEncoderApp_avx.exe -i "113.yuv" -o "output_vvenc_10bit.vvc" -v 5 -t 4 -s 1280x720 --fps 3000/1001 -c yuv420 -ip 256 --internal-bitdepth 10 --bitrate 3Mbps -f 200 --passes 2 --preset medium --level auto --tier main

mp4box.exe -new -add output_vvenc_10bit.vvc output_vvenc_10bit.mp4
Track Importing VVC - Width 1280 Height 720 FPS 3000/1001
VVC Import results: 200 samples (377 NALUs) - Slices: 2 I 0 P 198 B - 0 SEI - 1 IDR - 0 CRA
VVC Stream uses forward prediction - stream CTS offset: 31 frames
[vvc @ 0000019cf30e4ef0] Intra Block Copy is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[vvc @ 0000019cf30e4ef0] frame 3, P( 0, 0) failed with -1163346256
[vvc @ 0000019cf30e4ef0] Intra Block Copy is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[vvc @ 0000019cf30e4ef0] frame 4, P( 5, 0) failed with -1163346256
[vvc @ 0000019cf30e4ef0] Intra Block Copy is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[vvc @ 0000019cf30e4ef0] Intra Block Copy is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[vvc @ 0000019cf30e4ef0] frame 6, P( 7, 0) failed with -1163346256
[vvc @ 0000019cf30e4ef0] Intra Block Copy is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
[vvc @ 0000019cf30e4ef0] frame 8, P( 0, 1) failed with -1163346256
[vvc @ 0000019cf30e4ef0] frame 5, P( 2, 0) failed with -1163346256
VVCSoftware: VTM Encoder Version 23.1-b4dd0e7 [Windows][GCC 14.0.0][64 bit] [SIMD=AVX]

ffvvc doesn't like VTM. Video doesn't play well.

VTMEncoderApp_avx.exe --SummaryVerboseness -c "encoder_randomaccess_vtm.cfg" --InputFile=114.yuv --BitstreamFile=output_vtm_10bit.vvc --SourceWidth=1280 --SourceHeight=720 --FrameRate=29.970 --InputBitDepth=10 --InternalBitDepth=10 --OutputBitDepth=10 --MSBExtendedBitDepth=10 --InputChromaFormat=420 --ChromaFormatIDC=420 --ConformanceWindowMode=1 --FramesToBeEncoded=200 --MatrixCoefficients=1 --InputColorPrimaries=-1 --LMCSSignalType=0 --Level=4 --BDPCM=1 --Tier=main --HashME=1 --IBC=1 --MaxCUWidth=16 --MaxCUHeight=16 --CTUSize=32 --QP=32 --RateControl=1 --TargetBitrate=3000000 --MaxBTLumaISlice=32 --MaxBTChromaISlice=32 --MaxBTNonISlice=32 --MaxTTLumaISlice=32 --MaxTTChromaISlice=32 --MaxTTNonISlice=32 --ColorTransform=0 --VideoFullRange=0 --InputSampleRange=0 --AspectRatioInfoPresent=1 --ChromaLocInfoPresent=1 --OverscanInfoPresent=1 --Log2MaxTbSize=5 --VirtualBoundariesPresentInSPSFlag=1 --EnableDecodingCapabilityInformation=1 --DecodingRefreshType=1 --HrdParametersPresent=0

mp4box.exe -new -add output_vtm_10bit.vvc output_vvenc_10bit.mp4
Track Importing VVC - Width 1280 Height 720 FPS 25000/1000
OpenGOP detected - adjusting file brand
VVC Import results: 48 samples (95 NALUs) - Slices: 3 I 0 P 45 B - 0 SEI - 1 IDR - 2 CRA
VVC Stream uses forward prediction - stream CTS offset: 31 frames

VVCSoftware: ECM Encoder Version 11.0-ce78934 (VTM-10.0-45dfe06) [Windows][GCC 14.0.0][64 bit] [SIMD=AVX]

ECM isn't probably VVC codec. Non-standard.

ECMEncoderApp_avx.exe --SummaryVerboseness -c "encoder_randomaccess_ecm.cfg" --InputFile=114.yuv --BitstreamFile=output_ecm_10bit.vvc --SourceWidth=1280 --SourceHeight=720 --FrameRate=29.970 --InputBitDepth=10 --InternalBitDepth=10 --OutputBitDepth=10 --MSBExtendedBitDepth=10 --InputChromaFormat=420 --ChromaFormatIDC=420 --ConformanceWindowMode=1 --FramesToBeEncoded=200 --MatrixCoefficients=1 --InputColorPrimaries=-1 --LMCSSignalType=0 --Level=4 --BDPCM=1 --Tier=main --HashME=1 --IBC=1 --MaxCUWidth=16 --MaxCUHeight=16 --CTUSize=32 --QP=32 --RateControl=1 --TargetBitrate=3000000 --MaxBTLumaISlice=32 --MaxBTChromaISlice=32 --MaxBTNonISlice=32 --MaxTTLumaISlice=32 --MaxTTChromaISlice=32 --MaxTTNonISlice=32 --ColorTransform=0 --VideoFullRange=0 --InputSampleRange=0 --AspectRatioInfoPresent=1 --ChromaLocInfoPresent=1 --OverscanInfoPresent=1 --Log2MaxTbSize=5 --VirtualBoundariesPresentInSPSFlag=1 --EnableDecodingCapabilityInformation=1 --DecodingRefreshType=1 --HrdParametersPresent=0

mp4box.exe -new -add output_ecm_10bit.vvc output_ecm_10bit.mp4
[VVC] wrong num tile columns 207 in PPS
[VVC] Error parsing NAL unit type 16
[VVC] Error parsing Picture Param Set
[vvc @ 000001692f8cc2b0] sps_delta_qp_in_val_minus1[i][j] out of range: 610, but must be in [0,255].
[vvc @ 000001692f8cc2b0] Failed to read unit 1 (type 15).
[vvc @ 000001692f8cc2b0] Failed to parse picture unit.
[vvc @ 000001692f8c1760] Could not find codec parameters for stream 0 (Video: vvc, none): unspecified size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options

https://www.sendspace.com/file/ar9enm

benwaggoner

19th February 2024, 21:17

VVC 1.0 was standardised in June 2020 at the time it was VTM 9. But in reality it was 2.0 in 2022 April with all compliance and standard test ready. Which at the time it was VTM 16.

I think FVC / ECM 11 right now is much closer to VTM 9 / VVC 1.0 stage. And in terms of Next Gen Codec, this is by far the earliest reach by historic standard. Part of me think that is due to COVID and researcher got insane productivity while being stuck at home.

I still remember I was impressed by VVC during VTM development, well turns out we squeeze another ~30% BD-Rate out of it within 2-3 years time.
Codec developers have been worried we're running out of ideas for decades. But it never quite seems to happen, thank goodness. And thank Moore's Law allowing us to keep on increasing decode compute requirements some and encode compute requirements a bunch.

But on the other hand it will likely be the first codec that requires hardware decoder to work.
You think? Not having a software decoder for testing would add a lot of friction. Hopefully at least it could be done in a GPU accelerated implementation so we can get work done before HW decoders are available.

Jamaika

23rd February 2024, 18:55

Intra Block Copy VVC errors have been corrected. Videos can be played.
Converting VTM videos requires adding framerate (-r). Video still stutters bit.
https://github.com/ffvvc/FFmpeg/pull/198
In function 'prepare_intra_edge_params_8',
inlined from 'intra_pred_8' at extra/vvc_intra_template.c:627:5:
extra/vvc_intra_template.c:535:21: warning: writing 16 bytes into a region of size 0 [-Wstringop-overflow=]
535 | left[i] = top[i] = left[0];
| ~~~~~~~~^~~~~~~~~~~~~~~~~~
extra/vvc_intra_template.c: In function 'intra_pred_8':
extra/vvc_intra_template.c:625:21: note: at offset -105 into destination object 'edge' of size 3136
625 | IntraEdgeParams edge;
| ^~~~

https://www.sendspace.com/file/f2zj4b

ksec

17th March 2024, 06:41

You think? Not having a software decoder for testing would add a lot of friction. Hopefully at least it could be done in a GPU accelerated implementation so we can get work done before HW decoders are available.

It is definitely the largest increase in terms of decoder complexity. At least not a viable options for vast majority of machines, especially Mobile Phones or Laptops. I dont think we will ever get a dav1d level decoder for VVC or FVC / H.267 ( FVC not being an official name ). And even if we assume we do get someone to write the insane amount of hand written assembly of dav1d. If we consider VVC being similar level of complexity as AV1. And we are looking at 8-10x decoding complexity of AV1 / VVC to FVC. You will need 8x more powerful computer to decode an FVC file with dav1d level of optimisation.

We are looking at an Apple M3 / or current Top Tier CPU from x86 to barely decode 1080P FVC files at 30fps with 100% CPU usage with an dav1d level of decoder ( Which we likely wont get ). I dont expect this level of CPU performance to filter through to the low end any time soon or even within next 10 years. And we haven't even talked about 4K.

So realistically any usage of FVC would requires dedicated hardware acceleration.

But Of course I hope I am utterly wrong.

ksec

15th May 2024, 07:29

From April's Meetings.

The rate reduction for natural sequences over VTM 11 in RA configuration for {Y, U, V} increased from ECM-11.0’s {-22.56%, -31.91%, -33.67%} to ECM-12.0’s {-24.01%, -33.20%, -35.34%}.

Tommy Carrot

29th July 2024, 08:19

birdie

1st August 2024, 14:24

GeoffreyA

1st August 2024, 16:53

Sounds great except VVenc loses badly x265 under certain conditions and x266 is nowhere to be found and you're already talking about H.267.

https://github.com/fraunhoferhhi/vvenc/discussions/389

I'm not a fan of how VVC has been deployed so far.

I think the problem is that VVenC denoises the video during encoding, leading to a softer picture. Recently, working with anime carrying artifacts and quantisation noise, I saw evidence of this. I found that libaom wiped the video clean, leading to an acceptable if soft picture. With VVC, I saw the same effect: the picture was cleaned up, though AV1 did a better job. (At low bitrates, too, it seems to inherit HEVC's, or x265's, characteristic artifacts along lines.) I don't know if VVenC exposes the option to disable denoising, but if it did, video might be sharper.

birdie

1st August 2024, 17:49

benwaggoner

1st August 2024, 18:08

I think the problem is that VVenC denoises the video during encoding, leading to a softer picture. Recently, working with anime carrying artifacts and quantisation noise, I saw evidence of this. I found that libaom wiped the video clean, leading to an acceptable if soft picture. With VVC, I saw the same effect: the picture was cleaned up, though AV1 did a better job. (At low bitrates, too, it seems to inherit HEVC's, or x265's, characteristic artifacts along lines.) I don't know if VVenC exposes the option to disable denoising, but if it did, video might be sharper.
I don't think it is denoising specifically, versus getting some higher QPs but with a codec that does a much better job of concealing block and particularly inter block artifacts.

Are you comparing AV1 and VVC at the same bitrates? I'd expect VVC to do somewhat better than AV1 at this in general. Of course, AV1 encoders are a lot more mature at this point.

Reencoding from source that already has video encoding artifacts is always a tricky challenge. Generally the more simlilar the codecs are, the better; reencoding from H.264 to HEVC is generally cleaner than from H.264 to VP9 as HEVC can fall back to pretty much symmetrically encoding the input pixels. Similarly, I'd expect VP9 to reencode better to AV1 than VVC, as they share so much common architecture (but I've not tested that).

As a best practice, cleaning up artifacts before reencoding is preferred. Garbage in is always at least as much garbage out, and often worse than that. The bitrates required to not come out worse than a source already encoded for distribution are often as high or higher than the original bitrate anyway. Reference and enterprise encoders are always tested and tuned on uncompressed sources, as that's what premium content has. Rest assured CrunchyRoll's sources don't have those kind of artifacts!

Given Google's heavy involvement in AV1 encoder development and use via YouTube, it would make absolute sense that libaom was tuned to handle artifacts typical in user-generated content, not just professional mezzanines or uncompressed test sources.

GeoffreyA

2nd August 2024, 12:48

Memory wasn't the best, so I did a quick test again. I would say that, at low bitrates, AV1 and VVC are on the same footing, but could be trading one artifact for another, and that will be subjective. What's certain, though, is that the bad source was "cleaned up," leading to better picture. At higher bitrates, both are preserving the artifacts of the source, leading to worse picture. So, this supports your suggestion that higher QPs, along with concealment, are saving the day. (It could also be that denoising, if present, is being cut down with more bitrate.)

The source is x264-encoded, 5 Mbps, 480p anime supposedly taken straight from the Blu-ray; it appears to be from the same master used for the DVD releases. It is not in the best shape: soft, along with subtle ringing, I'd say. I agree that one should clean up artifacts before encoding, but in this case, AV1 killed two birds with one stone: more compression and better picture (to a limit)!

Generally, though, I find that VVC is slightly ahead of AV1 and a touch sharper.

GeoffreyA

3rd August 2024, 20:04

Looks like there is some sort of denoising after all: MCTF. See birdie's link above as well as the following, p. 16:

https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9503377

FranceBB

3rd August 2024, 20:15

Rest assured CrunchyRoll's sources don't have those kind of artifacts!

Yep, I worked there in 2013 and we used to get Apple ProRes HQ files at 23,976p. Despite not being lossless, they were high quality mezzanine files. The whole licensing thing for Japanese Anime was a very big issue, though, which meant that according to which agreement you had, you could end up either getting a proper master or getting the same "TX Ready" master sent to Japanese Broadcasters and that was unfortunately XDCAM-50 (i.e a 50 Mbit/s Long GOP MPEG-2 stream at 29.970 interlaced TFF with 3:2 pulldown). Luckily it was still 1920x1080, albeit 8bit and with lots of banding. With anime being anime you also had the hard question of "what the heck do I do know" when some shows were clearly 23,976 with 3:2 pulldown BUT had either the opening or the ending 29.970 progressive. Do I deinterlace everything to 29.970p to preserve the opening/ending? Do I just IVTC everything to 23,976p (thus decimating the opening/ending)? The guideline back then was to always encode at 23,976p in those situations which is what we did. One of the worst mezzanine files we used to receive was Naruto Shippuden (I also always found odd that they kept using the same numbering as if the first series and the shippuden were the same thing, so it was like 220 + whatever episode number it was. Like episode 392 of the shippuden was 612). Why I'm mentioning Naruto in particular? Well, because we didn't have access to the TX file it was coming from and the guys at TV Tokyo were creating an high bitrate H.264 progressive file for all the streaming platforms licensing it for simulcast, but they were so not used to it that they didn't quite always encode it right, so you could end up with a 23,976fps progressive file in which some scenes had repeated frames as they were inverse telecined incorrectly. If anything, I think they were deinterlacing to 29,970 and then decimating to 23,976p so sometimes, if the pattern was right after they trimmed the clock, the colorbars etc, it matched and some other times it didn't if it was off by a few frames. :(

Anyway, from there our own mezzanine file was created (an H.264 1920x1080 level 4.1 4:2:0 8bit + AAC in mp4 with very very very high bitrate and almost always 23,976p) and then sent to the distribution encoder (along with the .ass) that would create the renditions automatically for the various resolutions and bitrates to populate the CDN. At that point, the publishing team would test those on the various devices to make sure everything was fine before the content was "unlocked" automatically at the scheduled time.

It feels like a lifetime ago, considering that I've abandoned the streaming sector a long time ago to dedicate to linear broadcasting and I've been working at Sky for 8 and a half years.

Given Google's heavy involvement in AV1 encoder development and use via YouTube, it would make absolute sense that libaom was tuned to handle artifacts typical in user-generated content, not just professional mezzanines or uncompressed test sources.

Yep and we can see this a lot in YouTube videos encoded AV1. While VP9 often presents plenty of artifacts, AV1 streams are generally much softer. This is because the AV1 approach is to blur and average out details rather than showing blocking while being bit-starved, so if the source already had artifacts it's very much possible that those ended up being "cleaned" by the same averaging out process. Whether that was intentional or not I'm not sure, but something tells me that AV1 was probably just trying to average out / smooth out problematic uncorrelated high frequency components and due to more luck than anything compression artifacts from low bitrate sources actually usually end up in that category.

Generally, though, I find that VVC is slightly ahead of AV1 and a touch sharper.

I would be surprised if it wasn't.
The goal by MPEG for H.266 VVC was to at the very least be 35% more efficient than H.265 HEVC (with 40% being the target) and in general VVEnc tests show it to be around 13% more efficient than AV1. Mind you, VVEnc is just 4 years old.

GeoffreyA

5th August 2024, 15:31

Yep, I worked there in 2013 and we used to get Apple ProRes HQ files at 23,976p.

It's interesting hearing about your experience in the industry. Most properly-encoded anime is almost always 23.976. Regarding that poor source I mentioned, namely "Sorcerer Hunters," there were other issues too. Inverse telecining in FFmpeg, with fieldmatch, decimate, and others, worked well for the main 26 episodes; but the OVA proved troublesome and I was left with combing in at least one scene. Same story in VapourSynth. So I've left it for now till I learn more.

Regarding AV1 and softness, I think future codecs will increasingly take this path because that is what is viewed, seemingly by many, as high quality these days. For my part, I find it regrettable.

nevcairiel

5th August 2024, 23:08

Regarding AV1 and softness, I think future codecs will increasingly take this path because that is what is viewed, seemingly by many, as high quality these days. For my part, I find it regrettable.

This is not new, it started with HEVC already, in contrast to H264.
Of course the alternative to softness is blocking artifacts, like H264 was famous for. I rather have softness then blocking artifacts.
Thats why in many cases people instantly preferred HEVC at very low bitrates, since a soft image is more watchable then a blocky H264 encode

The real solution is more bitrate, but we aren't getting that.

Of course thats the same reason new codecs may appear a bit sharper again - at the same bitrate you get a bit more quality, less need to reduce details, eg. less soft.

ksec

6th August 2024, 12:28

I've been curious about ECM / H.267 for a while, and i could not find a working windows build of this codec anywhere, and i dont want to beg for builds either, so with some reluctance, i set up visual studio on my work PC (btw, it's cool and all, but i feel ~40 GB for it is a bit excessive.). To my surprise, i could compile ECM encoder without a hitch, no crashes, everything works as it should.

So i could finally try it myself. It is very slow, even slower than the AV2 encoder, so much so that testing it is kinda difficult. (a 100 frames long 720p clip took about 14 hours to finish). But the quality/efficiency is very impressive indeed. Nothing really comes close to it, especially at lower bitrates. It easily outclasses AV2 in it's current stage.

H.267 ECM ( I believe the original name for it was Future Video Codec aka FVC but I dont see it being used anywhere any more) has roughly 8 - 10x the encoding complexity compared to VVC's VTM.

Right now ECM is showing about 25% BD-R compared to VTM. And up to 50% for Text and Graphics with Motions.

It also increase decoding complexity by 8x, the highest we have seen in any codec generation. I am just not entirely sure how this will work on Mobile. May be we could use it with LCEVC at 720P to reconstruct 1080P files?

Using VVC + LCEVC, which is what the Brazil TV 3.0 are going to be using in 2025, is proving to be high quality and extremely resource efficient. I remember earlier this year Brazil Government and University published a paper showing anywhere from -10% ( using more Bit Rate ) to 60% reduction [1] compared to VVC alone. And this is with early stage VVC and LCEVC encoder.

Hopefully we will have other encoder to play with soon. I hope there will be a Beamr for VVC.

[1] Somewhat interesting is that LCEVC is developed by V-Nova, a British Company and being true to British fashion they absolutely down play the potential of the tech. Which is quite amusing to me. XD

GeoffreyA

6th August 2024, 17:23

benwaggoner

8th August 2024, 21:15

This is not new, it started with HEVC already, in contrast to H264.
Of course the alternative to softness is blocking artifacts, like H264 was famous for. I rather have softness then blocking artifacts.
Thats why in many cases people instantly preferred HEVC at very low bitrates, since a soft image is more watchable then a blocky H264 encode

The real solution is more bitrate, but we aren't getting that.

Of course thats the same reason new codecs may appear a bit sharper again - at the same bitrate you get a bit more quality, less need to reduce details, eg. less soft.
I don't think there's anything keeping modern codecs from having as much detail as older ones, and in some cases they have tools that allow even greater detail than before. HEVC's transform skip and lossless CU options available in all profiles mean complex graphics with very sharp edges can be encoded perfectly, which isn't always possible in any frequency transform only codec.

It's more that the more advanced the codec, the better in-loop artifact reduction tools it has. Before in-loop deblocking, once a reference frame hit too high a QP, quality was trashed for the rest of the GOP, and you had shorter max GOP durations. Once you could get soft instead of blocky, future frames that referenced a high QP frame could spend bits adding detail instead of trying to erase erroneous detail of blocks.

HEVC added SAO, which does similar stuff for ringing artifacts. VVC is able to do much better with motion vector artifacts than prior codecs, allowing high QP prediction not look as artificial.

These sorts of tools all shift codecs towards having high QP result in just loss of detail instead of detail loss with introduction of wrong detail (ala MPEG-2, where 8x8 block patterns often became painfully visible). With a good encoder, that means it takes fewer bits to hit a certain level of high quality. For noisy content, it can also mean that it's not feasible to save all that many bits over prior codecs without losing some detail.

It also means that bitrates can be pushed much lower without introducing distracting artifacts. If a broadcaster is using fixed RF bandwidth, adding more, softer channels makes compelling economic sense, as you can get away with a lot more compression before customer start complaining. And even for IP streaming, bandwidth costs are a big part of the total cost of the business, so reducing them can help the bottom line a lot.

Still, it's not like we were getting artifact free high detail from those sectors before; it's pretty much premium content delivered over IP where the economics made sense for using enough bits to look consistently good. I'll take softer over soft-with-blocks any day. Although the psychovisual factors there can get complex; the added high frequencies of DCT artifacts in MPEG-2 offered a certain "sizzle" that some customers actually preferred over the uncompressed source.

Another reason we can see early-development encoders tend towards softness is that PSNR is intrinsically biased in that direction with per-frame QP. Encoders need psychovisually optimized adaptive quantization to lower QP in flatter areas to provide a more subjectively balanced encode. SSIM and VMAF are only a little better at properly accounting for the value of preserving detail more in low-detail areas.

ksec

23rd January 2025, 15:09

INSIGHT: Future of Video Compression ITU/ISO Workshop

MPEG and ITU organized a workshop in Geneva titled “Future Video Coding – Advanced Signal Processing, AI, and Standards” as part of preparations for defining the H.267 codec.

I cover here the requirements, with presentations made by Samsung, Amazon, China Mobile, and MainConcept. https://bit.ly/3PClCSq

Samsung
Samsung highlighted the deployment of different codecs across various markets: the MPEG family is used for broadcast/pay TV/OTT, while AOM codecs are prevalent in OTT/social media platforms. The company emphasized that codec efficiency is no longer the highest priority due to advancements in network capabilities (e.g., 5G, Fiber). Samsung's key priorities are:
- Licensing cost (Samsung has implemented all existing codecs: MPEG-2, VP8, VP9, AVC, HEVC, AV1).
- Low complexity (important for encoding on smartphones, smart glasses, and high-end TVs).
Samsung also discussed the evolution of traditional codecs: moving from pre-H.266 codecs based on a collection of algorithmic tools to hybrid deterministic/ML-based tools for H.267, and eventually, to fully end-to-end ML-based encoders (autoencoders) by the 2030 timeframe.

Amazon
Amazon presented various points of innovation and improvement:
- Film grain synthesis (introduced in AV1 and later adopted by VVC).
- Subjective evaluation of new codecs, considering all levels of the ABR ladder.
- Native support for multiview video.
- Error resilience in UDP transport.
- Enhanced handling of banding and low-light conditions.
- Increased use of subjective metrics, such as VMAF.
- Advocacy for AI-based compression that can be updated even after the standard is published.

China Mobile
China Mobile focused on UHD (4K/8K), 5G, and emphasized:
- Lower latency and increased parallel processing.
- UHD capture using portable devices.
- Real-time adaptation of compression parameters based on varying network conditions.
- Balancing compression efficiency with reduced complexity (challenging with AI).
- Including both peak and average bitrates in benchmarks.
- Adaptive compression based on image content, network conditions, device type, and unicast demands.
- Proper handling of AI-generated video.
- Using subjective quality measurements for evaluations.

MainConcept
MainConcept’s presentation was highly grounded, reminding the audience that a codec takes 10–15 years to progress from standardization to large-scale deployment. This means H.267 may only see widespread adoption around 2035–2040.
Their recommendations for H.267 included:
- No additional compression gain compared to VVC.
- Prioritizing resolutions like 1080p and 4K.
- Encoding tools focused on improving quality rather than reducing bitrate.
- Ultra-low latency (sub-frame).
- CPU-friendly decoding.
- Parallel processing support.
- Shorter time-to-market cycles.

https://www.linkedin.com/feed/update/urn:li:activity:7286232368439341056/

Pretty much echoing what I have been saying about compression efficiency, complexity, 5G and quality / bitrate issues for years.

kurkosdr

23rd January 2025, 16:59

Samsung also discussed the evolution of traditional codecs: moving from pre-H.266 codecs based on a collection of algorithmic tools to hybrid deterministic/ML-based tools for H.267, and eventually, to fully end-to-end ML-based encoders (autoencoders) by the 2030 timeframe.
Yay! I can't wait for decoders that hallucinate details that weren't there in the first place. Also, I can't wait for the inevitable scandal when some EBU broadcaster re-broadcasts Euronews or BBC World terrestrially at a low bitrate and the decoder hallucinates details that were never there over some politically important footage and the re-broadcaster gets blamed for messing with the footage. And yes, some EBU broadcasters re-broadcast Euronews or other EBU broadcasters terrestrially (for example, Greece's ERT re-broadcasts BBC World terrestrially).

At least artifacts are recognizable as artifacts, they don't hallucinate things that were never there, not plausibly at least.

FranceBB

23rd January 2025, 22:06

Yeah... I'm also a bit skeptical on the introduction of machine learning in H.267 encoders, basically what Samsung wants to do. The main reason is that they would be inevitably subject to hallucinations and while the current compression artifacts are easily recognizable by everyone, adding machine learning to an encoder could lead to hallucinations which are absolutely critical to avoid, especially in high importance contexts like news and media archival. Even nowadays we have all kind of trickeries implemented on the consumer devices like linear interpolation to increase the framerate, various upscaling and post processing techniques implemented by the various TVs etc, but they're on the consumer devices and they can always be disabled. If we put those things in the encoders, then there's gonna be no escape from it, for everyone, from mezzanine files to distribution files, anything can hallucinate and obviously the lower the bitrate the greater the chances of it happening. If they wanna introduce them, fine, but they should be encoder specific, it should be possible to disable them and they should only allow for things like helping with predictions for a better motion-compensation etc.

By the way, there's also something that wasn't included in the report which I think is important: H.267 expectations are for it to provide a 25% bitrate reduction compared to H.266, which just shows how difficult it's getting to create new more efficient codecs after H.264.

MPEG-2 is 30% more efficient than MPEG-1.
xvid is 10% more efficient than MPEG-2.
H.264 is 40% more efficient than xvid.
H.265 is 35% more efficient than H.264.
H.266 is 30% more efficient than H.265.
H.267 will be 25% more efficient than H.266.

Codec Efficiency Gain (vs. Predecessor)
MPEG-1 ~0% (no predecessor)
MPEG-2 ~30%
Xvid ~10%
H.264 ~40%
H.265 ~35%
H.266 ~30%
H.267 ~25% (estimated)

Are we just going towards a saturation point? Perhaps that would explain why they wanna integrate machine learning stuff (which I'm against obviously).

rwill

23rd January 2025, 23:14

Can't wait for Xerox Moments (https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_are_switching_written_numbers_when_scanning) when encoding...

Otherwise just a note: Mpeg-1 and Mpeg-2 have around the same efficiency. It is just that Mpeg-2 supports interlaced.

Mpeg-1 is quite superior to H.261 (50%+ reduction ?) though because H.261 has very limited motion compensation. It is H.261 that has no predecessor

benwaggoner

24th January 2025, 20:18

rwill

24th January 2025, 20:39

I am pretty positive that Mpeg-1 already had Half Pel MC, having implemented Mpeg-1/2 encoders, decoders and the like.

benwaggoner

27th January 2025, 18:44

I am pretty positive that Mpeg-1 already had Half Pel MC, having implemented Mpeg-1/2 encoders, decoders and the like.
You are correct.

And sheesh, that was so long ago now!

modus-ms325c

28th January 2025, 02:03

maybe it's about time we brought MPEG-1/2 back from the ashes...

kurkosdr

28th January 2025, 18:10

maybe it's about time we brought MPEG-1/2 back from the ashes...
What do you mean by "bring MPEG-1/2 back from the ashes"? MPEG-1 or MPEG-2 are still used when compatibility with old equipment or old formats (VCD and DVD-Video accordingly) is desired, they never went away. Especially not MPEG-2.

But use them for modern things like HD? Hell no. Even if you want something royalty-free and ISO standard, MPEG-4 Part 2 is reasonably well-supported and is royalty-free except in Brazil (and will be fully royalty-free sometime in 2026). MPEG-2 is royalty-free except in Malaysia (and won't be royalty-free until sometime in 2035 due to submarine patents in that country). There is no reason to use MPEG-1 or MPEG-2 other than compatibility with old equipment or old formats.

FranceBB

28th January 2025, 23:38

Well, as someone who encodes MPEG-2 stuff regularly, I wouldn't mind if someone got the old encoders polished (HC-ENC, x262, lavc). Ideally it would be nice to have a properly multi thread and numa node aware open source MPEG-2 encoder. I mean, sure, the encoding complexity of MPEG-2 is very low for modern CPUs, but the problem is that most of the time we're limited in the amount of resources used, not to mention that most encoders are either plain C / C++ only or have only old assembly optimizations like SSE2 and could benefit from multithreading and assembly optimizations up to AVX512, but I guess no one will ever spend time doing that...

rwill

29th January 2025, 05:09

Don't think that Mpeg-2 can benefit from anything above SSE2. 16x16 Macroblocks and all.

Regarding good Mpeg-2 encoders, I have done multiple tests over the years and I am still looking for some free encoder to beat my y262 encoder in metrics/subjective quality... but one does not simply switch horses at the end of the race right?

modus-ms325c

29th January 2025, 16:30

Regarding good Mpeg-2 encoders, I have done multiple tests over the years and I am still looking for some free encoder to beat my y262 encoder in metrics/subjective quality... but one does not simply switch horses at the end of the race right?
idk man, maybe work on y262's missing features first since no one else will or is unable to do so for you

rwill

29th January 2025, 18:19

idk man, maybe work on y262's missing features first since no one else will or is unable to do so for you

So... whats missing really?

benwaggoner

29th January 2025, 21:12

What do you mean by "bring MPEG-1/2 back from the ashes"? MPEG-1 or MPEG-2 are still used when compatibility with old equipment or old formats (VCD and DVD-Video accordingly) is desired, they never went away. Especially not MPEG-2.
Are you aware of anyone still making VCDs this decade? If so, do you know why? I would think that those old pre-DVD VCD players used for movie piracy back in the day would have all broken down ages ago.

It's delightful to see ancient tech still being used!

Last I heard, the US Navy was still using VC-1 on submarines.

rwill

30th January 2025, 06:16

High Flyer – Etihad’s inflight entertainment (https://www.broadcastprome.com/case-studies/high-flyer-etihads-inflight-entertainment/)

Thales and Panasonic have different audio and video encoding requirements. Panasonic requires its video files to be encoded at MPEG 4 1.5mbps with 16:9 aspect ratio, and its audio files in mp3 at 128kbps. Thales requires Hollywood movies to be encoded in MPEG 1 at 1.5mbps with an aspect ratio of 4:3 and MPEG 2 in 16:9. All other video content for Thales is encoded in MPEG 1 at 1.5mbps aspect ratio 4:3, while audio files have the same format as Panasonic.

Article is from 2017 and airplanes do not get upgrades often?

modus-ms325c

30th January 2025, 16:01

So... whats missing really?
https://files.catbox.moe/kvqsdx.png

rwill

30th January 2025, 17:27

https://files.catbox.moe/kvqsdx.png

It does not have Frame Threading because it has Slice Threading. Slice Threading is the more sane choice for Mpeg-2.

About Dual Prime and Field Pictures, do you know what these are and where they have a real world use case?

modus-ms325c

31st January 2025, 01:40

I don't know what they are, haven't seen what they do in practice, and was most definitely not the writer of a README that consists of these phrases.

rwill

31st January 2025, 06:53

I don't know what they are, haven't seen what they do in practice, and was most definitely not the writer of a README that consists of these phrases.

Then please be more careful with statements like these:

idk man, maybe work on y262's missing features first since no one else will or is unable to do so for you

modus-ms325c

31st January 2025, 15:06

*** edited ***

avih

31st January 2025, 19:56

@modus-ms325c next time it's a ban. Please behave.

benwaggoner

3rd February 2025, 18:59

High Flyer – Etihad’s inflight entertainment (https://www.broadcastprome.com/case-studies/high-flyer-etihads-inflight-entertainment/)

Article is from 2017 and airplanes do not get upgrades often?
The in-seat entertainment gets updated more frequently than you might think, as newer generations can save a lot of weight, thus fuel, and thus operating expenses. Some of the older systems were quite a few kilos per seat, including wiring.

Some airlines, like Alaska, have ditched in-seat entertainment entirely and just give everyone free WiFi access to the preloaded content library on the plane. That lets them use H.264 + AAC-LC at the minimum, and it would probably work to make it HEVC today. That said, I don't think compression efficiency is directly tied to cost savings anymore, so they'll emphasize compatibility over that. Once you've got the WiFi to handle one codec, a more efficient one doesn't help that much. And while better compression could store more titles, storage is also getting cheaper on its own. I expect that they're more limited by licensing than storage anyway.

kurkosdr

4th February 2025, 00:54

It does not have Frame Threading because it has Slice Threading. Slice Threading is the more sane choice for Mpeg-2.

About Dual Prime and Field Pictures, do you know what these are and where they have a real world use case?

If memory serves me well, a "field-picture" encodes a pair of interlaced fields (top and bottom field). Basically, in MPEG-2, in interlaced mode, you can either have a picture encoded as a progressive frame or as a pair of fields. The first type of picture is useful for "fake interlace" (needed to get a 25p video on DVD-Video because DVD-Video doesn't support progressive mode for example) or for scenes with zero motion in true interlaced video, and the other is for scenes with motion in true interlaced video.

Note: when I say "true interlaced video" I mean the kind of interlaced video that would give you those annoying comb artifacts during motion if you tried to weave/overlay its field-pairs into frames.

If so, does it mean y262 can't encode true interlaced video?