Log in

View Full Version : Intel SVT-AV1


Pages : 1 2 3 4 [5]

benwaggoner
12th May 2025, 16:59
I think the problem is temporal or keyframe filtering, which can get even worse in libaom. SVT-AV1-PSY seems to have largely overcome it, but is, perhaps, still constrained by AV1's nature. VVC, through vvenc, comes out even worse in this regard. It's also disaster for many of these encoders, x264 included, in two-pass mode. x265's rate control was solid in two passes.
That kind of temporal discontinuity has always been an encoder problem, not a codec problem, in my experience. We even had it working well in WMV9 Advanced Profile/VC-1, where we got interframe delta QP for I-frames enabled (it was only P and B in WMV9 Main; an oversight).

benwaggoner
12th May 2025, 17:00
With FFmpeg, that's the default behavior, or maybe the only behavior, as the bitdepth conversion code of x265 is not included in the "core encoding function" that FFmpeg wraps around, it will only try to keep the input pixel format, the profile is automatically determined based on that.
That's probably also how almost all of the encoders in FFmpeg works, when some input formats not supported by an encoder, the pixel format conversion is done by auto-insterted swscale filter.
So if the x265 library that linked to FFmpeg is compiled with 10bit support, Main10 profile will be used in this case. The profile parameter doesn't work, even if you specified Main10, but input 8bit, the result is Main profile.
Oh, that's annoying! I didn't realize as I always pipe to x265 when I am using ffmpeg, so I can use the same x265 syntax regardless of the host app.

benwaggoner
12th May 2025, 17:07
IYesterday, I tested this out, and found that disabling TF and the loop filters did not eliminate the temporal "flickering," squaring with your notes that psy-rd was key.
Yeah, all low-moderate bitrate video encoding quality is reliant on having psychovisual rate distortion. Reference encoder style fixed GP at fixed QP is useful for early stage development, but never delivers actual real-world quality @ bitrate improvements compared to what a refined encoder for a prior generation codec can deliver.

Video codecs are psychovisual optimizations all the way down, including such basic elements like sRGB, gamma, and quant/lambda tables. Even an uncompressed .BMP file relies on embedded psychovisual optimization to get an 8-bit value to reasonably map to human visual perception.

oibaf
2nd June 2025, 11:00
FYI: SVT-AV1-PSY -> SVT-AV1 merge progress report (https://gitlab.com/AOMediaCodec/SVT-AV1/-/issues/2269).

ShortKatz
11th June 2025, 22:25
I wouldn't have thought that the film-grain option would be so computationally intensive. If I encode a movie with film-grain=12, encoding takes twice as long as without. It is clear to me that film-grain is computationally intensive, but my surprise is that it is so extreme that the encoding time doubles.

ShortKatz
13th June 2025, 23:10
In recent SVT-AV1 master variance boost is broken: https://gitlab.com/AOMediaCodec/SVT-AV1/-/issues/2273

oibaf
15th July 2025, 08:28
In recent SVT-AV1 master variance boost is broken: https://gitlab.com/AOMediaCodec/SVT-AV1/-/issues/2273

Fixed: https://gitlab.com/AOMediaCodec/SVT-AV1/-/merge_requests/2458

(BTW the issue was just in git, not in any official release)

benwaggoner
17th July 2025, 17:54
I wouldn't have thought that the film-grain option would be so computationally intensive. If I encode a movie with film-grain=12, encoding takes twice as long as without. It is clear to me that film-grain is computationally intensive, but my surprise is that it is so extreme that the encoding time doubles.
FGS requires analyzing video, identifying what is grain versus content, parameterizing that content, and removing the grain that the parameters can reproduce. That's all out-of-band of the encoder itself, using quite different algorithms that haven't gotten years of hand-tuned assembly optimization. And isn't based on any previously existing codebase like AV1 had VP9.

FGS is great stuff, but is really orthogonal technology to AV1-the-codec, and in a much earlier stage of refinement and adoption. I'm not aware of any big commercial deployments of FGS due to a variety of encoding and playback challenges.

Once the algorithms themselves have been proven and the playback made sufficiently reliable, real optimization can begin.

Boulder
17th July 2025, 18:38
FGS is great stuff, but is really orthogonal technology to AV1-the-codec, and in a much earlier stage of refinement and adoption. I'm not aware of any big commercial deployments of FGS due to a variety of encoding and playback challenges.

Netflix is going to be using FGS.

https://netflixtechblog.com/av1-scale-film-grain-synthesis-the-awakening-ee09cfdff40b

oibaf
17th July 2025, 20:31
Also: DVB releases findings from Film Grain Synthesis study mission (https://dvb.org/news/dvb-releases-findings-from-film-grain-synthesis-study-mission/).

benwaggoner
21st July 2025, 22:56
Netflix is going to be using FGS.

https://netflixtechblog.com/av1-scale-film-grain-synthesis-the-awakening-ee09cfdff40b
And they list some early titles using it! I'll check it out.

I think everyone doing film content wants to be doing FGS once we get sufficient maturity.

Z2697
22nd July 2025, 09:28
SVT-AV1 by default doesn't apply denoising to the video when film-grain parameter is set. (for a long time now, it used to)
The change is made because the built in denoiser isn't great, but what is it analyzing against then?
That's an essential part of the FGS process right?
It doesn't seem like it's analyzing against the "coded" image either, you'll still get noises even with lossless encoding.
So, is it just adding arbitrary type and amounts of noise?

There's no option for providing a "FGS reference" input, you'll have to use much more sophicticated method if you want that.
Maybe some encoding script, GUI or anything by AV1 enthusiasts already have done that. IDK.
And I'm sure big companies like Netflix must have that figured out?

Anyway, that's the reason why I just go like @!#%#$&% when I see some people saying "hey just turn on a bit of film-grain for everyhing and it makes everything better" online. No it's not (yet? (hopefully)).

Boulder
22nd July 2025, 09:34
SVT-AV1 by default doesn't apply denoising to the video when film-grain parameter is set. (for a long time now, it used to)
The change is made because the built in denoiser isn't great, but what is it analyzing against then?
That's an essential part of the FGS process right?
It doesn't seem like it's analyzing against the "coded" image either, you'll still get noises even with lossless encoding.
So, is it just adding arbitrary type and amounts of noise?

There's no option for providing a "FGS reference" input, you'll have to use much more sophicticated method if you want that.
Maybe some encoding script, GUI or anything by AV1 enthusiasts already have done that. IDK.
And I'm sure big companies like Netflix must have that figured out?

Anyway, that's the reason why I just go like @!#%#$&% when I see some people saying "hey just turn on a bit of film-grain for everyhing and it makes everything better" online. No it's not (yet? (hopefully)).
It's doing some kind of denoising for analysis, I've seen someone mention what it does, but it wasn't the same what --film-grain-denoise uses (and yes, it's crap and should never be used).

It definitely does not add any lost details back but creates grain adaptively based on the amount of grain/noise the original clip has. In that sense it's rather safe to use at low levels like 6-10, it won't make a clip super grainy if it's originally clean, but it can hide artifacts like banding and add a touch of fake details in the moving image.

grav1synth can be used to create a grain table based on a diff of two clips. It can also be used to extract, add or remove the FGS data from a clip. Doesn't work with all videos though, there are odd crashes every now and then.

Z2697
22nd July 2025, 09:50
It's doing some kind of denoising for analysis, I've seen someone mention what it does, but it wasn't the same what --film-grain-denoise uses (and yes, it's crap and should never be used).

Applying FGS header without the original noise being removed is also crap... IMO.

Boulder
22nd July 2025, 10:34
Applying FGS header without the original noise being removed is also crap... IMO.

The encoder removes a fair amount of noise by default so I don't see a problem there. Without a fork with psy-rd capabilities, the amount of blurring is quite substantial.

benwaggoner
24th July 2025, 01:33
Applying FGS header without the original noise being removed is also crap... IMO.
Is anyone doing that?

benwaggoner
24th July 2025, 01:36
The encoder removes a fair amount of noise by default so I don't see a problem there. Without a fork with psy-rd capabilities, the amount of blurring is quite substantial.
No one should be making AV1 with an encoder with poor psy-rd! Getting encoders to good psychovisual maturity is a major factor in the delay between a codec spec being released and it being used for real-world content.

But in any case, maintaining creative intent requires putting the same kind of grain back, and real-world grain varies a lot. So you really can't do good classification without developing most of a removal algorithm as well.

Z2697
24th July 2025, 17:28
Is anyone doing that?

Anyone using SVT-AV1's built-in FGS feature, without enabling the denoising (which is off by default), is doing that, technically.
But the denoising is crap, which in turn makes whole thing crappier, so who can blame them.

Maybe it makes sense (very little, however) that the encoder "conveniently" blurs the video enough, but what I mean is, this is misuse, this feature should be used more carefully.

Just my 2 cents.

benwaggoner
24th July 2025, 18:35
Anyone using SVT-AV1's built-in FGS feature, without enabling the denoising (which is off by default), is doing that, technically.
But the denoising is crap, which in turn makes whole thing crappier, so who can blame them.

Maybe it makes sense (very little, however) that the encoder "conveniently" blurs the video enough, but what I mean is, this is misue, this feature should be used more carefully.

Just my 2 cents.
I personally wouldn't consider mainline SVT-AV1's quality sufficient for premium content in any case. And FGS is orthogonal to the codec itself. I'm not aware of any particularly good open source implementations for FGS parameterization and removal.

oibaf
2nd August 2025, 11:00
[3.1.0] - 2025-7-24 (https://gitlab.com/AOMediaCodec/SVT-AV1/-/releases#310---2025-7-24)

API updates


Added new flags for --chroma-qm-min and --chroma-qm-max from SVT-AV1-PSY (!2442)
Introducing --rtc flag to set the default parameters for an improved RTC performance (!2443)
Enabled M11 and M12 presets for rtc mode for faster speed levels (!2452)


Encoder


Improved mid and high quality presets quality vs speed tradeoffs for fast-decode 0,1,2 modes in random access (!2443):
~15-25% speedup for M1-M5 at the same quality levels for fast-decode 0
~15-20% speedup for M3-M7 at the same quality levels for fast-decode 1,2
1-1.5% BD-Rate improvement for M0 MR
Significant improvements in Low Delay mode and enabling presets 0-6 by enabling missing coding features
Improved performance of the RTC mode with ~5-10% BD-Rate improvements at similar complexity across presets M7-M10 (!2452)
Further Arm Neon and SVE2 optimizations that improve high bitdepth encoding by an average of ~5% in low resolutions
Added S-Frame support for random access mode (!2451)
Additional improvements / porting of features from SVT-AV1-PSY for variance boost (!2431, !2432)


Cleanup Build and bug fixes and documentation


General testing improvements and fixes (!2406, !2454)
Deprecated unused avx512{er,pf} as they were never used and also removed with GCC 15 (!2415)
Visual console display fixes (!2420, !2423)
Fixed compilation bugs and cleanup with Arm (!2417, #2259, !2427, !2434, !2438, !2439)
Fixed some formulas in the documentation (!2444)
Added new options to slim down SVT-AV1 for RTC use cases (!2456, !2457, !2459)
Fixed some issues with QP handling, vbr stability, and screen content (!2458, #2262, #2272, #2273)
Fixes issue with resize-mode (!2463, #2282, #2260)
Removed cpuinfo dependency and instead use cpu detection code from aom (!2426, !2453)


Arm Improvements


Speed comparison was done against v3.0.2 on AWS Graviton4 instances with Clang 20
Uplits are geometric means across presets 0-10


Landscape video:


1080p: +4%
720p: +6%
480p: +6%
360p: +3%
240p: +4%


Portrait video:


1080p: +8%
720p: +4%
480p: +3%
360p: +7%
240p: +4%