Benwaggoner HEVC encoding challenge [Archive] - Page 5

View Full Version : Benwaggoner HEVC encoding challenge

Pages : 1 2 3 4 [5]

benwaggoner

12th February 2023, 04:25

As I see modern video cameras to get 'real world' images have good progress of 'internal denoisers' so the world of broadcasting really moves to this direction. When you buy new set of video cameras with lower noise to your studio with less noise you also got great benefit to the quality of the compressed by MPEG encoders content for broadcasting. But it may be also simulated by (much cheaper) denoise hardware unit before your master MPEG coder. Or even software of zero price if you make software file-based processing and use software MPEG encoder.

Also internally both temporal denoiser and MPEG encoder based on the same ideas of motion tracking (block or object based) so may use same hardware to make motion estimation (and may be MPEG encoder may reuse motion estimation from denoiser). May be it is already implemented in modern codecs after HEVC - like in AV1.

So the 'preprocessing' between real world and MPEG coder is really essential part of total moving pictures compression process. To remove 'real world' random data and to clean really required visual information about scene from random noise of intermediate scene-view transfer media of photons flux.
The inter-frame 'denoiser' simply simulate video camera with more accumulating time per each scene object in compare with 'primary scene frame-based' camera. Primary camera (if not using internal interframe digital denoise) is limited to inter-frame time interval to accumulate photons. Also can not perform individual scene objects trackng.

The 'motion compensated denoiser' can extend accumilating time to about total visibility time of the object in the cut-scene and perform individual tracking if each scene object is not static relative to the primary video camera. So it simulates massive array (equal to the blocks number in blocks-based denoiser) of 'secondary video cameras' with individual tracking and much more extending data accumulating time without motion blur.

And after this 2-stages physical + simulated secondary video cameras scene data transform you got more clean scene data to MPEG encoder and pass it to simple enough MPEG encoder and got better output quality because MPEG can now spent more bits to the real scene objects encoding and not to residual noise encoding after non-complete motion compensation of nosied blocks.
Yeah, the line between encoding and preprocessing was never all that clear, and certainly has been becoming less so as codecs and encoders advance.

Still allowing unfettered preprocessing seems risky, as techniques like adding contrast like encoders were using to goose VMAF scores is a risk.

The spirit of the test is "what can deliver output most like the source" - which has unavoidable subjective elements.

I'm open to suggestions on how best to address this in an updated version of the challenge.

I'm thinking using StEM2 10-bit, with separate 1080p SDR and 2160p HDR targets. Thoughts?

DTL

23rd February 2023, 11:51

I'm open to suggestions on how best to address this in an updated version of the challenge.

The idea is to take noise-free source and to test encoder after adding some 'natural' noise and compare with 'clean' source.

As natural photon-shot noise is Poisson distribution at very low photons count and close to Gauss at medium and high we can more or less acceptably simulate it with AddGrain AVS plugin (promised to have gauss-noise).

For noise-free source we can take either 100% syntetic render (not very realistic natural source) or try to take as clean as possible natural camera shots.
As additional noise-cleaning steps the downsize may be used. 8x downsize adds about +18 dB to SNR (from photon-shot noise) and for 8bit and may be 10bit is it close to quantization noise only with good enough daylight camera shots with 0dB gain or even some -dB gain.

And to arrange some not very small frame for testing the tiling of downsized parts of total clip may be used. It also create 'average' complexity framestream of the total clip (that typically contain different complexity cutscenes) of short enough length to encode.

So with current 3.7.3 test 6 AVS+ release from https://forum.doom9.org/showthread.php?p=1983250#post1983250
was created source cleaning and conditioning script:

LoadPlugin("ffms2.dll")
LoadPlugin("avsresize.dll")
LoadPlugin("AddGrainC.dll")

FFmpegSource2("04.ts")

ConvertBits(16)

LetterBox(8, 8, 8, 8)

fc=last.FrameCount
fco=fc/16

drw=last.width / 8
drh=last.height / 8

c00=Trim(0 * fco, fco)
c01=Trim(1 * fco, 1 * fco + fco)
c02=Trim(2 * fco, 2 * fco + fco)
c03=Trim(3 * fco, 3 * fco + fco)

c10=Trim(4 * fco, 4 * fco + fco)
c11=Trim(5 * fco, 5 * fco + fco)
c12=Trim(6 * fco, 6 * fco + fco)
c13=Trim(7 * fco, 7 * fco + fco)

c20=Trim(8 * fco, 8 * fco + fco)
c21=Trim(9 * fco, 9 * fco + fco)
c22=Trim(10 * fco, 10 * fco + fco)
c23=Trim(11 * fco, 11 * fco + fco)

c30=Trim(12 * fco, 12 * fco + fco)
c31=Trim(13 * fco, 13 * fco + fco)
c32=Trim(14 * fco, 14 * fco + fco)
c33=Trim(15 * fco, 15 * fco + fco)

bp=105 // average sharpness
cp=0

c00=UserDefined2Resize(c00, drw, drh, b=bp, c=cp)
c01=UserDefined2Resize(c01, drw, drh, b=bp, c=cp)
c02=UserDefined2Resize(c02, drw, drh, b=bp, c=cp)
c03=UserDefined2Resize(c03, drw, drh, b=bp, c=cp)

c10=UserDefined2Resize(c10, drw, drh, b=bp, c=cp)
c11=UserDefined2Resize(c11, drw, drh, b=bp, c=cp)
c12=UserDefined2Resize(c12, drw, drh, b=bp, c=cp)
c13=UserDefined2Resize(c13, drw, drh, b=bp, c=cp)

c20=UserDefined2Resize(c20, drw, drh, b=bp, c=cp)
c21=UserDefined2Resize(c21, drw, drh, b=bp, c=cp)
c22=UserDefined2Resize(c22, drw, drh, b=bp, c=cp)
c23=UserDefined2Resize(c23, drw, drh, b=bp, c=cp)

c30=UserDefined2Resize(c30, drw, drh, b=bp, c=cp)
c31=UserDefined2Resize(c31, drw, drh, b=bp, c=cp)
c32=UserDefined2Resize(c32, drw, drh, b=bp, c=cp)
c33=UserDefined2Resize(c33, drw, drh, b=bp, c=cp)

r0=StackHorizontal(c00, c01, c02, c03)
r1=StackHorizontal(c10, c11, c12, c13)
r2=StackHorizontal(c20, c21, c22, c23)
r3=StackHorizontal(c30, c31, c32, c33)

StackVertical(r0, r1, r2, r3)

ConvertBits(8) // use target bitdepth conversion

#AddGrain(10)

Trim(0, 5000)

Prefetch(2)

For creating low-noise testclip from 4K footage with 1920x1080 output frame size. The b/c params for downsampler were taken for not very best 'video-makeup' sharpness to not make transients to black between 'picture in picture' elements too overshooting. The LetterBox in AVS still have issues with transients creation (https://github.com/AviSynth/AviSynthPlus/issues/339). Execution of script is slow enough so it is better save to lossless compressed 16bit per sample file (like FFV1 ver 3 from ffmpeg, yuv420p16le) and use it for testing MPEG encoders (after dithering or not to target bitdepth 8/10 or more).

The test footage was 4K UHD SDR 'A Wild Year on Earth', UK, Northern Pictures, 2020 https://www.imdb.com/title/tt13715870/, episode 04 https://www.imdb.com/title/tt13716068/?ref_=ttep_ep4 .

For faster speed (and x265 10bit again not started via avspipemod at me) MPEG encoder was 8bit x264 with settings:

--profile high --crf 18 --ref 4 -b 12 --direct auto --me esa --subme 11 --no-fast-pskip --trellis 2
--merange 24 --deblock -2:-2 --b-adapt 2 --transfer "bt709" --colorprim "bt709" --colormatrix "bt709" --psnr

Results for different AddGrain(N) settings are:
No grain: 10640K, PSNR Y Mean 43.4 dB, encoding performance about 5 fps
AddGrain(2): 12347K, PSNR Y Mean 40.7 dB, encoding performance 4.34 fps
AddGrain(5): 17807K, PSNR Y Mean 38.7 dB, encoding performance 3.43 fps
AddGrain(10): 33969K, PSNR Y Mean 36.7 dB, encoding performance 2.85 fps
AddGrain(20): 69177K, PSNR Y Mean 35.4 dB, encoding performance 2.42 fps

So the residual noise in the content to MPEG encode significantly change MPEG output bitrate with fixed CRF-type encoding (and even with higher bitrate the PSNR still lower).

In a 'perfect world' MPEG encoder with perfect noise reduction internal should output stable lowest possible bitrate at highest PSNR because the real clean content is not changed and only additive random nosie is added with zero mean (I hope). Also the quality metric (PSNR/SSIM/VIF/VMAF/..) of the MPEG encoding result must be compared with 'clean source' before noise addition.

damian101

28th February 2023, 12:58

What does peak bitrate refer to exactly, and what is VBV in regards to VP9/AV1?

damian101

3rd March 2023, 11:49

SolLevante in under 2 Mbit/s: https://mega.nz/file/9LYXgY4b#xF1PHse3hW4uQxecGwu4rTZ6dkmU4398ShdgwtLqS2o
I did not limit peak bitrate in any way, but followed the other constraints of the challenge.

damian101

3rd March 2023, 12:06

Tears of Steel in under 1 Mbit/s (slightly above with container overhead): https://mega.nz/file/RLhSmZzb#QU0wLBz-gUqjr0JpnB4dLy0Tu4BC75zU1B2pp5QusFs
I again did not concern myself with peak bitrate here.

damian101

3rd March 2023, 15:05

And here's 2 Mbit/s Tears of Steel: https://mega.nz/file/RTwknTrZ#7AamrIj0Vn9nsMb0yHv1yY4fTuQyJfKQ_2B0WwJbWOk

benwaggoner

6th March 2023, 01:51

What does peak bitrate refer to exactly, and what is VBV in regards to VP9/AV1?
I don't know the libvpx/libav1 syntax for peak bitrate control, but it should use whatever yield the same results as the specified --vbv-bufsize and --vbv-maxrate values

damian101

16th March 2023, 19:42

After further optimization, I encoded again, and this time with bitrate constrained to 4000 kbit/s (it sometimes overshoots, but so does x265, should be quite comparable).

2 Mbit/s Tears of Steel:
https://mega.nz/file/1WJkVB4B#QpNxZsEAlXfKzYHOxdkmYKvAGIfGSecFA_Pbn0M3dyw

1 Mbit/s Tears of Steel:
https://mega.nz/file/8fYlFIIL#V7EmKPpK9VfXBRN0M5BLjq33itUzR466CYbLNohyk0I

2 Mbit/s SolLevante:
https://mega.nz/file/0GhAVJZR#5BQ9YU1aarBilmALkwH7NkDKW1tQaXbVIeS1HvwLq7E

1 Mbit/s SolLevante:
https://mega.nz/file/UWwDCBAZ#3edv_dteag28_nwnsJR-L-IVsnLL7J6I2vZrDEzXgFo

At least the 1Mbit/s SolLevante could have been better if I had adapted my parameters, but I wanted to keep parameters identical for all the samples, quality control aside of course, and I generally don't care much about extremely low quality targets. As a result, all the samples don't use some features like Wiener filter and CDEF, because I don't like how they tend to remove too much detail because they're guided by crappy MSE.

benwaggoner

17th March 2023, 01:54

Sol Levante is a pretty uniquely weird and challenging source. It does great in demonstrating how well an encoder gracefully degrades when it's not possible to do a good job. And is a severe test of 2-pass VBR as it is half really easy to encode credits.

rwill

10th March 2024, 10:11

benwaggoner

12th March 2024, 01:39

Oh well how time flies...

I did 1Mbit HEVC and VVC encodes of Tears of Steel and Sol Levante using my stuff..

https://drive.google.com/drive/folders/1_y9yduWjj89VkPfIxC9FmggDIV78HJc3?usp=sharing

Where I tried to make VVC look like HEVC.

I noticed that adding more of the advanced tools to VVC gives the picture some sort of 'plastic', 'female instagram model' or 'make up' look which I did not like at all.

So anyway, after years of ToS and Sol in SDR maybe some new sequence is in order... Ben?
StEM2 is some great footage that's pretty typical for modern 24p stuff without much grain:

https://theasc.com/society/stem2

Great licensing terms and available in a bunch of formats. And recent enough that encoders haven't been overturned to death over it. Not great if you want to show off film grain synthesis, but great to show off how the encoding underneath that hood would work.

Thoughts?

rwill

12th March 2024, 18:21

So I got myself ASC_StEM2_178_UHD_ST2084_1000nits_Rec2020_Stereo_ProRes4444XQ.mov and converted it to yuv420p10le with ffmpeg like so:

../ffmpeg.exe -i <file> -pix_fmt yuv420p10le -acodec none -vcodec rawvideo -f rawvideo - | encoder...

and decided to do a HEVC Main10 HDR10 encode.

I tried to figure out a bitrate. Given the 1Mbit average target from ToS I figured going from 1920x800 to 3840x2160 .. that 4Mbit average would be somewhat reasonable and did an encode with a 120 pictures GOP, 12.5Mbit maxrate and 25Mbit buffer which more or less targets Level 5. I did not see any severe quality problems in the encode.

So I dropped the average rate to 2Mbit and started to see slight decreases in quality in flat areas.

If someone is interested, the encodes are at the following link:
https://drive.google.com/drive/folders/1G_dP4YOLvUjKVNM3TnHuHqeCifbAWKM9?usp=sharing

So while from a content perspective it may be challenging with the saturated colors and contrast, for a modern encoder it is not so much and it might be needed to decrease rate further to show encoder differences.

One thing I noticed is that filmgrain content can still trip off less tuned encoders with disastrous effects but such content may be hard to get to the public for testing...

benwaggoner

13th March 2024, 17:31

Yeah, that's the thing about test content - you've got to pick what you want to test.

So, what kind of content are people interested in testing? We could potentially assemble a sequence of different kinds of content.

KarthikTdk

23rd May 2024, 10:36

Hi all, any have c language code for h265(HEVC)

rwill

12th August 2024, 19:16

Regarding Content ..

By lurking Reddit I stumbled upon the Google Search term "archive.org prores trailer". It appears some people have been ripping Digital Cinema Trailers to Prores and uploaded them to Archive.org.

Now I have no clue about the movies the trailers are about and there is no license attached so its kind of problematic .. but from the samples I pulled it looked like OK'ish recent movie content.

Sagittaire

31st October 2024, 02:00

Someone test H266 for this challenge?

benwaggoner

7th November 2024, 01:06

Someone test H266 for this challenge?
I'd welcome a contribution! VVCEnc seems to be the closest to a x26? sort of tool, but it's still not as refined as x265 was in 2015.

Sagittaire

7th November 2024, 21:26

I'd welcome a contribution! VVCEnc seems to be the closest to a x26? sort of tool, but it's still not as refined as x265 was in 2015.

Bad multithreading optimisation for VVCEnc but I have solution for that ... ;-)

Z2697

7th November 2024, 21:53

IIRC VVde/enC are optimized version of VTM? Or one of them is.

rwill

29th January 2025, 05:55

Someone test H266 for this challenge?

Is this size easier to read? (https://forum.doom9.org/showthread.php?p=1998891#post1998891)

benwaggoner

29th January 2025, 21:14

Is this size easier to read? (https://forum.doom9.org/showthread.php?p=1998891#post1998891)
Yeah. I've hoped someone else would, but I may get curious enough to do it myself one of these days.

x265 and Beamr have also gotten better since the most recent HEVC submissions, so fresh ones of those would make sense.

Z2697

30th January 2025, 18:38

Yeah. I've hoped someone else would, but I may get curious enough to do it myself one of these days.

x265 and Beamr have also gotten better since the most recent HEVC submissions, so fresh ones of those would make sense.

I think the "core functions" of x265 hasn't been updated for years, there will be no difference if the parameters are identical.

benwaggoner

3rd February 2025, 18:38

I think the "core functions" of x265 hasn't been updated for years, there will be no difference if the parameters are identical.
There have been a decent number of minor bug fixes that can slightly improve quality since, say, five years ago. The bigger differences will be from us having learned more about how to tune x265, and having new parameters to use. That said, a lot of newer stuff is about improving quality @ perf, not improving quality at ultra placebo settings.

This test is, sheesh 6.5 years old now. 8-bit HD SDR really isn't the cutting edge anymore!

We should do a new challenge using StEM 2 as source (https://dpel.aswf.io/asc-stem2/).

For the modern era, what? Just throwing the below as a starting point for discussion:

StEM v2 "Quicktime with stereo, 1.78 @ UHD, 1000 nits ProRes4444XQ, Rec2020"
(should we make a shorter edited version for faster encoding and easier testing? The full thing is 23 minutes long)
5 Mbps ABR, 10 Mbps peak (challenging enough)
24.00 fps
384x2160p
Frame resizing techniques allowed
Film grain synthesis allowed? A with-FGS and without-FGS version?
10-bit HDR PQ Rec. 2020 primaries limited range
2-5 second variable GOP duration (fixed 2 seconds is more challenging, but increasingly less mainstream)
Closed GOP for adaptive streaming (techniques like RADL allowed as long as each fragment is independently decodable)
Objective metrics don't matter, subjective evaluation only.

How's that look? Any questions or suggested modifications? This represents all of 5 minutes of my consideration, so I hope there are!

tormento

3rd February 2025, 19:32

Film grain synthesis allowed? A with-FGS and without-FGS version?
I haven't yet understood if FGS is really a thing or not in x265 and, actually, I don't know of players that can recreate it when reproducing HEVC video.

Does it work as in AV1, where is removed and when played reconstructed or in some other way?

Can you give us some examples of how to create a proper FGS video with x265?

Z2697

5th February 2025, 08:03

126 GiB, that's a challenge by itself :p

Z2697

5th February 2025, 10:02

Eh, I downloaded it and it's 17:26 long, did I downloaded the wrong one?

The pixel format is YUV444P12, shall we specify the method of subsampling and dithering? Or even encode as is?

excellentswordfight

5th February 2025, 10:35

126 GiB, that's a challenge by itself :p
Eh, I downloaded it and it's 17:26 long, did I downloaded the wrong one?

The pixel format is YUV444P12, shall we specify the method of subsampling and dithering? Or even encode as is?
Sounds like you downloaded the IMF-version (J2K compression). And yes that is the length of it.

I think StEM2 is a pretty neat source, its very representive of "modern content", i.e. HDR UHD shot digitally with a rather clean image, I also like that there are both SDR and HDR version available which can make for some nice comparisons. My only complaint would be that the credits are a fairly big portion of the title.

I have been using the IMF versions with this commandline:

"ffmpeg.exe" -probesize 1000MB -ss 00:00:08 -i "hdr\StEM2_HDR_Rec2020PQ_444F_IMF_2160p24_178.mxf" -vf scale=out_color_matrix=bt2020ncut_h_chr_pos=0ut_v_chr_pos=0 -pix_fmt yuv420p10le -an -f yuv4mpegpipe -strict -1 - | "x265.exe" --y4m ...

And using the SDR version for 1080p tests:

"ffmpeg.exe" -probesize 1000MB -ss 00:00:08 -i "sdr\StEM2_SDR_Rec709_444F_IMF_2160p24_178.mxf" -s 1920x1080 -sws_flags spline -pix_fmt yuv420p10le -an -f yuv4mpegpipe -strict -1 - | "x264.exe" --demuxer y4m ...

rwill

5th February 2025, 10:51

We should do a new challenge using StEM 2 as source (https://dpel.aswf.io/asc-stem2/).

I don't know about the 5Mbit.. have you seen my StEM 2 HEVC encodes on the previous page? Either I just cannot see the problems at 4Mbit or 4Mbit was too much already.

We need a Video Buffer Size too.. I suggest 2sec of peak rate.

Z2697 already raised a valid concern about 444 -> 420 and bitdepth reduction being an undefined preprocessing step. So someone has to make a 10bit 420 of the source or alternatively just let everyone do their own thing. Or just define some recent ffmpeg version + cmdline to get to yuv420p10le.

@Z2697: 17:26 min duration sounds about right.

Z2697

5th February 2025, 11:23

I downloaded the ProRes4444XQ 4K HDR version, the IMF version is much larger.

A fairly big portion of the file is credits, that's "time speaking", the bitrate of the credits portion is much lower compared to the main portion.
Probably also why the seemlingly low bitrate look ok, it's averaged out by the credits part, the main part get a somewhat higher average bitrate.

If the encoding time is a concern, perhaps we can cut the credits part off... however I'm not sure if it will violate the license, it does say "Redistributions of these digital assets or any part of them must include the above copyright notice" though.

excellentswordfight

5th February 2025, 11:47

I downloaded the ProRes4444XQ 4K HDR version, the IMF version is much larger.

Is it? StEM2_HDR_Rec2020PQ_444F_IMF_2160p24_178.mxf is 124GiB over here. Maybe they have updated the packages? Or maybe the IMF download includes both 4K and UHD versions, it was a few years ago I downloaded it.

I havnt encoded the UHD HDR version at bitrates bellow 10Mbps, but I agree with the points above, as the source is rather easy to compress we need to choice a bitrate were modern codecs struggle, or at least will display differences that is relevant under normal viewing conditions. Has someone done a sanity check at 5Mbps?

edit.
I also now remembered this post i made: https://forum.doom9.org/showthread.php?p=1977513#post1977513 and especially this:

"Its still a bit interesting that even now when when I convert it in rather controlled fassion, there is both a luminance shift and colorshift (mostly reds as usual) compared to the Prores & AVC version that they offer. But I also saw encoding errors on the prores version, so Im not sure how carefully they have treated those versions... "

That was for the SDR-version mind you, but you might wanna doublecheck the prores HDR source as well (this is how the SDR prores looked https://ibb.co/MCFvGF7, and this is how it should look https://ibb.co/dkNhgqB)

So please be wary of that, if I also remember correctly, there are at least one bad frame in all soruces, that looks like an in-camera/capture issue.

rwill

5th February 2025, 12:37

Is it? StEM2_HDR_Rec2020PQ_444F_IMF_2160p24_178.mxf is 124GiB over here.

I havnt encoded the UHD HDR version at bitrates bellow 10Mbps, but I agree with the points above, as the source is rather easy to compress we need to choice a bitrate were modern codecs struggle, or at least will display differences that is relevant under normal viewing conditions. Has someone done a sanity check at 5Mbps?

You can always click on the link Ben has in his post to see the more or less official releases. There the ProRes is 126 GB and the IMF is 118, 243 or 554 GB.

Regarding a 5Mbit encode you can, as I stated above, always go one page back in this thread and check out my 4 and 2 Mbit encode.

excellentswordfight

5th February 2025, 14:05

You can always click on the link Ben has in his post to see the more or less official releases. There the ProRes is 126 GB and the IMF is 118, 243 or 554 GB.

The 118 IMF is the SDR-version (that I also have downloaded), I was more confused about that there was no one matching the size of the 124GiB version that I have. But I did an edit on my post, I think the 243GB one might have both a (DCI)4K and a UHD version included.

Edit. Yes thats the case, I still had the xml-files, they reference a 4K files as well: StEM2_HDR_Rec2020PQ_444F_IMF_1716p24_239.mxf, as well as StEM2_HDR_Rec2020PQ_444F_IMF_2160p24_178 (the one that I kept).

excellentswordfight

7th February 2025, 13:42

I did "baseline" encode using your critieras @benwaggoner (selected a 4s fixed IDR I-frames GOP, with adaptive non IDR i-frames placement), with just straight up preset slower and no tweaking.

"x265.exe" --y4m --preset slower --profile main10 --level-idc 50 --bitrate 5000 --vbv-maxrate 10000 --vbv-bufsize 10000 --keyint 96 --min-keyint 96 --rc-lookahead 96 --no-open-gop --pass 1 --hdr10-opt --range limited --colorprim bt2020 --transfer smpte2084 --colormatrix bt2020nc --master-display "G(8500,39850)B(6550,2300)R(35400,14600)WP(15635,16450)L(10000000,50)" - -o NUL
x265.exe" --y4m --preset slower --profile main10 --level-idc 50 --bitrate 5000 --vbv-maxrate 10000 --vbv-bufsize 10000 --keyint 96 --min-keyint 96 --rc-lookahead 96 --no-open-gop --pass 2 --hdr10-opt --range limited --colorprim bt2020 --transfer smpte2084 --colormatrix bt2020nc --master-display "G(8500,39850)B(6550,2300)R(35400,14600)WP(15635,16450)L(10000000,50)" - -o out

https://limewire.com/d/3010926b-e7fa-47d0-b13d-3d09a80569cc#oIw07--JjhlUvwhIxHu7gxvvemAQjThVOwMp-JGr6qk

And yes, we are going to need a harder sample that stresses the encoder more, even at what is a relative low bitrate for UHD at 5Mbps, its gonna be hard to see any meaningful differences, at least for good encoders.

benwaggoner

7th February 2025, 19:33

I haven't yet understood if FGS is really a thing or not in x265 and, actually, I don't know of players that can recreate it when reproducing HEVC video.

Does it work as in AV1, where is removed and when played reconstructed or in some other way?

Can you give us some examples of how to create a proper FGS video with x265?
Yeah, it would work the same as AV1, using a similar workflow.

x265 can mux FGS metadata from a sidecar file, so it is a thing to that degree. You'll need a tool to create the FGS metadata and players that can correctly parse that and generate the FGS overlay, which is the harder part. AV1 is really the only codec where most player software implements the FGS rendering.

There's been discussion of using either the old MPEG AVC FGS technology or using the AV1 one with HEVC, VVC, and other codecs. The AV1 is "funky" in that a lot of random seeds produce weird patterning. Tools are getting much improved year-on-year, but I'm not aware of any mainstream streaming service using AV1 FGS yet; some early devices shipped with defective implementations, not all of which have been patched since.

I've not spent enough time with practical implementations of the MPEG approach to know how well it compares, or how good the available tools are. The hardest part is the grain-parametrization-and-removal analysis preprocessing in the end. Grain removal hasn't historically been something you'd batch process whole titles through. AI is proving to be a good tool to use in this, and I think it's on the cusp of being production ready.

benwaggoner

7th February 2025, 19:40

I don't know about the 5Mbit.. have you seen my StEM 2 HEVC encodes on the previous page? Either I just cannot see the problems at 4Mbit or 4Mbit was too much already.

We need a Video Buffer Size too.. I suggest 2sec of peak rate.
I was thinking the lower of 25 Mb (max VBV for HEVC Level 5.0) or the max GOP length at peak rate. Larger allows for better rate control and multipass/lookahead encoding to improve quality. Of course, lower also stresses how well the codec handles QP spikes. Open to discussion here.

Z2697 already raised a valid concern about 444 -> 420 and bitdepth reduction being an undefined preprocessing step. So someone has to make a 10bit 420 of the source or alternatively just let everyone do their own thing. Or just define some recent ffmpeg version + cmdline to get to yuv420p10le.
For the first test I made a canonical .y4m to download, which could make sense to eliminate any risk of divergent processing. And probably edit it down some from the full length (speed up the credits?). That length wouldn't be a serious issue with HEVC or AV1 these days, but could be for VVC and AV2 due to lack of encoder maturity.

We need to be cognizant of the specifics of the Creative Commons license being used.

benwaggoner

7th February 2025, 19:42

I havnt encoded the UHD HDR version at bitrates bellow 10Mbps, but I agree with the points above, as the source is rather easy to compress we need to choice a bitrate were modern codecs struggle, or at least will display differences that is relevant under normal viewing conditions. Has someone done a sanity check at 5Mbps?

Good point. Should we identify some content with film grain? Alternatively, we could argue that in the future everything would use FGS, so how we encode no-low grain content is really the important question.

Thoughts? Any alternatives to suggest?

tormento

8th February 2025, 15:24

Yeah, it would work the same as AV1, using a similar workflow.
Given your last paragraph conclusion, do you have any idea about how to do it in real terms? Having working grain generation in hevc would be really a gamechanger.

benwaggoner

10th February 2025, 20:33

Given your last paragraph conclusion, do you have any idea about how to do it in real terms? Having working grain generation in hevc would be really a gamechanger.
I haven't messed with tools for the MPEG FGS since the HD-DVD era.

tormento

10th May 2025, 13:09

Still trying to understand how to properly use grain removal and synthesis with x265.

Any help will be really appreciated.

benwaggoner

12th May 2025, 16:46

Still trying to understand how to properly use grain removal and synthesis with x265.

Any help will be really appreciated.
Given effectively no real-world decoders support HEVC grain synthesis, the most help I could offer is avoiding that path ;).

Note that you can use AV1 FGS tools with HEVC using the AFGS1 standard. It's actually better than the original AV1 FGS as it allows the grain rendering resolution to be at display resolution while keeping grain size consistent between different resolutions. It's a relatively easy feature for hardware that has a HEVC and a AV1 decode to implement, as AV1 FGS is entirely out-of-band pre processing and post processing.

tormento

12th May 2025, 19:18

Note that you can use AV1 FGS tools with HEVC using the AFGS1 standard.
Can you please explicit this sentence? Explain me like the noob that I am :)

Boulder

12th May 2025, 19:21

With x265, the same grain table file that can be used with SVT-AV1 or aomenc, does not work.

benwaggoner

13th May 2025, 20:02

With x265, the same grain table file that can be used with SVT-AV1 or aomenc, does not work.
That would need specific AFGS1 support in x265, and more importantly players that use that metadata. Neither are hard, but are still work that needs to be done. I don't see any commits for it in x265 yet.

benwaggoner

13th May 2025, 20:15

Can you please explicit this sentence? Explain me like the noob that I am :)
This can probably help: https://aomedia.org/blog%20posts/new-film-grain-synthesis-specification-now-available/

Big picture, AV1 has an out of loop film grain synthesis feature. The workflow that a grainy source has grain removed and parameters that describe the removed grain are generated. The the de-grained video frames are encoded as normal. The grain metadata is inserted into the final bitstream like any other time-based metadata.

On playback, the decoder decodes the AV1 stream as normal. If there is grain metadata, then it synthesizes new grain hopefully matching the removed grain, which is then composited on top of the decoded video.

Note in the above, the actual codec itself has nothing to do with grain removal or synthesis. It's preprocessing, metadata, and postprocessing, entirely out of the codec's compression/decompression loop.

Because of that the AV1 grain removal can be used with any input video, and the synthesis applied to the decoded frames of any codec.

AFGS1 is AOM's specification for doing that. It also includes another grain synthesis mode where the synthesis is done at the display resolution, not the decoded frame resolution.

This fixes a big oversight in the original design. When the grain is rendered at output resolution, the grain loses detail at lower resolutions, and also needs to be generated for each input resolution.

That meant that fine source grain would become course at low bitrates/and resolutions. Grain detail would change a lot in adaptive streaming, and the goal of persevering the original grain would be lost encoding at anything but the source resolution. With the new, AFGS1-only mode, grain can always be rendered in the same way at the display resolution, so grain detail doesn't change when the bitrate does, making for a much more consistent and accurate image.

This can be especially great for content that is mostly noise, little signal, like 80's Super35 movies. At UHD, something like Ghostbusters is pretty much 720p content with a 2160p layer of grain on top of it. So Ghostbusters can be encoded at the resolution the de-grained content needs and the grain can be reconstructed in full detail.

This is all awesome, but still somewhat theoretical as AFGS1 implementations aren't really in the wild yet, and a number of early AV1 implementations shipped with defective film grain synthesis. So no one is really delivering content that uses FGS at scale yet. But the promise in terms of improved experience and bitrate reduction for the hardest to encode content is huge, so I expect it to be used a lot as the ecosystem matures.