Benwaggoner HEVC encoding challenge - Page 11

benwaggoner · 12th February 2023, 04:25

Quote:

Originally Posted by DTL

As I see modern video cameras to get 'real world' images have good progress of 'internal denoisers' so the world of broadcasting really moves to this direction. When you buy new set of video cameras with lower noise to your studio with less noise you also got great benefit to the quality of the compressed by MPEG encoders content for broadcasting. But it may be also simulated by (much cheaper) denoise hardware unit before your master MPEG coder. Or even software of zero price if you make software file-based processing and use software MPEG encoder.

Also internally both temporal denoiser and MPEG encoder based on the same ideas of motion tracking (block or object based) so may use same hardware to make motion estimation (and may be MPEG encoder may reuse motion estimation from denoiser). May be it is already implemented in modern codecs after HEVC - like in AV1.

So the 'preprocessing' between real world and MPEG coder is really essential part of total moving pictures compression process. To remove 'real world' random data and to clean really required visual information about scene from random noise of intermediate scene-view transfer media of photons flux.
The inter-frame 'denoiser' simply simulate video camera with more accumulating time per each scene object in compare with 'primary scene frame-based' camera. Primary camera (if not using internal interframe digital denoise) is limited to inter-frame time interval to accumulate photons. Also can not perform individual scene objects trackng.

The 'motion compensated denoiser' can extend accumilating time to about total visibility time of the object in the cut-scene and perform individual tracking if each scene object is not static relative to the primary video camera. So it simulates massive array (equal to the blocks number in blocks-based denoiser) of 'secondary video cameras' with individual tracking and much more extending data accumulating time without motion blur.

And after this 2-stages physical + simulated secondary video cameras scene data transform you got more clean scene data to MPEG encoder and pass it to simple enough MPEG encoder and got better output quality because MPEG can now spent more bits to the real scene objects encoding and not to residual noise encoding after non-complete motion compensation of nosied blocks.

Yeah, the line between encoding and preprocessing was never all that clear, and certainly has been becoming less so as codecs and encoders advance.

Still allowing unfettered preprocessing seems risky, as techniques like adding contrast like encoders were using to goose VMAF scores is a risk.

The spirit of the test is "what can deliver output most like the source" - which has unavoidable subjective elements.

I'm open to suggestions on how best to address this in an updated version of the challenge.

I'm thinking using StEM2 10-bit, with separate 1080p SDR and 2160p HDR targets. Thoughts?

DTL · 23rd February 2023, 11:51

Quote:

Originally Posted by benwaggoner

I'm open to suggestions on how best to address this in an updated version of the challenge.

The idea is to take noise-free source and to test encoder after adding some 'natural' noise and compare with 'clean' source.

As natural photon-shot noise is Poisson distribution at very low photons count and close to Gauss at medium and high we can more or less acceptably simulate it with AddGrain AVS plugin (promised to have gauss-noise).

For noise-free source we can take either 100% syntetic render (not very realistic natural source) or try to take as clean as possible natural camera shots.
As additional noise-cleaning steps the downsize may be used. 8x downsize adds about +18 dB to SNR (from photon-shot noise) and for 8bit and may be 10bit is it close to quantization noise only with good enough daylight camera shots with 0dB gain or even some -dB gain.

And to arrange some not very small frame for testing the tiling of downsized parts of total clip may be used. It also create 'average' complexity framestream of the total clip (that typically contain different complexity cutscenes) of short enough length to encode.

So with current 3.7.3 test 6 AVS+ release from https://forum.doom9.org/showthread.p...50#post1983250
was created source cleaning and conditioning script:

Code:

LoadPlugin("ffms2.dll")
LoadPlugin("avsresize.dll")
LoadPlugin("AddGrainC.dll")

FFmpegSource2("04.ts")

ConvertBits(16)

LetterBox(8, 8, 8, 8)

fc=last.FrameCount
fco=fc/16

drw=last.width / 8
drh=last.height / 8

c00=Trim(0 * fco, fco)
c01=Trim(1 * fco, 1 * fco + fco)
c02=Trim(2 * fco, 2 * fco + fco)
c03=Trim(3 * fco, 3 * fco + fco)

c10=Trim(4 * fco, 4 * fco + fco)
c11=Trim(5 * fco, 5 * fco + fco)
c12=Trim(6 * fco, 6 * fco + fco)
c13=Trim(7 * fco, 7 * fco + fco)

c20=Trim(8 * fco, 8 * fco + fco)
c21=Trim(9 * fco, 9 * fco + fco)
c22=Trim(10 * fco, 10 * fco + fco)
c23=Trim(11 * fco, 11 * fco + fco)

c30=Trim(12 * fco, 12 * fco + fco)
c31=Trim(13 * fco, 13 * fco + fco)
c32=Trim(14 * fco, 14 * fco + fco)
c33=Trim(15 * fco, 15 * fco + fco)

bp=105 // average sharpness
cp=0

c00=UserDefined2Resize(c00, drw, drh, b=bp, c=cp)
c01=UserDefined2Resize(c01, drw, drh, b=bp, c=cp)
c02=UserDefined2Resize(c02, drw, drh, b=bp, c=cp)
c03=UserDefined2Resize(c03, drw, drh, b=bp, c=cp)

c10=UserDefined2Resize(c10, drw, drh, b=bp, c=cp)
c11=UserDefined2Resize(c11, drw, drh, b=bp, c=cp)
c12=UserDefined2Resize(c12, drw, drh, b=bp, c=cp)
c13=UserDefined2Resize(c13, drw, drh, b=bp, c=cp)

c20=UserDefined2Resize(c20, drw, drh, b=bp, c=cp)
c21=UserDefined2Resize(c21, drw, drh, b=bp, c=cp)
c22=UserDefined2Resize(c22, drw, drh, b=bp, c=cp)
c23=UserDefined2Resize(c23, drw, drh, b=bp, c=cp)

c30=UserDefined2Resize(c30, drw, drh, b=bp, c=cp)
c31=UserDefined2Resize(c31, drw, drh, b=bp, c=cp)
c32=UserDefined2Resize(c32, drw, drh, b=bp, c=cp)
c33=UserDefined2Resize(c33, drw, drh, b=bp, c=cp)


r0=StackHorizontal(c00, c01, c02, c03)
r1=StackHorizontal(c10, c11, c12, c13)
r2=StackHorizontal(c20, c21, c22, c23)
r3=StackHorizontal(c30, c31, c32, c33)

StackVertical(r0, r1, r2, r3)

ConvertBits(8) // use target bitdepth conversion

#AddGrain(10)

Trim(0, 5000)

Prefetch(2)

For creating low-noise testclip from 4K footage with 1920x1080 output frame size. The b/c params for downsampler were taken for not very best 'video-makeup' sharpness to not make transients to black between 'picture in picture' elements too overshooting. The LetterBox in AVS still have issues with transients creation (https://github.com/AviSynth/AviSynthPlus/issues/339). Execution of script is slow enough so it is better save to lossless compressed 16bit per sample file (like FFV1 ver 3 from ffmpeg, yuv420p16le) and use it for testing MPEG encoders (after dithering or not to target bitdepth 8/10 or more).

The test footage was 4K UHD SDR 'A Wild Year on Earth', UK, Northern Pictures, 2020 https://www.imdb.com/title/tt13715870/, episode 04 https://www.imdb.com/title/tt13716068/?ref_=ttep_ep4 .

For faster speed (and x265 10bit again not started via avspipemod at me) MPEG encoder was 8bit x264 with settings:

Code:

--profile high --crf 18 --ref 4 -b 12 --direct auto --me esa --subme 11 --no-fast-pskip --trellis 2 
--merange 24 --deblock -2:-2 --b-adapt 2 --transfer "bt709" --colorprim "bt709" --colormatrix "bt709" --psnr

Results for different AddGrain(N) settings are:
No grain: 10640K, PSNR Y Mean 43.4 dB, encoding performance about 5 fps
AddGrain(2): 12347K, PSNR Y Mean 40.7 dB, encoding performance 4.34 fps
AddGrain(5): 17807K, PSNR Y Mean 38.7 dB, encoding performance 3.43 fps
AddGrain(10): 33969K, PSNR Y Mean 36.7 dB, encoding performance 2.85 fps
AddGrain(20): 69177K, PSNR Y Mean 35.4 dB, encoding performance 2.42 fps

So the residual noise in the content to MPEG encode significantly change MPEG output bitrate with fixed CRF-type encoding (and even with higher bitrate the PSNR still lower).

In a 'perfect world' MPEG encoder with perfect noise reduction internal should output stable lowest possible bitrate at highest PSNR because the real clean content is not changed and only additive random nosie is added with zero mean (I hope). Also the quality metric (PSNR/SSIM/VIF/VMAF/..) of the MPEG encoding result must be compared with 'clean source' before noise addition.

damian101 · 28th February 2023, 12:58

What does peak bitrate refer to exactly, and what is VBV in regards to VP9/AV1?

damian101 · 3rd March 2023, 11:49

SolLevante in under 2 Mbit/s: https://mega.nz/file/9LYXgY4b#xF1PHs...398ShdgwtLqS2o
I did not limit peak bitrate in any way, but followed the other constraints of the challenge.

damian101 · 3rd March 2023, 12:06

Tears of Steel in under 1 Mbit/s (slightly above with container overhead): https://mega.nz/file/RLhSmZzb#QU0wLB...5zU1B2pp5QusFs
I again did not concern myself with peak bitrate here.

damian101 · 3rd March 2023, 15:05

And here's 2 Mbit/s Tears of Steel: https://mega.nz/file/RTwknTrZ#7AamrI...fKQ_2B0WwJbWOk

benwaggoner · 6th March 2023, 01:51

Quote:

Originally Posted by damian101

What does peak bitrate refer to exactly, and what is VBV in regards to VP9/AV1?

I don't know the libvpx/libav1 syntax for peak bitrate control, but it should use whatever yield the same results as the specified --vbv-bufsize and --vbv-maxrate values

damian101 · 16th March 2023, 19:42

After further optimization, I encoded again, and this time with bitrate constrained to 4000 kbit/s (it sometimes overshoots, but so does x265, should be quite comparable).

2 Mbit/s Tears of Steel:
https://mega.nz/file/1WJkVB4B#QpNxZs...ecFA_Pbn0M3dyw

1 Mbit/s Tears of Steel:
https://mega.nz/file/8fYlFIIL#V7EmKP...466CYbLNohyk0I

2 Mbit/s SolLevante:
https://mega.nz/file/0GhAVJZR#5BQ9YU...XbVIeS1HvwLq7E

1 Mbit/s SolLevante:
https://mega.nz/file/UWwDCBAZ#3edv_d...J6I2vZrDEzXgFo

At least the 1Mbit/s SolLevante could have been better if I had adapted my parameters, but I wanted to keep parameters identical for all the samples, quality control aside of course, and I generally don't care much about extremely low quality targets. As a result, all the samples don't use some features like Wiener filter and CDEF, because I don't like how they tend to remove too much detail because they're guided by crappy MSE.

benwaggoner · 17th March 2023, 01:54

Sol Levante is a pretty uniquely weird and challenging source. It does great in demonstrating how well an encoder gracefully degrades when it's not possible to do a good job. And is a severe test of 2-pass VBR as it is half really easy to encode credits.

rwill · 10th March 2024, 10:11

Oh well how time flies...

I did 1Mbit HEVC and VVC encodes of Tears of Steel and Sol Levante using my stuff..

https://drive.google.com/drive/folde...c3?usp=sharing

Where I tried to make VVC look like HEVC.

I noticed that adding more of the advanced tools to VVC gives the picture some sort of 'plastic', 'female instagram model' or 'make up' look which I did not like at all.

So anyway, after years of ToS and Sol in SDR maybe some new sequence is in order... Ben?

benwaggoner · 12th March 2024, 01:39

Quote:

Originally Posted by rwill

Oh well how time flies...

I did 1Mbit HEVC and VVC encodes of Tears of Steel and Sol Levante using my stuff..

https://drive.google.com/drive/folde...c3?usp=sharing

Where I tried to make VVC look like HEVC.

I noticed that adding more of the advanced tools to VVC gives the picture some sort of 'plastic', 'female instagram model' or 'make up' look which I did not like at all.

So anyway, after years of ToS and Sol in SDR maybe some new sequence is in order... Ben?

StEM2 is some great footage that's pretty typical for modern 24p stuff without much grain:

https://theasc.com/society/stem2

Great licensing terms and available in a bunch of formats. And recent enough that encoders haven't been overturned to death over it. Not great if you want to show off film grain synthesis, but great to show off how the encoding underneath that hood would work.

Thoughts?

rwill · 12th March 2024, 18:21

So I got myself ASC_StEM2_178_UHD_ST2084_1000nits_Rec2020_Stereo_ProRes4444XQ.mov and converted it to yuv420p10le with ffmpeg like so:

../ffmpeg.exe -i <file> -pix_fmt yuv420p10le -acodec none -vcodec rawvideo -f rawvideo - | encoder...

and decided to do a HEVC Main10 HDR10 encode.

I tried to figure out a bitrate. Given the 1Mbit average target from ToS I figured going from 1920x800 to 3840x2160 .. that 4Mbit average would be somewhat reasonable and did an encode with a 120 pictures GOP, 12.5Mbit maxrate and 25Mbit buffer which more or less targets Level 5. I did not see any severe quality problems in the encode.

So I dropped the average rate to 2Mbit and started to see slight decreases in quality in flat areas.

If someone is interested, the encodes are at the following link:
https://drive.google.com/drive/folde...M9?usp=sharing

So while from a content perspective it may be challenging with the saturated colors and contrast, for a modern encoder it is not so much and it might be needed to decrease rate further to show encoder differences.

One thing I noticed is that filmgrain content can still trip off less tuned encoders with disastrous effects but such content may be hard to get to the public for testing...

benwaggoner · 13th March 2024, 17:31

Yeah, that's the thing about test content - you've got to pick what you want to test.

So, what kind of content are people interested in testing? We could potentially assemble a sequence of different kinds of content.

3rd March 2023, 12:06	#205 \| Link
damian101 Registered User Join Date: Feb 2021 Location: Germany Posts: 18	Tears of Steel in under 1 Mbit/s (slightly above with container overhead): https://mega.nz/file/RLhSmZzb#QU0wLB...5zU1B2pp5QusFs I again did not concern myself with peak bitrate here. Last edited by damian101; 3rd March 2023 at 12:11.

16th March 2023, 19:42	#208 \| Link
damian101 Registered User Join Date: Feb 2021 Location: Germany Posts: 18	After further optimization, I encoded again, and this time with bitrate constrained to 4000 kbit/s (it sometimes overshoots, but so does x265, should be quite comparable). 2 Mbit/s Tears of Steel: https://mega.nz/file/1WJkVB4B#QpNxZs...ecFA_Pbn0M3dyw 1 Mbit/s Tears of Steel: https://mega.nz/file/8fYlFIIL#V7EmKP...466CYbLNohyk0I 2 Mbit/s SolLevante: https://mega.nz/file/0GhAVJZR#5BQ9YU...XbVIeS1HvwLq7E 1 Mbit/s SolLevante: https://mega.nz/file/UWwDCBAZ#3edv_d...J6I2vZrDEzXgFo At least the 1Mbit/s SolLevante could have been better if I had adapted my parameters, but I wanted to keep parameters identical for all the samples, quality control aside of course, and I generally don't care much about extremely low quality targets. As a result, all the samples don't use some features like Wiener filter and CDEF, because I don't like how they tend to remove too much detail because they're guided by crappy MSE. Last edited by damian101; 16th March 2023 at 19:50.

17th March 2023, 01:54	#209 \| Link
benwaggoner Moderator Join Date: Jan 2006 Location: Portland, OR Posts: 4,771	Sol Levante is a pretty uniquely weird and challenging source. It does great in demonstrating how well an encoder gracefully degrades when it's not possible to do a good job. And is a severe test of 2-pass VBR as it is half really easy to encode credits. __________________ Ben Waggoner Principal Video Specialist, Amazon Prime Video My Compression Book

13th March 2024, 17:31	#213 \| Link
benwaggoner Moderator Join Date: Jan 2006 Location: Portland, OR Posts: 4,771	Yeah, that's the thing about test content - you've got to pick what you want to test. So, what kind of content are people interested in testing? We could potentially assemble a sequence of different kinds of content. __________________ Ben Waggoner Principal Video Specialist, Amazon Prime Video My Compression Book

28th February 2023, 12:58	#203 \| Link
damian101 Registered User Join Date: Feb 2021 Location: Germany Posts: 18	What does peak bitrate refer to exactly, and what is VBV in regards to VP9/AV1?

3rd March 2023, 11:49	#204 \| Link
damian101 Registered User Join Date: Feb 2021 Location: Germany Posts: 18	SolLevante in under 2 Mbit/s: https://mega.nz/file/9LYXgY4b#xF1PHs...398ShdgwtLqS2o I did not limit peak bitrate in any way, but followed the other constraints of the challenge.

3rd March 2023, 15:05	#206 \| Link
damian101 Registered User Join Date: Feb 2021 Location: Germany Posts: 18	And here's 2 Mbit/s Tears of Steel: https://mega.nz/file/RTwknTrZ#7AamrI...fKQ_2B0WwJbWOk

10th March 2024, 10:11	#210 \| Link
rwill Registered User Join Date: Dec 2013 Posts: 349	Oh well how time flies... I did 1Mbit HEVC and VVC encodes of Tears of Steel and Sol Levante using my stuff.. https://drive.google.com/drive/folde...c3?usp=sharing Where I tried to make VVC look like HEVC. I noticed that adding more of the advanced tools to VVC gives the picture some sort of 'plastic', 'female instagram model' or 'make up' look which I did not like at all. So anyway, after years of ToS and Sol in SDR maybe some new sequence is in order... Ben?

12th March 2024, 18:21	#212 \| Link
rwill Registered User Join Date: Dec 2013 Posts: 349	So I got myself ASC_StEM2_178_UHD_ST2084_1000nits_Rec2020_Stereo_ProRes4444XQ.mov and converted it to yuv420p10le with ffmpeg like so: ../ffmpeg.exe -i <file> -pix_fmt yuv420p10le -acodec none -vcodec rawvideo -f rawvideo - \| encoder... and decided to do a HEVC Main10 HDR10 encode. I tried to figure out a bitrate. Given the 1Mbit average target from ToS I figured going from 1920x800 to 3840x2160 .. that 4Mbit average would be somewhat reasonable and did an encode with a 120 pictures GOP, 12.5Mbit maxrate and 25Mbit buffer which more or less targets Level 5. I did not see any severe quality problems in the encode. So I dropped the average rate to 2Mbit and started to see slight decreases in quality in flat areas. If someone is interested, the encodes are at the following link: https://drive.google.com/drive/folde...M9?usp=sharing So while from a content perspective it may be challenging with the saturated colors and contrast, for a modern encoder it is not so much and it might be needed to decrease rate further to show encoder differences. One thing I noticed is that filmgrain content can still trip off less tuned encoders with disastrous effects but such content may be hard to get to the public for testing...