View Full Version : Benwaggoner HEVC encoding challenge
The Beamr encodes look good - I spent some time this evening looking at the 1.5 Mbps encode. One area where I think x265 is still superior is in the scene with the robot legs around frame 1232 - the background has very distracting visual artifacts (mostly looking like mosquito noise) in the Beamr version that are not present in the x265 version.
I saw some similar issues in the scene around frame 4514, including a keyframe pop.
However, the tables turn in a dark scene around frame 3651 where x265 has similar artifacts in the background and on the sniper's face, but Beamr is quite good. Another example of this is in the (bright) scene around frame 7527 where Tom's face is more detailed and more stable in the Beamr version.
I need to spend more time reviewing these - but Beamr definitely makes a very good showing! :)
Thanks for doing this, Tom!
Thanks for the feedback Derek. These encodes were done with a pre-release (development) version of Beamr 5, v4.5. I'm in St. Petersburg Russia this week, working with our development team. We will take a look at your feedback and will produce a fresh set of encodes with the production release version, which is a few weeks away from being finished.
I can't find any robot legs or distracting artifacts around frame 1232... can you double-check the position of the issue you're talking about? I see the quality change in the wall behind the guy at frame 4519 (a keyframe), and some "live walls" behind the characters in the scene before that (frame 4136-4398).
By the way, the best way to compare video quality side by side is to use Beamr View, our frame-synchronized dual video player. I know Derek has a copy, but if any other video professionals need copies of Beamr View, ping me (tv at beamr dot com).
Blue_MiSfit
15th May 2019, 18:41
My bad - frame 11232
Emulgator
15th May 2019, 22:20
Beamr 4501: Impressive !
Until now I only looked at 4501 1Mbps, pixel peeping mode.
Remarkably kept/simulated texture, sometimes glued onto the underlying surface, but nicely done.
The expected artifacts at such low bitrate are unexpectedly small, and well shifted to domains where it doesn't hurt.
My favourite ToS warping artifact "robot arm moving before leaves" is very good suppressed,
as are all contour warping stuff that x265 still has, well, at these misery bitrates.
Nice research, never believed that would be possible !
IgorC
26th May 2019, 21:08
Here are my Beamr 5 encodes (https://drive.google.com/open?id=12rOGjYJI2pmKUVZYgGiQD8E1awI7e75l) (the Google Drive includes copies of Ben's x265 UltraPlacebo encodes). I look forward to feedback.
Beamr looks absolutely better than x265 ver 2.8. Not sure how x265 3.0 would perform.
Also, a relevant article here
https://www.streamingmedia.com/Articles/Post/Blog/HEVC-IP-Owners-Are-Killing-the-Golden-Goose-Over-Royalties-131923.aspx
mandarinka
6th August 2019, 18:29
I didn't do any content specific encoding in these. I just did the slowest, highest quality encode 2-pass I had the patience for. Basically
--preset placebo
--cu-lossless
--tskip
-F 1
--ref 6
--bframes 16
--aq-mode 3
--rd-refine
If I had infinite patience I'd add --me sea and --subme 7 for PlusUltraPlacebo.
Acording to x265 documentation, CU-lossless requires extra signaling so it adds some bitrate overhead whether it is useful or not. It is usually nigh impossible to check if it helps or not at higher bitrates, because due to ratecontrol variation, some frames will look worse, some better, and you will have no idea. But I tried to meassure SSIM at 10mbits and 18mbits with a bluray source (1440x1080p24) and despite this high bitrate, I got a SSIM drop from CU-lossless.
I observed SSIM dropping from 0.9872928 without cu-lossless to 0.9872879 with cu-lossless at 18mbits, and from 0.9820309 to 0.9820127 at 10mbits.
Are you sure using this tool on normal lossy encodes (as opposed on some extreme bitrate archival encodes or something, hybrid lossless?) really helps visually quality per unit of bitrate? Or perhaps it is different with clean uncompressed source but it harms on compressed sources?
benwaggoner
8th August 2019, 20:56
Are you sure using this tool on normal lossy encodes (as opposed on some extreme bitrate archival encodes or something, hybrid lossless?) really helps visually quality per unit of bitrate? Or perhaps it is different with clean uncompressed source but it harms on compressed sources?
I am not sure if it does, no. It's more likely to help with zero-noise synthetic images, like the credits, but didn't do a lot of testing. Like I said, it was just an experiment to see what stuck after throwing at a wall.
I definitely want to try again with the new aq-mode and some other x265 improvements.
mandarinka
8th August 2019, 21:48
I see, thanks.
Funky080900
31st August 2019, 13:37
Hi there
here is my VVC encode using VTM Encoder Version 6.0. The original files are 37MB@403kbps. Encoding time ~ 1 week on i7-4720HQ.
I split the movie to encode in parallel and couldn't figure out how to merge the .bin files, so I recommend downloading the lossless h.264 video (5GB).
I don't know how representative my results are for VVC but since it took so long to encode I thought I'd share it anyway.
Files: https://drive.google.com/open?id=1XsHs9tMEaHMKhasqjXfc_CZWXWtkLzJF
benwaggoner
3rd September 2019, 19:23
Hi there
here is my VVC encode using VTM Encoder Version 6.0. The original files are 37MB@403kbps. Encoding time ~ 1 week on i7-4720HQ.
I split the movie to encode in parallel and couldn't figure out how to merge the .bin files, so I recommend downloading the lossless h.264 video (5GB).
I don't know how representative my results are for VVC but since it took so long to encode I thought I'd share it anyway.
Files: https://drive.google.com/open?id=1XsHs9tMEaHMKhasqjXfc_CZWXWtkLzJF
Awesome, thank you!
Greenhorn
26th November 2019, 08:23
So I'm not a VP9 expert by any stretch of the word, but I thought I'd take a stab at this as I'd been meaning to probe libvpx's options for a while anyway. Maybe someone'll find this useful.
https://mega.nz/#!RR9F0IYS!kAoSuKEHzdxQR-rl3qVDhKzVciAuaaO1eSa-0qgJ1rg
settings: --target-bitrate=2000 --buf-optimal-sz=2000 --kf-max-dist=120 --max-gf-interval=23 --tile-columns=0 --row-mt=1 --auto-alt-ref=6 --arnr-maxframes=15 --arnr-type=3 --frame-boost=1 --aq-mode=1. Overall encode speed was 1.56FPS (with ~20% utilization) on a Ryzen 7 3700X.
Average bitrate is 1998 kbps. I don't know how to measure the maximum local bitrate, sorry.
Notes/explanations/attempted justifications. Apologies in advance for the bad formatting and wordiness.
0) libvpx defaults to its highest-quality preset, and there doesn't seem to be a lot you can do to improve visuals sadly.
1) --threads appears to be the size of the worker pool, not the number of frame threads. I don't think there's any frame-level parallelism going on in the encoder, actually, as --frame-parallel enables/disables parallel decoding.
2) For a given --threads count, different --tile-columns counts will produce files that look more or less the same to my eyes and the basic ffmpeg metrics, but differ in slightly in filesize. I left --tile-rows at zero.
3) For a given --tile-columns count, altering --threads has the same effect.
4) Row-based multithreading (--row-mt) appears to be similar to WPP in HEVC; with tiles disabled, enabling it actually produced slightly smaller files with slightly higher visual metric scores, in addition to boosting encode speed from ~40FPM to ~1.5FPS.
5) Relative to the default settings (which appear to be --row-mt=0 --threads=8 --tile-rows=0 --tile-columns=6), --threads 1 --tile-columns 0 produced files that were on average 1% smaller for identical metric scores. (Caveat: this was done with several ~30second test clips extracted with FFMPEG.)
6) I really couldn't spot a visual difference between no-parallelism encodes and encodes with substantially more threads and tiles enabled than default, if I'm being honest.
7) There doesn't appear to be a strict analogue to --vbv-maxrate in libvpx, so I just set the "optimal buffer size" to 2000 milliseconds.
8) If I didn't set either a maximum or minimum quantizer value, bitrate would frequently rocket to absurdly high levels, then plummet down to equally absurd levels for an absurdly long amount of time. I believe this loose rate-control may be the cause of artifacting like you see in SmilingWolf's encodes from last year. There's still some in scenes like the sniper descending down his rope, but it does seem to be reigned in. Also: the overshoot-pct/undershoot-pct settings which you'd think might control or influence this appear to do precisely nothing.
9) Golden Frames and Alternate Reference Frames appear to be the same thing. By default they occur at an non-fixed interval of up to 16 frames (sort of a secondary GOP within the GOP . . .); I hope that increasing this to 23 (the maximum) was permissible, as it was one of the only tweaks I found that seemed to have a notable effect on the output files' visuals or metric scores.
10) I'm a bit fuzzy as to what it actually does in a technical sense, but setting auto-alt-ref to was the other really universally positive tweak I found.
11) AQ appears to be a little half-baked in libvpx (common theme . . .), but seemed to have an overall positive effect. Based on the code, aq-mode 1 is fairly similar to aq-mode 2 in the x26x encoders, while aq-mode 2 is similar to their aq-mode 1. AQ3 is silently disable if cpu-used is less than 5, and I really have no idea what AQ4 is trying to do but it didn't help anything for this clip.
12) None of the --tune-content settings had anything like a positive effect. "film" actually appears to be similar to "rc-grain" in x265, while "screen" just murdered quality.
13) The "--tune" option will always be set to "psnr"; setting it to "ssim" seemed like low-hanging fruit initially, but it appears that the actual code for the option was never ported from VP8 to VP9.
14) All of the "arnr-" settings had miniscule but actual effects on output. Setting arnr-type to three enables a bidirectional filter, and I assume that a larger sample size should be better.
15) --frame-boost seems like it might be silently ignored, but the intended effect (raise or lower QP of alt-ref frames based on 2-pass analysis) seemed good enough that I enabled it anyway.
16) "--sharpness" is actually an inverse to the strength of the loop filter (I think); reducing it increased filesize by about 10% and worsened artifacts.
I think the final encode is actually pretty OK, if not anything special.
Tadanobu
24th March 2020, 05:49
Could you please explain how you generated that film grain table please ? Because grain retention is one of the problem with aomenc. At least with default settings.
Funky080900
24th March 2020, 15:27
First denoise the video
ffmpeg -i ToS.y4m -vf nlmeans=s=1.5 denoised.yuv
Then use the noise_model application located under ./examples/noise_model: https://aomedia.googlesource.com/aom/+/master/examples/noise_model.c
noise_model --fps=24/1 --width=1920 --height=800 --i420 --input-denoised=denoised.yuv --input=ToS.yuv --output-grain-table=film_grain.tbl
I only kept sY sCb and sCr because the generated .tbl didn't look good.
Blue_MiSfit
24th March 2020, 18:27
WOW.
That is truly impressive. This reinforces my belief that film grain modeling is enormously important and will really make 1 Mbps 1080p totally viable. This is a game changing feature for AV1!
Do any of the encoders have / plan to integrate this in-loop during encoding? Working with YUV intermediates is pretty painful.
quietvoid
25th March 2020, 20:55
WOW.
That is truly impressive. This reinforces my belief that film grain modeling is enormously important and will really make 1 Mbps 1080p totally viable. This is a game changing feature for AV1!
Do any of the encoders have / plan to integrate this in-loop during encoding? Working with YUV intermediates is pretty painful.
aomenc/libaom supports this with the option --denoise-noise-level=[0..50]
Anything higher than 0 enables denoising and film grain modeling, however there is not much control on the denoising strength/algorithms used.
foxyshadis
27th March 2020, 00:06
WOW.
That is truly impressive. This reinforces my belief that film grain modeling is enormously important and will really make 1 Mbps 1080p totally viable. This is a game changing feature for AV1!
Do any of the encoders have / plan to integrate this in-loop during encoding? Working with YUV intermediates is pretty painful.
SVT-AV1 also has it, but it's mostly a clone of aomenc's. There have been a few minor changes that should show up in the next release, but FGM is one of those things that engineers are very loathe to touch, since it's all-but-untestable and intentionally introduces randomness.
nevcairiel
27th March 2020, 01:08
This is a game changing feature for AV1!
H264 has Film Grain Modelling, just noone ever used it because its hard to use correctly.
Maybe we get more lucky this time around and enough engineering time is put into it...
Blue_MiSfit
27th March 2020, 02:23
Right, I remember a few devices did but very few.
It's so incredibly useful when delivering streaming video though!
I wonder, everyone agreed that HE-AAC was better than AAC at lower bitrates. That was never controversial. Since FGM is fundamentally similar, why the resistance from engineers?
benwaggoner
30th March 2020, 06:30
H264 has Film Grain Modelling, just noone ever used it because its hard to use correctly.
Maybe we get more lucky this time around and enough engineering time is put into it...
AV1's implementation is quite a bit better, from people I've talked to who have looked at both. Also, grain removal itself is quite computationally expensive and algorithmically complex, and is a lot more feasible today than it was in 2006 when FGM was a required (but never used) feature of H.264 for HD-DVD.
It wasn't mandatory anywhere else, which also was a huge barrier to adoption. It likely could have become quite useful for 720p streaming circa 2010. Fewer pixels, more MIPS, bigger bitrate challenges.
It's easy to forget that we have >100x more compute available per pixel for 1080p than we did when the HD optical formats launched. I bet the typical AV1 pixel gets >>1000x more MIPS than a launch Blu-ray or HD-DVD.
Sagittaire
23rd April 2020, 11:37
libaom 1Mbps: https://drive.google.com/file/d/1gEwfqVcjZZcjDeQKMmvgk6shYQHxidU0/view?usp=sharing
aomenc --passes=2 --pass=2 --fpf=firstpass.log --target-bitrate=1220 --kf-max-dist=120 --cpu-used=0 -t 4 --deltaq-mode=2 --film-grain-table=film_grain.tbl -o AV1-0.ivf input.y4m
Clearly impressive result but by definition you make pre-process with this encoding technique. And with source with dither it's a considerable advantage. If I make denoising (dithering high frequency cut), no doubt that HEVC will be really better too.
@ benwaggoner
For this challenge at really low bitrate, use source with dithering is not really good idea, because local complexity frequency is really high. Never, I will make direct encoding without preprocess with this source like this.
Anyway I will try ... I have actually time to lose ... ;-)
Blue_MiSfit
23rd April 2020, 23:22
Clearly impressive result but by definition you make pre-process with this encoding technique. And with source with dither it's a considerable advantage. If I make denoising (dithering high frequency cut), no doubt that HEVC will be really better too.
I'd very much like to see denoised HEVC (static pre-processing) vs AV1 with FGM!
benwaggoner
24th April 2020, 00:03
For this challenge at really low bitrate, use source with dithering is not really good idea, because local complexity frequency is really high. Never, I will make direct encoding without preprocess with this source like this.
Banding is pretty hard to encode as well. Depending on the content and codec, dithering can allow for better quality at lower bitrates than not dithering.
I could probably knock out a non-dithered version if you'd like to try one.
Sagittaire
24th April 2020, 11:28
Banding is pretty hard to encode as well. Depending on the content and codec, dithering can allow for better quality at lower bitrates than not dithering.
I could probably knock out a non-dithered version if you'd like to try one.
No challenge is challenge ... ;-)
1) After first analyse, VBV (4 Mbps, 12 kbit) seem useless for 1000 kbps encoding. Defaut crf mode with x265 seem don't have VBV saturation in complexe scene.
2) Paradoxaly, certainely that the more complexe encoding will be 2000 kbps enconding with contrained VBV because you have high buffer saturation in complexe scene. But I have solution for that ... ;-)
Funky080900
24th April 2020, 20:24
Clearly impressive result but by definition you make pre-process with this encoding technique. And with source with dither it's a considerable advantage. If I make denoising (dithering high frequency cut), no doubt that HEVC will be really better too.
@ benwaggoner
For this challenge at really low bitrate, use source with dithering is not really good idea, because local complexity frequency is really high. Never, I will make direct encoding without preprocess with this source like this.
Anyway I will try ... I have actually time to lose ... ;-)
The denoised video was only used for generating the film grain table and not for encoding.
Libaom only ever saw the unaltered source file provided by Benwaggoner.
Sagittaire
25th April 2020, 13:11
@benwaggoner
use zone option is legal for this challenge?
After analyse, end credit is really high bitrate zone in constant quality mode (crf, cq or multipass encoding).
before credit zone (frame 14135), bitrate must be at 850 kbps and credit zone is inhabitually long with something like 15% of total frames
benwaggoner
27th April 2020, 23:15
@benwaggoner
use zone option is legal for this challenge?
It isn't legal in the base version, but I don't mind adding a version with content-specific temporal parameters. The basic test was really to exercise how well the encoder can do with static settings across variable content. But how well is possible no-holds-barred is also interesting.
After analyse, end credit is really high bitrate zone in constant quality mode (crf, cq or multipass encoding).
before credit zone (frame 14135), bitrate must be at 850 kbps and credit zone is inhabitually long with something like 15% of total frames
Yeah, that is a weirdness about this clip. Something with classic scrolling text would give an encoder opportunity to really squeeze and redistribute bits out of what can be really cranked down. All in all, though, ToS was the best broadly available and freely licensed source I found.
benwaggoner
28th April 2020, 02:31
I finally got around to posting some encodes comparing x264 --hevc-aq with --aq-mode 4. I reran my old tests using the latest x265 and new parameters. These are 1 Mbps. Just 2-pass VBR at veryslow; the UltraPlacebo versions to follow.
HEVC-AQ (https://1drv.ms/v/s!AlvIQZWsyeO-k7tgvI1tlwmSgRwd4g?e=PzhpcX)
AQ-MODE 4 (https://1drv.ms/v/s!AlvIQZWsyeO-k7thJC1wvob8RmFFkQ?e=GNYKTi)
Curious on which the rest of you think is superior.
benwaggoner
28th April 2020, 02:56
Oh, and I see I had a few old oddball things I didn't post before.
Best possible WMV encodes, via Expression Encoder and its built-in version of the VC-1 Professional Encoder SDK.
1000 Kbps (https://1drv.ms/v/s!AlvIQZWsyeO-kKps2VqtsCjMwPF2cg?e=I4jPFg)
1500 Kbps (https://1drv.ms/v/s!AlvIQZWsyeO-kKpuw9v9wyKH5rfDlA?e=AK6OCF)
2000 Kbps (https://1drv.ms/v/s!AlvIQZWsyeO-kKptiVfSb2xEL1Wwqg?e=E5hsfS)
Sagittaire
28th April 2020, 08:04
I finally got around to posting some encodes comparing x264 --hevc-aq with --aq-mode 4. I reran my old tests using the latest x265 and new parameters. These are 1 Mbps. Just 2-pass VBR at veryslow; the UltraPlacebo versions to follow.
HEVC-AQ (https://1drv.ms/v/s!AlvIQZWsyeO-k7tgvI1tlwmSgRwd4g?e=PzhpcX)
AQ-MODE 4 (https://1drv.ms/v/s!AlvIQZWsyeO-k7thJC1wvob8RmFFkQ?e=GNYKTi)
Curious on which the rest of you think is superior.
http://jfl1974.free.fr/Videos/ToS/ToS-2.mkv
Well here really better result with more temporal stability and really better edge reproduction. I have really better result in scene introduction on the bridge (between 26 sec and 40 sec for exemple). We can see (real) mosquitoes flying around the actress' head. Really better result too than your previous placebo encoding for my eyes.
- Better subjectif result
- Better objectif result (than Ma encoding for exemple).
encoded 17620 frames in 29373.04s (0.60 fps), 1000.00 kb/s, Avg QP:30.93, Global PSNR: 42.353, SSIM Mean Y: 0.9653839 (14.607 dB)
I make encapsulation in mkv file with HE-AAC 5.1 audio at 128 kbps. I think it's really good quality for only 1000 kbps encoding at 1080p with 5.1 audio.
Test in progress with better psy setting for better subjective result if I can ...
Sagittaire
28th April 2020, 08:44
@ benwaggoner
Well it's really curious. I Check previous Ma's enconding or TomV's encoding. And Ma's enconding, TomV's encoding or my encoding seem have really better quality than your placebo encoding (particulary in low motion part). You don't have problem with your encoding profil? You are sure that onedrive don't make reencoding (at upload or at download?). Quality is really bad for me.
and Beamr 5 encoding are the best for my eyes: really good complexity conservation. Certainely high psy (and good) optimisation. I will try to produce better result.
Opmox
29th April 2020, 04:19
deleted
Boulder
30th April 2020, 10:53
rd 4 instead of 6 and psy-rdoq 0 because (in my experience) it makes too much onion gradient artifacts (https://slow.pics/c/mx2EEPX6), maybe there's another way to fix it but I don't know.
I happened to run into a frame with some of that onion gradient in my tests yesterday. It was fixed by using --rd-refine. In my opinion, --rd 4 loses too much detail at least at higher bitrates but I'm still testing.
benwaggoner
30th April 2020, 22:50
@ benwaggoner
Well it's really curious. I Check previous Ma's enconding or TomV's encoding. And Ma's enconding, TomV's encoding or my encoding seem have really better quality than your placebo encoding (particulary in low motion part). You don't have problem with your encoding profil? You are sure that onedrive don't make reencoding (at upload or at download?). Quality is really bad for me.
and Beamr 5 encoding are the best for my eyes: really good complexity conservation. Certainely high psy (and good) optimisation. I will try to produce better result.
Which iteration of the placebo settings? The only placebo I think I posted was several years ago with a much older version of x265. The ones I posted this last week were veryslow. I'm rendering out the UltraPlacebo ones still, on a older machine where it looks like it'll take a full week.
Sagittaire
1st May 2020, 12:30
Which iteration of the placebo settings? The only placebo I think I posted was several years ago with a much older version of x265. The ones I posted this last week were veryslow. I'm rendering out the UltraPlacebo ones still, on a older machine where it looks like it'll take a full week.
Well I don't think it's x265 version problem. Your recent 2 pass x265 encoding have these problem too:
https://onedrive.live.com/?authkey=%21ALyNbZcJkoEcHeI&cid=BEE3C9AC9541C85B&id=BEE3C9AC9541C85B%21318944&parId=BEE3C9AC9541C85B%21267619&o=OneUp
Really bad quality on scene introduction:
Frame 442 to 559 : rocket takeoff, really blocky rocket and temporal blocking, I don't check quantizer but it's certainely too high value here. Perhaps rate control problem.
Frame 600 to 965 : bridge scene, problem on the edge (around the actress’s head or from the actor's arm for example). Certainely too high quantizer here. Perhaps rate control problem too.
I don't have these problem and the other encoding from me, or Ma or Opmox with x265 don't have these problem. It's really curious.
benwaggoner
1st May 2020, 16:24
Frame 442 to 559 : rocket takeoff, really blocky rocket and temporal blocking, I don't check quantizer but it's certainely too high value here. Perhaps rate control problem.
Frame 600 to 965 : bridge scene, problem on the edge (around the actress’s head or from the actor's arm for example). Certainely too high quantizer here. Perhaps rate control problem too.
I don't have these problem and the other encoding from me, or Ma or Opmox with x265 don't have these problem. It's really curious.
Yeah, these were really encodes meant to test all the new recent features of x265 together. I am sure there are oddities.
Here's the full parameters blob I did, used in both passes. No implication these are appropriate settings for production use; I just stuck new parameters into an old ToS script from a couple of years ago as I was heading off to bed.
--level-idc 4.0 --profile main --preset veryslow -F 2 --selective-sao 2 --sar 1 --rd-refine --scenecut-aware-qp --hist-scenecut --hme --multi-pass-opt-analysis --multi-pass-opt-distortion --tskip --tskip-fast --keyint 120 --tu-intra 4 --tu-inter 4 --hevc-aq --hrd --aud --single-sei --bitrate 1000 --vbv-maxrate 4000 --vbv-bufsize 12000 --colorprim bt709 --transfer bt709 --colormatrix bt709
And below full blob for the "UltraPlacebo" version I tried just to see what insanity may be feasible. The first pass only has about 80 hours left to go on my older personal computer (i7-6800K, which was actually a decent encoding box when I first built it). I'll try it on my actual workstation once I've got a break in my day job renders.
There's reasons there are parameters not even used in --preset placebo. Some because they are newer than the last preset refactoring (like HME) and some because the vanishingly small quality gains can't be justified by the huge speed increases.
--level-idc 4.0 --profile main --preset placebo --sar 1 -F 1 --ref 6 --bframes 16 --subme 7 --rd-refine --scenecut-aware-qp --hist-scenecut --hme --hme 5, 5, 5 --multi-pass-opt-analysis --multi-pass-opt-distortion --tskip --cu-lossless --keyint 120 --tu-intra 4 --tu-inter 4 --hevc-aq --hrd --aud --single-sei --bitrate 1000 --vbv-maxrate 4000 --vbv-bufsize 12000 --colorprim bt709 --transfer bt709 --colormatrix bt709
quietvoid
1st May 2020, 18:28
I would blame the default --max-qp-delta (with --scenecut-aware-qp) and --hist-threshold (with --hist-scene-cut). It's pretty awful to increase QP by 5 for 500ms every (detected) scene cut.
Increasing the --hist-threshold just to 0.02 already cuts the I frames by 3x or something, someone had posted about it as well in the x265 thread.
Sagittaire
1st May 2020, 19:01
I would blame the default --max-qp-delta (with --scenecut-aware-qp) and --hist-threshold (with --hist-scene-cut). It's pretty awful to increase QP by 5 for 500ms every (detected) scene cut.
Increasing the --hist-threshold just to 0.02 already cuts the I frames by 3x or something, someone had posted about it as well in the x265 thread.
I will check that but it's certainely something like that.
benwaggoner
2nd May 2020, 01:38
I will check that but it's certainely something like that.
I'm reencoding with --hist-threshold 0.02 to see if that helps.
benwaggoner
2nd May 2020, 01:57
Hmmm, check this out: https://bitbucket.org/multicoreware/x265/commits/eca79c2880129eab0380d5f1220b4ea15f233abe
Kirithika Kalirathnam committed eca79c2
2019-10-18
Fix the RC Pass2 ABR
This commit does the following changes:
1. Fix the order of RC Pass 1 stats Analysis in Pass2
2. Fix the aggressive Qp tuning for I/P frames in Pass2
Sagittaire
6th May 2020, 10:34
Hmmm, check this out: https://bitbucket.org/multicoreware/x265/commits/eca79c2880129eab0380d5f1220b4ea15f233abe
Kirithika Kalirathnam committed eca79c2
2019-10-18
Fix the RC Pass2 ABR
This commit does the following changes:
1. Fix the order of RC Pass 1 stats Analysis in Pass2
2. Fix the aggressive Qp tuning for I/P frames in Pass2
Well seem to be the problem. At actual committed version (3.3+26), x265 seem have too agressive rate control compression at the start of encoding with 2 pass. You have not the problem with 3 pass or crf mode.
and it's really curious, but crf mode and multipass encoding seem have really different strategy for rate control too:
- crf mode has really agressive quantizer for bframe: really high metric but lower quality for bframe.
- multipass is less agressive for bframe: lower metric but higher subjective quality with better temporal quality stability.
Imply that it's difficult to compare crf and multipass mode beacause RC strategy are different (quality can really change between N and N+1 frame and conclusion too).
Sagittaire
6th May 2020, 13:13
Well for HEVC I will try to make:
- highest possible quality metric (crf mode)
- highest possible "temporal" quality metric (multipass mode)
- highest subjective quality (with my best psy mode actived)
- Ultra fast encoding demonstration in "VOD production mode"
These encoding will have really higher overall quality than old ultraplacebo benwaggoner encoding (for metric quality or subjective quality)
benwaggoner
6th May 2020, 20:25
Well for HEVC I will try to make:
- highest possible quality metric (crf mode)
- highest possible "temporal" quality metric (multipass mode)
- highest subjective quality (with my best psy mode actived)
- Ultra fast encoding demonstration in "VOD production mode"
These encoding will have really higher overall quality than old ultraplacebo benwaggoner encoding (for metric quality or subjective quality)
I expect they would be! Those were done using a 2018 x265 build and plenty of fixes and other improvements have come in since then. Plus that was really an attempt to figure out what maximum encoder complexity could do. Plenty of content-specific tweaks remain to be discovered and applied.
I look forward to seeing your results!
Sagittaire
9th May 2020, 11:04
I expect they would be! Those were done using a 2018 x265 build and plenty of fixes and other improvements have come in since then. Plus that was really an attempt to figure out what maximum encoder complexity could do. Plenty of content-specific tweaks remain to be discovered and applied.
I look forward to seeing your results!
Actually more than 100 encodings for find best profil. I had not encoded with the x265 for quite a long time and there was plenty of functionality to test. currently I finish the 1000 kbps encodings with the final profiles:
- crf mode with highest metric score: my profile will beat all previous HEVC encoding here with a large margin
- ABR with 3 passes : my best visual compromise
- Reuse ABR : encoding with ABR 3 passes analysis reuse statistique but at fastest speed (~50x speed improvement vs slower, ~100x vs veryslow, ~200x vs placebo ... with comparable quality)
- ABR 3 passes with default RC, AQ and PSY setting (but at same level research for ME, CU, TU ...) for make comparison.
Anyway some little conclusion:
- hevc-aq option produce better metric and by far
- Rate Control in multipass encoding don't work very well: 2 passes don't produce constant quality encoding. You must use 3 passes for better bitrate repartition and even sometimes 4 passes if you want same bitrate repartition than crf mode.
- crf mode has the best metrics but with really aggressive quantizer on bframes: the difference in quality between PFrames and bframes is really large. ABR multipasses encoding don't have this "problem".
- placebo preset encoding is not really usefull (less than 1% size gain at same "metric" quality versus veryslow preset). For exemple I obtain really better constant quality in "3 pass preset slower" versus "2 pass preset placebo" simply because Rate Control is not really good in 2 passes mode.
crf mode:
x265 [info]: frame I: 221, Avg QP:27.27 kb/s: 11216.30 PSNR Mean: Y:43.857 U:46.103 V:46.112 SSIM Mean: 0.973372 (15.747dB)
x265 [info]: frame P: 3910, Avg QP:29.09 kb/s: 2844.64 PSNR Mean: Y:43.033 U:46.037 V:45.997 SSIM Mean: 0.971443 (15.443dB)
x265 [info]: frame B: 13489, Avg QP:36.49 kb/s: 296.98 PSNR Mean: Y:41.376 U:45.256 V:45.240 SSIM Mean: 0.967579 (14.892dB)
x265 [info]: Weighted P-Frames: Y:6.0% UV:4.8%
x265 [info]: Weighted B-Frames: Y:4.1% UV:2.8%
x265 [info]: consecutive B-frames: 10.8% 7.6% 5.9% 24.9% 21.6% 29.2%
encoded 17620 frames in 30340.70s (0.58 fps), 999.28 kb/s, Avg QP:34.73, Global PSNR: 42.688, SSIM Mean Y: 0.9685090 (15.018 dB)
3 passes ABR mode with exactly same setting:
x265 [info]: frame I: 221, Avg QP:26.58 kb/s: 10729.30 PSNR Mean: Y:43.864 U:46.118 V:46.127 SSIM Mean: 0.972412 (15.593dB)
x265 [info]: frame P: 3910, Avg QP:29.26 kb/s: 2266.71 PSNR Mean: Y:42.187 U:45.456 V:45.336 SSIM Mean: 0.966785 (14.787dB)
x265 [info]: frame B: 13489, Avg QP:31.23 kb/s: 469.21 PSNR Mean: Y:41.386 U:44.917 V:44.851 SSIM Mean: 0.965572 (14.631dB)
x265 [info]: Weighted P-Frames: Y:6.0% UV:4.7%
x265 [info]: Weighted B-Frames: Y:4.1% UV:2.8%
x265 [info]: consecutive B-frames: 10.8% 7.6% 5.9% 24.9% 21.6% 29.2%
encoded 17620 frames in 34506.66s (0.51 fps), 996.78 kb/s, Avg QP:30.73, Global PSNR: 42.449, SSIM Mean Y: 0.9659273 (14.676 dB)
Sagittaire
9th May 2020, 17:38
and I check Beamr 5 encoding stream: I think that this encoder use "zone encoding" or really constrained Rate Control to have higher quality in less complexes parts. But quality in complexes parts like high motion is really low. Compromise are good for my eyes. Anyway you can't compare Bearm encoding and x265 encoding with that.
size for 14135 -> 15062 frame credit zone (really complexe part with high bitrate in constant quantizer mode):
- x265, Placebo "Benwaggoner" encoding: 19 514 Ko (1300 Kbps)
- x265, Veryslow "Ma" encoding: 18 465 Ko
- x265, My 3 pass ABR encoding: 19 357 Ko
- x265, My crf encoding: 23 523 Ko (1568 Kbps)
- Bearm encoding: 15 145 Ko (1009 Kbps)
Imply 980 Kbps for first 14135 frames (without credit) for Bearm and 930 Kbps for first 14135 frames (without credit) for x265. It's 6% of size. It's the difference between slow versus placebo quality preset for x265 encoder. And at really low bitrate (1000 Kbps for 1080p), it's particulary high difference.
benwaggoner
9th May 2020, 22:34
Wow, some good and interesting results. Thank you! And an excellent demonstration of the weakness of considering --preset a primary determinator of quality. For nearly all encodes, there are other tweaks that can be made that will produce better quality, faster than just cranking up preset. More passes are a classic example of using MIPS smarter instead of just harder. Heck, a simple --nr-inter 150 can be a bigger improvement than slower to placebo.
The only scenario I've seen where placebo really delivers materially better efficiency than slower is in lossless encoding, where placebo can be nearly 10% better than every veryslow with some sorts of content.
Can you share your settings for your best result encodes?
Sagittaire
9th May 2020, 22:56
Wow, some good and interesting results. Thank you! And an excellent demonstration of the weakness of considering --preset a primary determinator of quality. For nearly all encodes, there are other tweaks that can be made that will produce better quality, faster than just cranking up preset. More passes are a classic example of using MIPS smarter instead of just harder. Heck, a simple --nr-inter 150 can be a bigger improvement than slower to placebo.
The only scenario I've seen where placebo really delivers materially better efficiency than slower is in lossless encoding, where placebo can be nearly 10% better than every veryslow with some sorts of content.
Can you share your settings for your best result encodes?
Yes I will make complete post when all my tests are finished. And actually I obtain really better quality than your old placebo encoding (for metric and my eyes), and by large margin at 1000 kbps.
Here all my encoding for 1000 kbps:
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@REM >> ToS encoding, 3 pass ABR, veryslow profil, high metric profil
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-abr.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 1 --slow-firstpass --stats ToS-1000-abr.log --preset veryslow --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --no-limit-modes --ref 5 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --tskip --qg-size 64 --merange 64 --deblock -1,-1 --limit-sao
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-abr.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 3 --stats ToS-1000-abr.log --preset veryslow --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --no-limit-modes --ref 5 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --tskip --qg-size 64 --merange 64 --deblock -1,-1 --limit-sao
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-abr.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 2 --stats ToS-1000-abr.log --preset veryslow --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --no-limit-modes --ref 5 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --tskip --qg-size 64 --merange 64 --deblock -1,-1 --limit-sao --analysis-save ToS_1000_R10-abr_analysis.dat --analysis-save-reuse-level 10 --csv ToS-1000-abr.csv --csv-log-level 1
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@REM >> ToS encoding, 3 pass ABR, veryslow profil, high speed reuse from 1000 kbps ABR encoding
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-hs.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 3 --stats ToS-1000-abr.log --preset veryslow --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --no-limit-modes --ref 5 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --tskip --qg-size 64 --merange 64 --deblock -1,-1 --limit-sao --analysis-load ToS_1000_R10-abr_analysis.dat --analysis-load-reuse-level 10
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-hs.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 3 --stats ToS-1000-abr.log --preset veryslow --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --no-limit-modes --ref 5 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --tskip --qg-size 64 --merange 64 --deblock -1,-1 --limit-sao --analysis-load ToS_1000_R10-abr_analysis.dat --analysis-load-reuse-level 10
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-hs.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 3 --stats ToS-1000-abr.log --preset veryslow --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --no-limit-modes --ref 5 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --tskip --qg-size 64 --merange 64 --deblock -1,-1 --limit-sao --analysis-load ToS_1000_R10-abr_analysis.dat --analysis-load-reuse-level 10 --csv ToS-1000-hs.csv --csv-log-level 1
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@REM >> ToS encoding, 3 pass ABR, veryslow profil, default RC and PSY setting
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-psy.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 1 --slow-firstpass --stats ToS-1000-psy.log --preset veryslow --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --no-limit-modes --ref 5 --rd-refine --tskip --qg-size 64 --merange 64
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-psy.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 3 --stats ToS-1000-psy.log --preset veryslow --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --no-limit-modes --ref 5 --rd-refine --tskip --qg-size 64 --merange 64
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-psy.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1000 --pass 2 --stats ToS-1000-psy.log --preset veryslow --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --no-limit-modes --ref 5 --rd-refine --tskip --qg-size 64 --merange 64 --analysis-save ToS_1000_R10-psy_analysis.dat --analysis-save-reuse-level 10 --csv ToS-1000-psy.csv --csv-log-level 1
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@REM >> ToS encoding, 1 pass CFR, veryslow profil, highest metric profil
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1000-crf.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --crf 23.1 --pass 1 --slow-firstpass --stats ToS-1000-crf.log --preset veryslow --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --no-limit-modes --ref 5 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --tskip --qg-size 64 --merange 64 --deblock -1,-1 --limit-sao --analysis-save ToS_1000_R10-crf_analysis.dat --analysis-save-reuse-level 10 --csv ToS-1000-crf.csv --csv-log-level 1
benwaggoner
10th May 2020, 00:01
Interesting. I'm curious about what drove cg-size of 32 versus 64 and b-frames of of 5.
FWIW, 6 reference frames are actually legal here, since it is 1920x800 instead of 1080. Just as long as you don't apply DRM and try to play on a Qualcomm SoC :).
Sagittaire
10th May 2020, 00:08
Interesting. I'm curious about what drove cg-size of 32 versus 64 and b-frames of of 5.
FWIW, 6 reference frames are actually legal here, since it is 1920x800 instead of 1080. Just as long as you don't apply DRM and try to play on a Qualcomm SoC :).
yes I try, but perhaps less than 0.1% size save. Like for 16 bframes.
Sagittaire
10th May 2020, 00:10
Careful with analysis-load-reuse-level, in my tests I had like ~800 psnr less, you can prevent this loss with refine-intra and refine-inter
yes but refine is dramatical speed loss ... ;-)
https://forum.doom9.org/showthread.php?p=1911278#post1911278
Sagittaire
10th May 2020, 01:02
Interesting. I'm curious about what drove cg-size of 32 versus 64 and b-frames of of 5.
Well I try that too with ToS ... ;-)
--qg-size 64 save 2% size (from my memory)
I try --hme --hme-search 0,1,3 --hme-range 24,48,64 versus --me-range 64 and really good result too. But hme is not compatible with analyse reuse (bug?).
This slower profil seem really good comprise between speed and quality. I wil test that for 1500 and 2000 encoding.
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
@REM >> ToS encoding, 3 pass ABR, slower profil, high metric profil
@REM >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1500-abr.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1500 --pass 1 --slow-firstpass --stats ToS-1500-abr.log --preset slower --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --ref 5 --limit-refs 3 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --qg-size 64 --hme --hme-search 0,1,3 --hme-range 24,48,64 --deblock -1,-1 --limit-sao
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1500-abr.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1500 --pass 3 --stats ToS-1500-abr.log --preset slower --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --ref 5 --limit-refs 3 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --qg-size 64 --hme --hme-search 0,1,3 --hme-range 24,48,64 --deblock -1,-1 --limit-sao
x265.exe --input C:\ToS_1920x800_xdither.y4m --output ToS-1500-abr.265 --input-res 1920x800 --output-depth 10 --fps 24000/1000 --bitrate 1500 --pass 2 --stats ToS-1500-abr.log --preset slower --qcomp 0.50 --bframes 5 --b-adapt 2 --min-keyint 1 --keyint 120 --rc-lookahead 60 --vbv-maxrate 4000 --vbv-bufsize 12000 --psnr --ssim --tune ssim --ref 5 --limit-refs 3 --rd-refine --hevc-aq --qp-adaptation-range 1.0 --qg-size 64 --hme --hme-search 0,1,3 --hme-range 24,48,64 --deblock -1,-1 --limit-sao --csv ToS-1500-abr.csv --csv-log-level 1
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.