x265 Sharpness and Detail (Best Settings?) [Archive] - Page 2

madey83

1st December 2022, 12:30

--rc-lookahead is capped at --keyint in any case, so it doesn't really matter.

I've certainly seen valuable quality improvements (specifically big reductions in quality variations that yield visibly poor quality) when increasing keyint and rc-lookahead from 2 to 5 seconds, and that was mostly due to the higher --rc-lookahead (comparing --keyint 120 --rc-lookahead 48 versus 128). That said, --rc-lookahead is most impactful with single-pass --crf encodes. It offers benefit with 1-pass CBR, but a full two-pass encode is basically rc-lookahead=total frames.

--aq-mode 3 is identical to --aq-mode 2 except that it lowers QP as luma values get nearer to black. This gets around Rec. 709's perceptual non-uniformity, where the difference between luma of 16 and 17 is much more visible than between, say 216 and 217. So the same compression artifacts can be much more visible near black than near white.

--aq-mode 3 can really help shadow detail and reduce banding near black. However, those lower QPs use more bits. For CBR and VBV-limited encodes, this increases QP of brighter frames, which can introduce new quality issues. Or in CRF mode, increase ABR quite a bit. It's a great tool when there's enough bandwidth, but needs to be used delicately at lower bitrates. It's not a safe feature to just leave always on when targeting best possible quality at low bitrates. It helps sometimes, and hurts others. Lower --aq-strength values are optimal with aq-mode 3 than 2 as well, as high values can really starve brighter macroblocks of bits.

It's also not appropriate for HDR-10, which is much more perceptually uniform, and where the visible difference between luma values is pretty much identical at any brightness.

10-bit encoding can make --aq-mode less needed as well, as the difference between 8-bit 16 and 17 now is the difference between 64 and 68, and those four extra steps make for more efficient encoding, reduce the need for dithering, and generally makes for much less visible banding and blocking.

I'm confident that --aq-mode 3 could be improved to be more content-adaptive. It'd prefer x264 to have aq-mode algorithm selection decouple from luma bias. So we could do stuff like

--aq-mode 4 --low-luma-bias 0.7

And make --aq-mode 3 essentially an alias to

--aq-mode 2 --low-luma-bias 1.0

Hi,

What could be your settings recommendation for Dolby vision encoder to have quite small file for 43 minutes episode ( 1-1.5GB)?

HD MOVIE SOURCE

4th December 2022, 06:40

--rc-lookahead is capped at --keyint in any case, so it doesn't really matter.

I've certainly seen valuable quality improvements (specifically big reductions in quality variations that yield visibly poor quality) when increasing keyint and rc-lookahead from 2 to 5 seconds, and that was mostly due to the higher --rc-lookahead (comparing --keyint 120 --rc-lookahead 48 versus 128). That said, --rc-lookahead is most impactful with single-pass --crf encodes. It offers benefit with 1-pass CBR, but a full two-pass encode is basically rc-lookahead=total frames.

--aq-mode 3 is identical to --aq-mode 2 except that it lowers QP as luma values get nearer to black. This gets around Rec. 709's perceptual non-uniformity, where the difference between luma of 16 and 17 is much more visible than between, say 216 and 217. So the same compression artifacts can be much more visible near black than near white.

--aq-mode 3 can really help shadow detail and reduce banding near black. However, those lower QPs use more bits. For CBR and VBV-limited encodes, this increases QP of brighter frames, which can introduce new quality issues. Or in CRF mode, increase ABR quite a bit. It's a great tool when there's enough bandwidth, but needs to be used delicately at lower bitrates. It's not a safe feature to just leave always on when targeting best possible quality at low bitrates. It helps sometimes, and hurts others. Lower --aq-strength values are optimal with aq-mode 3 than 2 as well, as high values can really starve brighter macroblocks of bits.

It's also not appropriate for HDR-10, which is much more perceptually uniform, and where the visible difference between luma values is pretty much identical at any brightness.

10-bit encoding can make --aq-mode less needed as well, as the difference between 8-bit 16 and 17 now is the difference between 64 and 68, and those four extra steps make for more efficient encoding, reduce the need for dithering, and generally makes for much less visible banding and blocking.

I'm confident that --aq-mode 3 could be improved to be more content-adaptive. It'd prefer x264 to have aq-mode algorithm selection decouple from luma bias. So we could do stuff like

--aq-mode 4 --low-luma-bias 0.7

And make --aq-mode 3 essentially an alias to

--aq-mode 2 --low-luma-bias 1.0

Very valuable info, thank you. I'd be interested in trying the 5x keyint for lookahead. Does higher lookahead require much more processing time? If so, it really depends on what can be gained from 5x lookahead.

I have thought about having bit-rate starved on brighter scenes, but I haven't found it to be an issue when bit-rates are high enough. I guess it could be an issue for bit-rate-restricted content. I typically find that black-level detail can still be an issue even on 4K discs. So even using it as a tool to distribute more bits into black-level detail is useful. Do you think this is why it potentially could improve gradients from black to full color? Because there are more bits being distributed in the low end of the spectrum and now the gradients look smoother?

benwaggoner

7th December 2022, 20:10

For example jpsdr's recent x265 mods have this kind of a feature. There is the parameter --aq-bias-strength which is a multiplier on top of aq-strength. I believe it should make --sbrc work more efficiently. (aq-mode 5 mentioned here is aq-mode 4 + low luma bias)

--aq-bias-strength <float>
Adjust the strength of dark scene bias in AQ modes 3 and 5. Setting this
to 0 will disable the dark scene bias, meaning modes will be equivalent to
their unbiased counterparts (2 and 4).
Default 1.0.
Well, that's just lovely! Have these changes been contributed back to MCW for x265? I've exactly this feature with MCW developers before, and they showed some interest.

I don't really use other private forks much to keep my settings compatible with other x265 implementations. I'm going to try this one out; I have some content types where those tweaks could really make a difference.

Boulder

8th December 2022, 06:00

Well, that's just lovely! Have these changes been contributed back to MCW for x265? I've exactly this feature with MCW developers before, and they showed some interest.

I don't really use other private forks much to keep my settings compatible with other x265 implementations. I'm going to try this one out; I have some content types where those tweaks could really make a difference.

Unfortunately not, at least I've not seen it discussed in the x265-devel list. In my opinion, it is too difficult to contribute since it's not just about submitting a pull request but requires more effort.

benwaggoner

9th December 2022, 22:18

What could be your settings recommendation for Dolby vision encoder to have quite small file for 43 minutes episode ( 1-1.5GB)?
Profile 5 or 8.1?

Those file sizes give around 3-4.5 Mbps; pretty darn low for 4K. I'd look at scaling down to 1080p or maybe 1440p. I think only clean animation would encode well at those bitrates. Something with even light noise might require --crf 30 at full resolution. 2-pass encoding could be preferable as final file sizes would vary a lot depending on source specifics.

The job is more about artifact suppression than fine detail retention at those bitrates. So definitely you'll want --sao, default in-loop deb locking, some --nr-intra, maybe some <100 --nr-inter. And I'd start with --preset slower at a minimum.

benwaggoner

9th December 2022, 22:27

Very valuable info, thank you. I'd be interested in trying the 5x keyint for lookahead. Does higher lookahead require much more processing time? If so, it really depends on what can be gained from 5x lookahead.
--rc-lookahead can't be higher than --keyint. For single pass encodes when I have enough RAM and time, I try to use keyint=rc-lookahead.

There is some slowdown with higher values, but I find it's relatively cheap as quality/speed tradeoff features go. The bigger limitation is the RAM requirements, which can increase almost linearly with --rc-lookahead. I've had a 60 GB single x265 process doing 8K 10-bit encoding with --rc-lookahead 96.

I have thought about having bit-rate starved on brighter scenes, but I haven't found it to be an issue when bit-rates are high enough. I guess it could be an issue for bit-rate-restricted content. I typically find that black-level detail can still be an issue even on 4K discs. So even using it as a tool to distribute more bits into black-level detail is useful. Do you think this is why it potentially could improve gradients from black to full color? Because there are more bits being distributed in the low end of the spectrum and now the gradients look smoother?
Yeah, --aq-mode 3 lowers QP near black, and so a gradient that has one end that dark will benefit from it.

Using 10-bit instead of 8-bit can also really help gradients in general. And use the --dither parameter if you're encoding at a lower color depth than the source, which can help reduce 8-bit SDR banding as well.

Lots of these things are bitrate related. If you have enough bits, they may not be much of an issue. But if trying to really maximize bang for the bit, careful tuning can really help. Optimal --aq-mode and --aq-strength can vary quite a bit in different shots and scenes.

A whole lot of deep amazing things can be done using x265 as .dll instead of an .exe, where some parameters can be varied by GOP, frame, and even CU.

It'd be lovely to have some sort of standard XML syntax for .dll specific tweaks with an .exe that would parse those and pass them on to the .dll in a predictable and portable way.

madey83

10th December 2022, 21:28

Profile 5 or 8.1?

Those file sizes give around 3-4.5 Mbps; pretty darn low for 4K. I'd look at scaling down to 1080p or maybe 1440p. I think only clean animation would encode well at those bitrates. Something with even light noise might require --crf 30 at full resolution. 2-pass encoding could be preferable as final file sizes would vary a lot depending on source specifics.

The job is more about artifact suppression than fine detail retention at those bitrates. So definitely you'll want --sao, default in-loop deb locking, some --nr-intra, maybe some <100 --nr-inter. And I'd start with --preset slower at a minimum.

Hi,
Sorry but I did not mention that I will downscale to 1080p profile 5 as original source and for noise source I'd probably use some denise filter.

My preference for encode is crf

HD MOVIE SOURCE

10th December 2022, 22:33

--rc-lookahead can't be higher than --keyint. For single pass encodes when I have enough RAM and time, I try to use keyint=rc-lookahead.

There is some slowdown with higher values, but I find it's relatively cheap as quality/speed tradeoff features go. The bigger limitation is the RAM requirements, which can increase almost linearly with --rc-lookahead. I've had a 60 GB single x265 process doing 8K 10-bit encoding with --rc-lookahead 96.

Yeah, --aq-mode 3 lowers QP near black, and so a gradient that has one end that dark will benefit from it.

Using 10-bit instead of 8-bit can also really help gradients in general. And use the --dither parameter if you're encoding at a lower color depth than the source, which can help reduce 8-bit SDR banding as well.

Lots of these things are bitrate related. If you have enough bits, they may not be much of an issue. But if trying to really maximize bang for the bit, careful tuning can really help. Optimal --aq-mode and --aq-strength can vary quite a bit in different shots and scenes.

A whole lot of deep amazing things can be done using x265 as .dll instead of an .exe, where some parameters can be varied by GOP, frame, and even CU.

It'd be lovely to have some sort of standard XML syntax for .dll specific tweaks with an .exe that would parse those and pass them on to the .dll in a predictable and portable way.

Great info, thank you.

benwaggoner

12th December 2022, 22:03

Hi,
Sorry but I did not mention that I will downscale to 1080p profile 5 as original source and for noise source I'd probably use some denise filter.

My preference for encode is crf
Profile 5 can be somewhat challenging for an encoder, as it shifts the PQ EOTF to maximize the dynamic range of the current shot. So, if encoding something that's only 100 nits, it'll expand the Y' range higher so there are more code values.

This looks a lot like fades to the encoder, so having p- and b-frame weighted prediction on is important. The --fades feature could help as well, although I've never actually gotten it to result in different output than not using it.

The expanded range can also through off tools that presume encoding to actually perceptually linear code values. The impact of --nr-* could vary in strength on the same content if they got mapped to different code values. The broader range of code values the image gets mapped to, the stronger the AC coefficients will be, and so the less impact a given --nr-* strength would have.

Long way of saying that it might be better to apply denoising in uncompressed PQ ICtCp before or after scaling, and send denoised frames to the encoder instead of using the built-in tools.

I've not actually tested this empirically, but it seems kind of inevitable barring some DoVi Profile 5 specific optimization in x265 that would compensate for the range expansion.

madey83

13th December 2022, 11:11

Profile 5 can be somewhat challenging for an encoder, as it shifts the PQ EOTF to maximize the dynamic range of the current shot. So, if encoding something that's only 100 nits, it'll expand the Y' range higher so there are more code values.

This looks a lot like fades to the encoder, so having p- and b-frame weighted prediction on is important. The --fades feature could help as well, although I've never actually gotten it to result in different output than not using it.

The expanded range can also through off tools that presume encoding to actually perceptually linear code values. The impact of --nr-* could vary in strength on the same content if they got mapped to different code values. The broader range of code values the image gets mapped to, the stronger the AC coefficients will be, and so the less impact a given --nr-* strength would have.

Long way of saying that it might be better to apply denoising in uncompressed PQ ICtCp before or after scaling, and send denoised frames to the encoder instead of using the built-in tools.

I've not actually tested this empirically, but it seems kind of inevitable barring some DoVi Profile 5 specific optimization in x265 that would compensate for the range expansion.

Hi,

Thank you for this but this is to deep explanation for me.
I have found below and I would like to ask if belove settings are good or there is something to change to get better results:

Cry Macho 2021 1080p UHD BluRay DD+5.1 DoVi x265-DON

Writing library : x265 3.5+39-931178347:[Windows][MSVC 1933][64 bit] 10bit
Encoding settings : cpuid=1049583 / frame-threads=4 / numa-pools=24 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x804 / interlace=0 / total-frames=0 / level-idc=51 / high-tier=1 / uhd-bd=0 / ref=6 / no-allow-non-conformance / repeat-headers / annexb / aud / no-eob / no-eos / hrd / info / hash=0 / no-temporal-layers / no-open-gop / min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=16 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=40 / lookahead-slices=4 / scenecut=40 / no-hist-scenecut / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=4 / tu-intra-depth=4 / limit-tu=4 / rdoq-level=2 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=4 / limit-refs=1 / limit-modes / me=3 / subme=5 / merange=57 / temporal-mvp / no-frame-dup / no-hme / weightp / weightb / no-analyze-src-pics / deblock=-3:-3 / no-sao / no-sao-non-deblock / rd=4 / selective-sao=0 / no-early-skip / rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=2.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=13.7 / qcomp=0.80 / qpstep=4 / stats-write=0 / stats-read=0 / vbv-maxrate=160000 / vbv-bufsize=160000 / vbv-init=0.9 / min-vbv-fullness=50.0 / max-vbv-fullness=80.0 / crf-max=0.0 / crf-min=0.0 / ipratio=1.20 / pbratio=1.10 / aq-mode=1 / aq-strength=0.85 / no-cutree / zone-count=8 / zones: / start-frame=112 / end-frame=430 / bitrate-factor=4.000000 / zones: / start-frame=14684 / end-frame=15005 / bitrate-factor=7.000000 / zones: / start-frame=16507 / end-frame=17008 / bitrate-factor=1.500000 / zones: / start-frame=17009 / end-frame=17145 / bitrate-factor=2.000000 / zones: / start-frame=17601 / end-frame=23536 / bitrate-factor=2.500000 / zones: / start-frame=31365 / end-frame=37290 / bitrate-factor=3.000000 / zones: / start-frame=44733 / end-frame=50608 / bitrate-factor=3.000000 / zones: / start-frame=140360 / end-frame=140415 / bitrate-factor=3.000000 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / no-const-vbv / sar=0 / overscan=0 / videoformat=5 / range=0 / colorprim=9 / transfer=16 / colormatrix=9 / chromaloc=1 / chromaloc-top=2 / chromaloc-bottom=2 / display-window=0 / master-display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50) / cll=1134,104 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / hist-threshold=0.03 / no-opt-cu-delta-qp / no-aq-motion / hdr10 / hdr10-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=0 / analysis-save-reuse-level=0 / analysis-load-reuse-level=0 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00 / scenecut-aware-qp=0conformance-window-offsets / right=0 / bottom=0 / decoder-max-rate=0 / no-vbv-live-multi-pass

Of course without providing zones to encoder.

benwaggoner

13th December 2022, 16:11

Could you give me your command line instead? That's easier to parse.

madey83

13th December 2022, 16:54

Could you give me your command line instead? That's easier to parse.

This is not mine. I found it on internet

HD MOVIE SOURCE

30th December 2022, 18:37

--rc-lookahead is capped at --keyint in any case, so it doesn't really matter.

--aq-mode 3 is identical to --aq-mode 2 except that it lowers QP as luma values get nearer to black. This gets around Rec. 709's perceptual non-uniformity, where the difference between luma of 16 and 17 is much more visible than between, say 216 and 217. So the same compression artifacts can be much more visible near black than near white.

--aq-mode 3 can really help shadow detail and reduce banding near black. However, those lower QPs use more bits. For CBR and VBV-limited encodes, this increases QP of brighter frames, which can introduce new quality issues. Or in CRF mode, increase ABR quite a bit. It's a great tool when there's enough bandwidth, but needs to be used delicately at lower bitrates. It's not a safe feature to just leave always on when targeting best possible quality at low bitrates. It helps sometimes, and hurts others. Lower --aq-strength values are optimal with aq-mode 3 than 2 as well, as high values can really starve brighter macroblocks of bits.

It's also not appropriate for HDR-10, which is much more perceptually uniform, and where the visible difference between luma values is pretty much identical at any brightness.

10-bit encoding can make --aq-mode less needed as well, as the difference between 8-bit 16 and 17 now is the difference between 64 and 68, and those four extra steps make for more efficient encoding, reduce the need for dithering, and generally makes for much less visible banding and blocking.

I'm confident that --aq-mode 3 could be improved to be more content-adaptive. It'd prefer x264 to have aq-mode algorithm selection decouple from luma bias. So we could do stuff like

--aq-mode 4 --low-luma-bias 0.7

And make --aq-mode 3 essentially an alias to

--aq-mode 2 --low-luma-bias 1.0

Do you any other settings that could improve detail? How about rskip=0? Is this a setting that is good for grain retention? If it's good for grain retention, could it also be good for small details that are not grain?

Thanks.

benwaggoner

3rd January 2023, 18:56

Do you any other settings that could improve detail? How about rskip=0? Is this a setting that is good for grain retention? If it's good for grain retention, could it also be good for small details that are not grain?

I've found --rskip 2 --rskip-threshold 2 or 3 works well with typical film grain, without as big a speed hit as --rskip 0.

Raising --psy-rd and --psy-rdoq some can help. Try 3 for each as a starting point.

--ctu 32 for sure

The built-in --nr-inter and --nr-intra aren't particularly advanced denoising filters, but as they operate at the quantization phase, can offer bigger compression efficiency benefits than other methods. Start with something like --nr-intra 100 --nr-inter 400. --nr-intra is a spatial denoising filter and --nr-inter is temporal, which is why we use a good multiple higher as film grain is all temporal.

HD MOVIE SOURCE

4th January 2023, 19:52

I've found --rskip 2 --rskip-threshold 2 or 3 works well with typical film grain, without as big a speed hit as --rskip 0.

Raising --psy-rd and --psy-rdoq some can help. Try 3 for each as a starting point.

--ctu 32 for sure

The built-in --nr-inter and --nr-intra aren't particularly advanced denoising filters, but as they operate at the quantization phase, can offer bigger compression efficiency benefits than other methods. Start with something like --nr-intra 100 --nr-inter 400. --nr-intra is a spatial denoising filter and --nr-inter is temporal, which is why we use a good multiple higher as film grain is all temporal.

Okay, thank you. Now, what is it about ctu=32 that has a benefit over ctu=64? This isn't something I know anything about. I just assumed it defaults to 64 and that's where it should be.

Does it just give more control over each block? Would this increase bit-rates?

Thanks.

HD MOVIE SOURCE

5th January 2023, 19:16

I've found --rskip 2 --rskip-threshold 2 or 3 works well with typical film grain, without as big a speed hit as --rskip 0.

Raising --psy-rd and --psy-rdoq some can help. Try 3 for each as a starting point.

--ctu 32 for sure

The built-in --nr-inter and --nr-intra aren't particularly advanced denoising filters, but as they operate at the quantization phase, can offer bigger compression efficiency benefits than other methods. Start with something like --nr-intra 100 --nr-inter 400. --nr-intra is a spatial denoising filter and --nr-inter is temporal, which is why we use a good multiple higher as film grain is all temporal.

As with tskip, it has a mode where it can be useful, but use less resources with --tskip-fast. If use rskip, are 0, 1, and 2 just less and less levels of rskip, or are they completely different rskip modes meant for different content? What really asking, is there an rskip-fast like the tskip-fast, because it's extremely useful.

I wanted to ask a bit more about aq-mode=3 and aq-strength. I believe aq-strength ranges from 0 (off) to 3 (full), But, when you set aq-strength to 3 what is that really doing? Is it trying to place 100% of it's bit-rate in darker areas? Or is there a cap even when set to 3 that only, say, 25% of it's bit-rate looks fo darker content?

----Lets just say it looked for 25% of it's bit-rate to be added in dark areas. If I lower aq-strength from 3 to 1.5, would that mean it it not distributes 12.5% of it's bit-rates into dark areas?

I'm just trying to understand how aq-strength works in relation to aq-mode 3.

Thanks.

benwaggoner

6th January 2023, 23:59

All your questions and more are answered at: https://x265.readthedocs.io/en/master/cli.html

It specifies the data type of each parameter.

For --rskip, --rskip 2 is the mode you want to use (deprecating the older, poor at grain default --rskip 1). You control speed/quality tradeoff with --rskip 2 with the --rskip-edge-threshold parameter. Lower values are slower but more accurate. The default is 5, and I like to use 2-3 for high quality encoding.

I'm not aware of any methods to estimate the bitrate impact of different --aq-modes, as they are quite content dependent.

HD MOVIE SOURCE

10th January 2023, 23:06

All your questions and more are answered at: https://x265.readthedocs.io/en/master/cli.html

It specifies the data type of each parameter.

For --rskip, --rskip 2 is the mode you want to use (deprecating the older, poor at grain default --rskip 1). You control speed/quality tradeoff with --rskip 2 with the --rskip-edge-threshold parameter. Lower values are slower but more accurate. The default is 5, and I like to use 2-3 for high quality encoding.

I'm not aware of any methods to estimate the bitrate impact of different --aq-modes, as they are quite content dependent.

I appreciate it thank you.

HD MOVIE SOURCE

28th January 2023, 18:10

I've found --rskip 2 --rskip-threshold 2 or 3 works well with typical film grain, without as big a speed hit as --rskip 0.

Raising --psy-rd and --psy-rdoq some can help. Try 3 for each as a starting point.

--ctu 32 for sure

The built-in --nr-inter and --nr-intra aren't particularly advanced denoising filters, but as they operate at the quantization phase, can offer bigger compression efficiency benefits than other methods. Start with something like --nr-intra 100 --nr-inter 400. --nr-intra is a spatial denoising filter and --nr-inter is temporal, which is why we use a good multiple higher as film grain is all temporal.

With all the settings I'm using to pull out as much detail as possible using x265. One setting that Ive forgot about is cu-lossless. Can this help pull out more detail in grain and fine particles?

benwaggoner

1st February 2023, 00:39

With all the settings I'm using to pull out as much detail as possible using x265. One setting that Ive forgot about is cu-lossless. Can this help pull out more detail in grain and fine particles?
No, --cu-lossless just tries mathematically lossless encoding for a given CU, and picks it if it is lower bitrate/distortion. In essence it's a really specialized and compute-intensive subset of --tskip. It would only be used with very high bitrates or some very specialized patterns, like a fully noise-free digitally rendered black/white checkerboard pattern, color bars, or something artificial like that. Grainy content is pretty much the inverse of when it might be helpful.

I've never found a real-world scenario where --cu-lossless would be useful for a distribution encode. Perhaps for a mezzanine file. --tskip should capture most or all of the value at a much lower performance hit. There's no --cu-lossless-fast equivalent to --tskip-fast.

HD MOVIE SOURCE

22nd February 2023, 05:19

No, --cu-lossless just tries mathematically lossless encoding for a given CU, and picks it if it is lower bitrate/distortion. In essence it's a really specialized and compute-intensive subset of --tskip. It would only be used with very high bitrates or some very specialized patterns, like a fully noise-free digitally rendered black/white checkerboard pattern, color bars, or something artificial like that. Grainy content is pretty much the inverse of when it might be helpful.

I've never found a real-world scenario where --cu-lossless would be useful for a distribution encode. Perhaps for a mezzanine file. --tskip should capture most or all of the value at a much lower performance hit. There's no --cu-lossless-fast equivalent to --tskip-fast.

Thanks, will run some more encodes to see what it's giving me. I'll also run some film grain tests, but as you said, it's very compute-intensive, and you need a monster setup to run this. It's okay for 4-minute content to test, but for full movies, it's a real struggle.

I appreciate the feedback.

HD MOVIE SOURCE

24th February 2023, 18:42

No, --cu-lossless just tries mathematically lossless encoding for a given CU, and picks it if it is lower bitrate/distortion. In essence it's a really specialized and compute-intensive subset of --tskip. It would only be used with very high bitrates or some very specialized patterns, like a fully noise-free digitally rendered black/white checkerboard pattern, color bars, or something artificial like that. Grainy content is pretty much the inverse of when it might be helpful.

I've never found a real-world scenario where --cu-lossless would be useful for a distribution encode. Perhaps for a mezzanine file. --tskip should capture most or all of the value at a much lower performance hit. There's no --cu-lossless-fast equivalent to --tskip-fast.

You're right, I tested this on grain content, and I got the exact same results, right down to the average QP. Hmm, interesting, good to know. I was also using tskip, and both encodes with cu-lossless and without have the exact same results. It may as well have been the exact same file. Very interesting.

benwaggoner

4th March 2023, 23:06

You're right, I tested this on grain content, and I got the exact same results, right down to the average QP. Hmm, interesting, good to know. I was also using tskip, and both encodes with cu-lossless and without have the exact same results. It may as well have been the exact same file. Very interesting.
Yep, that's what will happen if x265 doesn't detect any CUs where lossless encoding has better rate/distortion. Which I would expect if the clip is all grainy frames.

You may not be getting any benefit from --tskip either. It can help with noise-free credits in grainy movies, but isn't going to be useful for any CU with significant random noise. It's pretty hard to simply spend more compute to make grainy content look better than --preset slower. What improves grain is using the --nr-i* features, lowering ipratio and pbratio, and other things that are more qualitative tuning, not just more expensive modes.

HD MOVIE SOURCE

21st March 2023, 04:22

Yep, that's what will happen if x265 doesn't detect any CUs where lossless encoding has better rate/distortion. Which I would expect if the clip is all grainy frames.

You may not be getting any benefit from --tskip either. It can help with noise-free credits in grainy movies, but isn't going to be useful for any CU with significant random noise. It's pretty hard to simply spend more compute to make grainy content look better than --preset slower. What improves grain is using the --nr-i* features, lowering ipratio and pbratio, and other things that are more qualitative tuning, not just more expensive modes.

Good to know, and thank you for the advice.

HD MOVIE SOURCE

4th May 2023, 16:28

Yep, that's what will happen if x265 doesn't detect any CUs where lossless encoding has better rate/distortion. Which I would expect if the clip is all grainy frames.

You may not be getting any benefit from --tskip either. It can help with noise-free credits in grainy movies, but isn't going to be useful for any CU with significant random noise. It's pretty hard to simply spend more compute to make grainy content look better than --preset slower. What improves grain is using the --nr-i* features, lowering ipratio and pbratio, and other things that are more qualitative tuning, not just more expensive modes.

Hey, was just running some tests for ctu=64 and ctu=32, and ctu=32 was minorly better. Is this a simple...test on the content first? To find out what's the best methodology? Are there rules like, animation is best on ctu=64 and film is best on ctu=32? Because I was testing film content and 32, again, was better, but by a hair.

Would going lower than this be viable? Have you ever done that?

Also, based on the cut size, does this effect tu-intra-depth and tu-inter-depth? From my understand, tu-intra-depth and tu-inter-depth
are how small into the details the encode goes, is that correct? So for fine details, film grain and things like that tu-intra-depth=4 and tu-inter-depth=4 would typically be better.
Does using ctu=32 effect the size of the intra depth and inter?

benwaggoner

6th May 2023, 07:51

Hey, was just running some tests for ctu=64 and ctu=32, and ctu=32 was minorly better. Is this a simple...test on the content first? To find out what's the best methodology? Are there rules like, animation is best on ctu=64 and film is best on ctu=32? Because I was testing film content and 32, again, was better, but by a hair.

Would going lower than this be viable? Have you ever done that?
I've never seen any benefit to using ctu below 32 myself, other squeezing a last bit of extra encoder speed and parallelization.

Also, based on the cut size, does this effect tu-intra-depth and tu-inter-depth? From my understand, tu-intra-depth and tu-inter-depth
are how small into the details the encode goes, is that correct? So for fine details, film grain and things like that tu-intra-depth=4 and tu-inter-depth=4 would typically be better.
Does using ctu=32 effect the size of the intra depth and inter?
They are relative to ctu. So going all 4 down is only possible with ctu 64. ctu 32 to 3, 16 to 2.

Boulder

6th May 2023, 18:24

They are relative to ctu. So going all 4 down is only possible with ctu 64. ctu 32 to 3, 16 to 2.
They are relative to TU. The default TU should be the same as CTU as per the docs, but it actually is 32 unless CTU is lower. 32 is enough to use depth 4. I use CTU 32, TU 32, depth 4 in my encodes.

HD MOVIE SOURCE

8th May 2023, 04:04

They are relative to TU. The default TU should be the same as CTU as per the docs, but it actually is 32 unless CTU is lower. 32 is enough to use depth 4. I use CTU 32, TU 32, depth 4 in my encodes.

Interesting, are you seeing better encodes using CTU=32? I've seen one encode where the QP only dropped by 0.1 on average, so I didn't know whether to use it or not.

In theory can CTU=32 grab more detail than 64? I might just run some tests for even lower than 32 just to see it.

Boulder

8th May 2023, 05:03

Interesting, are you seeing better encodes using CTU=32? I've seen one encode where the QP only dropped by 0.1 on average, so I didn't know whether to use it or not.

In theory can CTU=32 grab more detail than 64? I might just run some tests for even lower than 32 just to see it.

Let's say that I've not seen any quality loss because of it. I mostly do encodes between 720p and 1440p and the lower value improves multithreading a lot.

There is also something strange happening with CTU 64 + limit-tu 0 + rskip 2 and MultiCoreWare just ignored the report. I don't trust that it wouldn't appear also in other cases. CTU 32 doesn't have it.

https://forum.doom9.org/showthread.php?p=1919347&highlight=--limit-tu%2A0#post1919347

excellentswordfight

8th May 2023, 09:23

Interesting, are you seeing better encodes using CTU=32? I've seen one encode where the QP only dropped by 0.1 on average, so I didn't know whether to use it or not.
What resolution are you encoding at? There are benefits of using it at high resolutions, i.e. 4k/uhd. Many years ago I asked someone at multicoreware (when they were active here) why they didnt automatically lower it to 32 for lower resolutions as 64 seems a bit overkill there and since it improves multithreading so much by lowering it, and they replied with some numbers showing increased compression efficiency even at 720/1080p, it was minor mind you, and tbh I have never seen any kind noticeable improvement that was worth the speed-penalty on high thread count systems.
In theory can CTU=32 grab more detail than 64? I might just run some tests for even lower than 32 just to see it.
The theory is that larger CU-sizes can compress the image more efficiently, and by that it should actually increase details as it can spend saved bits were more needed instead. But as with a lot of other parameters for improved compression, they might not yield "better" results when encoding for visually lossless, not sure why it would hurt quality in this case though. I've seen people mention that it would improve grain-encoding, but I have yet seen it demonstrated.

benwaggoner

17th May 2023, 18:20

Interesting, are you seeing better encodes using CTU=32? I've seen one encode where the QP only dropped by 0.1 on average, so I didn't know whether to use it or not.

In theory can CTU=32 grab more detail than 64? I might just run some tests for even lower than 32 just to see it.
I have seen --ctu 32 provide better quality with very grainy content at 4K and HD resolutions.

HD MOVIE SOURCE

3rd June 2023, 04:27

I have seen --ctu 32 provide better quality with very grainy content at 4K and HD resolutions.

Thats interesting. Does CTU=32 typically cost more bit-rate to use than CT=64? Is CTU=64, typically better for animation with plain objects, like plain skies, where CTU=64 would be more useful?

HD MOVIE SOURCE

18th June 2023, 04:02

I have seen --ctu 32 provide better quality with very grainy content at 4K and HD resolutions.

I noticed that --hme has some settings that goes with it.

--hme-search between 0,1,2. Is 2 the best? / Highest setting?

--hme-range between 0,1,2. Same question.

Is 2 for both settings maxed out?

benwaggoner

19th June 2023, 17:59

Thats interesting. Does CTU=32 typically cost more bit-rate to use than CT=64? Is CTU=64, typically better for animation with plain objects, like plain skies, where CTU=64 would be more useful?
Yes, its main value is increased compression efficiency for big flatter/gradient areas. It's not a big improvement; 32x32 is already much bigger than prior codecs supported.

benwaggoner

19th June 2023, 18:06

I noticed that --hme has some settings that goes with it.

--hme-search between 0,1,2. Is 2 the best? / Highest setting?

--hme-range between 0,1,2. Same question.

Is 2 for both settings maxed out?
The are documented here: https://x265.readthedocs.io/en/master/cli.html#temporal-motion-search-options

--hme-search is 0-5, just replicating the --me options.
--hme-range replicates --merange, and so can be anything from 0 to 32768

HD MOVIE SOURCE

26th June 2023, 21:29

The are documented here: https://x265.readthedocs.io/en/master/cli.html#temporal-motion-search-options

--hme-search is 0-5, just replicating the --me options.
--hme-range replicates --merange, and so can be anything from 0 to 32768

I appreciate it thanks. Have you experimented with any specific settings here? Are the defaults more than fine?

Is search best at 5, but say, the hme range to follow whichever settings you use for merange? Which for me is 57.

benwaggoner

27th June 2023, 22:41

I appreciate it thanks. Have you experimented with any specific settings here? Are the defaults more than fine?

Is search best at 5, but say, the hme range to follow whichever settings you use for merange? Which for me is 57.
I've not found any material quality or performance benefit form using --hme instead of not. x265 doesn't seem to be doing frame or GOP level parallelism for the different sizes, so practical speed benefits are a real challenge. I believe the engineering was more focused on the UHDKit product's ability to simultaneously encode, for example, a 540p, 1080p, and 2160p with each higher resolution reusing the motion search of the prior resolution, and getting parallelism from that.

I can see some combination of coarser --me with a wider net (note that 24 px at the 25% size would be 96 px in the final frame) search range for the lower resolutions, and a more precise me with a smaller search range at the higher makes intuitive sense as a potential quality/speed tradeoff improvement. I've not dived deep to try and find that combination, however.

There was some early talk from MCW some years ago that --hme would help improve quality with very grainy content, as the low pass filtering of the downscales would exclude false positive motion vectors resulting from random grain matches, with the full resolution pass refining on the initial matches. Which makes intuitive sense, but I never saw an actual PoC of this benefit.

I didn't find any benefit in a couple of rounds of initial testing. It's on my backlog of things to noodle with further time permitting.

HD MOVIE SOURCE

29th June 2023, 03:57

I've not found any material quality or performance benefit form using --hme instead of not. x265 doesn't seem to be doing frame or GOP level parallelism for the different sizes, so practical speed benefits are a real challenge. I believe the engineering was more focused on the UHDKit product's ability to simultaneously encode, for example, a 540p, 1080p, and 2160p with each higher resolution reusing the motion search of the prior resolution, and getting parallelism from that.

I can see some combination of coarser --me with a wider net (note that 24 px at the 25% size would be 96 px in the final frame) search range for the lower resolutions, and a more precise me with a smaller search range at the higher makes intuitive sense as a potential quality/speed tradeoff improvement. I've not dived deep to try and find that combination, however.

There was some early talk from MCW some years ago that --hme would help improve quality with very grainy content, as the low pass filtering of the downscales would exclude false positive motion vectors resulting from random grain matches, with the full resolution pass refining on the initial matches. Which makes intuitive sense, but I never saw an actual PoC of this benefit.

I didn't find any benefit in a couple of rounds of initial testing. It's on my backlog of things to noodle with further time permitting.

Interesting, thanks.

HD MOVIE SOURCE

1st July 2023, 21:01

I've not found any material quality or performance benefit form using --hme instead of not. x265 doesn't seem to be doing frame or GOP level parallelism for the different sizes, so practical speed benefits are a real challenge. I believe the engineering was more focused on the UHDKit product's ability to simultaneously encode, for example, a 540p, 1080p, and 2160p with each higher resolution reusing the motion search of the prior resolution, and getting parallelism from that.

I can see some combination of coarser --me with a wider net (note that 24 px at the 25% size would be 96 px in the final frame) search range for the lower resolutions, and a more precise me with a smaller search range at the higher makes intuitive sense as a potential quality/speed tradeoff improvement. I've not dived deep to try and find that combination, however.

There was some early talk from MCW some years ago that --hme would help improve quality with very grainy content, as the low pass filtering of the downscales would exclude false positive motion vectors resulting from random grain matches, with the full resolution pass refining on the initial matches. Which makes intuitive sense, but I never saw an actual PoC of this benefit.

I didn't find any benefit in a couple of rounds of initial testing. It's on my backlog of things to noodle with further time permitting.

One thing I've noticed, and I think it's when I started using hme, and that's...Even when I set my encoder to placebo, me=2 gets set to 2 and I believe placebo should be set to 5. merange=48 gets set to 48 and placebo is 92.

I re-encoded with me=5 and that got set fine. However, I set merange to the default of 57 and it still gets set to 48. Is hme controlling this?

microchip8

1st July 2023, 22:05

One thing I've noticed, and I think it's when I started using hme, and that's...Even when I set my encoder to placebo, me=2 gets set to 2 and I believe placebo should be set to 5. merange=48 gets set to 48 and placebo is 92.

I re-encoded with me=5 and that got set fine. However, I set merange to the default of 57 and it still gets set to 48. Is hme controlling this?

hme has its own search radius, which you can set using --hme-range

HD MOVIE SOURCE

3rd July 2023, 01:59

hme has its own search radius, which you can set using --hme-range

Okay, thank you. I haven't had much experience with this so I'm still learning how hme affects other settings. Thanks

benwaggoner

3rd July 2023, 17:46

One thing I've noticed, and I think it's when I started using hme, and that's...Even when I set my encoder to placebo, me=2 gets set to 2 and I believe placebo should be set to 5. merange=48 gets set to 48 and placebo is 92.

I re-encoded with me=5 and that got set fine. However, I set merange to the default of 57 and it still gets set to 48. Is hme controlling this?
When you're using --hme, the --hme-search and --hme-range override --me and --me-range. So --preset has no impact on those.

To get a "placebo" --hme, you'd probably use

--hme-search 3,3,3 --hme-range 92,92,92

it's the last digit of each that's most important, as those are for the full resolution final pass. I imagine that

--hme-search 2,2,3 --hme-range 57,57,92

Would give you essentially equal results.

I don't think that --hme would offer much benefit at all in placebo. To get a better speed-quality benefit, something like the below would be more likely to be net beneficial. HME techniques are generally about improving quality @ speed, not maximum quality when not speed bound.

--hme-search 2,2,3 --hme-range 25,25,26

Where the motion search range can be constrained for the higher resolution passes as the coarser stages were able to identify good matches to refine in the later stages. a 25 range at quarter resolution maps to 100 at the final resolution.

(25 constrains motion search to 32x32 in hex, as does 26 to 32x32 in star. That allows another 32x32 row to be parallelized in WPP versus higher values).

HD MOVIE SOURCE

11th July 2023, 22:07

When you're using --hme, the --hme-search and --hme-range override --me and --me-range. So --preset has no impact on those.

To get a "placebo" --hme, you'd probably use

--hme-search 3,3,3 --hme-range 92,92,92

it's the last digit of each that's most important, as those are for the full resolution final pass. I imagine that

--hme-search 2,2,3 --hme-range 57,57,92

Would give you essentially equal results.

I don't think that --hme would offer much benefit at all in placebo. To get a better speed-quality benefit, something like the below would be more likely to be net beneficial. HME techniques are generally about improving quality @ speed, not maximum quality when not speed bound.

--hme-search 2,2,3 --hme-range 25,25,26

Where the motion search range can be constrained for the higher resolution passes as the coarser stages were able to identify good matches to refine in the later stages. a 25 range at quarter resolution maps to 100 at the final resolution.

(25 constrains motion search to 32x32 in hex, as does 26 to 32x32 in star. That allows another 32x32 row to be parallelized in WPP versus higher values).

I really appreciate the explanation on this, because I've been testing some search and range settings for hme and my encodes have been crashing. Now, you've provided the command lines to be more understandable I'll give them a try.

I've recently been testing placebo vs straight hme, and hme gave me 0.1 QP average QP more than not using it vs placebo that then uses me=5 and 92 for the merange.
Interestingly enough the hme encode was faster than placebo when just stating hme, and not stating the range or search values. So, I think you might be right that under placebo it may not be needed. However, now knowing what I can change, I might be able to eek more out of it, LOL.

Placebo with no hme = Job completed (Elapsed Time: 8h 16m)
Placebo with basic hme turned on = Job completed (Elapsed Time: 7h 09m)

Placebo with no hme average QP= Avg QP:8.74
Placebo with basic hme turned on = Avg QP:8.73

So, unless you can tune hme significantly with the settings you said
--hme-search 3,3,3 --hme-range 92,92,92, maybe there's some more quality to be gained? But, I assume you'd lose speed with these settings vs the default which is 16,32,48 for the range, but I'm not sure what for the search.

My media info says hme / Level / merange / L0,L1,L2=16,32,48, which is me just stating hme in the command line. I did a search for hme-search and that isn't there. I assume that you have to manually state that?

The big this is, is there a visual difference when using hme vs not using it when using placebo? Also, I wonder when the speed crosses over? Meaning, I wonder which preset hme becomes slower? Because the merange is always 57 on every other preset. If hme-range is the equivalent and by default only goes to 48, is hme always better to use?

hme-range <integer>,<integer>,<integer> is the last number the most important number? Does the first and second number matter if they are lower-resolution searches?

Thanks.

microchip8

12th July 2023, 08:06

If you're encoding Full HD or lower, I wouldn't bother with HME.

benwaggoner

12th July 2023, 22:25

My media info says hme / Level / merange / L0,L1,L2=16,32,48, which is me just stating hme in the command line. I did a search for hme-search and that isn't there. I assume that you have to manually state that?
It's the same values as --me and --me-range, so

--hme-search:
0: dia
1: hex (default)
2: umh
3: star
4: sea
5: full

Thus the default search modes for HME are Dia for 1/4 res, Hex for 1/2 res, and UMH for full res. --preset placebo uses 3: Star. So for placebo-equivalent you'd want at least the third --hme-search value to be 3.

The big this is, is there a visual difference when using hme vs not using it when using placebo? Also, I wonder when the speed crosses over? Meaning, I wonder which preset hme becomes slower? Because the merange is always 57 on every other preset. If hme-range is the equivalent and by default only goes to 48, is hme always better to use?
You can always use --me 48. But the key goal of a HME algorithm is that you can do an initial coarse search at a lower resolution, and then refine with more accuracy but smaller search range as the resolutions go up. So, in theory, a --me 92 gets handled by having the first motion search range be 92/4=23. There are intricacies between search modes and such, but that's the basic idea.

hme-range <integer>,<integer>,<integer> is the last number the most important number? Does the first and second number matter if they are lower-resolution searches?

I'm not sure which number is the most "important" - it gets down to finding the right combination of quality, speed.

Sure --hme-search 3,3,3 --hme-range 24, 48, 92 should match placebo quality, but I don't know that it would look better than placebo, or that it would save any encoding time.

N'Cha

23rd September 2023, 16:50

[QUOTE=HD MOVIE SOURCE;1975282]Hi,

I'm looking to find settings that increase detail and sharpness. In turn, I'm also looking for settings that apply blurring or softness and can be turned off.

Here are the settings that I know of so far that do this.

subme=7 (A sharpener, detail increase)
no-deblock / deblock (Blurring/detail loss the more deblock is used)
no-sao / sao (High Blur)
no-strong-intra-smoothing / strong-intra-smoothing (Smoother/Blur)
psy-rd=-- & psy-rdoq=-- rdoq-level=-- (Detail increase as the cost of accuracy - Can look over-digital)

I am unaware of any evidence that --strong-intra-smoothing reduces visible detail in any scenario.
Nor does --deblock if used appropriately. Both will reduce detail at moderate-low bitrates, as they reduce compression efficiency and drive QPs higher, which itself is a low-pass filter. The HEVC built-in deblocking filter is better turned than H.264's.

--subme only goes up to 5 in --preset placebo. I've not seen visible quality improvements from using higher values.

If you have sharp synthetic details (text, line art, noise-free anime), --tskip can increase detail.

In theory --hme can help with very noisy content.
--amp and --rect can improve detail by allowing TUs to better match the shape of content, for example encoding a mild horizontal curve as 32x8 instead of forcing it to be square.

But in general, a lot of these things are controlled by the preset. It's best to pick the slowest preset that is fast enough, and start tweaking from there.

That "blurrs" differences in QP to reduce quality fluctuations. It has no direct impact on sharpness.

I'd start doing a test encode at --preset slower --crf 18 --no-sao and see if the detail retention and file size are okay for you. If not, tweak from there. Jumping in with parameters without having some reference encodes to compare with the source and each other makes for a lot more work.Is it still your opinion that subme 7 is not sharper/more detailed than subme 5 ? (especially for anime)

benwaggoner

26th September 2023, 01:11

Is it still your opinion that subme 7 is not sharper/more detailed than subme 5 ? (especially for anime)
I haven't tested this is many years, but I'm not aware of any changes that could make a difference. MCW's testing showed that --subme 7 wasn't even worth using in --preset placebo.

If anyone has more recent or specific results, I'd love to hear about them!

brad86

22nd January 2024, 09:58

Recently found a good balance of settings and quality I would like for my encodes, except for one issues that I can not seem to fix.
I'm getting a great image overall with these settings, other than blurring near the top of the canvas. It's like it doesn't know how to handle the edge very well.
I would love to find a setting that can stop this. A lower CRF value (even 0 for testing) does not stop it. aq-strength, is really the only tweak I make between different films (1.0-1.3), but again, even a high value here does not help.

You can see the issue I'm having below.

https://i.ibb.co/qsSjqs0/mpv-shot0001.jpg

My settings current settings are;
10-bit. CRF 17. Slow preset.
selective-sao=2:no-strong-intra-smoothing:rskip=2:rskip-edge-threshold=3:aq-mode=3:aq-strength=1.2:deblock=-3:-3

Boulder

22nd January 2024, 10:50

Borders cropped off before encoding? Mod-8/16 vertical resolution?

brad86

22nd January 2024, 11:00

Borders cropped off before encoding? Mod-8/16 vertical resolution?

No cropping. I despise doing such a thing. It is as it is on my disc.