x265 HEVC Encoder [Archive] - Page 170

jauh

6th February 2022, 23:24

Try switching to CTU 32 and I'm sure it will look better.

I'll give that a try, but I think it's chasing the wrong goal: rskip >0 gives an encoding speed improvement at the cost of coding efficiency, and lowering CTU to 32 lowers the latter further, whereas limit-tu 1 stops recursion based on the relative cost of a TU split. So, so far as I can tell you get better, albeit slower, coding efficiency at CTU=64 with rskip=0 and limit-tu=1 cf. CTU=32, rskip=1, and limit-tu=0.

I could be wrong, so I'll try it, just need to wait for the current encode to finish and it's chugging along at 1.43fps...

jauh

7th February 2022, 02:12

Try switching to CTU 32 and I'm sure it will look better.

So, I gave CTU=32 a go, and as you hypothesised, no paintbrush smear, but:-

CTU=64,QG=64,rskip=0,limit-tu=1:

x265 [info]: frame I: 7, Avg QP:18.46 kb/s: 33742.92
x265 [info]: frame P: 95, Avg QP:19.68 kb/s: 21594.99
x265 [info]: frame B: 398, Avg QP:21.23 kb/s: 7305.39
x265 [info]: Weighted P-Frames: Y:2.1% UV:2.1%
x265 [info]: Weighted B-Frames: Y:3.0% UV:1.3%
x265 [info]: consecutive B-frames: 7.8% 4.9% 2.9% 37.3% 12.7% 18.6% 9.8% 2.9% 0.0% 0.0% 0.0% 0.0% 0.0% 1.0% 0.0% 0.0% 2.0%
encoded 500 frames in 366.59s (1.36 fps), 10390.54 kb/s, Avg QP:20.89

CTU=32,QG=32,rskip=1,limit-tu=0:

x265 [info]: frame I: 7, Avg QP:18.46 kb/s: 33622.09
x265 [info]: frame P: 95, Avg QP:19.60 kb/s: 22259.21
x265 [info]: frame B: 398, Avg QP:21.10 kb/s: 8580.94
x265 [info]: Weighted P-Frames: Y:2.1% UV:2.1%
x265 [info]: Weighted B-Frames: Y:3.0% UV:1.3%
x265 [info]: consecutive B-frames: 7.8% 4.9% 2.9% 37.3% 12.7% 18.6% 9.8% 2.9% 0.0% 0.0% 0.0% 0.0% 0.0% 1.0% 0.0% 0.0% 2.0%
encoded 500 frames in 374.67s (1.33 fps), 11530.39 kb/s, Avg QP:20.78

CTU=32,QG=32,rskip=0,limit-tu=0:

x265 [info]: frame I: 7, Avg QP:18.46 kb/s: 33622.09
x265 [info]: frame P: 95, Avg QP:19.61 kb/s: 22226.53
x265 [info]: frame B: 398, Avg QP:21.10 kb/s: 8577.44
x265 [info]: Weighted P-Frames: Y:2.1% UV:2.1%
x265 [info]: Weighted B-Frames: Y:3.0% UV:1.3%
x265 [info]: consecutive B-frames: 7.8% 4.9% 2.9% 37.3% 12.7% 18.6% 9.8% 2.9% 0.0% 0.0% 0.0% 0.0% 0.0% 1.0% 0.0% 0.0% 2.0%
encoded 500 frames in 372.18s (1.34 fps), 11521.40 kb/s, Avg QP:20.78

So it does help, but at a cost... In the circumstances, I think CTU=QG=64,rskip=0,limit-tu=1 is the optimal variant...

Boulder

7th February 2022, 06:07

asarian

7th February 2022, 09:03

jauh

7th February 2022, 10:48

Looks like all recent x265 binaries are broken. Whichever one I use, keeps saying

y4m [info]: 3840x2160 fps 24000/1001 i420p10 unknown frame count
raw [info]: output file: q:\video\arti.hevc
x265 [info]: HEVC encoder version 3.5+20-17839cc0d
x265 [info]: build info [Windows][GCC 11.2.0][64 bit] 10bit

(With --frames on the command line, of course).

Only the yuuki one I still have, from April last year, still shows the frames. Did something change?

It's not a bug, it's literally telling you that the pipe input doesn't have a clue about how many frames it has (how would a pipe know how many frames exist?) The progress bar will display [n/specified frames] because that's how many frames x265 is expecting. If you specify a yuv/y4m file (not a pipe) x265 will work out how many frames the file has based on the file size. At least that's my experience.

jauh

7th February 2022, 10:52

If the filesize bothers you, you can always raise CRF to compensate. The fact just is that CTU 64 has these strange side effects in x265. Personally I go for quality first while the x265 development has gone size first :p

I think I was being too laconic which led to a possible confusion: CTU=QG=32 eliminates the paintbursh smear with limit-tu=0 and rskip!=1, but visually, the result is indistinguishable from CTU=QG=64 with limit-tu=1 and rskip=0. The size of 32 only increases the signalling cost. ;)

asarian

7th February 2022, 10:54

It's not a bug, it's literally telling you that the pipe input doesn't have a clue about how many frames it has (how would a pipe know how many frames exist?) The progress bar will display [n/specified frames] because that's how many frames x265 is expecting. If you specify a yuv/y4m file (not a pipe) x265 will work out how many frames the file has based on the file size. At least that's my experience.

Oh, it's a bug alright. :)

The pipe knows, the way it always does: because I tell it how many frames there are (like I said); like:

VSPipe -c y4m "f:\jobs\contact.vpy" - | x265 --y4m --input - --preset medium --input-depth 8 --output-depth 10 --crf 10 --frames 215323 --output "g:\video\contact.hevc"

It's worked like that for years.

jauh

7th February 2022, 11:00

Oh, it's a bug alright. :)

The pipe knows, the way it always does: because I tell it how many frames there are (like I said); like:

VSPipe -c y4m "f:\jobs\contact.vpy" - | x265 --y4m --input - --preset medium --input-depth 8 --output-depth 10 --crf 10 --frames 215323 --output "g:\video\contact.hevc"

It's worked like that for years.

--frames tells x265 how many frames to expect to encode (so you get a % progress report), a pipe can never know how many things will be fed to it in the future, it only knows what it's been fed so far.

asarian

7th February 2022, 11:14

--frames tells x265 how many frames to expect to encode (so you get a % progress report), a pipe can never know how many things will be fed to it in the future, it only knows what it's been fed so far.

Exactly. --frames is parsed to x265. Look how yuuki build does it:

y4m [info]: 3840x2160 fps 25/1 i420p10 frames 0 - 10194 of 10195
x265 [info]: Using preset medium & tune none
raw [info]: output file: q:\video\opera.hevc
x265 [info]: HEVC encoder version 3.5+2-g2b25c9ba0+45
x265 [info]: build info [Windows][GCC 10.2.0][64 bit] Yuuki 10bit

And as long as memory serves, x265 has always shown the frames like that, whatever build (until recently, that is).

Boulder

7th February 2022, 11:43

I think I was being too laconic which led to a possible confusion: CTU=QG=32 eliminates the paintbursh smear with limit-tu=0 and rskip!=1, but visually, the result is indistinguishable from CTU=QG=64 with limit-tu=1 and rskip=0. The size of 32 only increases the signalling cost. ;)

CTU 64 will make a mess out of noisy flat backgrounds compared to CTU 32 (qg-size 32 in both cases since x265 doesn't use 64 by default for CTU 64).

Boulder

7th February 2022, 11:44

Exactly. --frames is parsed to x265. Look how yuuki build does it:

y4m [info]: 3840x2160 fps 25/1 i420p10 frames 0 - 10194 of 10195
x265 [info]: Using preset medium & tune none
raw [info]: output file: q:\video\opera.hevc
x265 [info]: HEVC encoder version 3.5+2-g2b25c9ba0+45
x265 [info]: build info [Windows][GCC 10.2.0][64 bit] Yuuki 10bit

And as long as memory serves, x265 has always shown the frames like that, whatever build (until recently, that is).

That build probably has a patch applied, in vanilla x265 it's always been like it is now as far as I can remember.

I don't know why it is so important to you since the only place where it really matters is the ETA calculation which works regardless of the piping info string.

jauh

7th February 2022, 11:49

CTU 64 will make a mess out of noisy flat backgrounds compared to CTU 32 (qg-size 32 in both cases since x265 doesn't use 64 by default for CTU 64).

It doesn't, that's what I mean by visually indistinguishable.

Boulder

7th February 2022, 12:08

It doesn't, that's what I mean by visually indistinguishable.

Do you have any samples? It would be interesting to see them as my experiences are the exact opposite, also based on recent tests.

jauh

7th February 2022, 12:38

Do you have any samples? It would be interesting to see them as my experiences are the exact opposite, also based on recent tests.

TBBT season 1 episode 2 is great with (the first 500 frames are more than enough to see that flat noisy areas are just fine):-

--crf 20 \
--ref 6 --limit-refs 0 \
--bframes 16 --weightb --b-intra --b-adapt 2 \
--lookahead-slices 1 --rc-lookahead 240 \
--min-keyint 24 --keyint 240 \
--pmode --pme \
--rect --amp --no-limit-modes \
--me star --merange 58 --subme 5 --max-merge 5 \
--analyze-src-pics --no-early-skip --rskip 0 --fades \
--tu-intra-depth 4 --tu-inter-depth 4 --limit-tu 1 \
--rdoq-level 0 --psy-rd 4 --rd-refine \
--deblock -2:-1 \
--no-cutree \
--aq-mode 2 --aq-strength 1.0 \
--no-sao \
--ctu 64 --qg-size 64 --qpstep 8 \
--opt-qp-pps --opt-ref-list-length-pps ...

Boulder

7th February 2022, 13:16

TBBT season 1 episode 2 is great with (the first 500 frames are more than enough to see that flat noisy areas are just fine):-

--crf 20 \
--ref 6 --limit-refs 0 \
--bframes 16 --weightb --b-intra --b-adapt 2 \
--lookahead-slices 1 --rc-lookahead 240 \
--min-keyint 24 --keyint 240 \
--pmode --pme \
--rect --amp --no-limit-modes \
--me star --merange 58 --subme 5 --max-merge 5 \
--analyze-src-pics --no-early-skip --rskip 0 --fades \
--tu-intra-depth 4 --tu-inter-depth 4 --limit-tu 1 \
--rdoq-level 0 --psy-rd 4 --rd-refine \
--deblock -2:-1 \
--no-cutree \
--aq-mode 2 --aq-strength 1.0 \
--no-sao \
--ctu 64 --qg-size 64 --qpstep 8 \
--opt-qp-pps --opt-ref-list-length-pps ...

I'd rather you post some samples so the rest of us don't need to dig the source from anywhere. I can myself compare on the Chernobyl sample I posted in that linked answer earlier and which I have already available.

AQ-mode 2 is already something I avoid along with a large max-merge. I tested disabling cu-tree just a while ago and it caused quite a lot of lost details in places. Of course, these are all subjective things.

jauh

7th February 2022, 13:20

I'd rather you post some samples so the rest of us don't need to dig the source from anywhere. I can myself compare on the Chernobyl sample I posted in that linked answer earlier and which I have already available.

AQ-mode 2 is already something I avoid along with a large max-merge. I tested disabling cu-tree just a while ago and it caused quite a lot of lost details in places. Of course, these are all subjective things.

What can take a 43MB APNG file? You've disabled things that I've expressly enabled on purpose and vice versa, of course your results are going to be wildly different to mine...

edit: here you go:-
edit 2: uploaded a longer clip (5s) with a scene change so that ffmpeg has to render all frames and not stop abruptly without properly resolving all references

https://ufile.io/pusv5xt0

Boulder

7th February 2022, 14:36

What can take a 43MB APNG file? You've disabled things that I've expressly enabled on purpose and vice versa, of course your results are going to be wildly different to mine...

edit: here you go:-
edit 2: uploaded a longer clip (5s) with a scene change so that ffmpeg has to render all frames and not stop abruptly without properly resolving all references

https://ufile.io/pusv5xt0

Why not just regular source and result samples?

jauh

7th February 2022, 14:44

Why not just regular source and result samples?

What do you expect to see??? It's far simpler for you to encode your problematic scene with the options that I specified and see if it makes a difference, is it not?

Boulder

7th February 2022, 15:06

Then, please provide an unprocessed source as is.

jauh

7th February 2022, 15:16

Then, please provide an unprocessed source as is.

Why? I'm not making an archive copy of the BD (I have the BD for that!), is there anything in the clip that worsens your psychovisual experience or is it that it's not got the artefacts that you expected and you now want to knit pick that tiniest portions of grain are out of phase? But here, I've no dog in the fight:-

https://ufile.io/01rl8ik1

I'll even tell you the clip zone for the episode:- TBBT S1 E2 @

-ss 00:00:08 -t 5 -vf crop=600:200:1200:0

Boulder

7th February 2022, 15:37

Why? I'm not making an archive copy of the BD (I have the BD for that!), is there anything in the clip that worsens your psychovisual experience or is it that it's not got the artefacts that you expected and you now want to knit pick that tiniest portions of grain are out of phase? But here, I've no dog in the fight:-

https://ufile.io/01rl8ik1

I'll even tell you the clip zone for the episode:- TBBT S1 E2 @

-ss 00:00:08 -t 5 -vf crop=600:200:1200:0

I'm not going to investigate ways of opening APNG files in Avisynth, simple as that. As I've said, I have my preferences and you probably have different ones. I am just interested in seeing what kind of results your settings produce on the source compared to mine - maybe there is something to look deeper in.

jauh

7th February 2022, 15:48

I'm not going to investigate ways of opening APNG files in Avisynth, simple as that. As I've said, I have my preferences and you probably have different ones. I am just interested in seeing what kind of results your settings produce on the source compared to mine - maybe there is something to look deeper in.

You don't need to use Avisynth for APNG, MS Edge plays them fine (drag and drop), and so would any Chromium clone, by extension, I suspect... And if you really wanted to see the difference between the two settings, there's literally nothing stopping you from encoding a scene of your choosing from your own sources and seeing the difference...

Boulder

7th February 2022, 15:55

You don't need to use Avisynth for APNG, MS Edge plays them fine (drag and drop), and so would any Chromium clone, by extension, I suspect... And if you really wanted to see the difference between the two settings, there's literally nothing stopping you from encoding a scene of your choosing from your own sources and seeing the difference...

Well, the point was to take a look at your results as they often depend on the source. And as I mentioned, I can compare things on the couple of sample clips I have ready.

Boulder

7th February 2022, 16:23

TBBT season 1 episode 2 is great with (the first 500 frames are more than enough to see that flat noisy areas are just fine):-

--crf 20 \
--ref 6 --limit-refs 0 \
--bframes 16 --weightb --b-intra --b-adapt 2 \
--lookahead-slices 1 --rc-lookahead 240 \
--min-keyint 24 --keyint 240 \
--pmode --pme \
--rect --amp --no-limit-modes \
--me star --merange 58 --subme 5 --max-merge 5 \
--analyze-src-pics --no-early-skip --rskip 0 --fades \
--tu-intra-depth 4 --tu-inter-depth 4 --limit-tu 1 \
--rdoq-level 0 --psy-rd 4 --rd-refine \
--deblock -2:-1 \
--no-cutree \
--aq-mode 2 --aq-strength 1.0 \
--no-sao \
--ctu 64 --qg-size 64 --qpstep 8 \
--opt-qp-pps --opt-ref-list-length-pps ...

Are you sure you didn't mess something up in those settings? I did a test encode using them and using my baseline, and your encode is a lot smaller and looks just really ugly in motion. I did compare the MediaInfo output to make sure I didn't misplace something, but that was not the case.

Mine: https://drive.google.com/file/d/1JvmaK9YKUGqcDdd_J2fTvGb8z63yCJox/view?usp=sharing
Yours: https://drive.google.com/file/d/1qP0-McP8aJzC6M_URd2kf_BvHVQpBwzJ/view?usp=sharing

My settings from MediaInfo: https://pastebin.com/bsqX1S4c

jauh

7th February 2022, 18:12

Are you sure you didn't mess something up in those settings? I did a test encode using them and using my baseline, and your encode is a lot smaller and looks just really ugly in motion. I did compare the MediaInfo output to make sure I didn't misplace something, but that was not the case.

Mine: https://drive.google.com/file/d/1JvmaK9YKUGqcDdd_J2fTvGb8z63yCJox/view?usp=sharing
Yours: https://drive.google.com/file/d/1qP0-McP8aJzC6M_URd2kf_BvHVQpBwzJ/view?usp=sharing

My settings from MediaInfo: https://pastebin.com/bsqX1S4c

Well, for starters you're massively(!) overcompensating on chroma QP offsets (I have no idea what effect that has on CRF decisions) whereas I'm not, you also set frame-threads=4, I do not, and claim they are my settings.

Conversely, your "default" is at CRF=18, mine's at 20, c'mon, that's not even like for like, my settings explode to similar bitrates at CRF 18, so if you're going to compare and whine at least try to compare like for like!

edit: as it happens, I have Chernobyl, so just tell me what episode and what frame range that clip is, and I'll do it on my end...

Boulder

7th February 2022, 18:50

Well, for starters you're massively(!) overcompensating on chroma QP offsets (I have no idea what effect that has on CRF decisions) whereas I'm not, you also set frame-threads=4, I do not, and claim they are my settings.

Conversely, your "default" is at CRF=18, mine's at 20, c'mon, that's not even like for like, my settings explode to similar bitrates at CRF 18, so if you're going to compare and whine at least try to compare like for like!

Chill out, even if this is the internet. The point has been comparing the results, which I did and I was only asking if the settings are correct.

Besides, you said that those setting produce a satisfying output, so naturally I try them first. They work much better with my Hot Fuzz (crisp film) and The Hobbit (all CGI) samples but there's a lot of motion there or very little noise or actual detail so it is much harder to find out the problems. With x265, the hardest parts are flat, darker coloured surfaces since the floating noise patterns start appearing quite easily, especially with a lower quality source where some might already be present.

Your settings are so different that the same CRF does not apply. In x264, there is a clear correlation; basically more effective options just reduce the filesize and keep a similar quality level, but it's not there in x265. Just to please you, I did try CRF 18 and it produced a much bigger file than CRF 20, but the defects are still there if less obvious. The difference between the base encodes probably comes mostly from the AQ mode. I did test aq-mode 2 with my settings and it fails to produce a satisfying result and the filesize is a lot smaller than with mode 1.

The chroma QP offsets do very little to the average bitrate. The difference when removing the offset is -2,3% in this case.

Frame threads from 4 to 1 affect the filesize by a whopping 0,01% in this case so it's meaningless and makes the encoding only slower.

One final problem with your settings is that they are mostly so placebo that the encoding gets dead slow, 2.45 fps vs 7.09 fps. Add any filtering to the formula and it's definitely not for everyday use.

Here are the original clips I've got:
Hot Fuzz sample: https://drive.google.com/file/d/19cIzFGAocSHYlBNJdZJ08OaQbfBIHUjm/view?usp=sharing
Chernobyl sample: https://drive.google.com/file/d/1nEa9H70d7Ekeue2omnwLDToVK7LaxZ_t/view?usp=sharing
The Hobbit sample: https://drive.google.com/file/d/1LHYBAHsKLAeHuSYu79r_uoRTM4iyjC6G/view?usp=sharing

jauh

7th February 2022, 18:59

Chill out, even if this is the internet. The point has been comparing the results, which I did and I was only asking if the settings are correct.

Besides, you said that those setting produce a satisfying output, so naturally I try them first. They work much better with my Hot Fuzz (crisp film) and The Hobbit (all CGI) samples but there's a lot of motion there or very little noise or actual detail so it is much harder to find out the problems. With x265, the hardest parts are flat, darker coloured surfaces since the floating noise patterns start appearing quite easily, especially with a lower quality source where some might already be present.

Your settings are so different that the same CRF does not apply. In x264, there is a clear correlation; basically more effective options just reduce the filesize and keep a similar quality level, but it's not there in x265. Just to please you, I did try CRF 18 and it produced a much bigger file than CRF 20, but the defects are still there if less obvious. The difference between the base encodes probably comes mostly from the AQ mode. I did test aq-mode 2 with my settings and it fails to produce a satisfying result and the filesize is a lot smaller than with mode 1.

The chroma QP offsets do very little to the average bitrate. The difference when removing the offset is -2,3% in this case.

Frame threads from 4 to 1 affect the filesize by a whopping 0,01% in this case so it's meaningless and makes the encoding only slower.

One final problem with your settings is that they are mostly so placebo that the encoding gets dead slow, 2.45 fps vs 7.09 fps. Add any filtering to the formula and it's definitely not for everyday use.

Here are the original clips I've got:
Hot Fuzz sample: https://drive.google.com/file/d/19cIzFGAocSHYlBNJdZJ08OaQbfBIHUjm/view?usp=sharing
Chernobyl sample: https://drive.google.com/file/d/1nEa9H70d7Ekeue2omnwLDToVK7LaxZ_t/view?usp=sharing
The Hobbit sample: https://drive.google.com/file/d/1LHYBAHsKLAeHuSYu79r_uoRTM4iyjC6G/view?usp=sharing

Those settings are completely inappropriate for crips/cgi films, I'd definitely not use them for that!

But given that the episodes of TBBT on my settings are churning out at about 1MBps (8000kbps) and grain/noise is well preserved, I don't know what you're doing on your end but the SEI info in cherno_jauh.hevc is that it's not what I'm doing, which is why I said give me the episode number and frame range and I'll encode it on my end... <-- never mind, found it, I'll encode the scene (together with scene cuts) next.

Edit: are you also saying that you're filtering prior to encoding???

Boulder

7th February 2022, 19:58

Edit: are you also saying that you're filtering prior to encoding???

Not in these tests, but normally yes. Ultra light denoising and usually downscaling to 720p (HD) or 1080p/1440p (UHD).

benwaggoner

8th February 2022, 00:31

So, I gave CTU=32 a go, and as you hypothesised, no paintbrush smear, but:-

So it does help, but at a cost... In the circumstances, I think CTU=QG=64,rskip=0,limit-tu=1 is the optimal variant...
You'll want to increase --tu-intra-depth and --tu-inter-depth by 1 for --ctu 64 versus --ctu 32, so you can have the same minimum TU size.

--qg-size 64 seems too big to be actually optimal, unless you're going for very little adaptive quant. I've never seen 64/64 be better than 32/32 , and 32/32 can certainly be better than 64/64.

If you're doing --csv-log-level 2, you'll get per frame breakdowns of block types. Comparing different --limit-tu and --tu-*-depth modes can yield some interesting results in that data, which correlate with visible results in even more interesting ways.

jauh

8th February 2022, 04:03

Chill out, even if this is the internet. The point has been comparing the results, which I did and I was only asking if the settings are correct.

<snip>

So I literally have no idea how you used my settings and got a clip at around 2.7MB where as when I encoded the same clip (well, you didn't give me the start frame and duration so I had to guess, I also noticed your encode is at 25fps where as my BD source is 24000/1001 (so actually fewer frames than you would've needed), my encode starts from frame 52600 and lasts 600 frames) from BD source and got a much bigger (~13.5MB) file...

https://ufile.io/sjwmnuqu

The slight wobble of the two lines of the panel on the door (top-centre frame, from 10s till 17s) looks like a bug in x265 that my settings hit and yours don't, because I can't think of a reason why things would move in a frame during coding when they actually haven't moved at all in the master...

As for grinding to a halt, you can remove --pmode --pme --rect and --amp if you don't like the speed, I don't mind them as otherwise my cores sit there twiddling their... bits...

@benwaggoner
just for completeness I ran an encode with --qg-size 32 as well, that resulted in a slightly lower signalling cost, but still with the wobble:

https://ufile.io/zmdq52d7

Boulder

8th February 2022, 06:40

So I literally have no idea how you used my settings and got a clip at around 2.7MB where as when I encoded the same clip (well, you didn't give me the start frame and duration so I had to guess, I also noticed your encode is at 25fps where as my BD source is 24000/1001 (so actually fewer frames than you would've needed), my encode starts from frame 52600 and lasts 600 frames) from BD source and got a much bigger (~13.5MB) file...
Different releases so there might be a big difference in bitrate. Naturally this can affect the source quality quite a lot as well and present problems to the encoder.

The slight wobble of the two lines of the panel on the door (top-centre frame, from 10s till 17s) looks like a bug in x265 that my settings hit and yours don't, because I can't think of a reason why things would move in a frame during coding when they actually haven't moved at all in the master...

That's a side effect of the "floating noise" problem. If you look at my original sample clip, the lines already flicker slightly in it and encoding just amplifies it a lot.

Aq-mode > 1 is very prone to make it happen. I think the low-frequency noise is a hard case for those auto variance methods and the encoder just quantizes those CTUs to death. If I use my settings but change to aq-mode 2, the bitrate drops by ~30% at the same CRF so they are not very comparable. Then, in the Hobbit and Hot Fuzz clips, it overshoots the bitrate (+29% and +24%) without bringing anything extra to the overall quality.

I still don't understand why that mode was made default at the last big changes they made. They didn't even do anything to the actual method, just changed the default value.

jauh

8th February 2022, 13:03

<snip>

That's a side effect of the "floating noise" problem. If you look at my original sample clip, the lines already flicker slightly in it and encoding just amplifies it a lot.

<snip>

Try this one, see what you think:

https://ufile.io/l5o1hv52

Boulder

8th February 2022, 17:52

benwaggoner

9th February 2022, 02:11

Increasing --psy and --psy-rdoq can help with grain, as they preserve the energy and the general texture. That can help with visible banding, grain swirling, etcetera.

I suggest always testing with 2-pass VBR with grain tuning, as lots of the knobs to tweak can change both file size and CRF requirements significantly.

jauh

9th February 2022, 13:44

Definitely looks better. Your source probably has higher frequency grain, at least it looks like that so it won't start floating nearly as bad as the low freq type. Still the aq-mode 2 induced static areas are there, but they are not as apparent. I don't know how much the psychovisual options could help with that though, I've never really tested switching psy-rdoq off like you have and using only psy-rd. I borrowed my psy settings from littlepox's "tune film" thread.

It's the same source as produced noticeable wobble with other settings. The switch that made the difference was --aq-motion.

Increasing --psy and --psy-rdoq can help with grain, as they preserve the energy and the general texture. That can help with visible banding, grain swirling, etcetera.

I suggest always testing with 2-pass VBR with grain tuning, as lots of the knobs to tweak can change both file size and CRF requirements significantly.

In the x264 days, I used to do a 1st pass CRF then a second pass using the ABR from the CRF pass, I tried the same with x265's 3.5+20 version, but every single encode I tried, the video starts to frame skip at non-deterministic timecodes... Is this a known issue?

Boulder

9th February 2022, 16:42

In the x264 days, I used to do a 1st pass CRF then a second pass using the ABR from the CRF pass, I tried the same with x265's 3.5+20 version, but every single encode I tried, the video starts to frame skip at non-deterministic timecodes... Is this a known issue?

I've done this and it did work, albeit not with a recent build. It definitely should work since the stats file is the only thing used by the following passes and the rest comes from your source anyway.

benwaggoner

9th February 2022, 23:23

It's the same source as produced noticeable wobble with other settings. The switch that made the difference was --aq-motion.
Yeah, --aq-motion was an experiment abandoned years ago. It was introduced back five years ago in x265 2.3, and I don't know that it ever got any more engineering work. I'd anticipate some weird and suboptimal behaviors with changes and new parameters introduced since. And it started weird and suboptimal.

--aq-motion has the Experimental Feature warning for a reason! It never got close to something that could be used as a default setting. --aq-mode 4 gives better and much more reliable improvements.

benwaggoner

9th February 2022, 23:24

In the x264 days, I used to do a 1st pass CRF then a second pass using the ABR from the CRF pass, I tried the same with x265's 3.5+20 version, but every single encode I tried, the video starts to frame skip at non-deterministic timecodes... Is this a known issue?
I don't know what the issue is, but I'd recommend against it. Reusing first pass data is okay if only bitrate is changing, but things can diverge when other parameters are being played with.

jauh

10th February 2022, 19:09

Yeah, --aq-motion was an experiment abandoned years ago. It was introduced back five years ago in x265 2.3, and I don't know that it ever got any more engineering work. I'd anticipate some weird and suboptimal behaviors with changes and new parameters introduced since. And it started weird and suboptimal.

--aq-motion has the Experimental Feature warning for a reason! It never got close to something that could be used as a default setting. --aq-mode 4 gives better and much more reliable improvements.

That's the thing: --aq-motion is the thing that stopped the wobble!

I don't know what the issue is, but I'd recommend against it. Reusing first pass data is okay if only bitrate is changing, but things can diverge when other parameters are being played with.

I feared someone might say something along those lines.

I'm trialling a new strategy: avoiding psy altogether and going for plain CQP.

jauh

13th February 2022, 13:41

Does x265 have a calculation error in PSNR? For example in the Y-plane, encoder.cpp has (just grepping through source; similar for Cr and Cb planes):

int maxvalY = 255 << (X265_DEPTH - 8);
...
double refValueY = (double)maxvalY * maxvalY * size;
...
psnrY = (ssdY ? 10.0 * log10(refValueY / (double)ssdY) : 99.99);

except that in material with Bt.709 limited range, none of the planes can ever swing the full range of byte values, so does MAX_I not end up being off by quite a bit, esp. in the Y plane?

rwill

13th February 2022, 15:50

Does x265 have a calculation error in PSNR?

No. It does not.

jauh

13th February 2022, 16:14

No. It does not.

How so, when max effective value (nominal peak - black) for the Y plane, for example, is 219 for 8bit (cf. 255) and 876 for 10bit (cf. 1024)?

rwill

13th February 2022, 21:56

How so, when max effective value (nominal peak - black) for the Y plane, for example, is 219 for 8bit (cf. 255) and 876 for 10bit (cf. 1024)?

Thats not how it works you know...

The 255 << (BITDEPTH-8) comes from the reference software.
One could argue that it should be (1<<BITDEPTH)-1 but what you are proposing is .. I don't know what to write.. baseless ?

jauh

14th February 2022, 03:08

Thats not how it works you know...

The 255 << (BITDEPTH-8) comes from the reference software.
One could argue that it should be (1<<BITDEPTH)-1 but what you are proposing is .. I don't know what to write.. baseless ?

Why is it baseless, you don't have (1<<BITDEPTH)-1 possible values in the limited range, you have far fewer, so the magnitude of error relative to the total range is greater, i. e. 20log10(255)-10log10(5) != 20log10(219)-10log10(5)?

nevcairiel

14th February 2022, 08:48

jauh

14th February 2022, 14:32

Pixel values can still use the full range in an image, only limited by the actual physical bitdepth of 8 or 10-bit. If you signal the image to be "limited range", then these values are called BTB (Blacker than Black) or WTW (Whiter than White)
Thats why you always use the full 8-bit range for PSNR calculations, in every reference you can find for PSNR.

Even if you ignore that, the difference between a 255 range and a 219 range is barely even a third of a bit.

That explanation makes no sense: if you signal outside the permitted range the receiver clips the signal and neither BTB nor WTW convey anything meaningful (think speakers clip sound when DAC generates a wave outside speakers' range) so why would meaningless data contribute to SNR computation, let alone to the bettering of the metric, when it's actually axiomatically noise. Think about it: if you have a perfect sine wave between ±1 and then add noise to the wave, the peaks will overshoot the ±1 peaks of the sine wave, arguing that the overshooting conveys meaning is curious.

The argument that 255 vs 219 is barely even a third of a bit is neither here nor there: the peak of the signal affects PSNR. Let's take the peak at 255 and MSE at 5, the PSNR is 20log10(255)-10log10(5), or 41.1411 dB, whereas if you take peak at 219, the PSNR is 20log10(219)-10log10(5), or 39.8192 dB.

Just because some code does it, even if majority does it, doesn't mean it's the right way to do (the bandwagon fallacy); it's certainly the right way to computer PSNR for the full swing of 8 bits, there's nothing to say the code wasn't simply lifted from an image compression comparison code without considering the peaks of the signal that is being measured. Case in point: ffmpeg still creates mp4 files with 'isom' as major_brand, for example, when 14496-12 has for a long time (at least six years) said "[Annex E] brands should not be used as the major brand..."