Log in

View Full Version : Does scenecut parameter work as intended?


hellgauss
9th June 2024, 15:30
I have a question on the behavior of the scenecut parameter. According to this thread ( https://forum.doom9.org/showthread.php?t=121116 ) which perhaps is a little bit confusing for me, it seems that if a frame is K/I it is chosen by a formula like

IF (some calculation which does not depend explicitly on scenecut) < scenecut THEN choose K/I frame
ELSE choose P/B

This means, except perhaps for some very weird cases, that a lower value of scenecut should decrease the total number of K/I frames, since the above condition get stronger. (Note: I assume that the dependence of the l.h.s. on scenecut via the "distance from previous keyframe" is neglectable)

I performed some test on a 1420 seconds anime 1080p (high bitrate source, 34049 frames), with me=hex and subme=8, obtaining the following
(SCENECUT - Number of K/I Frames - Bitrate in kbps)


5 - 321 - 4210.93
10 - 326 - 4215.83
15 - 326 - 4218.32
20 - 320 - 4218.99
40 - 294 - 4219.84
60 - 283 - 4225.26
100 - 244 - 4264.55
150 - 192 - 4286.93


Same test and source as above but with me=tesa subme=10:


20 - 320 - 3819.79
40 - 298 - 3820.87
60 - 283 - 3825.66


Results seems not to match the above formula, number of K/I frames mainly decreases as scenecut increases, even if bitrate increases. (except for scenecut=5/10 which is perhaps too low)

For all encoding I used ffmpeg 7.0.1, animouse stable build from github with this command line:


ffmpeg -y -bitexact -ss 00:00:01 -t 1420.1 -i "%tmpdir%\t_%name%_video0.mkv" -vf "removegrain=0:1:1" -sws_flags accurate_rnd ^
-c:v libx264 -level 40 -preset veryslow -me_method ** -subq ** -me_range 56 ^
-crf 16.1 -tune animation -psy-rd 0.45:0 -aq-mode 3 -aq-strength 0.65 -threads 12 -chromaoffset 1 -x264-params scenecut=** ^
-bitexact "%tmpdir%\t_%pref%%name%%suff%.mkv" 2>&1 | tee "%logdir%\full-log_%name%.txt"


According to log data and SEI codec-string in the video, scenecut parameter seems to be set correctly. Log results on number of frames are confirmed by avinaptic, although with different average QP.

Is the formula right? Or I do something wrong in my test?

rwill
9th June 2024, 16:05
The scenecut parameter works as intended. If something gets that K or not depends on what you shortened to "(some calculation which does not depend explicitly on scenecut)".

Please look at the pseudo code from akupenguins original post again.

rwill
9th June 2024, 17:41
Ah sorry, better look here:

https://code.videolan.org/videolan/x264/-/blob/master/encoder/slicetype.c?ref_type=heads

function scenecut_internal

where its
res = pcost >= (1.0 - f_bias) * icost;

and f_bias is somewhat like the i_scenecut_threshold param, somewhat.

How do you count the K/I in your tests? I just did some tests here and higher scenecut parameter values result in higher I picture counts.

hellgauss
9th June 2024, 19:12
Thank for your replies.

Number of I frames (which also include K frames) is reported in the stderr of ffmpeg by libx264, and is confirmed by anivaptic report.

I'll do more tests with different source and different x264 build. The build in this test is "core 164 r3108M 31e19f9"

Final avinaptic report of encodes:
https://filetransfer.io/data-package/sw1RMCsh#link

rwill
9th June 2024, 19:53
Does ffmpeg support the x264 setting "--log-level debug" ? This should print a line like

x264 [debug]: scene cut at 500 Icost:2315846 Pcost:2167955 ratio:0.0639 bias:0.2133 gop:78 (imb:7496 pmb:292)

per detected scenecut, among other things, and gives some insight into why its one...

*edit*
Its "-loglevel debug" for ffmpeg that also prints the x264 debug info

hellgauss
14th June 2024, 09:33
I tried official ffmpeg 7.0.1 (gyan full build) with libx264-164-r3191-4613ac3 as well as last x264.exe from videolan. Results on number and type of frames does not change, although there is a very tiny difference in bitrate respect to r3108.

Then I tried another, publicly available, source (1280x720 CG animation, 14315 frames) : http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4

The windows10 script used is

for %%a in (10 15 18 20 35 40 45 80 60 100 150 5 666) do (

x264-r3191-4613ac3 "BigBuckBunny.mp4" --log-level debug --no-progress --level 40 --preset veryslow --me hex --subme 8 --merange 56 --crf 16.5 ^
--tune animation --threads 12 --chroma-qp-offset 1 --scenecut %%a -o "output\BBB_sc%%a.mkv" 2>&1 | tee "output\full-log_BBB_sc%%a.txt"

findstr /c:"scene cut at" "output\full-log_BBB_sc%%a.txt" >"output\log_BBB_x264-3191_sc%%a.txt"

avinaptic2-cli --drf --show-progress --out=output\avinaptic_BBB_sc%%a.txt output\BBB_sc%%a.mkv

)

pause

For each value of scenecut the script produces an output mkv, a full log, a short scenecut log and an avinaptic report. Log outputs are available here: https://www.mediafire.com/file/meh9u05y7ie8m96/output.zip/file

Reults:
(SCENECUT - Number of K/I Frames - Bitrate in kbps)

5 - 119 - 3043.75
10 - 138 - 3044.44
15 - 137 - 3045.05
18 - 137 - 3044.93
20 - 137 - 3046.16
35 - 136 - 3046.65
40 - 136 - 3046.65
45 - 136 - 3048.72
60 - 132 - 3051.81
80 - 124 - 3056.16
100 - 115 - 3073.80
150 - 100 - 3089.49
666 - 58 - 3100.40

DISCUSSION

- Results are qualitatively similar to my previous test: number of K/I frames DECREASE as scenecut increase, while bitrate increase (with a few exceptions)

- If scenecut is high, a lot of scenecut are triggered at gop=249, i.e. key_max - 1

- At scenecut, Pcost is never greater than Icost. With standard values of scenecut, Pcost is usually about 90-99% of Icost

CONCLUSION

Either there is something buggy in handling scenecut parameter or I do something wrong. I'll perform one last test similar to this one but with --threads 1

benwaggoner
14th June 2024, 17:09
- Results are qualitatively similar to my previous test: number of K/I frames DECREASE as scenecut increase, while bitrate increase (with a few exceptions)

- If scenecut is high, a lot of scenecut are triggered at gop=249, i.e. key_max - 1

- At scenecut, Pcost is never greater than Icost. With standard values of scenecut, Pcost is usually about 90-99% of Icost
That is all what I would expect. As scenecut goes up, the threshold to trigger a scenecut becomes higher, and thus fewer frames quality for a "natural" scenecut. When a natural scenecut hasn't happened within keyint, a keyframe is automatically inserted at that interval (frame 249 by default).

Pcost would never be higher than Icost, As a P-frame can be made of intra blocks as well as predicted, and there's a little extra overhead from being a IDR frame. An IDR needs to contain all the stream information needed to start decoding from that point.

Have you tried encoding with 2-pass? That basically provides a full-file lookahead, and thus gives as good an automatic keyframe placement as x264 can give. If you need to use 1-pass, increasing --rc-lookahead may help as well (I don't recall if it includes frame type decisions in x264). If you've got the RAM for it, --rc-lookahead can be as high as --keyint.

hellgauss
14th June 2024, 18:39
That is all what I would expect. As scenecut goes up, the threshold to trigger a scenecut becomes higher, and thus fewer frames quality for a "natural" scenecut.

This is indeed what I would expect from my test results. However it is in contrast with the following 4:

1)
My post #1 which is from akupenguin choice of the sign "<"
IF (some calculation which does not depend explicitly on scenecut) < scenecut THEN choose K/I frame

Higher scenecut should make the inequality more easily triggered, with higher probability of K/I frame. Note that in akupenguin formula the Psize value is with sign "-" in l.h.s.
(1 - (bit size of P-frame) / (bit size of I-frame) < (scenecut / 100) * (distance from previous keyframe) / keyint
---------------------
2)
rwill results, which are in contrast with mine (I think that at this point it is better to use publicly available source to compare results)
I just did some tests here and higher scenecut parameter values result in higher I picture counts
------------------------------
3)
MeGUI wiki ( https://en.wikibooks.org/w/index.php?title=MeGUI/x264_Settings&oldid=2201086 )
Higher values of scenecut increase the number of scenecuts detected.
---------------------------------
4)
Source code linked by rwill
{
f_bias = f_thresh_min
+ ( f_thresh_max - f_thresh_min )
* ( i_gop_size - h->param.i_keyint_min )
/ ( h->param.i_keyint_max - h->param.i_keyint_min );
}

res = pcost >= (1.0 - f_bias) * icost;
if( res && real_scenecut ) etc...

Note that f_bias should increase with scenecut parameter and appears with sign "-" in the r.h.s. of the inequality. Of course to get a clear vision the whole code should be analyzed. I'm an hobbiest programmer (better: a script writer) so I'm not able to do that.

-------------------

I think that most scenecut at GOP: 249 are a bug. If the scenecut is not triggered it is not reported in the log: there are more I frames in the final file than scenecut detected. Probably they are scene longer than key_max so that a K frame is automatically inserted.

I'm still performing threads=1 test. I will also try higher rc-lookaead. Thank you for your suggestion

hellgauss
3rd July 2024, 23:39
I tried several other options, such as high rc-lookahead, threads=1, use an old x264 build, obtaining similar results.

Then I tried another approach: a sistematic test using faster options, and I found which causes the weird results.

Higher number of b-frames in combination with b-adapt=2, causes the unwanted behavior of the scenecut parameter. More precisely, the only way to make it work as intented seems to be without bframes.

I performed the following tests, varying number of bframes and scenecut parameter. Source: http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4

CPUsetup:
using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
Number of threads: 12
File batch (Windows 10):
for %%b in (0 1 2 3 4 5 6 8 10) do (

for %%a in (0 5 10 15 20 30 40 60 80 100 150 666) do (

x264-r3191-4613ac3 "BigBuckBunny.mp4" --log-level debug --no-progress --b-adapt 2 --bframes %%b --scenecut %%a -o "output_bf\BBB_bf%%b_sc%%a.mkv" 2>&1 | tee "output_bf\full-log_BBB_bf%%b_sc%%a.txt"

findstr /c:"scene cut at" "output_bf\full-log_BBB_bf%%b_sc%%a.txt" >"output_bf\log_BBB_bf%%b_sc%%a.txt"

avinaptic2-cli --drf --show-progress --out=output_bf\avinaptic_BBB_bf%%b_sc%%a.txt output_bf\BBB_bf%%b_sc%%a.mkv
)
)

Similar tests have been performed with --b-adapt 0 and without --b-adapt (i.e. b-adapt = 1 as default). For bframes=0 the b-adapt parameter is not being set, as expected. The number of I frames, as reported by avinaptic, are in the following tables and graphs. First column is the scenecut parameter (X-axis), other columns are number of total I frames for each bframe-number tested (Y-axis)

b-adapt=0
sc bf=0 bf=1 bf=2 bf=3 bf=4 bf=5 bf=6 bf=8 bf=10
0 58 58 58 58 58 58 58 58 58
5 126 123 122 124 124 126 124 127 126
10 147 143 142 146 146 146 146 146 145
15 151 144 144 147 148 150 151 152 151
20 157 150 151 154 153 155 156 156 155
30 161 151 153 155 156 158 158 159 159
40 166 152 150 156 156 158 158 159 159
60 173 152 151 157 160 164 161 163 162
80 192 163 160 166 169 172 170 171 170
100 216 174 171 177 178 182 180 184 179
150 278 209 215 215 216 218 215 218 217
666 7624 67 67 67 67 67 67 67 67
------------------------------------
b-adapt=1 (default for normal preset)
sc bf=0 bf=1 bf=2 bf=3 bf=4 bf=5 bf=6 bf=8 bf=10
0 58 58 58 58 58 58 58 58 58
5 126 123 123 124 124 124 124 124 124
10 147 142 143 143 143 143 143 143 143
15 151 145 145 145 145 145 145 145 145
20 157 151 151 151 151 151 151 151 151
30 161 152 152 152 152 152 152 152 152
40 166 152 152 152 152 152 152 152 152
60 173 154 154 154 155 155 155 155 155
80 192 164 163 164 164 164 164 164 164
100 216 176 175 175 175 175 175 175 175
150 278 216 216 216 216 216 216 216 216
666 7624 67 67 67 67 67 67 67 67
------------------------------------
b-adapt=2
sc bf=0 bf=1 bf=2 bf=3 bf=4 bf=5 bf=6 bf=8 bf=10
0 58 58 58 58 58 58 58 58 58
5 126 122 121 120 121 121 120 119 119
10 147 142 141 141 140 140 140 141 138
15 151 145 143 144 140 140 140 141 137
20 157 151 146 145 143 142 140 141 137
30 161 152 148 147 144 141 142 140 137
40 166 152 149 144 141 140 141 140 136
60 173 154 152 145 142 141 140 136 132
80 192 164 154 145 143 142 138 132 124
100 216 175 158 150 143 136 130 122 115
150 278 216 152 137 124 122 117 104 100
666 7624 67 65 64 64 61 58 58 58

Graph (b-adapt=0): https://thumbs4.imagebam.com/d0/2c/6c/MEUGOIB_t.jpg (https://www.imagebam.com/view/MEUGOIB)
Graph (b-adapt=1): https://thumbs4.imagebam.com/80/f8/5b/MEUGOIC_t.jpg (https://www.imagebam.com/view/MEUGOIC)
Graph (b-adapt=2): https://thumbs4.imagebam.com/c4/33/81/MEUGOID_t.jpg (https://www.imagebam.com/view/MEUGOID)

REMARKS
- The bframes=0 curve is reported in all graphs for reference.
- Assuming that the scenecut parameter represents a % and that reasonable values are in 20-80 range (default /* 2), the only way to make it represent increase I-frame is by setting bframes=0. Most graph are more ore less constant in that range, For high b-frames and b-adapt=2, which is the standard for slower presets and animation profile, the correlation is inverse (this can also explains the different results obtained by rwill)
- All test results in less I-frame than the "correct" (in my opinion) behavior for bf=0.
- The 7624 result for the scenecut=666, used for test purposes, shows that the algorithm is very different without b-frames
- For b-adapt=1 results are nearly independent on bframes. For b-adapt=0 the qualitative behaviour is always the same, although numbers are slightly different.

CONCLUSION
This is a possible bug that should be investigated,

MasterNobody
8th July 2024, 21:45
There are no bugs here, with adequate values of scenecut in range from 0 to 100 (or may be even less than 100). But when you, with your inadequate values, actually force the scenecut to trigger almost every frame, then the flashes protection simply starts to work: https://code.videolan.org/videolan/x264/-/blob/master/encoder/slicetype.c?ref_type=heads#L1436

hellgauss
9th July 2024, 07:31
I do not think it is the flash protection. How it is assumed to be related to b-frame number and b-adapt?

I also think scenecut=20-80 is quite "adequate". Values outside that range are only for test.

Why, in 20-80 range, the number of I-frames depend on b-frame number and b-adapt? Why, for b-adapt=2 the behaviour is so QUALITATIVELY different varying bf (slightly increasing for bf=1 VS decreasing for bf=10)? Why for bf=0 the behaviour is so different?

I started to investigate the scenecut parameter to mitigate the banding effect in anime. I use "veryslow" preset together with anime tuning, so for me bf=10 and b-adapt=2 is "standard" (gray line in third graph)
The mechanism is this:
Scenecut missed --> I frame coded as P frame instead --> Higher QP frame --> Banding --> Banding is inherited by next frames

I found a partial workaround with b-adapt=1 : results seems better and also I can somewhat safely tune scenecut. However I would like to get the same curve as bf=0, but with bf=10.

MasterNobody
11th July 2024, 00:38
I do not think it is the flash protection. How it is assumed to be related to b-frame number and b-adapt?
Did you look at the code in my link? There you can immediately see that with b-adapt 2 and with a large number of B-frames, flash protection becomes more aggressive, since it can look further ahead.