View Full Version : GMSD and SSIM Quality Metrics
ChaosKing
16th March 2019, 23:16
Is that with seekmode=0, threads=1 too ?
Did you examine that mkv to see what characteristics cause the problem ?
Can you share the file ? It would be a good debugging test clip
Not yet. I will see if I can make a cut or maybe if remuxxing the file solves the issue :D
poisondeathray
16th March 2019, 23:20
I did some tests too. With SSIM and GMSD the first two runs gave identical results, but the third one gave a different one. The difference is not big - starts with the 6th decimal - but it should not happen.
stop 49.08846907182173 3.484012159548981 3762.3334999999825
stop 49.088468378240414 3.4840229354201178 3428.1400000000417
There's no RGB conversion involved here so it has to be something else.
Looking at the individual frame results:
0; 0.9794317688604798; 0.07659219540647606; 0.0;
1; 0.9793688841540404; 0.07778895381762808; 0.0;
2; 0.9801722825175584; 0.07465046759649577; 0.0;
3; 0.9805196896947995; 0.07238442343729419; 0.0;
0; 0.9794317688604798; 0.0765921875457907; 0.0; <--- GMSD different from 8th digit
1; 0.9793688841540404; 0.07778895381762808; 0.0;
2; 0.9801722825175584; 0.07465046759649577; 0.0;
3; 0.9805189190488873; 0.07238807894499658; 0.0; <--- SSIM and GMSD different from 6th digit
Looks like there are differences on some frames only. Does not look like a seeking issue either, the errors are much smaller than what would happen if the frame was different.
I remember seeing this kind of issue when I first tested SSIM metric with Vapoursynth but then the error disappeared and I forgot about it.
I triple checked this on 2 different sources, and I cannot reproduce
The only "weird" thing is sometimes the log file frame number is misordered , but values match every run when you check in a spread sheet
same result run CK's seek test script (on the same encoded video file), it is always 3 frames off...
I also ran the #1 FFMS2 seek test on the original download of "crowd_run_1080p50.y4m" - zero issues found.
Alright, so I assume my encodes are bad...
The Y4M is I-frame only , so you should not get any seek issues
I doubt your encodes are "bad" . More likely that ffms2 build is "bad"
Iron_Mike
16th March 2019, 23:27
I triple checked this on 2 different sources, and I cannot reproduce
The only "weird" thing is sometimes the log file frame number is misordered , but values match every run when you check in a spread sheet
you need to run many tests... as shown in one of my earlier posts, sometimes it outputs the exact same result 4-5 times in a row and then it's off...
The Y4M is I-frame only , so you should not get any seek issues
I doubt your encodes are "bad" . More likely that ffms2 build is "bad"
hah, that would explain it ! :D
@WorBry:
(1) what OS are you using to do your tests ?
(2) can you run the seek-test.py script from CK on one of your crowdrun encodes and see if it reports seeking errors ?
poisondeathray
16th March 2019, 23:31
you need to run many tests... as shown in one of my earlier posts, sometimes it outputs the exact same result 4-5 times in a row and then it's off...
Well I assumed 5 times was enough...I'll run 5 more for a total of 10 and report back if there are differences
But I don't have seek errors either...
WorBry
16th March 2019, 23:35
Probably better that I run the tests again with crowd_run_1080p50.y4m as source and post my results for x265 CRF28, rather than introducing another variable.
So, I encoded crowd_run_1080p50.y4m to x265 CRF28 mp4:
ffmpeg -i {Path}:/crowd_run_1080p50.y4m -vcodec libx265 -preset slow -crf 28 -pix_fmt yuv420p -r 50/1 -x265-params colorprim=1:transfer=1:colormatrix=1 {Path):/crowd_run_1080p50_x265_CRF28.mp4
Used Zeranoe nightly build - ffmpeg-20190312-d227ed5-win64-static.
Ran the SSIM and GMSD tests as before:
Downsample=True:
SSIM;GMSD (together): 475.68958616657005; 45.96858528255874
SSIM (alone): 475.68958616657005
GMSD (alone): 45.96858528255874
Downsample=False:
SSIM;GMSD (together): 436.07619188850276; 83.3646771904963
SSIM (alone): 436.07619188850276
GMSD (alone): 83.3646771904963
Edit: Observed no change in any of the scores over ten consecutive runs.
ChaosKing
16th March 2019, 23:40
OK, so with the "old" ffms2 from here https://github.com/FFMS/ffms2/releases my h264 file has no seeking issues. So I guess the newer ffmpeg version or something else is causing this problem. But other h264 files are ok.
5 sec test file link (https://www.dropbox.com/s/tdaaygh1wqfgtax/ffms2_seeking_issue.mkv?dl=1)
Maybe it would be a good idea to gather some files and make a script to test new releases against them.
poisondeathray
16th March 2019, 23:45
10 runs all exactly the same, not even the 6th or 8th decimal place variation zorr is reporting
I used 2 different test clips, a 300 frame 1920x1080, 1000 frame clip 1920x1080 (not crowdrun) . Both x265 encoded with default settings
@Iron_Mike - What ffms2 version are you using that has seek issues ?
Can you post a "problem" file that exhibits seek issues ?
Thanks CK for the h264 "seek" problem testclip . But to be clear, Iron_Mike is using h265
Iron_Mike
17th March 2019, 00:38
@Iron_Mike - What ffms2 version are you using that has seek issues ?
can VS output the version number of a plugin ? if so, what is the command to get the version number of a specific VS plugin ?
OK, so with the "old" ffms2 from here https://github.com/FFMS/ffms2/releases my h264 file has no seeking issues. So I guess the newer ffmpeg version or something else is causing this problem. But other h264 files are ok.
Thank you for that, this FFMS2 v2.23.1 does not have any seek issues. Ran your seek test script in mode #1 and zero issues !
So, I encoded crowd_run_1080p50.y4m to x265 CRF28 mp4:
Used Zeranoe nightly build - ffmpeg-20190312-d227ed5-win64-static.
Ran the SSIM and GMSD tests as before:
Downsample=True:
SSIM;GMSD (together): 475.68958616657005; 45.96858528255874
SSIM (alone): 475.68958616657005
GMSD (alone): 45.96858528255874
Downsample=False:
SSIM;GMSD (together): 436.07619188850276; 83.3646771904963
SSIM (alone): 436.07619188850276
GMSD (alone): 83.3646771904963
Edit: Observed no change in any of the scores over ten consecutive runs.
used same Zeranoe build (20190312-d227ed5) and encoded with same settings, then tried to run the GMSD/SSIM tests...
with the old FFMS2 version (that does not have seeking issues), vspipe crashes now and vseditor crashes as well when executing the .vpy script...
how are you reading the encoded files, are you using ffms2 as well ? if so, the ffms2 Wolfberry build you mentioned earlier does have seeking issues...
so which source filter to use to read in the files ?
ifb
17th March 2019, 00:50
I'm not sure that intra coding in HEVC is that much better than AVC? Of course it might be a moot point if XF-HEVC was crippled for the sake of reducing complexity/power consumption.
To sorta answer my own question, HEVC intra is supposed to need between 17.3% (https://www.researchgate.net/profile/Detlev_Marpe/publication/241631765_Performance_analysis_of_HEVC-based_intra_coding_for_still_image_compression/links/58de431daca27206a8a4e8b9/Performance-analysis-of-HEVC-based-intra-coding-for-still-image-compression.pdf) and 22.3% (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.352.3008&rep=rep1&type=pdf)/22.4% (https://www.researchgate.net/publication/327987700_Intra_Coding_Performance_Comparison_of_HEVC_H264AVC_Motion-JPEG2000_and_JPEGXR_Encoders/download) less bitrate than AVC intra for equivalent quality (BD-BR). I haven't found any numbers for AV1 that aren't paywalled yet.
WorBry
17th March 2019, 01:42
how are you reading the encoded files, are you using ffms2 as well ? if so, the ffms2 Wolfberry build you mentioned earlier does have seeking issues...
Yes, I'm using Wolfberry's ffms2 build.
I'm using ChaosKing's Portable Fatpack VapourSynth and running the scripts with VSEditor > (F7) Benchmark. Now using Wolfberry's FFMS2 build..
Iron_Mike
17th March 2019, 02:09
Yes, I'm using Wolfberry's ffms2 build.
I've used the one from your link, and it also had seeking issues...
can you test one of your encodes w/ CK's seek test script to see if they're also 3 frames off ?
ChaosKing
17th March 2019, 02:17
And have you tried lsmash, that should be ok.
WorBry
17th March 2019, 02:22
can you test one of your encodes w/ CK's seek test script to see if they're also 3 frames off ?
Yes, I've just done that with the Crowd Run x265 CRF28.mp4 encode and it (#5 - seekmode=0) reports 'No seeking issues found'. I'll test #1 also.
Edit: With #1, it's reporting the 3 frames off thing.
Iron_Mike
17th March 2019, 02:44
Yes, I've just done that with the Crowd Run x265 CRF28.mp4 encode and it (#5 - seekmode=0) reports 'No seeking issues found'. I'll test #1 also.
Edit: With #1, it's reporting the 3 frames off thing.
alright, so that matches my results over here.
so, does that mean we should not use this ffms2 version to read source files ?
how many of your tests in this thread were done w/ this ffms2 version ? I believe you initially stated you were reading files via vdub ?
is there a VS command to print the version of a specific plug ?
WorBry
17th March 2019, 02:49
I only started using Wolberry's ffms2 build today, for these tests. All previous tests were with the ffms2 build that came with the VapourSynth Fatpack Portable package, including the time that I was running the metric tests with the installed version of VapourSynth and VirtualDub2.
Iron_Mike
17th March 2019, 03:53
And have you tried lsmash, that should be ok.
just did that, it reports "[hevc @ 0000009525d3eb20] missing picture in access unit"
but the seek test was successful (no errors).
WorBry
17th March 2019, 04:41
I only started using Wolberry's ffms2 build today, for these tests. All previous tests were with the ffms2 build that came with the VapourSynth Fatpack Portable package, including the time that I was running the metric tests with the installed version of VapourSynth and VirtualDub2.
I've re-tested the CrowdRun x265 CRF28.mp4 encode from..
https://forum.doom9.org/showthread.php?p=1869094#post1869094
....with the original ffms2 version that came with VapourSynth Fatpack Portable and L-Smash Source (LWLibavSource) and both produce exactly the same SSIM and GMSD scores as Wolfberry's ffms2 did:
Downsample=True:
SSIM;GMSD (together): 475.68958616657005; 45.96858528255874
SSIM (alone): 475.68958616657005
GMSD (alone): 45.96858528255874
Downsample=False:
SSIM;GMSD (together): 436.07619188850276; 83.3646771904963
SSIM (alone): 436.07619188850276
GMSD (alone): 83.3646771904963
And the seek-test script reports 'No seeking issues found' for option # 2 'L-Smash-Works'.
WorBry
17th March 2019, 05:44
Can ffmpeg even ingest Canon XF-HEVC (10bit 422) camera footage yet ?
Yes. It was added (http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=f95aee2b72535e14b7463750fd7afb6d1cdbe4d4) to the MXF demuxer 8 days ago.
Samples (https://www.dropbox.com/sh/uam3s1bvralba07/AADfc7RvwmhEA-rLJ8pDjz8la?dl=0)
Nice.
Iron_Mike
17th March 2019, 06:09
Alright, I made some progress in figuring out what causes the issue w/ these fluctuations in GMSD/SSIM results.
it appears that when using the keyframes option (specifying a GOP size), e.g. -x265-params "keyint=100:min-keyint=100:rc-lookahead=100", it produces a file that some VS plug/module has an issue with, and therefore produce these inconsistent GSMD/SSIM metrics...
As a test to match WorBrys results, I made two encodes - in both cases I used the same source (crowdrun 1080p50), the same ffmpeg Zeranoe build, and the same ffmpeg settings that WorBry used.
But in one encode I added the keyframes/GOPsize, which is case (A).
# (A) here are the GSMD/SSIM (Downsample=False) results of video encoded w/ keyframes being set - 4 consecutive metrics calculation using the exact same vpy script
stop 122.04975401899155 236.93252855842488
stop 119.92695987646071 248.23415308822817
stop 82.89543157743563 436.64183973524297
stop 124.03537883254465 227.42573587299864
--> results in random metrics score
# (B) here are the GSMD/SSIM (Downsample=False) results of video encoded w/o setting keyframes - 10 consecutive metrics calculation using the exact same vpy script
stop 83.3646771904963 436.0761547550152
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.0761547550152
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.0761547550152
stop 83.3646771904963 436.0761547550152
stop 83.3646771904963 436.0761547550152
stop 83.36467719049627 436.07615475501524
stop 83.3646771904963 436.0761547550152
--> while the encode w/o keyframes (B) produces pretty much the same metrics repeatedly, two things to note:
(1) 5 out of 10 tests have more precision - more decimal points
(2) the SSIM only matches the first few digits of WorBrys results...
Downsample=False:
SSIM;GMSD (together): 436.07619188850276; 83.3646771904963
SSIM (alone): 436.07619188850276
GMSD (alone): 83.3646771904963 [/CODE]
Edit: Observed no change in any of the scores over ten consecutive runs.
why do these plugs not consistently produce the same result ?
Iron_Mike
17th March 2019, 06:28
I've re-tested the CrowdRun x265 CRF28.mp4 encode from..
https://forum.doom9.org/showthread.php?p=1869094#post1869094
....with the original ffms2 version that came with VapourSynth Fatpack Portable and L-Smash Source (LWLibavSource) and both produce exactly the same SSIM and GMSD scores as Wolfberry's ffms2 did:
And the seek-test script reports 'No seeking issues found' for option # 2 'L-Smash-Works'.
option #2 always resulted in success, #1 is the one that reports seek errors...
question is if the seek-test script is not correct, or if all plugs u mentioned above all are 3 frames off... then they would all produce the same results, which are ultimately incorrect
WorBry
17th March 2019, 07:06
option #2 always resulted in success, #1 is the one that reports seek errors...
question is if the seek-test script is not correct, or if all plugs u mentioned above all are 3 frames off... then they would all produce the same results, which are ultimately incorrect
On the contrary, it shows that regardless of what seek errors the seek-test script is reporting for ffms2, the metric scores come out correct - i.e. they are exactly the same as obtained with LWLibavSource.
Why do you say 'or if all plugs u mentioned above all are 3 frames off' when the seek-test reports no issues with LWLibavSource ?
Also what is the conclusion when (ffms2) option #5 (seekmode=0) reports 'No seeking issues found' but (ffms2) option #1 reports seek errors - in this case the "3 frames off" thing ?
WorBry
17th March 2019, 07:15
Encoded a file w/ the same settings and same ffmpeg Zeranoe build as WorBry did using the 1080p50 crowdrun as ref file. Made two encodes, one with and one without keyframes/GOPsize specified.
When I encode w/ the keyframes option (specifying a GOP size) -x265-params "keyint=100:min-keyint=100:rc-lookahead=100".....
(2) the SSIM only matches the first few digits of WorBrys results...
Were they the exact same settings that I used:
So, I encoded crowd_run_1080p50.y4m to x265 CRF28 mp4:
ffmpeg -i {Path}:/crowd_run_1080p50.y4m -vcodec libx265 -preset slow -crf 28 -pix_fmt yuv420p -r 50/1 -x265-params colorprim=1:transfer=1:colormatrix=1 {Path):/crowd_run_1080p50_x265_CRF28.mp4
...or did you add "-x265-params rc-lookahead=100" ?
Iron_Mike
17th March 2019, 09:12
On the contrary, it shows that regardless of what seek errors the seek-test script is reporting for ffms2, the metric scores come out correct - i.e. they are exactly the same as obtained with LWLibavSource.
Why do you say 'or if all plugs u mentioned above all are 3 frames off' when the seek-test reports no issues with LWLibavSource ?
must have missed the connection of the LWLibavSource, I don't know what that is... is that the lib for LSmash-Works ? (I'm not familiar w/ many of these plugs)
Alright, so, if the seek script reports no error for using LSmash, but does report errors via ffms2, yet the GMSD/SSIM scores are the exact same when the vid files are read via ffms2 or Lsmash, then one would assume that the seek test script has either flawed logic or at least the way it uses ffms2 in the #1 test is different from the way it is used when the vid files are read in your vpy script...
What is the command syntax for reading a vid file via Lsmash/LWLibavSource ?
Also what is the conclusion when (ffms2) option #5 (seekmode=0) reports 'No seeking issues found' but (ffms2) option #1 reports seek errors - in this case the "3 frames off" thing ?
+1 on that question. Since #5 is labeled as "slow, but more safe" I assumed that this is the better verification but then CK said to test via #1 as well...
Were they the exact same settings that I used:
...or did you add "-x265-params rc-lookahead=100" ?
...
edit: I edited my original post for clarity, I hope it makes more sense now ;-) https://forum.doom9.org/showthread.php?p=1869119#post1869119
...
one encode was exactly your settings - as you can see the GMSD matches but the SSIM only matches on the first 5 decimal points... (that again is odd, considering we used same source, same ffmpeg, same settings)
then another encode w/ your settings but in addition setting the keyframes/GOPsize... the resulting file always produces different GMSD/SSIM results, meaning whatever plugs/modules/classes are utilized to obtain these metrics, at least one of them has a problem w/ that file (or the keyframes in that file)... I just have zero idea why it produces always different results - maybe the frame seek returns a random frame, hence the random metric score...
btw, I ran a verification test on that encode to check if all keyframes are in the correct position (every 100 frames, hence every 2 secs) - and they are... I doin't think it's a bad encode, it's just triggers bad logic in at least one of the involved plugs/modules/classes...
ChaosKing
17th March 2019, 11:30
+1 on that question. Since #5 is labeled as "slow, but more safe" I assumed that this is the better verification but then CK said to test via #1 as well...
seekmode is a ffms2 parameter, see here https://github.com/FFMS/ffms2/blob/master/doc/ffms2-avisynth.md#int-seekmode--1
Iron_Mike
17th March 2019, 13:17
seekmode is a ffms2 parameter, see here https://github.com/FFMS/ffms2/blob/master/doc/ffms2-avisynth.md#int-seekmode--1
ah, thanks for that.
so I just re-ran the GMSD/SSIM tests but opened both files w/ ffms and seekmode=0, and the metrics score is the same.
so I guess seekmode is not used when these tests are run (since the seek test script indicated that seek mode is 3 frames off, the scores would be different)...
ChaosKing
17th March 2019, 13:48
Can you upload a small cut of your video file that you're testing?
WorBry
17th March 2019, 15:21
must have missed the connection of the LWLibavSource.
I've re-tested the CrowdRun x265 CRF28.mp4 encode from..
https://forum.doom9.org/showthread.php?p=1869094#post1869094
....with the original ffms2 version that came with VapourSynth Fatpack Portable and L-Smash Source (LWLibavSource) and both produce exactly the same SSIM and GMSD scores as Wolfberry's ffms2 did
I don't know what that is... is that the lib for LSmash-Works ? (I'm not familiar w/ many of these plugs)
You said you tested it, but got an error.
And have you tried lsmash, that should be ok.
just did that, it reports "[hevc @ 0000009525d3eb20] missing picture in access unit"
but the seek test was successful (no errors).
What is the command syntax for reading a vid file via Lsmash/LWLibavSource ?
clip = core.lsmas.LWLibavSource(source=r'{Path}:/video')
edit: I edited my original post for clarity, I hope it makes more sense now ;-) https://forum.doom9.org/showthread.php?p=1869119#post1869119
I don't know why you get inconsistent results on consecutive metric testing. I don't.
Please, no more requests for me to test this and that.
Over to you ChaosKing.
Can you upload a small cut of your video file that you're testing?
Edit: One last test.
I ran the SSIM:GMSD script with the 1080/50p x265 CRF28.mp4 encode imported with ffms2 and LWLibavSource - both plugins being the versions that came with VapourSynth Fatpack Portable.
I used the updated Zoptilib.py that provides the per frame metric scores:
https://forum.doom9.org/showthread.php?p=1868821#post1868821
In both tests, opened up the Previews in VSEditor and 'sent' to Frame #50. Here are the frame grabs:
http://i.imgur.com/xHJRr0Zm.jpg (https://imgur.com/xHJRr0Z)
Click on image to enlarge and (+) cursor to enlarge further.
Both tests exported exact same frame and per-frame SSIM/GMSD scores. Same thing stepping through the frames.
I also examined the x265 CRF26 encode in Elecard StreamEye and TMPGEnc SmartRenderer5 and they display the same frame for frame #50 i.e. frame #50 in the display sequence, but #52 in the stream sequence - it's a B frame.
Iron_Mike
17th March 2019, 23:52
You said you tested it, but got an error.
yes, using the option in the seek-test script, as CK suggested that. I couldn't run a metric test b/c I didn't have the command syntax.
clip = core.lsmas.LWLibavSource(source=r'{Path}:/video')
thank you for that.
Please, no more requests for me to test this and that.
thanks for running a couple of tests and confirming your settings.
I don't know why you get inconsistent results on consecutive metric testing. I don't.
as outlined in my other post, it's FFMS2 w/ seekmode=1 (default) that has a problem with keyframes - which will provide random, inconsistent GMSD/SSIM results.
This can be a problem for peeps running GMSD/SSIM tests on files they did not encode themselves and may not be aware of the issues w/ seekmode=1 (default).
So I ran a few more tests to confirm this.
Using ffmpeg Zeranoe 20190312-d227ed5 (https://ffmpeg.zeranoe.com/builds/win64/static/ffmpeg-20190312-d227ed5-win64-static.zip), CK's Fatpack (https://forum.doom9.org/showthread.php?t=175529), and FFMS2 from Wolfberry (https://drive.google.com/open?id=1UKBIocylymTJY_Q5-OjnDTIDzGfdyvG0) and the crowd_run_1080p50.y4m (https://media.xiph.org/video/derf/y4m/crowd_run_1080p50.y4m) 1080p50 source (500 frames / 10 secs), I created three encodes.
(1) encode w/o keyint, min-keyint, rc-lookahead parameters:
ffmpeg command
<ffmpeg path> -i "/path/to/crowd_run_1080p50.y4m" -c:v libx265 -preset slow -crf 28 -pix_fmt yuv420p -x265-params "colorprim=1:transfer=1:colormatrix=1" -r 50/1 "/path/to/encode.mp4"
GMSD/SSIM results - 5 consecutive metrics tests each - source filter varied
Lsmash:
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.36467719049627 436.0761547550152
stop 83.3646771904963 436.07615475501524
FFMS2 (Wolfberry) - seekmode=0:
stop 83.3646771904963 436.0761547550152
stop 83.36467719049627 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
FFMS2 (Wolfberry) - seekmode=1 (default):
stop 83.36467719049627 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.0761547550152
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.0761547550152
--> as you can see, w/o keyframes specified in the encode, seekmode causes no problem.
(2) encode w/ keyint, min-keyint, rc-lookahead:
ffmpeg command
<ffmpeg path> -i "/path/to/crowd_run_1080p50.y4m" -c:v libx265 -preset slow -crf 28 -pix_fmt yuv420p -x265-params "keyint=100:min-keyint=100:rc-lookahead=100:colorprim=1:transfer=1:colormatrix=1" -r 50/1 "/path/to/encode.mp4"
Lsmash:
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.64183973524297
FFMS2 (Wolfberry) - seekmode=0:
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.641839735243
stop 82.89543157743563 436.641839735243
FFMS2 (Wolfberry) - seekmode=1 (default):
stop 123.93944837285996 227.9372376995322
stop 123.74699115224307 228.86426005799092
stop 82.89543157743563 436.641839735243
stop 121.66415998563879 239.07092253508384
stop 122.1422683865401 236.40679167570886
--> keyframes were specified in the encode, seekmode=1 (default) provides inconsistent metrics
And as you can see ONE of the 5 metrics (w/ seekmode=1) matches the consistent metrics from Lsmash, so this can be very misleading if you only run one test and it happens to match...
(3) encode w/ keyint, min-keyint but w/o rc-lookahead:
--> wanted to see if it is specifically the rc-lookahead setting that causes the issues w/ seekmode=1
ffmpeg command
<ffmpeg path> -i "/path/to/crowd_run_1080p50.y4m" -c:v libx265 -preset slow -crf 28 -pix_fmt yuv420p -x265-params "keyint=100:min-keyint=100:colorprim=1:transfer=1:colormatrix=1" -r 50/1 "/path/to/encode.mp4"
Lsmash:
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
FFMS2 (Wolfberry) - seekmode=0:
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841787 435.99382191599165
FFMS2 (Wolfberry) - seekmode=1 (default):
stop 108.17858520287257 307.76386120454794
stop 124.03602892099701 228.48869961962274
stop 83.19534706841786 435.99382191599165
stop 122.13314288222833 236.93921207380862
stop 123.97733586255691 229.28604315863697
--> same issues as (2)
Notes:
(a) you need to use seekmode=0 if using FFMS2
(b) the code does provide sometimes higher precision, or very minor variations, see above decimal points highlighted in red
(c) although the same source and only the keyframes parameter as a difference in the three encodes, the consistent metrics via Lsmash or FFMS2 w/ seekmode=0 in all three tests provide different GMSD/SSIM results
83.3646771904963 vs. 82.89543157743563 vs. 83.19534706841786 # GMSD
436.07615475501524 vs. 436.641839735243 vs. 435.99382191599165 # SSIM
although these are minor variations considering one would divide the total metric by num frames, it is a another factor that affects the score, and it shouldn't IMO
ChaosKing
18th March 2019, 01:15
I recreated your tests and came to the same conclusion.
I checked it also like this in excel:
frame gmsd ssim run2 frame run1 - run2
255 0.1665046186678081 0.8689233699845679 255 0.1665046186678081 0.8689233699845679 0 0
256 0.30940736771850696 0.8740746768904321 256 0.16450349612365942 0.8740746768904321 0,144903872 0
257 0.3089268132506728 0.8666172357253087 257 0.16778360296358616 0.8666172357253087 0,14114321 0
...
264 0.2645300712830965 0.8644133391203703 264 0.16818244844402383 0.8644133391203703 0,096347623 0
265 0.2663959966212291 0.3675497926311728 265 0.1650834566291641 0.8708016251929013 0,10131254 -0,503251833
I noticed what the first 255 frames are ok. This is also why the seek test said no seeking issues found, bcs it tests (see bat file) only the first 100 frames. Setting it to 500 showed that ffms2 has also seeking issues with the encoded x265 mp4 file.
So basically only DGDecodeNV was 100% correct, but it cuts off the last 2 frames for some reason, so I had to test with 498 frames. (maybe ffmpeg produces a "broken" stream?)
p.s. I only tested the with latest ffms2 version.
pp.s. I would be good if zopti would sort by frame before saving the log file :D
pppppp.s
stop 83.36467719049627 436.0761547550152
This indicates that it could be a decoding issue. Like 1 pixel off or something. This could be a case for inspector butteraugli.
Iron_Mike
18th March 2019, 01:49
This is also why the seek test said no seeking issues found, bcs it tests (see bat file) only the first 100 frames. Setting it to 500 showed that ffms2 has also seeking issues with the encoded x265 mp4 file.
Not sure I understand you correctly, but using Wolfberry's FFMS2 w/ this command
py seek-test.py "/path/to/crowd_run_1080p50_encode.mp4" 0 499
option #1 - seekmode=1: showed seeking issues, always 3 frames off
option #5 - seekmode=0: showed no seeking issues
pp.s. I would be good if zopti would sort by frame before saving the log file :D
+1
even better: include a header line that states the Zopti and muvsfunc version that were used in the test and specifically state which metrics are listed in the file (GMSD|SSIM|MDSI|etc) and the order in which they are listed...
otherwise nobody knows what is what unless you wrote down the metrics collection passed to the Zopti class... ;-)
zorr
18th March 2019, 01:52
This indicates that it could be a decoding issue. Like 1 pixel off or something. This could be a case for inspector butteraugli.
A small difference like that can happen when summing floating point numbers in a different order. Most likely the frame order is a bit different sometimes. That could also be solved by sorting by frame numbers before summing them up.
zorr
18th March 2019, 01:59
even better: include a header line that states the Zopti and muvsfunc version that were used in the test and specifically state which metrics are listed in the file (GMSD|SSIM|MDSI|etc) and the order in which they are listed...
otherwise nobody knows what is what unless you wrote down the metrics collection passed to the Zopti class... ;-)
Zoptilib was written for the Zopti optimizer and the log file was not meant for human consumption. :) Zopti reads the used metrics from the source so it knows. But I can see that Zoptilib can be useful without Zopti and the header would make it better. I just need to update Zopti so that it doesn't choke on the header.
Iron_Mike
18th March 2019, 02:19
Zoptilib was written for the Zopti optimizer and the log file was not meant for human consumption. :) Zopti reads the used metrics from the source so it knows. But I can see that Zoptilib can be useful without Zopti and the header would make it better. I just need to update Zopti so that it doesn't choke on the header.
:thanks:
Iron_Mike
18th March 2019, 02:42
Zoptilib was written for the Zopti optimizer and the log file was not meant for human consumption. :) Zopti reads the used metrics from the source so it knows. But I can see that Zoptilib can be useful without Zopti and the header would make it better. I just need to update Zopti so that it doesn't choke on the header.
btw, if it is possible for you to identify whether the metrics were calculated via YUV or RGB - since once MDSI is in the metrics list, everything goes RGB and the values change - that would also be an important parameter to state in the header...
WorBry
18th March 2019, 06:02
Using ffmpeg Zeranoe 20190312-d227ed5 (https://ffmpeg.zeranoe.com/builds/win64/static/ffmpeg-20190312-d227ed5-win64-static.zip), CK's Fatpack (https://forum.doom9.org/showthread.php?t=175529), and FFMS2 from Wolfberry (https://drive.google.com/open?id=1UKBIocylymTJY_Q5-OjnDTIDzGfdyvG0) and the crowd_run_1080p50.y4m (https://media.xiph.org/video/derf/y4m/crowd_run_1080p50.y4m) 1080p50 source (500 frames / 10 secs), I created three encodes.
(1) encode w/o keyint, min-keyint, rc-lookahead parameters:
ffmpeg command
<ffmpeg path> -i "/path/to/crowd_run_1080p50.y4m" -c:v libx265 -preset slow -crf 28 -pix_fmt yuv420p -x265-params "colorprim=1:transfer=1:colormatrix=1" -r 50/1 "/path/to/encode.mp4"
GMSD/SSIM results - 5 consecutive metrics tests each - source filter varied
Lsmash:
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.36467719049627 436.0761547550152
stop 83.3646771904963 436.07615475501524
FFMS2 (Wolfberry) - seekmode=0:
stop 83.3646771904963 436.0761547550152
stop 83.36467719049627 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.07615475501524
FFMS2 (Wolfberry) - seekmode=1 (default):
stop 83.36467719049627 436.07615475501524
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.0761547550152
stop 83.3646771904963 436.07615475501524
stop 83.3646771904963 436.0761547550152
Well, i can only reiterate that in my tests I do not get these inconsistencies.
Same source, same x265 CRF28.mp4 command line (i.e. mine). Five consecutive runs of SSIM:GMSD (default, with downsample):
LWLibavSource:
Downsample=True (Default)
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
Downsample=False
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
FFMS2 (Wolfberry) -default
Downsample=True (Default)
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
stop 475.68958616657005 45.96858528255874
Downsample=False
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
stop 436.07619188850276 83.3646771904963
I assume you tested with Downsample=False. Interesting that your SSIM scores 436.0761547550152(4 ) are also marginally lower than mine 436.07619188850276, whereas the GMSD scores are the same, allowing for those inconsistencies in the last digit.
I should mention also that I ran the series' twice - first, retaining the ffindex and lwi files from the first test for the subsequent tests and then creating a fresh index file for each successive test (deleting the last one, of course). There was absolutely no difference in the results. What did you do in your tests - use the same index files or generate a fresh one for each test ?
And as a double check, I've just now imported the score data from the log files (I retained), into Excel (well, LibreOffice Calc) and, like ChaosKing, run difference calcs on the per frame scores - within each series and between the ffms2 and LWLibavSource series - and I find zero differences.
I didn't run any metric tests with ffms2 seekmode=0, but why would I ? Nor have I run any tests with x265 encoded with custom key frame and rc-lookahead settings, because I have no interest in doing so.
I would be good if zopti would sort by frame before saving the log file :D
The frames in my SSIM:GMSD test log files are all in sequence.
ChaosKing
18th March 2019, 10:17
But have you also used the fmpeg Zeranoe 20190312-d227ed5 build?
EDIT
re-muxxing the mp4 to mkv fixes everything.
The frames in my SSIM:GMSD test log files are all in sequence.
I have a Ryzen 1700 with 16 (8 real) cores. As far as I remember vsedit makes a get_frameAsync(i) call. So a unordered list can be expected.
Iron_Mike
18th March 2019, 11:32
Well, i can only reiterate that in my tests I do not get these inconsistencies.
well, did you actually read my post... ? ;-)
it is the x265 KEYFRAMES parameter that creates an encode that will throw off seekmode... all explained in detail above... simply use the provided ffmpeg commands in my post to re-create the problem
Iron_Mike
18th March 2019, 11:35
ran a couple of tests to see how preset behave for x265...
here I tested medium vs. fast... same range as WorBry CRF 0-30, used the same graph layout
https://i.imgur.com/d6WMXYh.png
ChaosKing
18th March 2019, 13:39
https://github.com/theChaosCoder/zoptilib
Zoptilib now writes an ordered log file.
But sadly it still shows small differences at the end sometimes. I will look for a more precise sum() function.
stop 83.3646771904963 stop 83.36467719049628
WorBry
18th March 2019, 13:55
well, did you actually read my post... ? ;-)
it is the x265 KEYFRAMES parameter that creates an encode that will throw off seekmode... all explained in detail above... simply use the provided ffmpeg commands in my post to re-create the problem
Yes, I read your post in great detail. And did you read mine ? I only referred to your first set of data with the x265 CRF26 created with the command line I gave you. "Simply use the provided ffmpeg commands in my post to re-create the problem" indeed - you've got a nerve. Yes I did, and I cannot replicate your inconsistent results - I get the exact same scores every time and a marginally higher SSIM score to you.
Read:
Well, I can only reiterate that in my tests I do not get these inconsistencies.
Interesting that your SSIM scores 436.0761547550152(4 ) are also marginally lower than mine 436.07619188850276, whereas the GMSD scores are the same, allowing for those inconsistencies in the last digit.
Yet you seem intent on asserting that your experience is somehow the norm i.e. inconsistencies are to be expected - when there is quite clearly something wrong.
And you didn't answer my question:
I should mention also that I ran the series' twice - first, retaining the ffindex and lwi files from the first test for the subsequent tests and then creating a fresh index file for each successive test (deleting the last one, of course). There was absolutely no difference in the results. What did you do in your tests - use the same index files or generate a fresh one for each test ?
Do you get the same dodgy results retaining the ffindex and lwi index files from the first test in a series vs creating a fresh index file for each ?
WorBry
18th March 2019, 14:14
But have you also used the fmpeg Zeranoe 20190312-d227ed5 build?
It's the other way round. He used Zeranoe 20190312-d227ed5 to replicate my test procedure:
https://forum.doom9.org/showthread.php?p=1869094#post1869094
As a test to match WorBrys results, I made two encodes - in both cases I used the same source (crowdrun 1080p50), the same ffmpeg Zeranoe build, and the same ffmpeg settings that WorBry used.
I get consistent results on serial testing. He get's these dodgy inconsistent results. How so ?
WorBry
18th March 2019, 16:05
...it is the x265 KEYFRAMES parameter that creates an encode that will throw off seekmode...
...Nor have I run any tests with x265 encoded with custom key frame and rc-lookahead settings, because I have no interest in doing so
Well I have now the run tests with x265 CRF28 encoded with custom key frame interval, exactly as per your script:
<ffmpeg path> -i "/path/to/crowd_run_1080p50.y4m" -c:v libx265 -preset slow -crf 28 -pix_fmt yuv420p -x265-params "keyint=100:min-keyint=100:colorprim=1:transfer=1:colormatrix=1" -r 50/1 "/path/to/encode.mp4"
And these are the results I get on serial testing with SSIM:GMSD (Downsample=False):
LWLibavSource:
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
ffms2- Wolfberry (default, i.e. seekmode=1)
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
stop 435.99385609567906 83.19534706841786
No inconsistencies at all. And the same results regardless of whether the ffindex and lwi files from the first test in each series were retained or a fresh index file was generated for each test.
Compare that to your results.
(3) encode w/ keyint, min-keyint but w/o rc-lookahead:
--> wanted to see if it is specifically the rc-lookahead setting that causes the issues w/ seekmode=1
ffmpeg command
<ffmpeg path> -i "/path/to/crowd_run_1080p50.y4m" -c:v libx265 -preset slow -crf 28 -pix_fmt yuv420p -x265-params "keyint=100:min-keyint=100:colorprim=1:transfer=1:colormatrix=1" -r 50/1 "/path/to/encode.mp4"
Lsmash:
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
FFMS2 (Wolfberry) - seekmode=0:
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841786 435.99382191599165
stop 83.19534706841787 435.99382191599165
FFMS2 (Wolfberry) - seekmode=1 (default):
stop 108.17858520287257 307.76386120454794
stop 124.03602892099701 228.48869961962274
stop 83.19534706841786 435.99382191599165
stop 122.13314288222833 236.93921207380862
stop 123.97733586255691 229.28604315863697
Also, as before, I extracted the log file data into LibreOffice Calc to run difference checks on the per-frame scores and I could find zero differences at all between the serial tests and between the LWLibavSource and ffms2 test series. And the frames were listed in correct sequence (0-499) in the log files.
Go figure.
poisondeathray
18th March 2019, 16:48
What hardware were you guys running on ?
It might partially have to do with threads and cores ; The more threads, the more requests, the higher chance of frame mismatches and seek errors if you're not using a robust seek method or indexing like dgsource or seekmode=0, threads=1
CK's seektest is like wild random seeks to simulate very bad case; but usually "normal" encoding scenarios , "normal scripts", even "normal" metric testing occurs in linear order .
But small inconsistencies, like 7th decimal place etc... suggest something else wrong. Unless you have duplicate frames in the test sequence (and the mild variation is just lossy encoding differences of duplicate frames), there is something else going on. I would start to look at other things, hardware issues, run memory diagnostics
WorBry
18th March 2019, 17:26
What hardware were you guys running on ?
Actually, I've been running these tests on an older PC (6-core AMD FX-6300 Vishera 3500 Mhz, NVidia GeForce GT730, Win10 x64) that I keep for grunt encoding/processing and linux stuff (dual boot). A bit sluggish with 4K , but it gets there. For the metric tests I've let it run on 6 threads.
But small inconsistencies, like 7th decimal place etc... suggest something else wrong. Unless you have duplicate frames in the test sequence (and the mild variation is just lossy encoding differences of duplicate frames), there is something else going on. I would start to look at other things, hardware issues, run memory diagnostics
There are no duplicate frames in the 10 sec Crowd Run sequence.
poisondeathray
18th March 2019, 17:36
Note the files you guys encoded are going to be slightly different too. Even with the same commandline . Because of encoding threads. Unless you guys have the same core and thread count on your hardware, or you explictly set threads
ChaosKing
18th March 2019, 20:02
I added Decimal() for testing purposes https://github.com/theChaosCoder/zoptilib
The metric values are now much longer, sum should always be the same between runs. Please test.
Iron_Mike
18th March 2019, 21:05
Yes, I read your post in great detail. And did you read mine ? I only referred to your first set of data with the x265 CRF26 created with the command line I gave you. "Simply use the provided ffmpeg commands in my post to re-create the problem" indeed - you've got a nerve. Yes I did, and I cannot replicate your inconsistent results - I get the exact same scores every time and a marginally higher SSIM score to you.
well, if you read the post then you may wanna allude to the point that the keyframes encode setting does not trip up seekmode on your system, b/c clearly as I outlined, the keyframes setting is the only differentiating factor in the encodes I made.
if you followed exactly the commands I posted and used the exact same sw packages I posted and linked to, then it may be a system difference, as poisondeathray pointed to. I asked you pages ago, what OS you're using...
But, point remains:
I can consistently repeat the problem w/ seekmode=1 (and the keyframes encode).
CK, who repeated the procedure from my post, ended up experiencing the exact same issues on his system.
So there is an issue, and clearly this is of interest for others, who may run into the same problems...
btw, this is not a ffmpeg specific version problem, I tested w/ three different ffmpeg versions..
Iron_Mike
18th March 2019, 21:11
Well I have now the run tests with x265 CRF28 encoded with custom key frame interval, exactly as per your script
did you use the download of crowd_run_1080p50.y4m or did you use your own encode from the 2160p version as ref file ?
Iron_Mike
18th March 2019, 21:13
Do you get the same dodgy results retaining the ffindex and lwi index files from the first test in a series vs creating a fresh index file for each ?
tried both ways, did not make a difference w/ seekmode=1 on the keyframes encoded file...
WorBry
18th March 2019, 21:44
if you followed exactly the commands I posted and used the exact same sw packages I posted and linked ..
Good grief. You mean the sw packages I pointed you to in the first place.
I asked you pages ago, what OS you're using...
Actually, I've been running these tests on an older PC (6-core AMD FX-6300 Vishera 3500 Mhz, NVidia GeForce GT730, Win10 x64) that I keep for grunt encoding/processing and linux stuff (dual boot).
And what OS and hardware are you running the tests on ?
So there is an issue, and clearly this is of interest for others, who may run into the same problems...
Not denying there is an issue, but I'm not seeing it.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.