Log in

View Full Version : Exploring the motion vector functions


markfilipak
29th April 2021, 01:12
Hello, All,

I'm exploring the motion vector functions to see if I can't improve on the results I've gotten so far from InterFrame. I have some technical questions that I hope someone will graciously help me with.

The bottom line is: I want to convert 24000/1001fps to 60000/1001fps but without any anti-aliasing or other "cosmetic" operations. If I have to use 'pel:1', that's okay. If I have to restrict 'mode' to 'mode:0' and 'algo' to 'algo:2', that's okay. It's just an experiment in order to 'see' what Super(), Analyse(), and Smooth() are actually doing.

I occurs to me that other folks may have already devised methods that avoid "smoothing". Do you have a favorate?

zorr
29th April 2021, 21:38
My favourite technique is to use MCompensate to reconstruct the intermediate frame using the previous and next frames. An example of what that looks like can be found here (https://forum.doom9.org/showthread.php?p=1855378#post1855378). MCompensate is an MVTools2 function, I'm not sure if SVPFlow has a similar one.

And you could try a semi-automatic way to find good parameters for your specific video using Zopti (https://forum.doom9.org/showthread.php?t=176076).

markfilipak
1st May 2021, 11:37
My favourite technique is to use MCompensate to reconstruct the intermediate frame using the previous and next frames. An example of what that looks like can be found here (https://forum.doom9.org/showthread.php?p=1855378#post1855378). MCompensate is an MVTools2 function, I'm not sure if SVPFlow has a similar one.
Oh, dear, it does look like I'm going to have to get into AviSynth. (Oh, wait, didn't I read that VapourSynth can now run AviSynth scripts?)
And you could try a semi-automatic way to find good parameters for your specific video using Zopti (https://forum.doom9.org/showthread.php?t=176076).
Zopti looks to be pretty brilliant. Thanks so much!

Well, I've been up all night (again!), and I'm too toasted to deal with AviSynth & Zopti this morning. I'll dive into both in the pseudo-morning (which, to me, will be this afternoon).

zorr
1st May 2021, 12:38
Oh, dear, it does look like I'm going to have to get into AviSynth.

Not really. MVTools is available (https://github.com/dubhater/vapoursynth-mvtools)for VapourSynth as well. And Zopti works with VapourSynth (https://forum.doom9.org/showthread.php?t=176076) too. The VapourSynth version of MVTools has some differences and Avisynth version has been developed further but I'd say the differences are very small.

Zopti looks to be pretty brilliant. Thanks so much!

Zopti was actually created for the same problem you're having now so it should be a pretty good tool for the job. Hopefully it doesn't suck all of your time though because running those optimizations can take a lot of time... :D But I can can help you get started by providing an augmented MVTools script for Zopti. Or if you still prefer SVPFlow that could be optimized with Zopti too but I don't have a ready-made script for that one.

markfilipak
1st May 2021, 20:51
Not really. MVTools is available (https://github.com/dubhater/vapoursynth-mvtools)for VapourSynth as well. And Zopti works with VapourSynth (https://forum.doom9.org/showthread.php?t=176076) too. The VapourSynth version of MVTools has some differences and Avisynth version has been developed further but I'd say the differences are very small.

Zopti was actually created for the same problem you're having now so it should be a pretty good tool for the job. Hopefully it doesn't suck all of your time though because running those optimizations can take a lot of time... :D But I can can help you get started by providing an augmented MVTools script for Zopti. Or if you still prefer SVPFlow that could be optimized with Zopti too but I don't have a ready-made script for that one.

Would you help get me started? That would be kind. I'm capable of understanding the motivation of most function arguments because I've read the ITU's H264 specification (at least down to the macroblock level) for example and therefore comprehend the architecture underlaying the processes, but I have difficulty putting concepts together because developers tend to not provide full, formal documentation. For example, here: https://github.com/mysteryx93/FrameRateConverter, Etienne Charland documents the work he's done: FrameRateConverter(), InterpolateDoubles(), StripeMask() (which I'm especially interested in), ConvertFpsLimit(), and ConditionalFilterMT(), but he doesn't indicate which of them return a value (or object, like an intermediate 'clip') so implementing his functions is somewhat mysterious to me (and I haven't figured out how to contact Etienne through github).

Thanks, zorr.

Regarding SVPflow, here: https://forum.doom9.org/showthread.php?p=1805342#post1805342, Sharc (who appears to be very experienced) rates InterFrame/SVPflow at the bottom of his/her 4-way ranking. That makes sense because SVP is designed for real-time MV interpolation and may have made (reasonable, necessary) compromises to maximize bit rate.

Regarding Zopti, assuming that I can cull MV function arguments to a subset that's manageable, I presume I'll be able to write a script to iterate on the subset, let it run overnight, and then view the resulting videos in the morning. Over time I may be able to reduce the subset to a few (or maybe a single!) 'best' set of arguments: A single (or small set) of workflows.

Thanks for sharing your insights! (BTW, are you the developer of Zopti?)

PS: It looks like MVTools for VapourSynth is available solely as source code. That leaves me out. Too bad.

poisondeathray
1st May 2021, 22:23
I have difficulty putting concepts together because developers tend to not provide full, formal documentation. For example, here: https://github.com/mysteryx93/FrameRateConverter, Etienne Charland documents the work he's done: FrameRateConverter(), InterpolateDoubles(), StripeMask() (which I'm especially interested in), ConvertFpsLimit(), and ConditionalFilterMT(), but he doesn't indicate which of them return a value (or object, like an intermediate 'clip') so implementing his functions is somewhat mysterious to me (and I haven't figured out how to contact Etienne through github).


Your difficulty probably has more to do with being new to avisynth. The documentation is clear if the reader has some general familiarity with avisynth. The plugin author should not document basic concepts, otherwise each plugin would have 364 pages or redundant info


All of them return video such as "clip", but you can assign any name if you like , such as "mask1", or "myclip2", or "originalsource" - if it helps you organize things better

StripeMask() returns a mask. It says it "builds a mask".

There is an assumption made that the user is familiar with basic avisynth, video, masks, and compositing . A mask is a white / black video that define areas of inclusion, exclusion. It's the same concept as an "alpha channel" in video. If there are no shades of "grey" , it's referred to as a "binary mask".

InterpolateDoubles() - "Replace double frames with interpolated frames using FrameRateConverter" . If duplicates are detected , based on the threshold setting, it replaces the 2nd intermediate. e.g. if you had A,B,B,C,D... and B met the threshold value setting for similarity, it would return A,B, (B/C), C, D . Otherwise, it woudl return the original if no duplicate is detected

ConditionalFilterMT() is similar to ConditionalFilter, but optimized for threading. It's like an IF/THEN statement. It analyzes a video "A", and returns video "B" if the specified condition is met, otherwise video "C".




PS: It looks like MVTools for VapourSynth is available solely as source code. That leaves me out. Too bad.

Compiled .dll's for MVTools2/Vapoursynth are in the releases section
https://github.com/dubhater/vapoursynth-mvtools/releases

markfilipak
2nd May 2021, 00:43
Your difficulty probably has more to do with being new to avisynth...

You're such a good 'pusher'.

Can I impose more? ..."No" is an acceptable answer. ...;)
You're so experienced that my questions may be simple to you. I hope so because, as I wrote, I have not found a way to contact Etienne Charland with questions.

Here:
https://raw.githubusercontent.com/mysteryx93/FrameRateConverter/master/FrameRateConverter.avsi
I see 3 functions:
function FrameRateConverter(), and
function InterpolateDoubles(), and
function FRC_GaussianBlur42().
But at the 'parent' page, here:
https://github.com/mysteryx93/FrameRateConverter
I see
FrameRateConverter, and
InterpolateDoubles, and
StripeMask, and
ConvertFpsLimit, and
ConditionalFilterMT.
Do you have any idea (or can guess) what's what?

PS: I successfully got vapoursynth-mvtools-v23-win64.7z. Thanks for the link. I don't why I just can't figure out github. There are times when "releases" goes to source code, only, or goes to distributables, or goes around in circles.

poisondeathray
2nd May 2021, 01:29
Here:
https://raw.githubusercontent.com/mysteryx93/FrameRateConverter/master/FrameRateConverter.avsi
I see 3 functions:
function FrameRateConverter(), and
function InterpolateDoubles(), and
function FRC_GaussianBlur42().
But at the 'parent' page, here:
https://github.com/mysteryx93/FrameRateConverter
I see
FrameRateConverter, and
InterpolateDoubles, and
StripeMask, and
ConvertFpsLimit, and
ConditionalFilterMT.
Do you have any idea (or can guess) what's what?


Some functions are embedded in the .dll. If you download the zip file from the "releases", FRC consists of a .dll and an .avsi, both are required for it to work

If you want to examine in more detail each function, just look at the source folder at github ("Src" folder for this project)

zorr
2nd May 2021, 02:21
Would you help get me started? That would be kind.

Sure. Just let me know what you'd like to start with:
A) MVTools2 / Avisynth script for Zopti
B) MVTools2 / VapourSynth script for Zopti
C) SVPFlow / VapourSynth script for Zopti
D) FrameRateConverter / AviSynth script for Zopti
E) FrameRateConverter / VapourSynth script for Zopti (yes that one is available for VS as well)

Perhaps starting with FrameRateConverter would not be a bad idea since it has a lot less parameters. It would also be a good baseline if you also want to try the other methods.

I'm capable of understanding the motivation of most function arguments because I've read the ITU's H264 specification (at least down to the macroblock level) for example and therefore comprehend the architecture underlaying the processes, but I have difficulty putting concepts together

While I don't want to discourage you (or anyone) from learning the concepts, it's not strictly necessary if you want to use Zopti. That's the whole premise: let the machine figure out what parameters work best. Basically you just take the script, change your source to whatever video you want to optimize for and start running Zopti.


Regarding SVPflow, here: https://forum.doom9.org/showthread.php?p=1805342#post1805342, Sharc (who appears to be very experienced) rates InterFrame/SVPflow at the bottom of his/her 4-way ranking. That makes sense because SVP is designed for real-time MV interpolation and may have made (reasonable, necessary) compromises to maximize bit rate.

Yes and those tests were most likely run with default values. Those might be targeted towards the real-time use case and not the best possible quality. I also tested InterFrame (based on SVPFlow) back when I investigated MVTools and its competitors and found the quality to be lacking (again using the default values). It would be interesting to compare its maximum quality against the current MVTools2.

Regarding Zopti, assuming that I can cull MV function arguments to a subset that's manageable, I presume I'll be able to write a script to iterate on the subset, let it run overnight, and then view the resulting videos in the morning.

The culling would certainly help, but the idea is to do a heuristic search and try to find the best parameter combination. Obviously you cannot try every combination so there's no quarantee that you'll get the "best" combination or even anywhere close to it, but I can say that the process works well enough to produce useful results even with a huge search space such as MVTools2.

Over time I may be able to reduce the subset to a few (or maybe a single!) 'best' set of arguments: A single (or small set) of workflows.

That's pretty much what I envisioned Zopti could help with. At this time it's not generally known what MVTools2 parameters are good for different content besides some general rules such as "bigger block size for bigger resolution". But I think there should be a small subset of parameter ranges that will work for most videos and once those are figured out it would be much faster process to run though them with Zopti and pick the best ones.

(BTW, are you the developer of Zopti?)

Yes. All feedback on Zopti is welcome. :)

poisondeathray
2nd May 2021, 05:14
Yes and those tests were most likely run with default values. Those might be targeted towards the real-time use case and not the best possible quality. I also tested InterFrame (based on SVPFlow) back when I investigated MVTools and its competitors and found the quality to be lacking (again using the default values). It would be interesting to compare its maximum quality against the current MVTools2.


zorr did you ever test mvtools2 + zopti with that clip or similar "fence" clips ? (ie. prototypical "picket fence" optical flow "fails")

I think selur took the video down, but here is a mirror in case anyone is interested in testing. The "goal" was a 29.97 => 59.94 2x interpolation
https://www.mediafire.com/file/hrfz3lw8v5i3a7a/forInterpolation.mp4/file

For me, all those 4 choices were "fails" that sharc mentioned. They either blended, or reduced artifacts with blending, or had severe artifacts. I played with a few settings, but there is nothing in mvtools2 that would make it usable in my opinion. The best would be doing nothing.

But I didn't run it through zopti , and this might be a good time to familiarize myself with it - if it can "solve" this clip or similar ones

I think downscaling to 1920x1080 for testing would be a good idea, as the UHD clip is soft and low quality for UHD ( plus it would speed up testing runs)

Or if Mark wants to upload a sample of his problem scene, maybe it could be tested too.

markfilipak
2nd May 2021, 13:59
zorr did you ever test mvtools2 + zopti with that clip or similar "fence" clips ? (ie. prototypical "picket fence" optical flow "fails")

I think selur took the video down, but here is a mirror in case anyone is interested in testing. The "goal" was a 29.97 => 59.94 2x interpolation
https://www.mediafire.com/file/hrfz3lw8v5i3a7a/forInterpolation.mp4/file

For me, all those 4 choices were "fails" that sharc mentioned. They either blended, or reduced artifacts with blending, or had severe artifacts. I played with a few settings, but there is nothing in mvtools2 that would make it usable in my opinion. The best would be doing nothing.

Did you try StripeMask (https://github.com/mysteryx93/FrameRateConverter#stripemask)?

But I didn't run it through zopti , and this might be a good time to familiarize myself with it - if it can "solve" this clip or similar ones

I think downscaling to 1920x1080 for testing would be a good idea, as the UHD clip is soft and low quality for UHD ( plus it would speed up testing runs)

Or if Mark wants to upload a sample of his problem scene, maybe it could be tested too.

'halos_ghosts_judder.23fps[23pps].59fps[59pps].mkv', the source after InterFrame.
https://i.ibb.co/2FF087z/halos-ghosts-judder.jpg
Left: 'Halo' around tracking object --------------------- Right: 'Ghost' in stationary "picket fence"
There's a 3rd part (not shown) that is a judder test.

The source: halos_ghosts_judder.23fps[23pps].mkv' is here: https://www.mediafire.com/file/s01cxnv4ll8ngox/halos_ghosts_judder.23fps%255B23pps%255D.mkv/file.
It's 7 seconds, 1920x1080, 24000/1001fps, 3,635,214 bytes.

The problem: Increasing InterFrame, OverrideArea to 300 minimizes the 'halo' and eliminates the 'ghost' but it creates a 12Hz (i.e. 5 frame) judder (seen in the 3rd part). The judder was not the usual judder. Instead of being caused by uneven frame rate cadence, this judder is caused by unstable/inconsistent blending. That's why I also seek a way to eliminate all blending and anti-aliasing so that I can see what's actually going on.

poisondeathray
2nd May 2021, 16:42
Did you try StripeMask (https://github.com/mysteryx93/FrameRateConverter#stripemask)?




Yes - that function builds a mask based on "stripe" detection - but what are you going to do with it ? You need to combine the mask with "something".

FRC uses stripemask as part of it's masking function when output mode is set to "auto" (default). It's a combination blended frames with the artifact mask, and the stripe mask - so the artifacts are less severe.

When stp=true (default) for FRC, it uses the stripemask. The inclusion areas detected in the strip mask are composited with convertfps frames (ie. blended frames)

You can visualize the masks it uses by setting the output mode to "over" - areas in yellow are the stripemask , areas in cyan are the artifact masks.

That's it's "claim to fame" - the detection of artifact interpreted areas, auto masking and blending of artifacts. It might be more pleasing to watch, but for my purposes - that' s still a "fail". On something like a fence pan, the entire fence is essentially blended. I'm seeking cleaner "auto" interpolation (without resorting to other programs) , and maybe zopti is the ticket

poisondeathray
2nd May 2021, 18:48
The problem: Increasing InterFrame, OverrideArea to 300 minimizes the 'halo' and eliminates the 'ghost' but it creates a 12Hz (i.e. 5 frame) judder (seen in the 3rd part). The judder was not the usual judder. Instead of being caused by uneven frame rate cadence, this judder is caused by unstable/inconsistent blending. That's why I also seek a way to eliminate all blending and anti-aliasing so that I can see what's actually going on.


One option is to use different settings for different scenes. Apply different filters or settings. ie. Trim and Splice . Default settings are ok for the 3rd scene in vapoursynth interframe in terms of judder - and they are fairly clean interpolations (minor frame edge artifacts) , not overly blended.

The scene change is blended, but there are other options

For me, the 1st and 2nd scene would still be unusable with OverrideArea=300 . Too many artifacts


RE scene changes -

what do you do at scene changes when the frame count for cycle cadence is interupted ? ie. the scene doesn't fall on 10 frame boundaries in your 23.976 => 59.94 case

You don't have the full set of motion vectors. The end frame before the next scene does not have a forward vector. The start frame of next scene does not have a backward vector

In general the options are 1) place duplicates or a 2) blended scene change. Or 3) third option is duplicate, then replace 1st duplicate with another inbetween frame (so it's essentially a deceleration move) . Pros/cons to each, but right at the scene change, casual viewers are generally less sensitive to duplicate or blend

zorr
2nd May 2021, 21:36
zorr did you ever test mvtools2 + zopti with that clip or similar "fence" clips ? (ie. prototypical "picket fence" optical flow "fails")

No, didn't try that one but I did try one which had a similar problem area. I didn't get very far with that one (had something else to do). I think those would benefit from the MCompensate trick to make the errors less horrible looking. :)

I think selur took the video down, but here is a mirror...

I took a glance and that one really looks like the worst case - the fence looks like it's moving into opposite direction than the rest of the scene. There's no way MVTools algorithms can figure out that it should actually move the other way. Perhaps the only chance is to use the DePan filter to detect the global motion and generate the fence part and combine it using the Stripe mask. Maybe.

But I didn't run it through zopti , and this might be a good time to familiarize myself with it - if it can "solve" this clip or similar ones

Perhaps Zopti can find settings that are the most pleasing but I doubt that the fence can be completely fixed without some additional logic.

I think downscaling to 1920x1080 for testing would be a good idea, as the UHD clip is soft and low quality for UHD ( plus it would speed up testing runs)

Yes, it's always a good idea to think ways to make the runs faster. I have also used a cropped portion of a short segment, typically 10 frames in the beginning. If/when good parameters are found they can be adjusted further by running longer segments and full frames.

Or if Mark wants to upload a sample of his problem scene, maybe it could be tested too.

Mark's sample looks much easier to handle. I will prepare some scripts and see how it goes.

markfilipak
4th May 2021, 02:08
... (BTW, are you the developer of Zopti?)
... Yes. All feedback on Zopti is welcome. :)

May I send you a personal message (PM)? ...no strings attached.

feisty2
4th May 2021, 06:04
No, didn't try that one but I did try one which had a similar problem area. I didn't get very far with that one (had something else to do). I think those would benefit from the MCompensate trick to make the errors less horrible looking. :)




if you like the result of MCompensate, have you tried MBlockFPS (with overlap > 0)? It doesn't have the "liquefying" artifacts like those Flow functions.
BlockFPS has been overlooked for many years because it wasn't very useful in the original avisynth MVTools (cannot handle overlapped blocks), but that restriction has been lifted from the VS MVTools since forever.

poisondeathray
4th May 2021, 07:12
No, didn't try that one but I did try one which had a similar problem area. I didn't get very far with that one (had something else to do). I think those would benefit from the MCompensate trick to make the errors less horrible looking. :)

I took a glance and that one really looks like the worst case - the fence looks like it's moving into opposite direction than the rest of the scene. There's no way MVTools algorithms can figure out that it should actually move the other way. Perhaps the only chance is to use the DePan filter to detect the global motion and generate the fence part and combine it using the Stripe mask. Maybe.



mcompensate is used in jm_fps and a number of similar related interpolation functions - it really doesn't help with that one. It does help with mark's clip, the 2nd scene's "paper window" (the right side of the frame) is basically clean with default jm_fps settings. But the wooden background is not on the left - the problem is object boundary demarcation vs. background .

Yes that fence clip is very difficult - but it is possible in other programs with some motion tracking to help guide the motion estimation and mattes. It just would be nice to have a cleaner starting point - less work to do in other programs - that's why I'm asking about zopti. Maybe there is some mvtools2 setting I'm missing (I've tested quite a few)


Speaking of cleaner starting points - the newer DNN methods tend to produce cleaner object edges than mvtools2 . Especially DAIN and RIFE. DAIN is very slow , but RIFE is usable with a decent GPU. Both run on python, it would be nice to see a vapoursynth version. They still "fail" on things like "picket fences" and those situations that all optical flow fails on - but 3 area it generally does better than mvtools2: 1) cleaner object edges, 2) y-axis rotational movements (such as a spin) 3) organic movements like flag waving, water. That's with the default publically distributed models; you could probably train custom models for specific things "picket fence"

Besides the slower speed, another downside is DAIN/RIFE are 2x, 4x... interpolators. But I've found if you use RIFE or DAIN followed by mvtools2, the results are significantly better than just mvtools2 alone , when you have a situation where you need to interpolate a non pow2 rate



if you like the result of MCompensate, have you tried MBlockFPS (with overlap > 0)? It doesn't have the "liquefying" artifacts like those Flow functions.
BlockFPS has been overlooked for many years because it wasn't very useful in the original avisynth MVTools (cannot handle overlapped blocks), but that restriction has been lifted from the VS MVTools since forever.


I didn't know vpy version of mblockfps was different. I have tried smoothfps/smoothfps2 and several similar variants that all use mblockfps with mrecalculate , instead of mflowfps, but I've always used them in avisynth . I'll play with the vpy version and report back if there are interesting findings

poisondeathray
4th May 2021, 07:25
I didn't know vpy version of mblockfps was different. I have tried smoothfps/smoothfps2 and several similar variants that all use mblockfps with mrecalculate , instead of mflowfps, but I've always used them in avisynth . I'll play with the vpy version and report back if there are interesting findings

Not much difference on several different test clips - the object boundary artifacts are still there, but less gloopy, more "blocky." When coupled with a "picket fence" type background such as the examples linked in this thread - "blocky" might even be worse

The fundamental problem is the inability to separate FG from BG (or other intermediate object "layers" cleanly.) Object separation is one area where DAIN/RIFE/ Nvidia Super SloMo tend to perform slightly better than mvtools2 (or other traditional methods)

zorr
4th May 2021, 10:05
May I send you a personal message (PM)? ...no strings attached.

Sure go ahead.

feisty2
4th May 2021, 15:51
Not much difference on several different test clips - the object boundary artifacts are still there, but less gloopy, more "blocky." When coupled with a "picket fence" type background such as the examples linked in this thread - "blocky" might even be worse


there's a "ghosting" vs "distortion" tradeoff for BlockFPS, you could try increasing the block size, something like blksize=32 (with no additional MRecalculate), if you're seeing too much distortion.

poisondeathray
4th May 2021, 19:04
there's a "ghosting" vs "distortion" tradeoff for BlockFPS, you could try increasing the block size, something like blksize=32 (with no additional MRecalculate), if you're seeing too much distortion.

Yes, just trading off artifacts. I've tried many settings and combinations with/without mrecalculate and varying blocksizes, and other settings. Not much difference between vpy and avs version with mblockfps - the main problem in these problem scenario clips is the FG/BG occlusions , object boundaries, "halo" artifacts. "mblock" vs. "mflow" won't help with that. Maybe some better analyze, or additional filtering or masks might. Maybe some magical zopti config could improve it, I haven't played with zopti much yet

Occlusions/FG/BG layers - those are one of the common failings with mvtools2 (and many other optical flow approaches) - I've posted about this before in other threads in more detail.

Masking and using blends like FRC's approach might make it less jarring when watching normally, but blended rings/halos and textures don't look that great either.

My "go to" for the some of the problem scenarios (especially the 3 listed in the previous post) is RIFE, with some clean up in other programs. The edge quality alone is worth it. Even if it "fails" in other areas, when you combine/composite approaches, you can get a usuable solve with muchs work.

Here are some apng demos highlighting edge quality of occlusions, FG objects on Mark's clip. I've brightened up the 2nd one. RIFE used jm_fps to bring it from 47.952=>59.94 . So the RIFE result is still "exposed" to mvtools2, but since the object and "frame samples" are closer at 47.952, the 2nd mvtools2 interpolation is cleaner, than if it had to do the full 23.976=>59.94. The clip Mark uploaded was a lower quality re-encode, the edges should be cleaner if the original video was used

The apng's in the zip archive should animate in most browsers, just open up in a tab
https://www.mediafire.com/file/55znmxsvl8nn6eb/oflow_comparison_apng.zip/file

Another "negative" to DAIN/RIFE is built in scene detection is quite poor compared to mvtools2 based scripts. It would be nice to have native RIFE vapoursynth implementation with built in scene detection

zorr
4th May 2021, 23:20
if you like the result of MCompensate, have you tried MBlockFPS (with overlap > 0)?

I was just yesterday looking at the latest MVTools docs and wondered why I haven't tested it. Perhaps I overlooked it because I was looking for a block version of MFlowInter back when I started testing MVTools in 2018.

BlockFPS has been overlooked for many years because it wasn't very useful in the original avisynth MVTools (cannot handle overlapped blocks), but that restriction has been lifted from the VS MVTools since forever.

Yes, I definately have to try it. Although now that I saw poisondeathray's example of BlockFPS it looks like it won't give the same caliber of improvement as the MCompensate trick does. But we'll find out. Also I didn't see any mention of overlapping restrictions on the Avisynth version, is the VS version the only one that handles overlapping?

mcompensate is used in jm_fps and a number of similar related interpolation functions

I think you mean MRecalculate. jm_fps doesn't have MCompensate, at least the version I have.

the problem is object boundary demarcation vs. background .

Yes that's probably the most common source of artifacts. I have pondered that if we could separate the foreground and background with a mask then the motion interpolation could be applied to them separately. It should be possible to detect the foreground vs background, after all we have the motion vectors and they point to different direction in the foreground than on the background. The problem is their effective resolution is the block size (like 16x16) so they cannot alone be used to separate an accurate foreground mask. I tested an approach where the SAD map from MMask is used to detect the boundary blocks and then combined with an edge detector but that doesn't give very clean results either.

the newer DNN methods tend to produce cleaner object edges than mvtools2 . Especially DAIN and RIFE. DAIN is very slow , but RIFE is usable with a decent GPU. Both run on python, it would be nice to see a vapoursynth version.

I agree! I haven't tested those myself but it seems likely that the best quality will use some kind of neural network, they are getting better. They still may not work for every kind of material as it depends a lot on what they have been trained on.

you could probably train custom models for specific things "picket fence"

Does the released code already support custom training? How long would it take to train them? The examples you posted looked very good.

poisondeathray
5th May 2021, 01:58
I think you mean MRecalculate. jm_fps doesn't have MCompensate, at least the version I have.


Yes, my bad

Lets see if any combinations can help in some of these tricky situations. But I have a feeling that no settings will help much for the occlusions, moving object boundaries on a "picket fence" type background scenario



I agree! I haven't tested those myself but it seems likely that the best quality will use some kind of neural network, they are getting better. They still may not work for every kind of material as it depends a lot on what they have been trained on.


Definitely - and for general use I would say mvtools2 is more balanced, feature full and tweakable. But RIFE edge quality is slightly better than mvtools2 overall, especially in that occlusion scenarios where objects pass over another. I can post other "non picket fence" examples too , but the 2 video in this thread have those shared similar characteristics.

eg. RIFE "failed" in the 2nd scene on the right side of the frame, where jm_fps breezed through without much issues. You have to combine results to get optimal results. I don't expect mvtools2 to work best with 1 set of settings on different types of characteristics either, or denosing with 1 set of settings on different types of noise across different scenes - so it shouldn't be surprising

Pros/cons - RIFE/DAIN has those other major downsides mentioned earlier too (slow, 2x multiples, limited "tweakable" settings, scenechanges not as clean, harder to use in a sense - although there are dedicated GUI's for RIFE/DAIN - so for general public they are actually easier to use than avisynth or vapoursynth)



Does the released code already support custom training? How long would it take to train them? The examples you posted looked very good.

Yes RIFE has training code, but people have had limited success with different data sets. I haven't tried it. You can look at the issues tracker - and the author is quite active (and posts updated model a few times over the last year) . Maybe something was lost in translation but he posted in one of the threads that the data set used has limited value - I don't quite understand that rationale.

DAIN - because it's sooo slow I would avoid if possible. But there are some situations and frames where it does slightly better (or worse) than RIFE.

poisondeathray
5th May 2021, 02:59
Here is the apng treatment with part of the fence clip, jm_fps vs. RIFE. (Same deal, open in a browser tab)
https://www.mediafire.com/file/2cwoe8ycnotr4c3/fence_jm_vs_rife_apng.zip/file

Clearly the RIFE foreground object boundaries are cleaner with significantly reduced "ring" of artifacts.

Both have problems on other parts of the fence (not shown in the apng) - but RIFE a better starting point in this situation because there is less cleanup to do in other programs.

zorr
7th May 2021, 01:27
I have some results to report. But first, let's go briefly over the setup. I decided to use the AviSynth version of MVTools2 as I think it might be the definitive version, created by master pinterf himself and thorougly tested by myself. So I will not include the script I used with Zopti here, but I can link to it if anyone is interested. The script uses MFlowFPS & Co to create intermediate frames that in the end can be compared to the original frames of the source. The basic idea is: (I may not have the charting skills of Mark but I can try... )

0 1 2 3 time (frames)
----------------------------------
[A ] [B ] [C ] [D ] original frames
[A ] [AB] [B ] [BC] [C ] [CD] [D ] 1) interpolate new frames (MFlowFPS)
[AB] [BC] [CD] 2) remove original frames (SelectOdd)
[AB] [BB] [BC] [CC] [CD] 3) interpolate new frames again (MFlowFPS)
[BB] [CC] 4) leave only latest interpolated frames (SelectOdd)

[B ] [C ] 5) compare original and interpolated frames (GMSD)
[BB] [CC]

I focused on the difficult part of the scene (the guy moving in front of background) so I cropped it and selected a smaller frame range. If this part is OK, the rest of the frame will be too.

I wanted to get something running quickly so I used an old template I had which doesn't have all the parameters of MAnalyse. Notably levels, pzero, pglobal, dct and scaleCSAD are missing. I will add those in the coming runs. I used the GMSD metric as I know from experience that it's more sensitive than SSIM. MDSI might be good also but GMSD is a bit more convenient to use as it doesn't need RGB clips.

I started with five relatively short runs (10 000 iterations each) to see how consistent the results are. If they are it's a good sign that the search is long enough. That's not the case here though.

10 000 iterations

run 1 run 2 run 3 run 4 run 5
GMSD 3.0752501 2.995727 2.922894 2.9211512 2.9990165

MSuper
pel 4 4 4 4 4
sharp 2 1 1 1 2
rfilter 2 2 4 4 1

MAnalyse
blksize 24 8 16 16 16
search 3 3 5 3 5
searchparam 6 1 4 5 3
pelsearch 9 9 9 8 13
lambda 11 (99) 20 (20) 10 (40) 6 (24) 22 (88)
lsad 7847 10006 5789 6936 10576
pnew 251 180 247 240 254
plevel 2 0 0 2 0
overlap 12 2 0 0 6
overlapv 0 4 4 0 4
divide 2 0 2 2 0
global true true true true true
badSAD 8779 1938 1445 8315 6540
badrange 22 7 10 33 11
meander true true true true true
temporal true true false true true
trymany false false false false false

MFlowFPS
ml 209 87 50 84 19
mask 2 2 2 2 2

With GMSD smaller values mean more similar frames. The best result was 2.9211512 on run 4. There's quite a bit of variation both in GMSD and the chosen parameter values. But looks like pel, global, meander, trymany and mask are all same in each run so at least they agree on something. The lambda row has two values for each run, the first one is what Zopti returned and the second value is what was actually input to MAnalyse as it was scaled by blksize*blksize/64.

The next phase was running three longer optimizations, 100k iterations each.

100 000 iterations

run 1 run 2 run 3
GMSD 2.843625 2.8578067 2.9011016

MSuper
pel 4 4 4
sharp 2 1 1
rfilter 1 2 4

MAnalyse
blksize 16 16 16
search 4 4 5
searchparam 2 2 4
pelsearch 14 10 6
lambda 3 (12) 2 (8) 284 (1136)
lsad 10249 15931 10
pnew 97 101 230
plevel 0 1 1
overlap 0 0 0
overlapv 0 0 0
divide 2 2 2
global true true true
badSAD 79 842 5749
badrange 5 7 18
meander true true true
temporal true true false
trymany false false false

MFlowFPS
ml 56 48 21
mask 2 2 2

Best result is now 2.843625. The second run got almost as good result but the third one was much further behind. This just shows that good results are not guaranteed even with this many iterations. The results now also agree on blksize, overlap, overlapv and divide. Also search, searchParam and temporal equal on the best two results. It could also be a fluke as we only have three runs to compare. There are also other values in close proximity within runs 1 and 2: lambda, pnew and badrange.

This is what the best result (so far) looks like (note that this is only showing the interpolated frames)

https://i.postimg.cc/W3xtqNJt/halos-ghosts-judder-PARETO-01-2-843625-2160.gif

As you can see it cannot compete with RIFE but at least it's better than the plain jm_fps result (I assume that's what was used) poisondeathray showed. Also it's clear to me that it's not even possible to get a nice foreground/background separation by swizzling 16x16 blocks around. This clearly needs MRecalculate to use smaller blocks.

I was curious about the chosen blksize 16 and why wouldn't for example 8x8 give better results. Perhaps the smaller blocksize cannot track the background coherently even though the object boundaries would look better. So I started some short runs (10 000 iterations again) to see what 8x8 blocks look like.

10 000 iterations

run 1 run 2 run 3 run 4 run 5
GMSD 2.9374497 2.9236088 2.9058511 2.8954031 2.9246368

MSuper
pel 4 4 4 4 4
sharp 2 2 2 2 2
rfilter 3 3 2 2 1

MAnalyse
blksize 8 (LOCKED)
search 3 4 2 4 5
searchparam 1 1 1 1 1
pelsearch 2 27 4 9 12
lambda 0 17 8 6 15
lsad 2313 7154 15093 10296 1975
pnew 248 242 250 244 186
plevel 1 0 1 0 0
overlap 2 2 2 2 0
overlapv 2 2 2 2 0
divide 0 0 0 0 2
global false false false false true
badSAD 3171 3176 3149 3049 1967
badrange 12 21 13 15 10
meander true true true true true
temporal true false true true true
trymany false false false false false

MFlowFPS
ml 253 106 183 84 115
mask 2 2 2 2 2


Curiously the results on average are better than the first 5 runs. It could be for two reasons: the search space is now smaller which helps Zopti find the good combinations faster OR the 8x8 blocks really are better but they are very hard to find. I will do 100k iterations next to find out.

It occurred to me that it's not that difficult to improve the 16x16 block output. The idea is to run the same interpolation multiple times with different x,y offsets within the 16x16 block and combine the results by taking the median value of each pixel. For example this is what it looks like with 9 samples (offsets 0, 6, 12 in each axis):

https://i.postimg.cc/5ycNNsZC/overlapping-MFlow-FPS-PARETO-01-2-843625-2160.gif

This is obviously 9 times slower but is another nice trick we can use. My original idea was to use the SAD values of MMask to select only the blocks with best vectors but that turned out to be too blocky.

zorr
25th May 2021, 01:27
Time for a little status update. I continued with the longer (100000 iterations) 8x8 block tests. The results were:

100 000 iterations
run 1 run 2 run 3
GMSD 2.8606837 2.8584862 2.8553133

MSuper
pel 4 4 4
sharp 2 1 1
rfilter 1 1 1

MAnalyse
blksize 8 (LOCKED)
search 4 1 1
searchparam 1 1 1
pelsearch 7 2 2
lambda 7 6 3
lsad 18930 932 2084
pnew 241 249 249
plevel 1 0 1
overlap 2 2 2
overlapv 2 2 2
divide 0 0 0
global false false false
badSAD 3234 3167 3170
badrange 13 12 12
meander true true true
temporal true true true
trymany false false false

MFlowFPS
ml 40 41 41
mask 2 2 2

So unfortunately 8x8 was not better than 16x16, the best 8x8 result 2.8553133 did not beat the champion 16x16 result 2.843625.

I was a bit puzzled as to why the optimal searchparam value (which controls the search radius) is very low, just 1 in the 8x8 case and 2 for the 16x16 case. I would have expected that larger radius cannot hurt the results but it looks like it can, and a lot!

I did some exhaustive searches around the optimal values just changing searchparam and pelsearch and drew some heat maps for each search algorithm. Results below (note that searchparam is named searchRange and pelsearch is named searchRangeFinest, it controls the search radius at the finest level):

https://i.postimg.cc/8PvMwqXp/algorithm-comparison-part-1.png
https://i.postimg.cc/TwhWbSW-X/algorithm-comparison-part-2.png

The brighter the color, the better the GMSD score. The best value is highlighted by a red dot (or dots if there are multiple best results). The decimal numbers above the maps are the worst and the best found GMSD score.

This confirms that there is no better alternative to these low search radiuses. This is my guess as to why larger radius doesn't work very well here: The background has a very repetitive pattern and that means there can be very good SAD scores at far away locations. So with a large search range some blocks will get best score at those far away locations and some will find the closest "correct" location and the mixing up of these far/close vectors looks ugly. The following gif demonstates what it looks like with search ranges 2 (optimal), 4 and 8:

https://i.postimg.cc/jSQ358fX/search-Range-effect.gif

Another mystery is that the search algorithm 3 (exhaustive search) was not the best one in either 8x8 or 16x16 case. The docs state that "It is slow, but it gives the best results, SAD-wise". Perhaps there are other factors involved and the best SAD is not all that matters.

Algorithm 4 (hexagon search) is a curious case, the pelsearch parameter has very little effect on the result and the best result is at pelsearch 14-17 while with other algorithms it is at around 2 (algorithm 5 is an exception as well). I guess this explains why it was found quite often as the best algorithm.

It's also easy to see that with 16x16 blocks it's overall easier to find good results, there is a steep drop to bad results with searchRange > 9 with 8x8 blocks while that dropoff happens at searchRange > 17 with 16x16 blocks. Also the bad results with 8x8 blocks are much worse (around 4,4) than with 16x16 blocks (around 3,5).

I have more results to share but that's enough for now. :)

zorr
26th May 2021, 00:28
Some further analysis about the search range: looking at the vectors shows what is actually going on:

https://i.postimg.cc/59zLz7hF/vectors.gif

These are the backward vectors of a single frame with search range 2 (good) and search range 8 (bad). On the left are the vectors drawn with MShow and details on one single block. On the right is the horizontal vector direction displayed as brightness using MMask.

What happens with search range 8 is that some vectors switch to opposite direction (the vx changes from -83 to 64). The same however doesn't happen on the forward vectors. MFlowFPS then has to deal with two conflicting vector directions and the end result is that some pixels are not moving anywhere (or very little) while those around them are.

Usually this kind of problem is avoided by making the vectors more coherent using truemotion=true or setting lambda, lsad, pnew, plevel and global individually. Perhaps Zopti used a shortcut here and avoided the problem by using a very focused search range. It would be interesting to force a larger search range and let Zopti figure out good values for the aforementioned parameters. Or perhaps it's simply not possible to achieve as good of a GMSD score using a larger search range because more coherent vectors make object boundaries harder to follow.

EDIT: The default search range of MVTools2 is 2 so what is used here is actually very normal. A larger search range would only be useful with faster motion.

zorr
19th June 2021, 01:06
The tests so far have been somewhat informative but they're not really telling about the full capabilities of MVTools because we didn't use all the possible parameters. The most important ones missing are levels, pzero, pglobal, dct and scaleCSAD. MSuper has parameters vpad and hpad but they only affect the edge blocks so we'll use the default values for those. But as we're adding five new variables and making the search even harder it's also a good idea to see what we've learned so far and try to eliminate parameters from the search that have known good values.

I identified a couple of parameters that always have the same value when we look at the best results: pel (of MSuper), meander (of MAnalyse) and ml (of MFlowFPS). Pel is 4, meander is true and ml is 2. I usually also look at how much better the best value is compared to other values and the heatmap gives a good overview.

https://i.postimg.cc/ZYLTYJHD/meander-super-pel-mask-fps.png

The best combinations are the white blocks and the worst are the dark ones. The color of each block reflects the best found result in each.

Meander=false results in much worse results, somewehere around 3.29. Meander is "Alternate blocks scan in rows from left to right and from right to left." and defaults to true. I'm not quite sure why this has such a large effect on the quality, but since it has we can simply "lock" this parameter to true.

Pel is another parameter where value 4 gives clearly better results than 2 (and I didn't even test 1) at least on this particular test case. Pel controls the precision of the motion vectors so it's quite understandable that a larger precision helps. It's not necessarily always the case though but it's safe to lock the value to 4 for these tests.

Ml (mask_fps in the heatmap) with value 2 seems to give much better results than 0 or 1. Ml affects the strength of occlusion mask, the lower the value the stronger the occlusion. Looks like occlusion mask is not helping in our tests so we lock the value as 2.

While there weren't other parameters with a single clearly best value there are others where we can limit the range of tested values. In the initial runs I tested blockSizes 8,16,24,32,48,64 and all the search algorithms. Let's see a heatmap of those:

https://i.postimg.cc/wBJpCz32/block-Size-search-Algo.png

Looks like blockSize 16 is much better than any other. I still wouldn't lock it down but we can at least remove the block sizes 32 and above, leaving just 8,16 and 24. I'm also adding new block sizes 3,4 and 6 (those didn't exists when I originally made the zopti script). This also means the maximum values for overlap and overlapv are now 12 (half of 24).

Search algorithms 6 and 7 are not quite as good as the others. Algorithm 6 is "pure horizontal exhaustive search" and 7 is "pure vertical exhaustive search" which explains why. We can leave them out from the next tests.

Next we'll look at the search ranges (searchparam and pelsearch). The heatmap shows only the top 50% of results so we can see them better.

https://i.postimg.cc/B6BGvCT1/search-Range-search-Range-Finest.png

The original search range was 1..30 for the search range and 1..60 for the finest level. The optimal values were 2-4 and 6-14 respectively. The best value of 2 for searchparam is clearly visible, for pelsearch many values give almost as good result as the best value of 14. Let's limit the search range to 1..8 and the finest level to 1..30.

I did an exhaustive scan of parameter LSAD to see how changing just that one parameter will change the result. The other parameters were using the best values found for 8x8 block size.

https://i.postimg.cc/cHZVP8p1/halos-ghosts-judder2-ex3-2021-05-10-13-07-15-optimize-exhaustive-run-01.png

The red portion of the line is the best result and looks like many values give the exact same result. There are other sections where changing LSAD doesn't change the result at all. Looks like it doesn't pay to try every possible LSAD value, we can add a filter to only test values divisible by 100. Also the results are not changing at all after a bit over 8000 so the search range can be limited to max 10000. It's also worth mentioning that the effect of LSAD is very small, even the worst possible LSAD drops the result to 2.864 from 2.855.

What about badSAD?

https://i.postimg.cc/hvZ6JZfh/halos-ghosts-judder2-ex2-2021-05-10-10-51-03-optimize-exhaustive-run-01.png

The effect of badSAD is much stronger, choose it badly and the result drops to 3.7. BadSAD doesn't have flat areas but at least we can set the maximum to 8000.

Lastly we look at lambda. I used lambda range 0..20000 in the first tests but the best lambdas were very small, just 12 in the best found result. We'll limit lambda range to 0..200. Below the best result per lambda value (the blue line shows the number of tests per lambda value). Note that this chart was not from an exhaustive search but just shows the best found result from the original search runs, that's why the line jumps so wildly.

https://i.postimg.cc/mDb5mW9t/lambda.png

One more change is that we allow negative values for parameter badRange which is the range of wide search for bad blocks. The negative sign indicates that the algorithm is switched to exhaustive search.

To be continued...

Boulder
19th June 2021, 11:45
Very interesting results indeed. Most of the MVTools parameters are quite fuzzy so it's nice to see something like this.
As you seem to enjoy testing, incorporating MRecalculate could be a logical next step after the normal MAnalyse path. I've noticed (just by looking at MShow results) that it stabilises the vector field quite a lot.

zorr
19th June 2021, 23:50
incorporating MRecalculate could be a logical next step after the normal MAnalyse path

Yes that's almost the plan. After MAnalyse+MFlowFPS we'll take a little detour and compare it to MBlockFPS. After that it's time for MRecalculate with which ever FPS function was better. :)

zorr
22nd June 2021, 00:37
Let's see the results, finally.

As usual I did some 10000 iteration runs first to get a little taste of things to come. The best result was 2.7709792 with blockSize 8. Very promising, that's already better than what we could achieve before!

Next up was three runs of 100 000 iterations.

2.707417 blockSize 8
2.8104672 blockSize 16
2.7966063 blockSize 16

Still better, but only one of the three runs made progress. The search would often get stuck on blockSize 16 and then results would not be as good as with blockSize 8. Ok, let's lock the blockSize to 8 and see what we get with another 100 000 iterations:

2.730084

Ok, not bad but it's not better than the previous best result 2.707417. At this point I decided to try Zopti's dynamic iteration count, more specifically the backtracking variation. It will keep going as long as there are new better results within certain iteration count and adjust the mutation rate along the way (large mutations in the beginning and small ones at the end). I chose to use 10 dynamic phases with 360 iterations per phase (that is, if there's no better result in 360 iterations move to next phase. if better result is found go back to previous phase). The results were

2.701151 in 117240 iterations

Now that's more like it, we got a new record. I tried again and got

2.718272 in 165720 iterations

Let's see what parameters the top 3 results found:

run 1 run 2 run 3
GMSD 2.707417 2.701151 2.718272
iterations 100 000 117 240 (dynamic) 165 720 (dynamic)

MSuper
pel 4 (LOCKED) 4 (LOCKED) 4 (LOCKED)
sharp 1 1 1
rfilter 2 2 0

MAnalyse
blksize 8 8 (LOCKED) 8 (LOCKED)
* levels 0 -1 -2
search 5 5 5
searchparam 1 1 1
pelsearch 3 6 4
lambda 3 6 5
lsad 4900 8200 4100
pnew 159 161 168
* pzero 6 34 33
* pglobal 3 11 137
plevel 0 2 1
overlap 0 0 2
overlapv 4 4 2
* dct 1 1 1
divide 0 0 0
global true true true
badSAD 1346 943 1712
badrange -8 2 -9
meander true (LOCKED) true (LOCKED) true (LOCKED)
temporal false false false
trymany false false false
* scaleCSAD 2 2 2

MFlowFPS
ml 36 18 29
mask 2 (LOCKED) 2 (LOCKED) 2 (LOCKED)


The parameters marked with * are the new MAnalyse params which we didn't use earlier.

Since 8x8 blockSize was better than 16x16 but was kinda hard to find until we locked it down, perhaps we should also try locking the blockSize to 6x6:

2.7766657 in 214080 iterations
Hmmm nope. And that took a s**tload of iterations.

Let's get back to those top 3 results. There are again parameters with the same value in all three and we can probably try some tighter search ranges as well.
LOCKED:
sharp = 1
searchAlgo = 5
searchRange = 1
dct = 1
divide = 0
globalMotion = true
temporal = false
trymany = false
scaleCSAD = 2

Range limited:
searchRangeFinest 1..20
lambda 0..12
pnew 130..200
pzero 0..100
badSAD 100..3000
badRange -20..20
maskScale 1..100

Another dynamic iteration search and the result was

2.6988451 in 101880 iterations
A small improvement again. The parameters were

run 1
GMSD 2.6988451
iterations 101 880 (dynamic)

MSuper
pel 4 (LOCKED)
sharp 1 (LOCKED)
rfilter 2

MAnalyse
blksize 8 (LOCKED)
* levels -2
search 5 (LOCKED)
searchparam 1 (LOCKED)
pelsearch 2
lambda 1
lsad 400
pnew 158
* pzero 3
* pglobal 3
plevel 2
overlap 0
overlapv 4
* dct 1 (LOCKED)
divide 0 (LOCKED)
global true (LOCKED)
badSAD 1344
badrange -7
meander true (LOCKED)
temporal false (LOCKED)
trymany false (LOCKED)
* scaleCSAD 2 (LOCKED)

MFlowFPS
ml 30
mask 2 (LOCKED)
The parameter levels is -2, that means two coarse levels are not used in the hierarchical analysis made while searching for motion vectors. That's also the smallest value I allowed so of course now I had to do another test allowing for even more negative levels. I set the range to -6..0 and the result was

2.7016597
This result used levels -4 but it wasn't better than before.

All right, this concludes the tests on MAnalyse + MFlowFPS. Some random notes about the parameters we found:


a bit surprising that a negative levels was needed for the best result. MVTools docs mention that "Sometimes levels is useful to prevent large (false) vectors", perhaps that's what it is doing here.
optimal overlap (0) is different than overlapv (4). This could be due to having mostly horizontal motion in this test clip. Perhaps optimal block size is not square either. I have limited testing to square ones because it's a bit difficult do define all the valid sizes for the optimization.
dct=1 means only frequency domain data is used for SAD calculation
best badrange is negative, meaning the exhaustive algorithm works better than the original UMH search
trymany=false is surprising, or perhaps I don't understand what it means. The docs say "try to start searches around many predictors". So it sounds like trymany is doing more work to try to find the best possible block matches but somehow this ruins quality. Perhaps it's the same phenomenon we saw with the large search range (which also destroys quality).
scaleCSAD=2 means the luma:chroma ratio used in SAD calculations is 4:8 so chroma is twice as important as luma. That's a bit of a suprise. In my script the video is converted to YV24 before going into MVTools so chroma already has the same resolution as luma.


ScaleCSAD and trymany can be seen below as a heatmap. The larger the weight of chroma the better the result. And trymany=true consistently results in worse score.

https://i.postimg.cc/4yTrJJZf/allparams-scale-CSAD-trymany.png

So how much better is 2.6988451 than 2.843625? What do these values even mean? The script calculates the total GMSD score. There are 27 measured frames so the average per frame is almost precisely 0.01. It's also important to remember that we're measuring the quality after two interpolations while the frame-doubling only needs one so the quality in practical use is perhaps twice better, about 0.005 per frame.

I wanted to see how the optimized script compares to RIFE so I did a similar double-interpolation using RIFE ncnn Vulkan (https://github.com/nihui/rife-ncnn-vulkan) with the latest v3.1 model and measured the GMSD. RIFE's total was 2.192934 (NOTE: RIFE works with images and round-trip YV24->png->YV24 gives GMSD penalty of 0.336003). The chart below shows the per frame GMSD scores for the initial version, the latest version with all the parameters and RIFE.

https://i.postimg.cc/8kBYRgNv/gmsd-per-frame-initial-all-params-rife.png

On some frames MVTools actually pulls ahead but RIFE is still better overall. Also notable that the latest MVTools script is actually worse than before on some frames and two worst single frame GMSD scores are from this latest version. We could of course optimize for minimizing the largest per frame GMSD and get different results.

Let's compare some frames where the latest script did better (frame 8) and one where it did worse (frame 11). These are the doubly-interpolated frames.

https://i.postimg.cc/NMvNXpnb/initial-allparams.gif

And here's a comparison of the latest MVTools script and RIFE. Frame 3 shows where RIFE is much better and frame 18 shows where MVTools beats RIFE.

https://i.postimg.cc/sDWcRtgX/allparams-rife.gif

Here's the complete MVTools result with score 2.6988451, again only showing the interpolated frames.

https://i.postimg.cc/d3Y0Zkrg/halos-ghosts-judder-mvtools-allparams-1x.gif


The next episode is about MBlockFPS, can it beat the best result of MFlowFPS?

markfilipak
8th September 2021, 18:33
@zorr, this is all quite amazing, and so are you -- I'm becoming a fan. ;)

I'm digesting it. My mind is somewhat boggled by the tech jargon.

I've found that, rather than going from 24000/1001fps to 60000/1001fps (i.e. 2.5x), I get better results by forcing 24000/1001fps to 24fps -- 0.1% metadata speed up (not a transcode) to cinema running time -- then going from 24fps to 120fps (i.e. 5x) and then letting the TV drop alternate frames to 60fps. I'm still using SVPflow because it's all I know, but it does seem to do a better job if the interpolation is integer (e.g. 5x).

Selur
8th September 2021, 21:03
I think selur took the video down,
Totally missed that, I also got some additional test clips (https://drive.google.com/drive/folders/1hG4-vEDn4l_B0QTlsrUnyl8JrOAuRgBy?usp=sharing) regarding stripes for interpolation tests,... :)

zorr
8th September 2021, 23:07
@zorr, this is all quite amazing

Nice to see you again. And good to know that at least someone found these ramblings interesting. ;)

I've found that, rather than going from 24000/1001fps to 60000/1001fps (i.e. 2.5x), I get better results by forcing 24000/1001fps to 24fps

It may have something to do with the fact that in order to speed up clip 5x you need to generate 4 new frames between each original frame. But to speed up 2.5x - you'd need to generate 1.5 frames. I'm not sure how the interpolation is handled in such cases but seems it has to drop some original frames as well.

I also got some additional test clips (https://drive.google.com/drive/folders/1hG4-vEDn4l_B0QTlsrUnyl8JrOAuRgBy?usp=sharing) regarding stripes for interpolation tests,... :)

Thanks for those, they look excellent material for StripeMask tests... whenever I have enough time to conduct those. :D

markfilipak
9th September 2021, 17:37
... I also got some additional test clips (https://drive.google.com/drive/folders/1hG4-vEDn4l_B0QTlsrUnyl8JrOAuRgBy?usp=sharing) regarding stripes for interpolation tests,... :)
Thanks for those.

FFprobe appears to be unreliable. Do you know of another way to probe video streams?
Also, Selur, is there a reason for "DAR 5:4" instead of "DAR 4:3"?

Input #0, matroska,webm, from 'Fensterladen.mkv':
Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 720x576, 25 fps, 25 tbr, 1k tbn (default)
NOTE: DAR,SAR missing

Input #0, matroska,webm, from 'Ferdinand06.mkv':
Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 1280x720 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 1k tbn (default)

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'forInterpolation.mp4':
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 3840x2160 [SAR 1:1 DAR 16:9], 62614 kb/s, 29.97 fps, 29.97 tbr, 30k tbn (default)

Input #0, matroska,webm, from 'Hamsterzaun.mkv':
Stream #0:0: Video: h264 (High), yuv420p(top first), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 1k tbn (default)
NOTE: 'interlaced_frame=1','top_field_first=1','repeat_pict=0' but is not interlaced.

Input #0, matroska,webm, from 'Hochhaus.mkv':
Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, top first), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 1k tbn (default)
NOTE: 'interlaced_frame=1','top_field_first=1','repeat_pict=0' but is not interlaced.

Input #0, matroska,webm, from 'Jalousie.mkv':
Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 720x576, SAR 1:1 DAR 5:4, 25 fps, 25 tbr, 1k tbn (default)

Input #0, matroska,webm, from 'Krone 25 fps.mkv':
Stream #0:0: Video: h264 (High), yuv420p(tv, bt470bg, progressive), 720x576 [SAR 12:11 DAR 15:11], SAR 1:1 DAR 5:4, 25 fps, 25 tbr, 1k tbn (default)
NOTE: differing DAR,SAR

Input #0, matroska,webm, from 'MusteramArm.mkv':
Stream #0:0: Video: h264 (High 10), yuv420p10le(progressive), 720x576 [SAR 1:1 DAR 5:4], 25 fps, 25 tbr, 1k tbn (default)

Input #0, matroska,webm, from 'Torzaun.mkv':
Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 720x576, 25 fps, 25 tbr, 1k tbn (default)
NOTE: DAR,SAR missing

Input #0, matroska,webm, from 'Vogelkaefig.mkv':
Stream #0:0: Video: h264 (High 4:4:4 Predictive), yuv420p(progressive), 720x576, SAR 1:1 DAR 5:4, 25 fps, 25 tbr, 1k tbn (default)

Selur
9th September 2021, 18:17
Do you know of another way to probe video streams?
MediaInfo and mplayer/mpv would be alternatives to get stream parameters.

Also, Selur, is there a reason for "DAR 5:4" instead of "DAR 4:3"?
Got those streams for a while now and collected most of them over the years, never really looked at the DAR, so no real reason. :)