MVTools, Depan, DepanEstimate for VapourSynth [Archive] - Page 10

View Full Version : MVTools, Depan, DepanEstimate for VapourSynth

Pages : 1 2 3 4 5 6 7 8 9 [10]

edcrfv94

29th October 2018, 07:28

vapoursynth mvtools v20

c_in = src

sup_a = core.mv.Super(c_in, pel=2)
sup = sup_a

analyse_args_df = dict(blksize=16, overlap=8, search=5, searchparam=4, dct=5)
bVec1 = core.mv.Analyse(sup_a, isb=True, delta=1, **analyse_args_df)
fVec1 = core.mv.Analyse(sup_a, isb=False, delta=1, **analyse_args_df)
bVec2 = core.mv.Analyse(sup_a, isb=True, delta=2, **analyse_args_df)
fVec2 = core.mv.Analyse(sup_a, isb=False, delta=2, **analyse_args_df)
bVec3 = core.mv.Analyse(sup_a, isb=True, delta=3, **analyse_args_df)
fVec3 = core.mv.Analyse(sup_a, isb=False, delta=3, **analyse_args_df)

compensate_args_df = dict(thsad=400)
bc1 = core.mv.Compensate(c_in, sup, bVec1, **compensate_args_df)
fc1 = core.mv.Compensate(c_in, sup, fVec1, **compensate_args_df)
bc2 = core.mv.Compensate(c_in, sup, bVec2, **compensate_args_df)
fc2 = core.mv.Compensate(c_in, sup, fVec2, **compensate_args_df)
bc3 = core.mv.Compensate(c_in, sup, bVec3, **compensate_args_df)
fc3 = core.mv.Compensate(c_in, sup, fVec3, **compensate_args_df)

cmp = core.std.Interleave([bc3, bc2, bc1, c_in, fc1, fc2, fc3])
#cmp = core.std.Interleave([fc3, fc2, fc1, c_in, bc1, bc2, bc3])

AviSynth+ mvtools-2.7.33

c_in = last

sup_a = c_in.MSuper(pel=2)
sup = sup_a

vec_norm = sup_a.MAnalyse(multi=true, delta=3, blksize=16, overlap=8, search=5, searchparam=4, DCT=5)
cmp = c_in.MCompensate(sup, vec_norm, tr=3, thSAD=400)

or

c_in = last

sup_a = c_in.MSuper(pel=2)
sup = sup_a

#vec = sup_a.MAnalyse(multi=true, delta=3, blksize=16, overlap=8, search=5, searchparam=4, DCT=5)
#cmp = c_in.MCompensate(sup, vec, tr=3, thSAD=400)

bVec1 = MAnalyse(sup_a, isb=True, delta=1, blksize=16, overlap=8, search=5, searchparam=4, dct=5)
fVec1 = MAnalyse(sup_a, isb=False, delta=1, blksize=16, overlap=8, search=5, searchparam=4, dct=5)
bVec2 = MAnalyse(sup_a, isb=True, delta=2, blksize=16, overlap=8, search=5, searchparam=4, dct=5)
fVec2 = MAnalyse(sup_a, isb=False, delta=2, blksize=16, overlap=8, search=5, searchparam=4, dct=5)
bVec3 = MAnalyse(sup_a, isb=True, delta=3, blksize=16, overlap=8, search=5, searchparam=4, dct=5)
fVec3 = MAnalyse(sup_a, isb=False, delta=3, blksize=16, overlap=8, search=5, searchparam=4, dct=5)

bc1 = MCompensate(c_in, sup, bVec1, thsad=400)
fc1 = MCompensate(c_in, sup, fVec1, thsad=400)
bc2 = MCompensate(c_in, sup, bVec2, thsad=400)
fc2 = MCompensate(c_in, sup, fVec2, thsad=400)
bc3 = MCompensate(c_in, sup, bVec3, thsad=400)
fc3 = MCompensate(c_in, sup, fVec3, thsad=400)

cmp = Interleave(bc3, bc2, bc1, c_in, fc1, fc2, fc3)
#cmp = Interleave(fc3, fc2, fc1, c_in, bc1, bc2, bc3)

Very different from the reslts of AviSynth version, not sure which one correct.

jackoneill

13th March 2019, 16:52

v21 (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v21) fixes three bugs:

* BlockFPS, Flow, FlowFPS, FlowInter: Fix crash with certain blksize/overlapv
ratios, like 8/2. Thanks to pinterf for finding the cause and the solution.
* Flow, FlowBlur, FlowFPS, FlowInter: Fix crash due to motion vectors pointing
outside the frame. Thanks to pinterf for finding the cause and the solution.
* Analyse: Fix use of an uninitialised variable. Only dct modes 2, 6, and 9 were
affected. The result was probably just nondeterministic output. This
uninitialised variable was inherited from the Avisynth plugin, version 2.5.11.3.

ChaosKing

13th March 2019, 17:03

Awesome!

ChaosKing

2nd November 2019, 10:02

Isn't it time for a v22 - AVX2 booster edition? :D
Or at least a test build so we can test it?

jackoneill

2nd November 2019, 17:00

Isn't it time for a v22 - AVX2 booster edition? :D
Or at least a test build so we can test it?

I should figure out issue #38 first.

tormento

6th April 2020, 10:57

I should figure out issue #38 first.
Would you please add MDegrainN (up to 6, at least)?

I need it to run G41Fun.py (https://github.com/Selur/VapoursynthScriptsInHybrid/blob/master/G41Fun.py) on noisy material.

I have tried MVTools single precision but it's simply too slow to have any use of it, at least the one without AVX2 requirement.

Boulder

4th May 2020, 18:40

Thank you for the new release (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v22) :thanks:

Lemmy said that speed is good for you.

bin.n2f

4th May 2020, 21:44

Boulder

5th May 2020, 05:22

I'm getting quite strange results considering the possible speed increases on my Ryzen 3900X. I would have expected the new Zen generation to be able to utilize the AVX2 optimizations really well, because they did help the first gen one when I tested Stephen R. Savage's early builds.

The first 2500 frames using vspipe:
v21 : 57.25 fps
v22 : 55.14 fps

Test script, normal Blu-ray source.
clp = core.dgdecodenv.DGSource(r"test.dgi")

degrain16 = core.fmtc.bitdepth(clp, bits=16)
superanalyse8 = core.mv.Super(clp, pel=2, chroma=True, rfilter=4, sharp=1)
supermdg16 = core.mv.Super(degrain16, pel=2, chroma=True, rfilter=4, levels=1, sharp=1)

analyze_args = dict(blksize=16, overlap=8, search=5, searchparam=8, pelsearch=8, truemotion=False)
degrain_args_16 = dict(thsad=200, thsadc=100, limit=1*256, limitc=2*256, thscd1=300, thscd2=80)

bv1_8 = core.mv.Analyse(superanalyse8, isb=True, delta=1, **analyze_args)
fv1_8 = core.mv.Analyse(superanalyse8, isb=False, delta=1, **analyze_args)

finalclip8 = core.mv.Degrain1(degrain16, supermdg16, bv1_8, fv1_8, **degrain_args_16)
finalclip8.set_output()

Boulder

13th May 2020, 15:59

What does MFinest actually do? There's no real documentation anywhere to be found.

jackoneill

14th May 2020, 08:15

What does MFinest actually do? There's no real documentation anywhere to be found.

If you look at the output of Super with pel=2 you'll see at the bottom of the frame 4 identically-sized images, and with pel=4 you'll see 16 identically-sized images. Those 4 or 16 images are actually one large image. I imagine they are stored like that so that Super doesn't waste lots of RAM in the top part of the frame.

Finest takes those 4 or 16 images and reassembles them into one (separate) image for the filters that need it. I think it's mostly the Flow* filters. They automatically invoke Finest when they need it.

I think this hack could be eliminated in the VapourSynth version because the various images produced by Super could be attached to the source frame as separate frame properties. If someone had the motivation to look into that.

Pat357

15th May 2020, 23:02

I'm getting quite strange results considering the possible speed increases on my Ryzen 3900X. I would have expected the new Zen generation to be able to utilize the AVX2 optimizations really well, because they did help the first gen one when I tested Stephen R. Savage's early builds.

The first 2500 frames using vspipe:
v21 : 57.25 fps
v22 : 55.14 fps

What is so strange about these results ? I would expect better speed from a Ryzen 3900... did you limit the nr. of threads ?
Which was the optimized build ?
Tested your script with 1920x1080 YUV420P8 clip on my i7940X (14c/28t):

Mvtools R22 (optimized by Stephen R. Savage)
>vspipe -e 1999 boulder.vpy .
Output 2000 frames in 14.63 seconds (136.66 fps)
Mvtools R23 (current version)
vspipe -e 1999 boulder.vpy .
Output 2000 frames in 17.34 seconds (115.37 fps)
I hope Stephen finds the time to optimize this new version :)

Are_

15th May 2020, 23:30

From a bluray?

Boulder

16th May 2020, 12:37

What is so strange about these results ? I would expect better speed from a Ryzen 3900... did you limit the nr. of threads ?
Which was the optimized build ?
Tested your script with 1920x1080 YUV420P8 clip on my i7940X (14c/28t):

Mvtools R22 (optimized by Stephen R. Savage)
>vspipe -e 1999 boulder.vpy .
Output 2000 frames in 14.63 seconds (136.66 fps)
Mvtools R23 (current version)
vspipe -e 1999 boulder.vpy .
Output 2000 frames in 17.34 seconds (115.37 fps)
I hope Stephen finds the time to optimize this new version :)

R22 was the first official build with AVX2 optimizations. The thing looks like a compiler issue, the libraries built with MSVC are faster.

https://github.com/dubhater/vapoursynth-mvtools/issues/47

feisty2

17th May 2020, 12:10

Would you please add MDegrainN (up to 6, at least)?

I need it to run G41Fun.py (https://github.com/Selur/VapoursynthScriptsInHybrid/blob/master/G41Fun.py) on noisy material.

I have tried MVTools single precision but it's simply too slow to have any use of it, at least the one without AVX2 requirement.

I might be able to add mvmulti functionalities (MAnalyze, MRecalculate, MCompensate, MFlow with radius, and MDegrainN) and cosine annealing ("thsad2") to this branch as well if the branch owner allows C++20 snippets in the code base.

tormento

17th May 2020, 12:33

I might be able to add mvmulti functionalities (MAnalyze, MRecalculate, MCompensate, MFlow with radius, and MDegrainN) and cosine annealing ("thsad2") to this branch as well if the branch owner allows C++20 snippets in the code base.

Great. I hope the SMDegrain script will be modified too.

Boulder

17th May 2020, 12:39

I might be able to add mvmulti functionalities (MAnalyze, MRecalculate, MCompensate, MFlow with radius, and MDegrainN) and cosine annealing ("thsad2") to this branch as well if the branch owner allows C++20 snippets in the code base.

Would you mind taking a stab at the scaling of vectors between different bitdepths (in case jackoneill doesn't have the interest to do it)?

feisty2

17th May 2020, 13:04

the thing is I'm not sure if jackoneill would accept C++20 code for 2 reasons
1) it breaks compatibility with tons of older compilers, in fact, GCC 10.1 is the only compiler that supports most of C++20 features currently.
2) the inserted snippets would have a very different coding style, it would look much more similar to dynamically typed languages than typical statically typed languages with a nominal type system (C, Java, C++98 (excluding template metaprogramming), etc.) some people find such code much easier to read and write and others find it hard to decipher. people have different coding mindsets, I personally think in structural typing and find type declarations useless, but lots of people rely on nominal typing and find code without type declarations hard to understand. jackoneill might also reject my code because he/she thinks the code is hard to maintain.

and I don't wanna create yet another mvtools branch if jackoneill decides not to merge my code.

Pat357

17th May 2020, 20:34

R22 was the first official build with AVX2 optimizations. The thing looks like a compiler issue, the libraries built with MSVC are faster.

https://github.com/dubhater/vapoursynth-mvtools/issues/47

Is there a MSVC compiled version for the new R23 available somewhere ?
I not smart enough to do it myself without a ready available .sln and other settings...

ChaosKing

18th May 2020, 08:11

Thx HolyWu
Tested with SMDegrain and the clang version is the fastest, up to 3fps faster compared to R23. MSVC is the slowest, even slower then R23. ICL is a tiny bit slower then clang.
Tested with Ryzen 2600 on 1080p source.

Ranking
clang
icl
R23 release
msvc

tormento

18th May 2020, 11:50

I am losing a bit the thread. Has been MDegrain up to 6 and N implemented or not?

feisty2

18th May 2020, 11:55

no, judging from jackoneill's lack of response, I don't think he/she will merge C++20 code, and I don't wanna create another mvtools branch.
your best shot is with the floating point branch if you do need arbitrary radius MDegrain and cosine annealing

tormento

18th May 2020, 12:48

feisty2

19th August 2020, 07:33

is this a typo? https://github.com/dubhater/vapoursynth-mvtools/blob/master/src/MVFlowBlur.c#L392
shouldn't it be d.blur * 256. / 100?

jackoneill

19th August 2020, 12:56

is this a typo? https://github.com/dubhater/vapoursynth-mvtools/blob/master/src/MVFlowBlur.c#L392
shouldn't it be d.blur * 256. / 100?

It looks deliberate. It's 200 again just a few lines above.

ChaosKing

2nd January 2023, 11:34

Happy new year
https://github.com/dubhater/vapoursynth-mvtools/commit/b5d58cb7ca1cfe27bdcb30fbcff67254580b7ab9

Adub

5th January 2023, 22:15

Haha you’re welcome. ;)

Thanks for posting this, I’d neglected to. A few more cleanups have been made since then as well. May minorly increase speed, but only minorly.

I’m still working on more updates in my free time. I’m working on AVX2 degrain cleanups and then hopefully incorporating more x264 code for higher bit depths, but we’ll see about that.

Longer term id like to add some basic DegrainN support (but limit it to radius 6 for now), add high bit depth degrain AVX and SSE2 code, support block size 24 (with a simple C implementation for now), and add high bit depth support to Mask. I say all that, but it’s not all trivial amounts of work, and I’m doing this in my random free time, so we’ll see how it goes.

amayra

22nd June 2024, 21:13

is this abandoned project?

avs version still under development unlike vapoursynth plug-in

Adub

5th July 2024, 18:51

I wouldn't say it's abandoned. The author accepts pull requests, so we can still work as a community to improve it.

I have one or two changes I've been meaning to create PR's for, I just haven't gotten time to do it yet.

So not abandoned, but in maintenance mode for now.

Adub

6th August 2024, 17:15

We've just released Version 24 of MVTools: https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v24

This is the last version that support Vapoursynth API version 3. We've already done the work to port to API version 4, which will be released in Version 25 here shortly.

Please enjoy this not abandoned project. :sly:

Myrsloik

11th May 2026, 14:52

An update. There is now API4 support and mvtools can be installed from pypi. If you have v25 or v26 YOU REALLY SHOULD UPDATE TO v27 TO AVOID MMEORY LEAKS.

DTL

16th May 2026, 19:27

Selur asks for porting DX12 motion estimation to VS MAnalyse. Can any VS programmer helps ?

The AVS+ version is there - https://github.com/DTL2020/mvtools/blob/9eedb9d0850f638fc43212fb515cf048f9b9a58f/Sources/MVAnalyse.h#L34 (look for DX12_ME define and #ifdef blocks in .h and .cpp files). This feature is completely excludable for non-DX12 builds to be possible (not loads d3d12.dll and not require DX12 to be installed).

Myrsloik

21st June 2026, 13:26

I'm currently working on cleaning up and optimizing mvtools and I'm curious about which pel values you actually use so I can spend more time optimizing them. For example I suspect few people ever use pel=1.

Does anyone ever use pel=1?

Do you all leave it at the default pel=2?

Is the only option you'd ever consider pel=4?

Selur

21st June 2026, 14:37

Does anyone ever use pel=1?
=> yes
QTGMCs presets ("Draft", "Ultra Fast", "Super Fast", "Very Fast", "Faster", "Fast", "Medium") use pel=1.
So, I suspect quite a few people use pel=1.

Adub

21st June 2026, 15:48

It’s common enough to use pel 1 with 4k content. I believe some versions of SMDegrain are tuned to use pel 4 for SD content, pel 2 for HD, and pel 1 for UHD.

Selur

21st June 2026, 16:07

@myrsloik: mvtools v29 is not available via pip atm.

Myrsloik

21st June 2026, 16:43

@myrsloik: mvtools v29 is not available via pip atm.

Fixed

Selur

21st June 2026, 18:01

Thanks :) (works now)