MVTools, Depan, DepanEstimate for VapourSynth [Archive] - Page 9

VS_Fan

23rd August 2016, 20:19

Let me rephrase my thought: In the same search at videolan, I though I saw 8, 10 (may be 12) bit_depth sse2 and/or avx optimizations for SATD calculation.

My question is then: No matter what the original video clip bit_depth is, could you use a “simplified” 8, 10 or 12 bit temporary version of the clip for the motion estimation? (reusing the x265 SATD asm optimizations)

Edit: By the way, thanks for the explanation :)

Nevilne

23rd August 2016, 20:22

You will actually get more useful motion estimation from even blurring the search clip.
Doesn't stop people here from doing stuff like nnedi for subpixel clips.

jackoneill

23rd August 2016, 20:33

Let me rephrase my thought: In the same search at videolan, I though I saw 8, 10 (may be 12) bit_depth sse2 and/or avx optimizations for SATD calculation.

My question is then: No matter what the original video clip bit_depth is, could you use a “simplified” 8, 10 or 12 bit temporary version of the clip for the motion estimation? (reusing the x265 SATD asm optimizations)

I considered it, but: http://forum.doom9.org/showthread.php?p=1695761#post1695761

feisty2

23rd August 2016, 20:43

Let me rephrase my thought: In the same search at videolan, I though I saw 8, 10 (may be 12) bit_depth sse2 and/or avx optimizations for SATD calculation.

My question is then: No matter what the original video clip bit_depth is, could you use a “simplified” 8, 10 or 12 bit temporary version of the clip for the motion estimation? (reusing the x265 SATD asm optimizations)

Edit: By the way, thanks for the explanation :)

SATD is a pretty simple transform and not hard to program at all...
Why not just re-program it in c/c++ like I did and it would be generic to all sample types

VS_Fan

23rd August 2016, 21:24

Don’t waste your time at this, moreover it would be misleading. We can explicitly convert the high-bitdepth clips to 8 bits for analysis. Anyway, I think there is a definite benefit to run the analysis on 10–12 bits. I often remap the luma channel to increase the contrast is some specific ranges (generally the dark parts), and keeping 8 bits crunches other ranges, reducing the accuracy of the analysis on fine textures.

I hope you can come up with a neat strategy to deal with SATD.

I congratulate and thank you all for the VapourSynth's 5th birthday!!! 2 days in advance :D

kolak

25th August 2016, 16:45

Would anyone consider making mvtools faster and better as a paid project?

feisty2

25th August 2016, 16:57

Would anyone consider making mvtools faster and better as a paid project?

I assume you're that "anyone" since you're the only one that has ever asked about this..

jackoneill's wishlist
(https://gist.github.com/dubhater/12a6af383dd006999ba3)

edit:typo

kolak

25th August 2016, 17:26

Are these ebooks?
I will buy all of them+ many more :)

jackoneill

25th August 2016, 17:33

Would anyone consider making mvtools faster and better as a paid project?

Which parts faster? Which parts better? Better in what way?

Yes, ebooks.

kolak

26th August 2016, 00:31

I'm mainly interested in frame rate interpolation part.
There would be quite specific problems to "solve" as well as speeding up whole conversion process, ideally by about 2x.

I will send you pm.

jackoneill

17th September 2016, 09:43

good news and bad news here

I fixed the SATD functions. Please test: http://savedonthe.net/download/914/vapoursynth-mvtools-satd-win64.html

feisty2

17th September 2016, 14:02

it works, but the result looks slightly different from my implementation
mine looks closer to dct=1, and yours closer to dct=0 (set thSAD to 10000 in MDeGrain and you will see the difference)
not sure why..

jackoneill

17th September 2016, 16:00

it works, but the result looks slightly different from my implementation
mine looks closer to dct=1, and yours closer to dct=0 (set thSAD to 10000 in MDeGrain and you will see the difference)
not sure why..

It's the same code used by MVTools for Avisynth and x264.

jackoneill

23rd October 2016, 14:47

v17 brings more speed for certain configurations, larger blocks, and a bug fix or two: https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v17

* Analyse, Recalculate: Fix bug that broke 16 bit processing (patches by feisty2).
* Analyse, Recalculate: Support block sizes of 64x32, 64x64, 128x64, and 128x128.
* Analyse, Recalculate: Make dct=1..4 a bit faster on x86.
* FlowFPS, FlowInter: Add AVX2 code.
* Analyse, Recalculate: Fix SATD functions used when dct=5..10 and the input is 16 bit.
* Analyse, Recalculate: Allow dct=5..10 with blocks larger than 16x16.

feisty2

23rd October 2016, 15:27

...
time to update my branch of mvtools as well...
and oyster and plum
dammit

Mystery Keeper

23rd October 2016, 16:19

Thanks for your awesome work! Why don't you two consolidate your efforts and make one version that supports all formats?

hydra3333

24th October 2016, 11:14

Thank you indeed.

luigizaninoni

24th October 2016, 11:30

Trying to build on Linux Mint, gives error:

make: *** No rule to make target 'src/CopyCode.c', needed by 'src/CopyCode.lo'. Stop.

jackoneill

24th October 2016, 16:14

Trying to build on Linux Mint, gives error:

make: *** No rule to make target 'src/CopyCode.c', needed by 'src/CopyCode.lo'. Stop.

That's because I changed some file names. To fix it:

make distclean
./configure
make

luigizaninoni

24th October 2016, 16:58

That's because I changed some file names. To fix it:

make distclean
./configure
make

Works fine. Thanks

feisty2

18th November 2016, 21:02

satd is yet again broken..
broken like not even activated

import vapoursynth as vs
core = vs.get_core()

clp = rule6
clp = core.fmtc.bitdepth(clp,bits=16,fulls=False,fulld=True)
clp = core.std.ShufflePlanes(clp,0,vs.GRAY)

sup = core.mv.Super(clp)
bv1a = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=5,isb=True)
bv2a = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=5,isb=True)
bv3a = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=5,isb=True)
fv1a = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=5,isb=False)
fv2a = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=5,isb=False)
fv3a = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=5,isb=False)

bv1b = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=0,isb=True)
bv2b = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=0,isb=True)
bv3b = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=0,isb=True)
fv1b = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=0,isb=False)
fv2b = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=0,isb=False)
fv3b = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=0,isb=False)

clpa = core.mv.Degrain3(clp, sup, bv1a, fv1a, bv2a, fv2a, bv3a, fv3a,thscd1=16320,thsad=2000)
clpb = core.mv.Degrain3(clp, sup, bv1b, fv1b, bv2b, fv2b, bv3b, fv3b,thscd1=16320,thsad=2000)

clp = core.std.Expr([clpa,clpb],"x y - abs 10000 *")

clp.set_output()

got a blank black clip, which means dct=5 is doing SAD actually...

feisty2

9th February 2017, 10:01

I added floating point support to MMask, you can back-port it to your branch and make your version of MMask work on higher precision if you want to.

https://github.com/IFeelBloated/vapoursynth-mvtools-sf/blob/master/src/MVMask.cpp

Boulder

15th February 2017, 17:21

I've been trying to track a weird problem in which VapourSynth Editor will slowly use up all the memory after refreshing the script enough times. I use a custom denoising function to process the videos.

I was able to find out that feeding an external clip to mv.Super causes a memory leak:

Core freed but 12 filter instances still exist
Core freed but 12 filter instances still exist
Core freed but 458496000 bytes still allocated in framebuffers
Core freed but 458496000 bytes still allocated in framebuffers

This is the part where it happens:

prefilt = core.dfttest.DFTTest(feed, tbsize=1, sigma=5, sigma2=5, sbsize=16, sosize=8)

pelmdg = core.fmtc.resample(clip=clp, scale=2, kernel='spline64', center=False)
pelprefilt = core.fmtc.resample(clip=prefilt, scale=2, kernel='spline64', center=False)

superanalyse = core.mv.Super(clp, pel=2, chroma=True, rfilter=4, pelclip=pelprefilt)
supermdg = core.mv.Super(clp, pel=2, chroma=True, rfilter=4, levels=1, pelclip=pelmdg)

One question: is it even sensible to use a denoised external super clip upsized with a sharp method? Or is it generally better to let the internal functions do things?

feisty2

15th February 2017, 17:55

I've been trying to track a weird problem in which VapourSynth Editor will slowly use up all the memory after refreshing the script enough times. I use a custom denoising function to process the videos.

I was able to find out that feeding an external clip to mv.Super causes a memory leak:

Core freed but 12 filter instances still exist
Core freed but 12 filter instances still exist
Core freed but 458496000 bytes still allocated in framebuffers
Core freed but 458496000 bytes still allocated in framebuffers

This is the part where it happens:

prefilt = core.dfttest.DFTTest(feed, tbsize=1, sigma=5, sigma2=5, sbsize=16, sosize=8)

pelmdg = core.fmtc.resample(clip=clp, scale=2, kernel='spline64', center=False)
pelprefilt = core.fmtc.resample(clip=prefilt, scale=2, kernel='spline64', center=False)

superanalyse = core.mv.Super(clp, pel=2, chroma=True, rfilter=4, pelclip=pelprefilt)
supermdg = core.mv.Super(clp, pel=2, chroma=True, rfilter=4, levels=1, pelclip=pelmdg)

no, it should be

superanalyse = core.mv.Super(prefilt, pel=2, chroma=True, rfilter=4, pelclip=pelprefilt)

One question: is it even sensible to use a denoised external super clip upsized with a sharp method? Or is it generally better to let the internal functions do things?

the difference is very small in general, but sensible, NNEDI is (kind of) noticeably better than internal functions.

Boulder

15th February 2017, 18:40

no, it should be

superanalyse = core.mv.Super(prefilt, pel=2, chroma=True, rfilter=4, pelclip=pelprefilt)

Sorry, got that the wrong way around when investigating. The reference clip is 'prefilt' in the function itself :)

the difference is very small in general, but sensible, NNEDI is (kind of) noticeably better than internal functions.OK, I think I'll keep things intact for now.

jackoneill

15th February 2017, 20:41

I've been trying to track a weird problem in which VapourSynth Editor will slowly use up all the memory after refreshing the script enough times. I use a custom denoising function to process the videos.

I was able to find out that feeding an external clip to mv.Super causes a memory leak:

Yes, there is a memory leak in Super. Does this DLL (http://savedonthe.net/download/1046/vapoursynth-mvtools-leaky-win64.html) work better?

Boulder

16th February 2017, 16:48

Works great, no more leaks :) Thanks a lot!

Pat357

12th April 2017, 22:57

jackoneill : the fixed libmvtools.dll (fixed mem-leak) has some unwanted dependencies like libgcc_s_seh1.dll and libstdc++6.dll.
Because I don't know what the fix is for the mem-leak and on github ( https://github.com/dubhater/vapoursynth-mvtools ) I do not see any
fix for the mem-leak either, recompiling from the unfixed same source would not make much sense.

Could you please make a fixed (no-mem-leak) version for it ?

jackoneill

14th April 2017, 14:41

jackoneill : the fixed libmvtools.dll (fixed mem-leak) has some unwanted dependencies like libgcc_s_seh1.dll and libstdc++6.dll.
Because I don't know what the fix is for the mem-leak and on github ( https://github.com/dubhater/vapoursynth-mvtools ) I do not see any
fix for the mem-leak either, recompiling from the unfixed same source would not make much sense.

Could you please make a fixed (no-mem-leak) version for it ?

Sorry about that. I totally forgot.

Here is v18. (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v18)

* Super: Fix memory leak when pelclip is used.

Pat357

24th April 2017, 22:21

In the MVtools doc is:
"Block sizes of 64x32, 64x64, 128x64, and 128x128 are supported."

Are smaller specified blksizes (like blksize=16) just ignored ? ..as I get no error.
Why only supporting the bigger blocks ?
Would you consider supporting also blksizes=16x16, 32x16, 32x32 or even 8x8, 16x8 ?

MonoS

24th April 2017, 22:30

In the MVtools doc is:
"Block sizes of 64x32, 64x64, 128x64, and 128x128 are supported."

Are smaller specified blksizes (like blksize=16) just ignored ? ..as I get no error.
Why only supporting the bigger blocks ?
Would you consider supporting also blksizes=16x16, 32x16, 32x32 or even 8x8, 16x8 ?

They are supported, the documentation only states the differences between the vapoursynth version and the original avisynth one.

jackoneill

5th June 2017, 21:16

satd is yet again broken..
broken like not even activated

import vapoursynth as vs
core = vs.get_core()

clp = rule6
clp = core.fmtc.bitdepth(clp,bits=16,fulls=False,fulld=True)
clp = core.std.ShufflePlanes(clp,0,vs.GRAY)

sup = core.mv.Super(clp)
bv1a = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=5,isb=True)
bv2a = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=5,isb=True)
bv3a = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=5,isb=True)
fv1a = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=5,isb=False)
fv2a = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=5,isb=False)
fv3a = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=5,isb=False)

bv1b = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=0,isb=True)
bv2b = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=0,isb=True)
bv3b = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=0,isb=True)
fv1b = core.mv.Analyse(sup,delta=1,blksize=32,overlap=16,search=3,dct=0,isb=False)
fv2b = core.mv.Analyse(sup,delta=2,blksize=32,overlap=16,search=3,dct=0,isb=False)
fv3b = core.mv.Analyse(sup,delta=3,blksize=32,overlap=16,search=3,dct=0,isb=False)

clpa = core.mv.Degrain3(clp, sup, bv1a, fv1a, bv2a, fv2a, bv3a, fv3a,thscd1=16320,thsad=2000)
clpb = core.mv.Degrain3(clp, sup, bv1b, fv1b, bv2b, fv2b, bv3b, fv3b,thscd1=16320,thsad=2000)

clp = core.std.Expr([clpa,clpb],"x y - abs 10000 *")

clp.set_output()

got a blank black clip, which means dct=5 is doing SAD actually...

It is indeed doing SAD. I accidentally made dct=5..10 behave like dct=0 in v17. I think v19 will happen soon.

jackoneill

7th June 2017, 20:37

v19 is here. (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v19)

* Super: Fix small bug in SSE2 code used with rfilter=3 and 8 bit input.
* Super: Fix bug in SSE2 code used with sharp=0 and 8 bit input.
* Analyse, Recalculate: Fix bug that made dct=5..10 behave like dct=0 (bug introduced in v17).
* Store SAD in 64 bit integers instead of 32 bit integers. This is required
because YUV444P16 video with 128x128 blocks could produce SADs
too large for 32 bit integers. Motion vectors produced by v18 or older
will not work with this version. Only users who stored the motion vectors
from Analyse/Recalculate on disk have to worry about this.
* Degrains: Put an upper limit on the legal values of thsad/thsadc to avoid
an overflow. The exact value of the limit depends on bit depth,
subsampling, and block size. It's probably fairly high.

MonoS

14th June 2017, 21:07

Hi, using a script like that

import vapoursynth as vs
import nnedi3_resample as edi
import havsfunc as has

core = vs.get_core()

def Denoise2(src, denoise, blksize, fast, truemotion):
overlap = int(blksize / 2)
pad = blksize + overlap

src = core.fmtc.resample(src, src.width+pad, src.height+pad, sw=src.width+pad, sh=src.height+pad, kernel="point")

super = core.mv.Super(src)

rep = has.DitherLumaRebuild(src, s0=1)
superRep = core.mv.Super(rep)

bvec2 = core.mv.Analyse(superRep, isb = True, delta = 2, blksize=blksize, overlap=overlap, truemotion=truemotion)
bvec1 = core.mv.Analyse(superRep, isb = True, delta = 1, blksize=blksize, overlap=overlap, truemotion=truemotion)
fvec1 = core.mv.Analyse(superRep, isb = False, delta = 1, blksize=blksize, overlap=overlap, truemotion=truemotion)
fvec2 = core.mv.Analyse(superRep, isb = False, delta = 2, blksize=blksize, overlap=overlap, truemotion=truemotion)

fin = core.mv.Degrain2(src, super, bvec1,fvec1,bvec2,fvec2, denoise)

fin = core.std.CropRel(fin, 0, pad, 0, pad)

return fin

src = core.lsmas.LWLibavSource("").fmtc.bitdepth(bits=16)

den = Denoise2(src, 200, blksize=16, fast=False, truemotion=False)

den.set_output()

i get poor denoising in bright area of the image, this doesn't happens when i first upscale the clip to 444 using
res = edi.nnedi3_resample(src, src.width ,src.height, sigmoid=True, invks=True, csp=vs.YUV444P16, curves="709")
i've checked the luma plane and there are no differences with the original(tried with a makediff and fmtc.histluma() call)

Is this a "known issue" or am i doing something wrong?

VS_Fan

15th June 2017, 01:26

Is this a "known issue" or am i doing something wrong?
It is most probably related to the luma:chroma SAD ratio weighting.

Pinterf explained it in this mvtools for avisynth forum post (https://forum.doom9.org/showthread.php?p=1806028#post1806028). He also recently released a version (2.7.18.22 – 2017-05-12) of mvtools for avisynth with a new parameter "scaleCSAD" to fine tune this ratio. See this forum post (https://forum.doom9.org/showthread.php?p=1806836#post1806836)

Both in avisynth and vapoursynth, instead of converting to 444, you could filter (mdegrain, etc) planes separately, and then combine them together again with ShufflePlanes.

MonoS

15th June 2017, 18:40

It is most probably related to the luma:chroma SAD ratio weighting.

Pinterf explained it in this mvtools for avisynth forum post (https://forum.doom9.org/showthread.php?p=1806028#post1806028). He also recently released a version (2.7.18.22 – 2017-05-12) of mvtools for avisynth with a new parameter "scaleCSAD" to fine tune this ratio. See this forum post (https://forum.doom9.org/showthread.php?p=1806836#post1806836)

I took a look at those two posts and i can't understand how the thsad of the chroma planes should influence the luma plane.

Both in avisynth and vapoursynth, instead of converting to 444, you could filter (mdegrain, etc) planes separately, and then combine them together again with ShufflePlanes.
If i denoise both sending the planes parameter with 0 or using ShufflePlanes to denoise only the luma i get the same poor performance in bright spot.

I should also mention that i'm talking about the luma plane, chroma planes, afaik are ok.

EDIT: i'll send a sample asap

VS_Fan

16th June 2017, 03:31

I took a look at those two posts and i can't understand how the thsad of the chroma planes should influence the luma plane.
It’s not the thSAD parameter (well, not only). It’s related to the mvtools internal SAD calculations made during ‘analyze’ or ‘recalculate’ to find the motion vectors. The luma:chroma weighting is:

4:2 for YV12 (4:2:0 subsampling)
4:4 for YV16 (4:2:2 subsampling)
4:8 for YV24 (4:4:4 subsampling)
That means: with 444 subsampling mvtools will base its calculations on twice as much chroma data than luma data. That’s why you get “cleaner” results. The chroma planes have typically less noise than the luma plane. So, with such a low value for thSAD (denoise=200) mvtools picks up the right vectors easier for chroma planes than it can with luma data.

I saw three ways to improve your script:

The ‘denoise’ value (thSAD parameter) is half of the default value. Leave it at default (400)
You are resizing the clip prior to processing with mvtools, with the very basic ‘point’ kernel, and at the end you are cropping the borders. You should avoid that. Use the hpad & vpad parameters for super instead. And if you really want crop the borders, you can resize after mdegrain with a better kernel and then crop.
DitherLumaRebuild “allows tweaking for pumping up the darks” (comment by the author). This may be leading you to oversaturate the bright areas. Try without it. You could use some other prefilter.

Like this:def Denoise2(src, denoise, blksize, fast, truemotion):
overlap = int(blksize / 2)
pad = blksize #+ overlap

#src = core.fmtc.resample(src, src.width+pad, src.height+pad, sw=src.width+pad, sh=src.height+pad, kernel="point")

super = core.mv.Super(src, hpad=pad, vpad=pad)

#rep = has.DitherLumaRebuild(src, s0=1)
# Optional - Some other prefilter:
rep = core.dfttest.DFTTest(clip=src, tbsize=1, sigma=2.0)
superRep = core.mv.Super(rep, hpad=pad, vpad=pad)

bvec2 = core.mv.Analyse(superRep, isb = True, delta = 2, blksize=blksize, overlap=overlap, truemotion=truemotion)
bvec1 = core.mv.Analyse(superRep, isb = True, delta = 1, blksize=blksize, overlap=overlap, truemotion=truemotion)
fvec1 = core.mv.Analyse(superRep, isb = False, delta = 1, blksize=blksize, overlap=overlap, truemotion=truemotion)
fvec2 = core.mv.Analyse(superRep, isb = False, delta = 2, blksize=blksize, overlap=overlap, truemotion=truemotion)

fin = core.mv.Degrain2(src, super, bvec1,fvec1,bvec2,fvec2, denoise)

#fin = core.std.CropRel(fin, 0, pad, 0, pad)

return fin

src = core.lsmas.LWLibavSource("").fmtc.bitdepth(bits=16)

den = Denoise2(src, 400, blksize=16, fast=False, truemotion=False)

den.set_output()

MonoS

16th June 2017, 22:39

So, if i understand correctly, those weighting are using during the analyze function to search for the proper motion vector, doing the analysis in 444 change some values and "improve" the denoising on the luma plane, am i correct?

Regarding your suggestion:
I usually use thsad around 150 up to 500 to obtaining different level of denoising, 200 for me is for a low-to-mid denoising.
i'm not resizing the clip, i'm simply padding it as I, when i did extensive test some years ago, noticed bad denoising on the bottom and right edge, even with padding, so i've started to pad my clip by myself, this method achieved very nice results.
AFAIK DitherLumaRebuild is commonly used for prefiltering the clip before doing motion analysis, i found this trick in one of cretindesalpes post and on vs QTGMC port.

Anyway i think you are right suggesting to use a stronger denoising, using 400 on the 420 clip it obtain similar result in those areas with weak denoising, but i would prefere to avoid using such strong, in my opinion, thsad.

VS_Fan

17th June 2017, 18:33

So, if i understand correctly, those weighting are using during the analyze function to search for the proper motion vector, doing the analysis in 444 change some values and "improve" the denoising on the luma plane, am i correct?
Right, but resampling to 444 doesn’t necessarily “improve” denoising. It just gives different results: For YUV colorspaces the amount of data used to represent luma and chroma for any pixel in each frame depends on the chroma subsampling (https://en.wikipedia.org/wiki/Chroma_subsampling). Mvtools’ analyze filter uses all data for each pixel to construct the blocks, unless you specify chroma=False.

From the mvtools doc at avisynth’s site (http://avisynth.nl/index.php/MVTools#About_MVTools): At analysis stage plugin divides frames by small blocks and try to find for every block in current frame the most similar (matching) block in second frame (previous or next). The relative shift of these blocks is motion vector. The main measure of block similarity is sum of absolute differences (SAD) of all pixels of these two blocks compared. SAD is a value which says how good the motion estimation was.
I usually use thsad around 150 up to 500 to obtaining different level of denoising, 200 for me is for a low-to-mid denoising.
This is my personal preference: For my 4:2:2 video sources I process each plane separately. I consider luma and chroma very different animals, so I tweak the corresponding thSAD and even thscd1 & thscd2 to lower values for chroma.
i'm not resizing the clip, i'm simply padding it as I, when i did extensive test some years ago, noticed bad denoising on the bottom and right edge, even with padding, so i've started to pad my clip by myself, this method achieved very nice results.
I can see now. There could have been a bug in earlier versions of the plugin, but you don’t need to do that any more
Anyway i think you are right suggesting to use a stronger denoising, using 400 on the 420 clip it obtain similar result in those areas with weak denoising, but i would prefere to avoid using such strong, in my opinion, thsad.
You could try mdegrain1, which will risk a lot less detail destruction while you use larger values for thsad.

MonoS

19th June 2017, 22:16

So what may be happening is this: Upscaling the chroma planes mvtools think that less of the image is changed because we have 2*2 chroma pixel that are very similar, so the same thsad result in more similar blocks and so more strong denoise, am i right?

feisty2

20th June 2017, 16:18

I understand now why you rewrote this thing like, entirely...
the code was full of weird bullshit and insanely fucking stupid stuff..

I managed to upgrade FakeBlockData and FakePlaneOfBlocks to normal C++14 but got stuck at FakeGroupOfPlanes
there's this bloody "update()" function throughout MVTools code,

it's like

void FakeBlockData::Update(const int *array) {
vector.x = array[0];
vector.y = array[1];
vector.sad = array[2];
}

in FakeBlockData and I recoded it to

auto Update(const VectorStructure *NewVectorPointer) {
Vector = *NewVectorPointer;
}

and in FakePlaneOfBlocks

void FakePlaneOfBlocks::Update(const int *array) {
array += 0;
for (int i = 0; i < nBlkCount; i++) {
blocks[i].Update(array);
array += N_PER_BLOCK;
}
}

I, again recoded it like

auto Update(const void *VectorStream) {
auto StreamCursor = reinterpret_cast<const VectorStructure *>(VectorStream);
for (auto i = 0; i < nBlkCount; ++i) {
blocks[i].Update(StreamCursor);
++StreamCursor;
}
}

.....

and there's one in FakeGroupOfPlanes like

void FakeGroupOfPlanes::Update(const int *array) {
const int *pA = array;
validity = GetValidity(array);
pA += 2;
for (int i = nLvCount_ - 1; i >= 0; i--)
pA += pA[0];
pA++;
pA = array;
pA += 2;
for (int i = nLvCount_ - 1; i >= 0; i--) {
planes[i]->Update(pA + 1);
pA += pA[0];
}
}

I mean like, dude, what the fuck??? this one is 11 out of 10 kinda wicked fucked up, all that weird abnormal pointer arithmetics with "pA" makes it impossible to recode...
I guess you should know all about that wicked update() function cuz you once converted MVTools to C
could you help me with this and explain that "FakeGroupOfPlanes::Update()", please?

jackoneill

20th June 2017, 16:52

and there's one in FakeGroupOfPlanes like

void FakeGroupOfPlanes::Update(const int *array) {
const int *pA = array;
validity = GetValidity(array);
pA += 2;
for (int i = nLvCount_ - 1; i >= 0; i--)
pA += pA[0];
pA++;
pA = array;
pA += 2;
for (int i = nLvCount_ - 1; i >= 0; i--) {
planes[i]->Update(pA + 1);
pA += pA[0];
}
}

I mean like, dude, what the fuck??? this one is 11 out of 10 kinda wicked fucked up, all that weird abnormal pointer arithmetics with "pA" makes it impossible to recode...
I guess you should know all about that wicked update() function cuz you once converted MVTools to C
could you help me with this and explain that "FakeGroupOfPlanes::Update()", please?

Well, see, half that function is redundant: https://github.com/dubhater/vapoursynth-mvtools/commit/9c89c7f919a90c16195ca55f9e963a13e02d2912#diff-6195dbc5362ec3b3fc298d79f802d212L110

feisty2

20th June 2017, 17:10

something I failed to understand

const int *pA = array + 2;

what's that "+2"? some kind of offset value? will it be affected if I change the structure of vector stream? like if I change "sad" in the vector to double?

and this,

pA += pA[0];

I suppose it should be something like

constexpr auto StreamHeaderOffset = 2;
auto pA = reinterpret_cast<const VectorStructure *>(array + StreamHeaderOffset);
auto MoveOnToTheNextVector = [](auto &VectorPointer) {
constexpr auto AbsoluteVectorSize = sizeof(std::decay_t<decltype(*VectorPointer)>);
constexpr auto RelativeVectorSize = AbsoluteVectorSize / sizeof(int);
auto ForwardDistance = VectorPointer->x / RelativeVectorSize;
VectorPointer += ForwardDistance;
};
MoveOnToTheNextVector(pA);

?

and that's why I hate C and old C++ so much cuz it's like fucking deciphering assembly code, what's so hard about defining weird constants with constexpr variables with proper names and writing some nested closure functions to tell others what the hell you're doing exactly?

jackoneill

20th June 2017, 18:06

something I failed to understand

const int *pA = array + 2;

what's that "+2"? some kind of offset value? will it be affected if I change the structure of vector stream? like if I change "sad" in the vector to double?

and this,

pA += pA[0];

This is what Analyse and Recalculate attach to each frame they return, and what the "array" parameter points to:

int total_size; // Size of the entire thing, i.e. the last int you may access is array[total_size - 1]
int validity; // 0 if the frame is too close to the beginning or end of the clip, otherwise 1
int first_level_size;
VECTOR first_level_vectors[first_level_size];
int second_level_size;
VECTOR second_level_vectors[second_level_size];
...
int last_level_size;
VECTOR last_level_vectors[last_level_size];
int divided_extra_level_size; // may not exist
VECTOR divided_extra_level_vectors[divided_extra_level_size]; //may not exist

So that +2 skips over the total size and validity. And then pA += pA[0] skips over the "current" level.

You may have noticed that all those size fields store numbers of ints, rather than numbers of bytes. This is probably because everything in there used to be an int (the sizes, validity, VECTOR's members, other things that used to be stored there). This is weird already, and it would have become weirder when I made the type of VECTOR::sad int64_t, so since v19 all these sizes store numbers of bytes.

feisty2

20th June 2017, 20:00

thanks for the detailed explanation, I can now finally reshape that weird piece of shit into something readable...

auto Update(const std::int32_t *VectorStream) {
constexpr auto StreamHeaderOffset = 2;
auto StreamCursor = VectorStream + StreamHeaderOffset;
auto GetValidity = [&]() {
return VectorStream[1] == 1;
};
auto UpdateVectorsForEachLevel = [&](auto Level) {
constexpr auto LevelHeaderOffset = 1;
auto LevelLength = StreamCursor[0];
auto CalibratedStreamCursor = reinterpret_cast<const VectorStructure *>(StreamCursor + LevelHeaderOffset);
planes[Level]->Update(CalibratedStreamCursor);
StreamCursor += LevelLength;
};
validity = GetValidity();
for (auto Level = nLvCount_ - 1; Level >= 0; --Level)
UpdateVectorsForEachLevel(Level);
}

guess Imma stick to the int-based size for now cuz I don't want no extra trouble...

just for comparison, this was the original version

void FakeGroupOfPlanes::Update(const int *array) {
const int *pA = array;
validity = GetValidity(array);
pA += 2;
for (int i = nLvCount_ - 1; i >= 0; i--)
pA += pA[0];
pA++;
pA = array;
pA += 2;
for (int i = nLvCount_ - 1; i >= 0; i--) {
planes[i]->Update(pA + 1);
pA += pA[0];
}
}

now you see why I said modern C++ is python with pointers :p

Boulder

9th July 2018, 12:35

Would it be possible to have the 'star' motion search method from x265 included in MVTools?

jackoneill

9th July 2018, 16:02

Would it be possible to have the 'star' motion search method from x265 included in MVTools?

It probably is.

But instead here is v20 with a small bug fix. (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v20)

* Fix green edges in the output of FlowBlur, FlowFPS, BlockFPS when pelclip is used and pel=2 (bug introduced in v12).

Boulder

9th July 2018, 16:25

It probably is.

But instead here is v20 with a small bug fix. (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v20)

Thank you for even considering it, and thanks for the fix :) It's always nice to see plugins being maintained.

edcrfv94

31st July 2018, 05:36

Vapoursynth mvtools not support DegrainN?(some script need tr=6)

Wolfberry

31st July 2018, 06:32

1. Binary Part: Extended Degrain to Degrain24 (24, it's my lucky number!)
2. Resurrected vmulti features from MVTools 2.6.0.5, implemented via a python module, "tr" works up to 24, guess no one will ever use a time radius > 24.... maybe?
3. Resurrected StoreVect and RestoreVect from MVTools 2.6.0.5, implemented via a python module

vmulti demos:
1. DegrainN

import vapoursynth as vs
import mvmulti
core = vs.core
sup = core.mvsf.Super(clp)
vec = mvmulti.Analyze(sup,tr=6,blksize=8,overlap=4)
vec = mvmulti.Recalculate(sup,vec,tr=6,blksize=4,overlap=2)
clp = mvmulti.DegrainN(clp, sup, vec, tr=6)
clp.set_output()

DegrainN is available in mvsf (mvtools single precision), the python module can be obtained here (https://github.com/IFeelBloated/vapoursynth-mvtools-sf/blob/master/src/mvmulti.py)