Log in

View Full Version : MVTools, Depan, DepanEstimate for VapourSynth


Pages : 1 2 3 4 5 6 7 [8] 9 10

Tarutaru
5th April 2016, 18:56
It's fixed now.

1952 = 1920 + 16 + 16 (padding). The strange heights are due to the downscaled copies made for motion estimation. Look at the output of mv.Super if you're curious.

Thanks, works perfectly.

jackoneill
8th April 2016, 20:24
v13 (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v13) fixes the two bugs reported since v12.

Boulder
9th April 2016, 09:23
Thanks for the new build :)

Elegant
15th April 2016, 01:57
Was just porting a script an error about mv.Mask not supporting 16bit. GitHub mentions that all functions support up to 16bit. Bug?


MOMask1 = core.mv.Mask(clip=Input, vectors=FV1, kind=1, ml=2)
File "src\cython\vapoursynth.pyx", line 1383, in vapoursynth.Function.__call__ (src\cython\vapoursynth.c:25204)
vapoursynth.Error: Mask: input clip must be GRAY8, YUV420P8, YUV422P8, YUV440P8, or YUV444P8, with constant dimensions.

jackoneill
15th April 2016, 09:29
Was just porting a script an error about mv.Mask not supporting 16bit. GitHub mentions that all functions support up to 16bit. Bug?


MOMask1 = core.mv.Mask(clip=Input, vectors=FV1, kind=1, ml=2)
File "src\cython\vapoursynth.pyx", line 1383, in vapoursynth.Function.__call__ (src\cython\vapoursynth.c:25204)
vapoursynth.Error: Mask: input clip must be GRAY8, YUV420P8, YUV422P8, YUV440P8, or YUV444P8, with constant dimensions.


No, it's true. I never got around to it.

groucho86
26th April 2016, 18:16
Hi everyone,
I'm running VapourSynth R32 and MVTools V13 on a Mac (10.9.5).

I'm trying to frame-convert archival from 29.97i to 23.976p.

About two-thirds of the way through (regardless of the actual length of the video input), the video-output freezes and stays frozen until the end of the length of the video.

I'm dealing primarily with the ProRes material, but I tried an h.264 file and had the same problem. This happens with both lsmas and ffms2.

I did a quick test on a Windows 7 machine and had the same problem.

My vpy script looks like this:
import vapoursynth as vs
import havsfunc as haf
import mvsfunc as mvf
import mvmulti

core = vs.get_core()

core.std.LoadPlugin("/usr/local/lib/libmvtoolssf.dylib")
core.std.LoadPlugin("/usr/local/lib/libfmtconv.dylib")
core.std.LoadPlugin("/usr/local/lib/libscenechange.dylib")
core.std.LoadPlugin("/usr/local/lib/libtemporalsoften2.dylib")
core.std.LoadPlugin("/usr/local/lib/libffms2.4.dylib")

clip = core.ffms2.Source(source="/Volumes/NA_3_TEMP/2997-V1-0061_DWD2014FL-CAM1_8.mov")

#clip = core.lsmas.LibavSMASHSource(source="/Volumes/NA_3_TEMP/2997-V1-0061_DWD2014FL-CAM1_8.mov")
#clip = core.std.AssumeFPS(clip, fpsnum=30000, fpsden=1001)
#clip = core.std.Trim(clip, first=0, last=100)


super = core.mv.Super(clip)
backward_vec = core.mv.Analyse(super, isb = True)
forward_vec = core.mv.Analyse(super, isb = False)
clip = core.mv.FlowFPS(clip, super, backward_vec, forward_vec, num=24000, den=1001, ml=100)

clip.set_output()

I'm using ffmpeg (3.0.1) to output to ProRes:
vspipe /Users/jr/test2.vpy --y4m - | ffmpeg -f yuv4mpegpipe -i - -c:v prores -y -profile:v 2 -an "/Users/jr/export/2398export.mov"

Piping to MPV presented the same issue.

I tried QTGMC and it successfully created a 59.94p video. What am I doing incorrectly with MVTools?

Thank you for your help!

jackoneill
26th April 2016, 20:10
What am I doing incorrectly with MVTools?


Nothing. It was a bug in this fork of MVTools. I pushed the fix. Thanks for reporting it.

groucho86
26th April 2016, 22:39
Confirming that that fixed the issue. Thank you!

jackoneill
3rd June 2016, 18:18
A preview of v14, for anyone interested in testing the ports of MDepan, DepanEstimate, Depan, and DepanStabilize:
https://ulozto.net/x2Waehra/vapoursynth-mvtools-v13-win32-7z
https://ulozto.net/xtMUJakd/vapoursynth-mvtools-v13-win64-7z


mv.DepanAnalyse(clip clip, clip vectors[, clip mask, bint zoom=True, bint rot=True, float pixaspect=1.0, float error=15.0, bint info=False, float wrong=10.0, float zerow=0.05, int thscd1=400, int thscd2=130, bint fields=False, bint tff])

mv.DepanEstimate(clip clip[, float trust=4.0, int winx=0, int winy=0, int wleft=-1, int wtop=-1, int dxmax=-1, int dymax=-1, float zoommax=1.0, float stab=1.0, float pixaspect=1.0, bint info=False, bint show=False, bint fields=False, bint tff])

mv.DepanCompensate(clip clip, clip data[, float offset=0.0, int subpixel=2, float pixaspect=1.0, bint matchfields=True, int mirror=0, int blur=0, bint info=False, bint fields=False, bint tff])

mv.DepanStabilise(clip clip, clip data[, float cutoff=1.0, float damping=0.9, float initzoom=1.0, bint addzoom=False, int prev=0, int next=0, int mirror=0, int blur=0, float dxmax=60.0, float dymax=30.0, float zoommax=1.05, float rotmax=1.0, int subpixel=2, float pixaspect=1.0, int fitlast=0, float tzoom=3.0, bint info=False, int method=0, bint fields=False])


The latter three don't exactly belong in MVTools, but they're sort of related to MDepan, and DepanEstimate uses FFTW as well, and this is convenient, so here they are.

I haven't yet figured out how to make DepanCompensate and DepanStabilise work with anything other than YUV420P8.

All four filters are multithreaded. I'd be very grateful for some speed comparisons with the corresponding Avisynth plugins, both with a single thread and with many, e.g. 4+.

VS_Fan
4th June 2016, 07:04
A preview of v14, for anyone interested in testing the ports of MDepan, DepanEstimate, Depan, and DepanStabilize:
Thank you, this is great! The mjpeg videos from my old still camera will surely be good experimental subjects :D. I will happily do some tests over the week.

I haven't yet figured out how to make DepanCompensate and DepanStabilise work with anything other than YUV420P8.
I hope you can, since these videos are originally YUV422P8 @30fps, I would like to encode them with x264 @ YUV422P10

Boulder
4th June 2016, 08:30
I hope you can, since these videos are originally YUV422P8 @30fps, I would like to encode them with x264 @ YUV422P10You can encode them in 10-bit depth even though the source you feed to x264 is in 8 bits.

feisty2
5th June 2016, 07:47
v2.5.11.21 (22.04.2016 by Fizick)
•MflowXXX: remove limit of motion vectors length (was 127/pel).


any plan to merge this?

jackoneill
5th June 2016, 10:07
any plan to merge this?

Yes, eventually.

feisty2
5th June 2016, 12:32
Yes, eventually.

Okay... And "eventually" sounds kinda low priority, and I think it should be sort of high priority cuz it's like a nasty bug

MonoS
5th June 2016, 17:09
Okay... And "eventually" sounds kinda low priority, and I think it should be sort of high priority cuz it's like a nasty bug

can you post me the commit? i'll code this for both mine and dubhater repo

amayra
5th June 2016, 23:57
can you do something about SVPFlow ?

feisty2
6th June 2016, 00:39
can you do something about SVPFlow ?

I think the svp guys already did it

VS_Fan
6th June 2016, 06:36
The results of my tests: Encoding a "DepanEstabilized" 3600 frames (2 min) clip 640x480 @ 30fps, using x264 on my laptop (core i5 3rd gen, Win10)

DepanEstimate fps
AVS ST 27.77
AVS MT 28.18
VPY 32b 27.63
VPY 64b 30.32

DepanAnalise fps
AVS ST 17.68
AVS MT 21.55
VPY 32b 22.45
VPY 64b 25.02


DepanEstimate:
• It is faster than MDepan/ DepanAnalise in both avisynth and vapoursynth, but I don’t get rotation adjustments.
• In these tests, avisynth filters were faster than your VS port (32 bits).
MDepan (AVS) / DepanAnalize (VS):
• Your VS port is way faster, and the results are practically identical
• I prefer DepanAnalise results over the ones of DepanEstimate, I "feel" the motion adjustments smoother

My Scripts for reference:
AVS:
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\Probar\RgTools.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\Probar\DePan.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\Probar\DePanEstimate.dll")
#LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\Probar\mvtools-v2.5.11.22\mvtools2.dll")
LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\Probar\mvtools-2.5.11.9-svp\mvtools2.dll")
SetMTMode(3,4)
LWLibavVideoSource("F:\TEMPX\CANON\MVI_2038.AVI", format="YUV420P8", fpsnum=30, fpsden=1, seek_threshold=60)
#Preroll(60)
original = last
SetMTMode(2)
pf = original.RemoveGrain(mode=4)
div = 2
pf = pf.BilinearResize(Width(original)/div,Height(original)/div)
pf = pf.RemoveGrain(mode=4)
pf = pf.BilinearResize(Width(original) ,Height(original))
super = MSuper(pf)
vectors = MAnalyse(super, isb = false)
#return mshow(super, vectors, showsad=true)
globalmotion = MDepan(pf, vectors, pixaspect=1.0, thSCD1=800, error=30.0, range=0) #, info=true) #, thSCD2=171
#globalmotion = DePanEstimate()
#return globalmotion
DepanStabilize(original, data=globalmotion, cutoff=0.33, pixaspect=1.0, method=1, zoommax=1.0, rotmax=10.0) #, mirror=15

return last


VPY:
import vapoursynth as vs
core = vs.get_core()

ret = core.lsmas.LWLibavSource(source=r"F:\TEMPX\CANON\MVI_2038.AVI", format="YUV420P8", fpsnum=30, fpsden=1)
pf = core.rgvs.RemoveGrain(clip=ret, mode=4)
div = 2
pf = core.resize.Bilinear(clip=pf, width=int(pf.width/div), height=int(pf.height/div))
pf = core.rgvs.RemoveGrain(clip=pf, mode=4)
pf = core.resize.Bilinear(clip=pf, width=pf.width*div, height=pf.height*div)
#globalmotion = core.mv.DepanEstimate(clip=pf, pixaspect=1.0)
super = core.mv.Super(clip=pf)
vectors = core.mv.Analyse(super=super, isb=False)
globalmotion = core.mv.DepanAnalyse(clip=pf, vectors=vectors, pixaspect=1.0, error=30.0, thscd1=800)
ret = core.mv.DepanStabilise(clip=ret, data=globalmotion, cutoff=0.33, zoommax=1.0, rotmax=10.0, pixaspect=1.0, method=1)
ret.set_output()

jackoneill
6th June 2016, 10:01
can you do something about SVPFlow ?

What would you have me do?

VS_Fan: Thanks for testing.

jackoneill
22nd June 2016, 11:50
https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v14


* BlockFPS, FlowFPS: Fix repeated frames in the last 33% of the clip when
reducing the frame rate (bug introduced at the beginning of this fork, probably).
* All filters that have a "fields" parameter: Fix handling of field-based clips
(was broken since the beginning of this fork).
* Add ports of MDepan, DepanEstimate, and Depan.

Changes from upstream, versions 2.5.11.20 and 2.5.11.21:

* BlockFPS: Remove parameter "thres", add parameter "ml".
* Compensate, Mask: Add parameter "time".
* Reject vector clips with negative delta where necessary.
* FlowXYZ: No longer limit motion vector length to -127..127.

The Depan filters now handle Gray, 4:2:0, 4:2:2, 4:4:4, and up to 16 bits.

feisty2
24th June 2016, 04:47
blockfps mode 6-8 are corrupted at bitdepth>8

import vapoursynth as vs
core = vs.get_core()

clp = rule6
clp = core.fmtc.bitdepth(clp,bits=8,fulls=False,fulld=True)
clp = core.std.ShufflePlanes(clp,0,vs.GRAY)

sup = core.mv.Super(clp)
bv = core.mv.Analyse(sup,8,overlap=4,delta=1,isb=True)
fv = core.mv.Analyse(sup,8,overlap=4,delta=1,isb=False)
clp = core.mv.BlockFPS(clp, sup, bv, fv, 50, 1, 8)

clp.set_output()

http://i.imgur.com/lNucJnc.png


import vapoursynth as vs
core = vs.get_core()

clp = rule6
clp = core.fmtc.bitdepth(clp,bits=16,fulls=False,fulld=True)
clp = core.std.ShufflePlanes(clp,0,vs.GRAY)

sup = core.mv.Super(clp)
bv = core.mv.Analyse(sup,8,overlap=4,delta=1,isb=True)
fv = core.mv.Analyse(sup,8,overlap=4,delta=1,isb=False)
clp = core.mv.BlockFPS(clp, sup, bv, fv, 50, 1, 8)

clp.set_output()

http://i.imgur.com/nZOpfvk.png
looks like a blank black frame, cuz

pDst_[w] = pOcc[w];
should be
pDst_[w] = static_cast<PixelType>(pOcc[w]) << (bitspersample - 8); //mask stuff are always 8bits, compensating scaling required at higher bitdepth


or do it manually

import vapoursynth as vs
core = vs.get_core()

clp = rule6
clp = core.fmtc.bitdepth(clp,bits=16,fulls=False,fulld=True)
clp = core.std.ShufflePlanes(clp,0,vs.GRAY)

sup = core.mv.Super(clp)
bv = core.mv.Analyse(sup,8,overlap=4,delta=1,isb=True)
fv = core.mv.Analyse(sup,8,overlap=4,delta=1,isb=False)
clp = core.mv.BlockFPS(clp, sup, bv, fv, 50, 1, 8)
clp = core.std.Expr(clp,"x 256 *")

clp.set_output()

http://i.imgur.com/AFZfo8Z.png
garbage, still, cuz

double l = 255 * pow(sad * dSADNormFactor, fGamma);

the expression is NOT linear, the result will be corrupted if the SAD remains unscaled, as ∂l/∂sad is not constant

jackoneill
24th June 2016, 12:53
blockfps mode 6-8 are corrupted at bitdepth>8


Maybe fixed. Thanks for testing.

jackoneill
1st July 2016, 19:31
https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v15

After years of waiting, one more MVTools filter is ported. From the original MVTools only MShow is left now.


* BlockFPS: Fix bug that prevented the use of the "ml" parameter (bug introduced in v14).
* BlockFPS: Maybe fix bad output with 16 bit input (bugs introduced in v6 and v14).
* DepanCompensate, DepanStabilise: Fix integer overflows with 16 bit input and subpixel=2 (bug introduced in v14).
* Rename all "isse" parameters to "opt". They work the same.
* Add filter Flow.

jackoneill
9th July 2016, 18:19
If QTGMC crashes, try https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v16.


* Compensate: Fix crash when overlap is used (bug introduced in v15).

kolak
27th July 2016, 12:38
Which parameters are mostly responsible for the speed when using FlowFPS?
I'm mainly interested in: search, searchparam, overlap, dct, trymany, mask and opt.

Boulder
27th July 2016, 12:47
I'd say dct is one that will surely affect speed.

jackoneill
27th July 2016, 13:21
Which parameters are mostly responsible for the speed when using FlowFPS?
I'm mainly interested in: search, searchparam, overlap, dct, trymany, mask and opt.

opt exists only for debugging purposes. As far as I know, it doesn't change the results.

dct > 0 definitely makes it much slower.
overlap > 0 makes it slower.
The search method matters too. Exhaustive will be slower than all the others, I guess.

If you want numbers, run some tests.

kolak
27th July 2016, 14:55
Wel, I'm not after some numbers, just idea about which parameters affect speed the most.

I have also different question.

I've noticed that search=6 does help a lot with artefacts when doing fps conversion on all up-down movements (sounds strange). In the same way opposite also works, so search=7 works for left-right movements.
Is there a way of switching search method based on dominant movement type? Sorry if this is silly question :) Problem is that this would have to work on "future" frames.

jackoneill
27th July 2016, 17:09
Wel, I'm not after some numbers, just idea about which parameters affect speed the most.

I have also different question.

I've noticed that search=6 does help a lot with artefacts when doing fps conversion on all up-down movements (sounds strange). In the same way opposite also works, so search=7 works for left-right movements.
Is there a way of switching search method based on dominant movement type? Sorry if this is silly question :) Problem is that this would have to work on "future" frames.

That's a question best directed at the original developers. I don't have the math skills to answer.

feisty2
27th July 2016, 18:35
Wel, I'm not after some numbers, just idea about which parameters affect speed the most.

I have also different question.

I've noticed that search=6 does help a lot with artefacts when doing fps conversion on all up-down movements (sounds strange). In the same way opposite also works, so search=7 works for left-right movements.
Is there a way of switching search method based on dominant movement type? Sorry if this is silly question :) Problem is that this would have to work on "future" frames.

read the doc


search, searchparam, pelsearch

search decides the type of search at every level, searchparam is an additional parameter (step, radius) for this search, and pelsearch is the radius parameter at finest (pel) level. Below are the possible values for the search type:

0 'OneTimeSearch'. searchparam is the step between each vectors tried (if searchparam is superior to 1, step will be progressively refined).
1 'NStepSearch'. N is set by searchparam. It's the most well known of the MV search algorithm.
2 Logarithmic search, also named Diamond Search. searchparam is the initial step search, there again, it is refined progressively.
3 Exhaustive search, searchparam is the radius (square side is 2*radius+1). It is slow, but it gives the best results, SAD-wise.
4 Hexagon search, searchparam is the range. (similar to x264).
5 Uneven Multi Hexagon (UMH) search, searchparam is the range. (similar to x264).
6 pure Horizontal exhaustive search, searchparam is the radius (width is 2*radius+1).
7 pure Vertical exhaustive search, searchparam is the radius (height is 2*radius+1).

kolak
27th July 2016, 19:55
I've read it and found that 6,7 works well depending on the movement type. That's why I asked if this can be dynamic with some "ahead" detection.

feisty2
27th July 2016, 20:01
The combo of 6,7 is 3

kolak
27th July 2016, 22:17
Ok, but it's a combo, not a dynamic switch between 6,7 so results are very different.

~SimpleX~
30th July 2016, 11:38
jackoneill, do you have any plans on 32bit clips support?

feisty2
30th July 2016, 11:44
jackoneill, do you have any plans on 32bit clips support?

http://forum.doom9.org/showthread.php?t=172525

~SimpleX~
30th July 2016, 11:55
feisty2, yup, I know you already made a 32-bit mvtools (thank you btw). Will this ever be merged with jackoneill's plugin? Or I should choose a plugin based on bitdepth in my scripts?

feisty2
30th July 2016, 11:59
feisty2, yup, I know you already made a 32-bit mvtools (thank you btw). Will this ever be merged with jackoneill's plugin? Or I should choose a plugin based on bitdepth in my scripts?

http://forum.doom9.org/showthread.php?p=1737735#post1737735

I would have discontinued my branch if the answer was yeah

jackoneill
30th July 2016, 12:08
jackoneill, do you have any plans on 32bit clips support?

I'd rather not do that.

feisty2
1st August 2016, 18:41
made 2 PRs on git, truemotion is broken at any bitdepth > 8

jackoneill
1st August 2016, 19:42
made 2 PRs on git, truemotion is broken at any bitdepth > 8

Thanks. I merged them.

feisty2
2nd August 2016, 17:41
good news and bad news here

bad news:
the SATD deduction I made last year was totally wrong (forgive me, my major is not signal processing...)
and you merged that incorrect stuff into your branch as well...

good news:
I made the double checked correct SATD implementation this time with a recursive Hadamard Ordered Walsh-Hadamard Transform
References:
https://en.wikipedia.org/wiki/Hadamard_transform
http://fourier.eng.hmc.edu/e161/lectures/wht/node2.html

static auto uninitialized = true;
static constexpr auto init_val = 1.;
static decltype(init_val + 0) hadamard_matrix_2x2[2][2];
static decltype(init_val + 0) hadamard_matrix_4x4[4][4];
static decltype(init_val + 0) hadamard_matrix_8x8[8][8];
static decltype(init_val + 0) hadamard_matrix_16x16[16][16];
static decltype(init_val + 0) hadamard_matrix_32x32[32][32];

template<int length = 2, typename T = double>
auto create_Hadamard_matrix(void *dst, const void *src) {
constexpr auto src_length = length >> 1;
const auto coeff = std::sqrt(2.);
auto actual_dst = reinterpret_cast<T(*)[length]>(dst);
auto actual_src = reinterpret_cast<const T(*)[src_length]>(src);
for (auto i = 0; i < src_length; ++i)
std::memcpy(actual_dst[i], actual_src[i], sizeof(actual_src[0]));
for (auto i = 0; i < src_length; ++i)
std::memcpy(actual_dst[src_length + i], actual_src[i], sizeof(actual_src[0]));
auto ptr = reinterpret_cast<T(*)[2][src_length]>(dst);
for (auto i = 0; i <length; ++i)
std::memcpy(ptr[i][1], ptr[i][0], sizeof(actual_src[0]));
ptr += src_length;
for (auto i = 0; i < src_length; ++i)
for (auto &x : ptr[i][1])
x = -x;
for (auto i = 0; i < length; ++i)
for (auto &x : actual_dst[i])
x /= coeff;
}

static auto SATD_init() {
create_Hadamard_matrix(hadamard_matrix_2x2, &init_val);
create_Hadamard_matrix<4>(hadamard_matrix_4x4, hadamard_matrix_2x2);
create_Hadamard_matrix<8>(hadamard_matrix_8x8, hadamard_matrix_4x4);
create_Hadamard_matrix<16>(hadamard_matrix_16x16, hadamard_matrix_8x8);
create_Hadamard_matrix<32>(hadamard_matrix_32x32, hadamard_matrix_16x16);
uninitialized = false;
}

template<int length = 2, typename T = double>
auto product_calc(const void *src, const void *hadamard, void *dst) {
auto actual_hadamard = reinterpret_cast<const T(*)[length]>(hadamard);
auto actual_src = reinterpret_cast<const T(*)[length]>(src);
auto actual_dst = reinterpret_cast<T(*)[length]>(dst);
auto dot_p = [&](auto row, auto column) {
T sum = 0;
for (auto i = 0; i < length; ++i)
sum += actual_hadamard[row][i] * actual_src[i][column];
return sum;
};
for (auto i = 0; i < length; ++i)
for (auto j = 0; j < length; ++j)
actual_dst[i][j] = dot_p(i, j);
}

template<int nBlkWidth, int nBlkHeight, typename PixelType>
auto Satd_C(const uint8_t *pSrc8, intptr_t nSrcPitch, const uint8_t *pRef8,
intptr_t nRefPitch) {
if (uninitialized)
SATD_init();
void *hadamard;
if (nBlkWidth == 32 && nBlkHeight == 32)
hadamard = hadamard_matrix_32x32;
else if (nBlkWidth == 16 && nBlkHeight == 16)
hadamard = hadamard_matrix_16x16;
else if (nBlkWidth == 8 && nBlkHeight == 8)
hadamard = hadamard_matrix_8x8;
else if (nBlkWidth == 4 && nBlkHeight == 4)
hadamard = hadamard_matrix_4x4;
else
hadamard = nullptr;
auto sum = 0.;
decltype(sum) _dif_block[nBlkHeight][nBlkWidth];
decltype(sum) _transformed_block[nBlkHeight][nBlkWidth];
for (auto y = 0; y < nBlkHeight; ++y) {
for (auto x = 0; x < nBlkWidth; ++x) {
auto pSrc = reinterpret_cast<const PixelType *>(pSrc8);
auto pRef = reinterpret_cast<const PixelType *>(pRef8);
_dif_block[y][x] = static_cast<decltype(sum)>(pSrc[x]) - pRef[x];
}
pSrc8 += nSrcPitch;
pRef8 += nRefPitch;
}
product_calc<nBlkWidth>(_dif_block, hadamard, _transformed_block);
for (auto &x : _transformed_block)
for (auto y : x)
sum += std::abs(y);
return sum;
}

bad news again:
my implementation is C++14 inside out, not sure if I could translate it to C properly so didn't make any PR this time

edit:typo

hydra3333
3rd August 2016, 08:32
vapoursynth newbie seeking clarification :- I want a script to mimic my old avisynth script and process all planes but now using the vapoursynth mvtools ... is the below correct ? If not, what is it doing, and what should it be instead ?

vsource = icore.mv.Degrain1(vsource, super, backward_vec1, forward_vec1, thsad=400, plane=4)
avisynth:
MDegrain1(super, backward_vec1,forward_vec1,thSAD=400,plane=4)

jackoneill
3rd August 2016, 10:24
vapoursynth newbie seeking clarification :- I want a script to mimic my old avisynth script and process all planes but now using the vapoursynth mvtools ... is the below correct ? If not, what is it doing, and what should it be instead ?

vsource = icore.mv.Degrain1(vsource, super, backward_vec1, forward_vec1, thsad=400, plane=4)
avisynth:
MDegrain1(super, backward_vec1,forward_vec1,thSAD=400,plane=4)

The "plane" parameter works the same.

jackoneill
21st August 2016, 15:27
good news and bad news here

bad news:
the SATD deduction I made last year was totally wrong (forgive me, my major is not signal processing...)
and you merged that incorrect stuff into your branch as well...

good news:
I made the double checked correct SATD implementation this time with a recursive Hadamard Ordered Walsh-Hadamard Transform
References:
https://en.wikipedia.org/wiki/Hadamard_transform
http://fourier.eng.hmc.edu/e161/lectures/wht/node2.html


Any reason why I shouldn't just remove the SATD functions entirely?

feisty2
21st August 2016, 15:41
Any reason why I shouldn't just remove the SATD functions entirely?

because they are the components from avisynth mvtools..?

well, SATD transforms the difference to frequency domain(Hadamard Transform = Discrete Fourier Transform mathematically on a 2^n * 2^n square block) and it helps with shimmering and fading sometimes like dct=1, but much faster than that

Myrsloik
21st August 2016, 15:48
I have a simple question. Has anyone ever tried SAD but with the average value of the input blocks adjusted? I mean the main point of this exercise is to reduce the whole block average into a single coefficient to reduce its influence a lot. There should be some faster but still good for 90% of the times trick.

And saying it's in avisynth is a bad excuse. Temporalsoften is in avisynth and that's nothing to be proud of.

feisty2
21st August 2016, 15:57
more specific? adjusted how, and mode 6-10 are hybrid SAD/SATD modes, and faster than mode 5(100% SATD)

VS_Fan
23rd August 2016, 11:54
I was trying to remember what's the purpose of using SATD in mvtools, so I googled a little, and found:

SATD allows mvtools to better track motion when there are rapidly changing levels of brightness (http://forum.doom9.org/showthread.php?p=1472189#post1472189)
https://www.svp-team.com/wiki/Plugins:_SVPflow
Use SATD function instead of SAD on finest level. Extremely slow, do not use it!
Use SATD function instead of SAD on every coarse level, improves motion vector estimation at luma flicker and fades.

I couldn't find much more.

And then I thought: if people at VideoLAN / MulltiCoreWare implemented the asm functions for SATD for x265? Could they be used? As I’m not a developer, I can't answer that question. Is this a possibility?

In the search (https://www.google.com/webhp?ie=utf-8&oe=utf-8#q=x265+satd+asm+site:videolan.org&start=30) I saw things like:

[x265] [PATCH] primitives: asm: update: implementation of satd(sse2) (https://mailman.videolan.org/pipermail/x265-devel/2013-June/000010.html)
[x265] [PATCH] primitives: asm: satd: fix for 32 bit issue (https://mailman.videolan.org/pipermail/x265-devel/2013-June/000024.html)

Could those be used for mvtools-VapourSynth ???

feisty2
23rd August 2016, 12:30
No, because stupid asm fixed SATD to a certain bitdepth(say, 8bits)
That's exactly why asm SATD functions from avisynth mvtools are useless

feisty2
23rd August 2016, 14:16
I was trying to remember what's the purpose of using SATD in mvtools
there are several approaches available to calculate the difference between 2 macroblocks.
1. SAD, sum of the absolute difference between each pair of samples
2. dct=1, transform both current block and the reference block to frequency domain, and calculate the sum of the absolute difference between each pair of transformed samples
3. SATD, get the difference block between 2 macroblocks, and transform that difference block to frequency domain and calculate the sum of the absolute value of each sample in that transformed difference block

basically, dct=5(SATD) is the compromise between dct=0(SAD) and dct=1(frequency domain SAD)