single precision MVTools plugin (stable) [Archive] - Page 2

~SimpleX~

4th August 2016, 07:50

You should check sample_type, not the number of bits

Missed that part, thanks! Updated.

feisty2

4th August 2016, 08:41

Missed that part, thanks! Updated.

theoretically you cannot expand DegrainN to the partial sum of a series like ∑[x=1, y](DegrainRadius[x] * P[x]) (∑[x=1, y]Radius[x] = N, ∑[x=1, y]P[x]=1), it works but the result is very different from DegrainN

mv123 = core.mv.Degrain3(clip, super, *v[:6], **kwargs)
mv456 = core.mv.Degrain3(clip, super, *v[6:], **kwargs)
sm = core.std.Merge(mv123, mv456, 0.4615)

is poles apart from mvsf.Degrain6, and you should not obfuscate them in your "degrain" function.

edit: made the statement more rigorous

~SimpleX~

4th August 2016, 11:23

theoretically you cannot expand DegrainN to the partial sum of a series like ∑[x=1, y](DegrainRadius[x] * P[x]) (∑[x=1, y]Radius[x] = N, ∑[x=1, y]P[x]=1), it works but the result is very different from DegrainN

mv123 = core.mv.Degrain3(clip, super, *v[:6], **kwargs)
mv456 = core.mv.Degrain3(clip, super, *v[6:], **kwargs)
sm = core.std.Merge(mv123, mv456, 0.4615)

is poles apart from mvsf.Degrain6, and you should not obfuscate them in your "degrain" function.

edit: made the statement more rigorous
I took that code from MCTemporalDenoise. I'd love to have Degrain at least up to 6 in jackoneill's plugin, but for now... Maybe I should remove that part completely and stick to Degrain1..Degrain3 for integral samples?

feisty2

4th August 2016, 11:29

I took that code from MCTemporalDenoise. I'd love to have Degrain at least up to 6 in jackoneill's plugin, but for now... Maybe I should remove that part completely and stick to Degrain1..Degrain3 for integral samples?

remove it, definitely, nothing wrong with this approach, it's just DIFFERENT from how DegrainN works in mvtools and you should never obfuscate different stuff together like they are the same stuff..

feisty2

24th October 2016, 15:06

r6:
new block sizes: 2x2, 64x64, 64x32, 128x128, 128x64, 256x256, 256x128
switched to fftw3.3.5

groucho86

7th November 2016, 22:07

Hi feisty2,
I'm using BlockFPS to get to a lower FPS (29.97i to 59.94p via QTGMC then to 23.976p) using this code :
clp = core.fmtc.bitdepth(clip,bits=32,fulls=False,fulld=True)
super = core.mvsf.Super(clp,pel=4,hpad=16,vpad=16,rfilter=4)
bw_1 = core.mvsf.Analyze(super,isb=True,blksize = 16,overlap=0,search=3,badrange =-24)
fw_1 = core.mvsf.Analyze(super,isb=False,blksize = 16,overlap=0,search=3,badrange =-24)
clp = core.mvsf.BlockFPS(clp, super, bw_1, fw_1, num=24000, den=1001, blend=False)

clp.set_output()

The video freezes about halfway through. I wonder if it's similar to the problem that was fixed in jackoneill's version (https://github.com/dubhater/vapoursynth-mvtools/commit/435be3fa3f1f21d8434c69aa3a2801f4765fbe85).

feisty2

19th December 2016, 11:30

r7
Bug Fix
fixed a clip length calculation bug in BlockFPS, reported by groucho86
New Feature
extended SATD to 64x64 128x128 and 256x256 blocks
Uncategorized
replaced Hadamard ordered SATD with the Sequency ordered variant, levels faster..

Mystery Keeper

3rd January 2017, 20:34

Could you please do Mask too? I found it can be used to limit the destructive effect of temporal denoising.

feisty2

4th January 2017, 08:43

Could you please do Mask too? I found it can be used to limit the destructive effect of temporal denoising.

will do that for the next update

feisty2

9th February 2017, 09:35

r8
New Feature
mvsf.Mask
Uncategorized
converted some ugly C89 style code to C++14 style

feisty2

9th February 2017, 09:36

Could you please do Mask too? I found it can be used to limit the destructive effect of temporal denoising.

done!

feisty2

9th February 2017, 17:21

anyone needs AVX/FMA optimizations for this thing?
I'll do it if I got 2 or more replies for "yes"

Mystery Keeper

9th February 2017, 19:04

anyone needs AVX/FMA optimizations for this thing?
I'll do it if I got 2 or more replies for "yes"Yes! And thank you very much for the Mask!

MonoS

11th February 2017, 16:39

anyone needs AVX/FMA optimizations for this thing?
I'll do it if I got 2 or more replies for "yes"

You can find an implementation in my repo, but be aware that is a bit slower than compiling it on -O3 on GCC

feisty2

11th February 2017, 16:43

You can find an implementation in my repo, but be aware that is a bit slower than compiling it on -O3 on GCC

but that's not my floating point version...

Mystery Keeper

11th February 2017, 20:05

You can find an implementation in my repo, but be aware that is a bit slower than compiling it on -O3 on GCC-O3 is not to be trusted. It is experimental. Any compiler features that prove safe and sensible quickly get moved into -O2.

Pat357

11th February 2017, 22:40

I'd like to compile this myself, but it didn't work out yet.
Is it possible that you did some renaming of the files ?
The current generated Makefile by Configure-script gives an error :

checking dynamic linker characteristics... Win32 ld.exe
checking how to hardcode library paths into programs... immediate
checking pkg-config is at least version 0.9.0... yes
checking for VapourSynth... yes
checking for FFTW3... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: executing depfiles commands
config.status: executing libtool commands
make: *** No rule to make target 'src/DCT.cpp', needed by 'src/DCT.lo'. Stop.

Seams DCT.cpp is no longer in /src

Can you have a look at this ?

Thanks !

feisty2

12th February 2017, 06:24

I'd like to compile this myself, but it didn't work out yet.
Is it possible that you did some renaming of the files ?
The current generated Makefile by Configure-script gives an error :

checking dynamic linker characteristics... Win32 ld.exe
checking how to hardcode library paths into programs... immediate
checking pkg-config is at least version 0.9.0... yes
checking for VapourSynth... yes
checking for FFTW3... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: executing depfiles commands
config.status: executing libtool commands
make: *** No rule to make target 'src/DCT.cpp', needed by 'src/DCT.lo'. Stop.

Seams DCT.cpp is no longer in /src

Can you have a look at this ?

Thanks !

done!

Pat357

12th February 2017, 21:06

Thanks for the fast response !
The compile seems to work, but I noticed a lot "unused" warnings....
When linking starts, things go wrong :
CXXLD libmvtoolssf.la
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text+0x6a): undefined reference to `__imp_pthread_mutex_destroy'
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text+0x940): undefined reference to `__imp_pthread_mutex_lock'
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text+0x987): undefined reference to `__imp_pthread_mutex_unlock'
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text.startup+0xf): undefined reference to `__imp_pthread_mutex_init'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text+0x7a): undefined reference to `__imp_pthread_mutex_destroy'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text+0x1e32): undefined reference to `__imp_pthread_mutex_lock'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text+0x1e77): undefined reference to `__imp_pthread_mutex_unlock'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text.startup+0xf): undefined reference to `__imp_pthread_mutex_init'
collect2.exe: error: ld returned 1 exit status
make: *** [Makefile:502: libmvtoolssf.la] Error 1

Any idea how to make this work ?
Do you have an idea why just these two MVAnalyze.cpp and MVRecalculate.cpp have this problem with linking threads-lib ?

My build-system : Mingw64 and GCC 6.3.0 / Msys2 / Win7 SP1 / Haswell
FFTW3 installed (from Mingw64)
Vapoursynth installed

Tested adding -mthreads or -pthread to CXXFLAGS , but had no effect
adding LIBS=-lpthread to environment didn't help.

Open for ideas !

feisty2

12th February 2017, 21:29

I guess the warnings are probably related to "g_fftw_plans_mutex" (a static mutex variable in DCTFFTW.hpp)
I did all my developing under Visual Studio 2017 and there was no warning or error building the plugin (Visual Studio simplified "compile + link" to one click of "build", and then generates binary automatically)
I know pretty much nothing about GNU compilers, I think you might have to ask someone else who's more familiar with GNU about those warnings and errors..

jackoneill

13th February 2017, 14:10

Thanks for the fast response !
The compile seems to work, but I noticed a lot "unused" warnings....
When linking starts, things go wrong :
CXXLD libmvtoolssf.la
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text+0x6a): undefined reference to `__imp_pthread_mutex_destroy'
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text+0x940): undefined reference to `__imp_pthread_mutex_lock'
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text+0x987): undefined reference to `__imp_pthread_mutex_unlock'
src/.libs/MVAnalyze.o:MVAnalyze.cpp:(.text.startup+0xf): undefined reference to `__imp_pthread_mutex_init'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text+0x7a): undefined reference to `__imp_pthread_mutex_destroy'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text+0x1e32): undefined reference to `__imp_pthread_mutex_lock'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text+0x1e77): undefined reference to `__imp_pthread_mutex_unlock'
src/.libs/MVRecalculate.o:MVRecalculate.cpp:(.text.startup+0xf): undefined reference to `__imp_pthread_mutex_init'
collect2.exe: error: ld returned 1 exit status
make: *** [Makefile:502: libmvtoolssf.la] Error 1

Any idea how to make this work ?
Do you have an idea why just these two MVAnalyze.cpp and MVRecalculate.cpp have this problem with linking threads-lib ?

My build-system : Mingw64 and GCC 6.3.0 / Msys2 / Win7 SP1 / Haswell
FFTW3 installed (from Mingw64)
Vapoursynth installed

Tested adding -mthreads or -pthread to CXXFLAGS , but had no effect
adding LIBS=-lpthread to environment didn't help.

Open for ideas !

You're using Mingw's winpthreads, right? If you compiled it static only you need to modify pthread.h a little:

--- mingw-w64-libraries/winpthreads/include/pthread.h 2014-10-26 04:11:33.000000000 +0200
+++ mingw-w64-libraries/winpthreads/include/pthread.h 2014-12-10 14:17:45.542746397 +0200
@@ -82,7 +82,9 @@
/* MSB 8-bit major version, 8-bit minor version, 16-bit patch level. */
#define __WINPTHREADS_VERSION 0x00050000

-#if defined DLL_EXPORT
+/* No __declspec crap needed when winpthreads is static.
+ * In fact, it produces linker errors (can't find __imp_blah). */
+#if defined DLL_EXPORTzzzzzzzzzzz
#ifdef IN_WINPTHREAD
#define WINPTHREAD_API __declspec(dllexport)
#else

Maybe you could report this bug and make them fix it.

Pat357

14th February 2017, 17:42

You're using Mingw's winpthreads, right?
Thanks !

Even when I tried with -mthreads , it gives the same error.

I'll try the patch and let you know.

PS : Compiled again with patch applied and tested : working 100 %

MonoS

14th February 2017, 22:03

but that's not my floating point version...

The code will be almost the same i suppose

MonoS

14th February 2017, 22:05

-O3 is not to be trusted. It is experimental. Any compiler features that prove safe and sensible quickly get moved into -O2.

-O3, for my interest, enable autovectorization and this feature, as i read on twitter from trustworthy people, may produce awful SIMD code, probably that's why it's still in O3 as you suggest (i didn't know about that, where did you read that?)

feisty2

20th June 2017, 15:07

r9
fixed an ancient memory leak in mvsf.Super
converted some weird C++98 code to C++14

Efenstor

7th July 2017, 10:44

Can't compile neither the master nor latest release:
make: *** No rule to make target 'src/FakeGroupOfPlanes.cpp', needed by 'src/FakeGroupOfPlanes.lo'. Stop.
make: *** Waiting for unfinished jobs....

Are_

7th July 2017, 11:25

Master will work if you go one commit back or if you delete the src/FakeGroupOfPlanes.cpp line in Makefile.am, probably.

feisty2

7th July 2017, 11:27

fixed

cwk

4th September 2017, 22:38

Getting the following error when attempting to compile master:

src/MVClip.hpp:21:56: error: invalid conversion from 'const char*' to 'char*' [-fpermissive]
auto evil = vsapi->getFrame(0, vectors, errorMsg.data(), MaxErrorLength);
^
...

Makefile:537: recipe for target 'src/GroupOfPlanes.lo' failed
make: *** [src/GroupOfPlanes.lo] Error 1

My sequence is:

autogen.sh
configure
make

Any ideas?

feisty2

4th September 2017, 23:51

C++17 support is required, the non-const version of "data()" is new in C++17

cwk

5th September 2017, 01:28

Thanks for the tip. With that in mind, I am able to compile r7 on Ubuntu 16.04 without too much difficulty.

Any ideas when gnu will offer support for that version of data()? It looks like the makefile is already geared up with that flag:

AM_CXXFLAGS = -O2 -std=c++1z -msse2 -mfpmath=sse $(STACKREALIGN) -Wall -Wextra -Wno-unused-parameter -Wshadow

I modified the flag to: "-std=c++17" specifically, which didn't appear to have any impact.

Are_

5th September 2017, 03:09

Gcc 7

shader

5th September 2017, 16:17

does your CPU feature the AVX2 extension?

I'm run VS on two different laptops. On is a little bit older and therefore it does not have AVX2 extension.

Does anybody compile the latest release for 64 bit without AVX2 and wants to share it with me?

Thanks!

chawl

1st November 2017, 02:13

I'm run VS on two different laptops. On is a little bit older and therefore it does not have AVX2 extension.

Does anybody compile the latest release for 64 bit without AVX2 and wants to share it with me?

Thanks!

I had the same problem so I compiled the latest (R9) source on MinGW for my good old Phenom II X6 and it may work for you too, so here it is:

https://drive.google.com/file/d/0BxnWzTDZM0v5M2hfTmtvTXI0Njg/view?usp=sharing

You have to put all DLLs under plugins64 directory as my build is not as static as feisty2's and has two dependencies. I also added libfftw3-3.dll.bak file for any reason that your version of FFTW might not work, just rename back to .dll to use it.

ChaosKing

12th February 2018, 13:00

Does your mvtools version has some other dependency except libfftw3-3.dll ? It works on my Desktop PC but not on a Win server 2012. I installed every c/c++ runtime I could think of.

This is the script I'm using https://forum.doom9.org/showthread.php?p=1833133#post1833133

I get this error using VS R43 x64:
Problem Event Name: APPCRASH
Application Name: vspipe.exe
Application Version: 0.0.0.0
Application Timestamp: 5a721c1c
Fault Module Name: libmvtools_sf_em64t.dll
Fault Module Version: 0.0.0.0
Fault Module time stamp: 59492968
Exception code: c000001d
Except Offset: 00000000000ea5fd
OS Version: 6.2.9200.2.0.0.272.7
Locale ID: 1031
Additional information 1: e868
Additional Information 2: e8684f6eaacb2ee3a8b23fd72a9ac907
Additional information 3: b387
Additional Information 4: b38712ff53fac18241870cc7b35aa182

feisty2

12th February 2018, 20:11

could be hardware related, does the cpu of ur server support avx2 instruction set?
if not, u should probably compile a binary for ur own hardware

ChaosKing

12th February 2018, 20:51

It's a i7 3770, only AVX :( https://ark.intel.com/de/products/65719/Intel-Core-i7-3770-Processor-8M-Cache-up-to-3_90-GHz

//Edit: The compiled dll by chawl works, thx!

ChaosKing

6th May 2018, 21:42

Could someone compile a new x64 binary (non avx2) of this filter? chawl's version produces some green frames with R43. Thank you!

ChaosKing

10th May 2018, 09:13

Thank you HolyWu.
But something is still not right. Both compiled version (sse and avx) produces also broken images. The sse version shows a green overlay I think. The strange part is, that all dlls work fine with my ryzen 1700(supports AVX2). But on a i5 3770 it's garbage.

https://i.imgur.com/Ffwxmim.jpg

Myrsloik

12th July 2018, 12:51

So this plugin requires the double precision fftw3 dll? Or am I mistaken?

tormento

6th April 2020, 09:27

URL]http://www.mediafire.com/file/a92cznircpr34r2/libmvtools_sf_em64t.7z[/URL]
Link expired. Could you please give a mirror?

Unfortunately no AVX2 here too.

feisty2

6th April 2020, 09:43

this plugin is a gigantic mess, I might have to rewrite it from the scratch someday...

feisty2

14th May 2020, 21:50

major update:
- the mvmulti python module is now deprecated, all mvmulti stuff has been embedded into the C++ plugin.
- mvsf.Degrain can now handle arbitrary radius (not limited to 24)
- mvsf.Degrain1/Degrain2/.../Degrain24 are removed, the only MDegrain function is now mvsf.Degrain, which works for any radius.
- mvsf.Analyse is removed, type "Analyze" instead
- new parameter "radius" for mvsf.Analyze, when specified, mvsf.Analyze generates a compound vector clip that works for mvsf.Degrain/Compensate/Flow/Recalculate
- mvsf.Compensate/Flow/Recalculate automatically output compound results when provided a compound vector clip
- mvsf.Degrain automatically deduces the radius from the compound vector clip, you don't need to specify the radius
- when "radius" is specified for mvsf.Analyze, "isb" and "delta" are ignored.
- new parameter "cclip" for mvsf.Compensate/Flow, same as in the mvmulti python module, only takes effect for compound outputs.

#MDegrainN
sup = core.mvsf.Super(clip)
vec = core.mvsf.Analyze(sup, radius=6, overlap=4)
vec = core.mvsf.Recalculate(sup, vec, blksize=4, overlap=2)
clip = core.mvsf.Degrain(clip, sup, vec, thsad=400)

#motion compensated dfftest
sup = core.mvsf.Super(clip)
vec = core.mvsf.Analyze(sup, radius=6, overlap=4)
vec = core.mvsf.Recalculate(sup, vec, blksize=4, overlap=2)
clip = core.mvsf.Compensate(clip, sup, vec)
clip = core.dfttest.DFTTest(clip, tbsize=2*6+1, tmode=0)
clip = core.std.SelectEvery(clip, 2*6+1, 6)

you need a C++20 compatible compiler and vsFilterScript (https://forum.doom9.org/showthread.php?t=181027) to build the binary.
the windows binary is currently unavailable because msvc does not support tons of C++20 core language features.

ChaosKing

15th May 2020, 07:02

Wow that's awesome :eek:
How is the speed compared to your previous implementation?
Does this version also supports 8..16 bits per sample?

feisty2

15th May 2020, 10:23

Wow that's awesome :eek:
How is the speed compared to your previous implementation?
Does this version also supports 8..16 bits per sample?

there's no noticeable speed difference.
no, it still only works for fp32 inputs

feisty2

15th May 2020, 10:31

can someone please build a binary for windows?
the binary I built relies on a bunch of Cygwin dlls.

to build the binary, you need to:
1) get "Include" folder from vsFilterScript
2) place "VapourSynth.h", "VSHelper.h", "fftw3.h" in the "Include" folder
3) compile and link "vapoursynth.lib", "libfftw3-3.lib"

Are_

15th May 2020, 10:43

It doesn't build on Linux :(

g++ -std=c++20 -IvsFilterScript -I/usr/include/vapoursynth -shared -fPIC -march=native -Ofast -o libmvtoolssf.so src/EntryPoint.cxx -lvapoursynth -lfftw3
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/10.1.0/include/g++-v10/cmath:45,
from vsFilterScript/Include/Infrastructure.hxx:15,
from vsFilterScript/Include/Range.hxx:2,
from vsFilterScript/Include/Plane.hxx:2,
from vsFilterScript/Include/Frame.hxx:3,
from vsFilterScript/Include/Clip.hxx:2,
from vsFilterScript/Include/Map.hxx:2,
from vsFilterScript/Include/Plugin.hxx:2,
from vsFilterScript/Include/Core.hxx:2,
from vsFilterScript/Include/Interface.hxx:2,
from src/EntryPoint.cxx:1:
src/Overlap.h:5:16: error: expected unqualified-id before numeric constant
5 | constexpr auto M_PI = 3.1415926535897932384626433832795;
|
$ LANG=C gcc --version
gcc (Gentoo 10.1.0 p1) 10.1.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Also could you consider using the normal include paths for vapoursynth and fftw3? The way things are now break packaging.

feisty2

15th May 2020, 10:54

I'm sure the code is correct, did you turn on some weird floating point flags (such that floating point literals are not assumed to be constants)?

pinterf

15th May 2020, 11:11

Nice. Only has a quick overview on the code.
The "auto"-count is shocking :)
I had to look up cpp documentation frequently for the used syntax elements.
I can see that probably because of the large - but hugely simplified - code base the used coding conventions are mixed (std::int32_t, int32_t, auto; reinterpret_cast vs traditional cast). Or is there an intentional difference on their usage?
Overlaps is still not 100% floating point, it still inherits the original code's two phase integer scaling 32 and 64 (32*64 is 2048 integer scale factor).
(Neither did I eliminate it completely in avs mvtools, I'm still using integer base (https://github.com/pinterf/mvtools/blob/mvtools-pfmod/Sources/overlap.h#L152) (lack of time) but there is no other intermediate scaling happens when dealing with float clips)

Are_

15th May 2020, 11:11

Nope, and shouldn't that had to be done in the source itself?