View Single Post
Old 27th August 2015, 10:01   #1  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: void
Posts: 2,633
single precision MVTools plugin (stable)

binary (x64 for winnt):https://github.com/IFeelBloated/vapo...es/tag/r10_pre

source code: https://github.com/IFeelBloated/MVTools_SF/tree/master

I'm feeling super duper awesometastic cuz, yeah, I'm insane enough to learn C++ by hacking this big fat monster plugin

namespace: mvsf.xxx

anyways, a few things I gotta say here:
1. currently available functions: Super, Analyze, Recalculate, Compensate and Degrain1/2/3, didn't add flow functions yet cuz I got some doubts about truemotion
2. SAD, SCD stuff are floats now (default: thSAD=400.0 (200.0 for Recalculate and 10000.0 for Compensate), thSCD1=400.0, thSCD2=130.0)
3. "limit" in Degrain is float now, and with a range of 0.0 - 1.0, 0.0=no filtering, 1.0=no limit
4. "isse" is removed cuz, well, I mean over 80% of the sse code won't even work on uint16_t clips...
5. I ain't figured out how the hell that dct stuff actually works, so please don't use it (keep dct=0) for now
6. "Analyse" got a new name now! and it's called "Analyze" , I'm American and "analyse" always gets autocorrected and that's not nice, now it's "Analyze" so good news to Americans and Canadians, but the old "Analyse" still works for compatibility reasons and British blokes , so you got the freedom to choose "mvsf.Analyze" or "mvsf.Analyse" and both will work

EDIT:
test2
1. Added SATD support (dct=5 works now)
2. Added mvsf.Finest (flow functions will be ready soon)

EDIT2:
test3
1. Fixed possible overflow in SATD
2. Removed SATD for 8x4 16x8 8x16 blocks, no one uses them anyways
3. Added SATD support to 32x32 blocks
4. SATD for 16x16 blocks is corrupted in the original vaporsynth port, fixed now

EDIT3:
test4
1. Added mvsf.FlowBlur (you got a full floating point QTGMC now, if you want to)
2. better SATD precision

EDIT4:
test5
1. Added mvsf.BlockFPS (someone, maybe, will ever use this thing?)

EDIT5:
test6
1. Fixed the crash of BlockFPS on GRAY clips

EDIT6:
test7
1. all 10 modes of dct are working now, "libfftw3f-3.dll" needs to be placed at the same folder with mvtools

EDIT7:
test8
1. Added mvsf.FlowFPS, wanna do some really fancy floating point precision slo-mo stuff? try it!
2. Added mvsf.FlowInter

EDIT8:
test9
1. Added mvsf.SCDetection

EDIT9:
test10
1. Added mvsf.Degrain4/5/6, these are the strict straight extensions of Degrain1/2/3, not like the approximate copycat python script

EDIT10:
test11
1. Binary Part: Extended Degrain to Degrain24 (24, it's my lucky number!)
2. Resurrected vmulti features from MVTools 2.6.0.5, implemented via a python module, "tr" works up to 24, guess no one will ever use a time radius > 24.... maybe?
3. Resurrected StoreVect and RestoreVect from MVTools 2.6.0.5, implemented via a python module

vmulti demos:
1. DegrainN
Code:
import vapoursynth as vs
import mvmulti
core = vs.get_core()
clp = xxx
sup = core.mvsf.Super(clp)
vec = mvmulti.Analyze(sup,tr=6,blksize=8,overlap=4)
vec = mvmulti.Recalculate(sup,vec,tr=6,blksize=4,overlap=2)
clp = mvmulti.DegrainN(clp, sup, vec, tr=6)
clp.set_output()
2. Compensate/Flow
Code:
import vapoursynth as vs
import mvmulti
core = vs.get_core()
clp = xxx
sup = core.mvsf.Super(clp)
vec = mvmulti.Analyze(sup,tr=6,blksize=8,overlap=4)
vec = mvmulti.Recalculate(sup,vec,tr=6,blksize=4,overlap=2)
clp = mvmulti.Compensate/Flow(clp, sup, vec, tr=6)
clp.set_output()
3.StoreVect (Return a vector clip that could be encoded by vspipe)
vec.vpy
Code:
import vapoursynth as vs
import mvmulti
core = vs.get_core()
clp = xxx
sup = core.mvsf.Super(clp)
vec = mvmulti.Analyze(sup,tr=6,blksize=8,overlap=4)
vec = mvmulti.Recalculate(sup,vec,tr=6,blksize=4,overlap=2)
vec = mvmulti.StoreVect(vec,"D:/vec.txt")
vec.set_output()
vspipe.exe vec.vpy D:\vec.rgb

4.RestoreVect (Restore the encoded vector clip back to a standard vector clip)
Code:
import vapoursynth as vs
import mvmulti
core = vs.get_core()
vec = mvmulti.RestoreVect("D:/vec.rgb","D:/vec.txt")
EDIT11:
test12
this one is, like, kinda free from runtime problems, the binary works without msvcr dlls, and silenced a warning in Overlap.cpp

EDIT12:
test13
precision boost
1. SAD (float -> double)
2. SATD (int16_t -> double)
3. DCT (uint8_t -> even more precise than float)
and also features some cosmetic changes from @ jackoneill

EDIT13:
test14
A. Full Precision Boost
1. DCT (float -> double)
2. Motion Analysis (float -> double)
3. Super (float -> double)
4. Overlap (float -> double)
5. Variance (float -> double)
6. Degrain (float -> double)
and more...
basically everything works at double precision, rounded to single precision only at the final output stage
binary compiled with strict floating point model settings (100% same like how IEEE defined how floating point calculation works)
B. Bug Fixes
fixed a bug inherited from the avisynth plugin (bit shift operation on negative values, reported by @Are_ via runtime debugging)

libfftw3-3.dll (not libfftw3f-3.dll) needs to be placed at the same folder with the plugin!!!

EDIT14:
test15
A. Colorspace
all floating point colorspaces are supported now, GrayS, RGBS and YUV4xxPS (note that dct 1-4 on YUV clips might be kind of buggy, as chroma features a different range from luma, will be fixed in the next release)
B. Degrain
the stupid "thsadc" and "limitc" parameters got their asses canceled, "thsad" and "limit" have been made arrays
"plane" parameter won't do nothing on RGB and GRAY clips, all planes will be processed
C. stuff, here and there..
bug fixes shamelessly copied from @jackoneill

EDIT15:
r1
first stable release!!!
A. sanity check.
will raise an error if the input is not single precision fp or features varying dimensions
B. DCT
fixed dct stuff on YUV input

EDIT16:
r2
merged bug fixes from jackoneill's branch since his last release
currently no plan to add depanning stuff

EDIT17:
r3
A. BlockFPS
1. added support to overlap (merged from Fizick's master branch)
2. new modes, mode 6-8, occlusion mask weighted on SAD (merged from Fizick's master branch)
B. Compensate
1. new parameter "time", use it to do partial time compensation (merged from Fizick's master branch)
C. Bug Fixes
1. vector length was clamped to 127/pel on motion flow functions, now it's 2147483647/pel, practically unlimited (Fizick relaxed it to 32767/pel, I decided to do it more thoroughly)
D. Precision Boost
1. internal masking for motion flow (uint8_t -> double)
2. simpleresize for masks (uint8_t -> double)
floating point precision MMask should be super easy to implement now (all internal stuff are double already), but I didn't do it anyways like, yeah, I'm all fucked up lazy

EDIT18:
r4
New Filter
binary: added mvsf.Flow
mvmulti: added mvmulti.Flow

EDIT19:
r5
Bug Fixes
1. truemotion was corrupted(bug inherited from jackoneill's branch), fixed.
2. the SATD implementation was completely incorrect, did some research and rewrote that from the beginning, SATD works correctly now

EDIT20:
r6:
new block sizes: 2x2, 64x64, 64x32, 128x128, 128x64, 256x256, 256x128
switched to fftw3.3.5

EDIT21:
r7
Bug Fix
fixed a clip length calculation bug in BlockFPS, reported by groucho86
New Feature
extended SATD to 64x64 128x128 and 256x256 blocks
Uncategorized
replaced Hadamard ordered SATD with the Sequency ordered variant, levels faster..

EDIT22:
r8
New Feature
mvsf.Mask
Uncategorized
converted some ugly C89 style code to C++14 style

EDIT22:
r9
fixed an ancient memory leak in mvsf.Super
converted some weird C++98 code to C++14

EDIT23:
major update:
- the mvmulti python module is now deprecated, all mvmulti stuff has been embedded into the C++ plugin.
- mvsf.Degrain can now handle arbitrary radius (not limited to 24)
- mvsf.Degrain1/Degrain2/.../Degrain24 are removed, the only MDegrain function is now mvsf.Degrain, which works for any radius.
- mvsf.Analyse is removed, type "Analyze" instead
- new parameter "radius" for mvsf.Analyze, when specified, mvsf.Analyze generates a compound vector clip that works for mvsf.Degrain/Compensate/Flow/Recalculate
- mvsf.Compensate/Flow/Recalculate automatically output compound results when provided a compound vector clip
- mvsf.Degrain automatically deduces the radius from the compound vector clip, you don't need to specify the radius
- when "radius" is specified for mvsf.Analyze, "isb" and "delta" are ignored.
- new parameter "cclip" for mvsf.Compensate/Flow, same as in the mvmulti python module, only takes effect for compound outputs.

Code:
#MDegrainN
sup = core.mvsf.Super(clip)
vec = core.mvsf.Analyze(sup, radius=6, overlap=4)
vec = core.mvsf.Recalculate(sup, vec, blksize=4, overlap=2)
clip = core.mvsf.Degrain(clip, sup, vec, thsad=400)

#motion compensated dfftest
sup = core.mvsf.Super(clip)
vec = core.mvsf.Analyze(sup, radius=6, overlap=4)
vec = core.mvsf.Recalculate(sup, vec, blksize=4, overlap=2)
clip = core.mvsf.Compensate(clip, sup, vec)
clip = core.dfttest.DFTTest(clip, tbsize=2*6+1, tmode=0)
clip = core.std.SelectEvery(clip, 2*6+1, 6)
you need a C++20 compatible compiler and vsFilterScript to build the binary.
the windows binary is currently unavailable because msvc does not support tons of C++20 core language features.

EDIT24:
feature update:
- new parameter "thsad2" for mvsf.Degrain
- new parameter "thsad2" for mvsf.Compensate, only takes effect for compound output

"thsad2" enables cosine annealing along time dimension for MDegrain and MCompensate, I'm sure many of you have been longing for this feature from the avs MVTools, there you have it!

EDIT25:
cumulative update:
- I merged every single bug fix from jackoneill's branch for the last 4 years.
- VectorStructure::sad has been promoted to double.

EDIT26:
trivial update:
- the "limit" parameter of mvsf.Degrain now defaults to infinity. it still follows the [0.0, 1.0] range, however out-of-range samples are allowed for floating point clips, which makes infinity the only true "unlimited" bound.

Last edited by feisty2; 16th May 2020 at 19:33.
feisty2 is offline   Reply With Quote