MVTools, Depan, DepanEstimate for VapourSynth [Archive] - Page 7

jackoneill

18th January 2016, 22:51

The v10 DLLs have been recompiled to work with VapourSynth older than r30.

that's the exact problem...
it should not but still looks pretty much untouched at thsad=2000

That was kind of related to the dct parameter. The code in the DCTFFTW class is fine, it was just given incorrect information in Recalculate. Analyse was not affected.

Boulder

19th January 2016, 05:01

Thanks, I'll run some tests today :)

Boulder

20th January 2016, 12:42

It seems that the performance of dct=5 is now at the expected level compared to the Avisynth equivalent and I've not been able to crash anything yet.

jackoneill

31st January 2016, 15:11

With help from the compiler, I found a few more of these integer overflow bugs, but there are just too many filters and parameters to test all the combinations myself. Therefore, please compile MVTools with GCC 4.9 or clang 3.3 or newer, like so:

./configure CXXFLAGS='-fsanitize=undefined'
make

and test your favourite scripts with a 16 bit video. If such a bug is encountered, you'll get a message in the console.

Boulder

1st February 2016, 05:12

Would you mind compiling such a build? It also seems that there are some other bugs corrected that affect the output (the Recalculate one). I don't have MinGW/MSYS set up for Vapoursynth and I'm not sure how to get things done with that.

jackoneill

1st February 2016, 12:37

Would you mind compiling such a build? It also seems that there are some other bugs corrected that affect the output (the Recalculate one). I don't have MinGW/MSYS set up for Vapoursynth and I'm not sure how to get things done with that.

Unfortunately I can't. UndefinedBehaviorSanitizer is not available for Windows yet.

Boulder

1st February 2016, 13:52

What about those corrections? If I understand correctly, Recalculate does nothing in the current release.

jackoneill

1st February 2016, 14:36

What about those corrections? If I understand correctly, Recalculate does nothing in the current release.

Okay, here is v11 (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v11), now with a Linux binary included.

* Fix a few more signed integer overflow bugs with 16 bit input in Analyse/Recalculate.
* Fix bug in Recalculate with 16 bit input and dct=1..4, which turned client filters into very slow no-ops.

Boulder

1st February 2016, 14:44

Thanks a lot, it's much appreciated :)

Are_

3rd February 2016, 14:12

With help from the compiler, I found a few more of these integer overflow bugs, but there are just too many filters and parameters to test all the combinations myself. Therefore, please compile MVTools with GCC 4.9 or clang 3.3 or newer, like so:

./configure CXXFLAGS='-fsanitize=undefined'
make

and test your favourite scripts with a 16 bit video. If such a bug is encountered, you'll get a message in the console.

Did that with a script I was having problems in the early days:

import vapoursynth as vs
core = vs.get_core()

clip = core.lsmas.LWLibavSource(rule6)
clip = core.fmtc.bitdepth(clip, bits=16)

sup = core.mv.Super(clip, pel=2)
bvec = core.mv.Analyse(sup, blksize=32, isb=True , chroma=True, search=3, searchparam=1)
fvec = core.mv.Analyse(sup, blksize=32, isb=False, chroma=True, search=3, searchparam=1)
bvec = core.mv.Recalculate(sup, bvec, blksize=8, search=3, searchparam=1)
fvec = core.mv.Recalculate(sup, fvec, blksize=8, search=3, searchparam=1)
clip = core.mv.BlockFPS(clip, sup, bvec, fvec, mode=3, thscd2=12, num=60000, den=1001)
clip[4678:10000].set_output()

And got:

/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:731:40: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:730:40: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:931:65: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:930:65: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:1107:35: runtime error: signed integer overflow: 50 * 48436919 cannot be represented in type 'int [8]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44943708 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 45038739 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44992873 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44990206 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44944694 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44992873 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44990206 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 45038739 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 45038739 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44910136 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:436:35: runtime error: signed integer overflow: 50 * 43379226 cannot be represented in type 'int'

Fortunately, the visual glitches I had in the past are now gone.

Then I did test gmtc with this script:

import vapoursynth as vs
import havsfunc as haf

core = vs.get_core()

clip = core.fmtc.bitdepth(rule6, bits=16)
clip = haf.QTGMC(src, TFF=True, Preset="slower", Denoiser="KNLMeansCL", EZDenoise=2.0, Tuning="dv-hd", ChromaNoise=True)
clip = core.fmtc.bitdepth(clip, bits=10)
clip.set_output()

Results:

/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f6b28be1b77 for type '__m64', which requires 8 byte alignment
0x7f6b28be1b77: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f6b294080c4 for type '__m64', which requires 8 byte alignment
0x7f6b294080c4: note: pointer points here
00 00 00 00 00 00 00 01 01 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93ef33b297 for type '__m64', which requires 8 byte alignment
0x7f93ef33b297: note: pointer points here
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93efb617e4 for type '__m64', which requires 8 byte alignment
0x7f93efb617e4: note: pointer points here
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80
^
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:931:65: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:930:65: runtime error: left shift of negative value -4
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93f14506a4 for type '__m64', which requires 8 byte alignment
0x7f93f14506a4: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93f1c756a2 for type '__m64', which requires 8 byte alignment
0x7f93f1c756a2: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^

Now this is strange, because it's crashing at 6 frames and I did finish this encode a few weeks ago on a full 25 min length video footage but that's maybe something in my hardware or setup :(

All of this test were run with latest git as for 01/02/16 16:34:36

I hope it helps.

jackoneill

3rd February 2016, 18:27

Thanks for testing!

I'll sort out the overflows. Those misaligned load errors are false positives, though. The x86 instruction in question must be movq, which doesn't require any alignment.

If it crashes, you should see why. :)

feisty2

4th February 2016, 17:34

Did that with a script I was having problems in the early days:

import vapoursynth as vs
core = vs.get_core()

clip = core.lsmas.LWLibavSource(rule6)
clip = core.fmtc.bitdepth(clip, bits=16)

sup = core.mv.Super(clip, pel=2)
bvec = core.mv.Analyse(sup, blksize=32, isb=True , chroma=True, search=3, searchparam=1)
fvec = core.mv.Analyse(sup, blksize=32, isb=False, chroma=True, search=3, searchparam=1)
bvec = core.mv.Recalculate(sup, bvec, blksize=8, search=3, searchparam=1)
fvec = core.mv.Recalculate(sup, fvec, blksize=8, search=3, searchparam=1)
clip = core.mv.BlockFPS(clip, sup, bvec, fvec, mode=3, thscd2=12, num=60000, den=1001)
clip[4678:10000].set_output()

And got:

/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:731:40: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:730:40: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:931:65: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:930:65: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:1107:35: runtime error: signed integer overflow: 50 * 48436919 cannot be represented in type 'int [8]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44943708 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 45038739 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44992873 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44990206 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44944694 cannot be represented in type 'int'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44992873 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44990206 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 45038739 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 45038739 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:375:35: runtime error: signed integer overflow: 50 * 44910136 cannot be represented in type 'int [8][2]'
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.h:436:35: runtime error: signed integer overflow: 50 * 43379226 cannot be represented in type 'int'

Fortunately, the visual glitches I had in the past are now gone.

Then I did test gmtc with this script:

import vapoursynth as vs
import havsfunc as haf

core = vs.get_core()

clip = core.fmtc.bitdepth(rule6, bits=16)
clip = haf.QTGMC(src, TFF=True, Preset="slower", Denoiser="KNLMeansCL", EZDenoise=2.0, Tuning="dv-hd", ChromaNoise=True)
clip = core.fmtc.bitdepth(clip, bits=10)
clip.set_output()

Results:

/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f6b28be1b77 for type '__m64', which requires 8 byte alignment
0x7f6b28be1b77: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f6b294080c4 for type '__m64', which requires 8 byte alignment
0x7f6b294080c4: note: pointer points here
00 00 00 00 00 00 00 01 01 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93ef33b297 for type '__m64', which requires 8 byte alignment
0x7f93ef33b297: note: pointer points here
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93efb617e4 for type '__m64', which requires 8 byte alignment
0x7f93efb617e4: note: pointer points here
80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80
^
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:931:65: runtime error: left shift of negative value -1
/var/tmp/portage/media-plugins/vapoursynth-mvtools-9999/work/vapoursynth-mvtools-9999/src/PlaneOfBlocks.cpp:930:65: runtime error: left shift of negative value -4
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93f14506a4 for type '__m64', which requires 8 byte alignment
0x7f93f14506a4: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
^
/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/include/emmintrin.h:704:24: runtime error: load of misaligned address 0x7f93f1c756a2 for type '__m64', which requires 8 byte alignment
0x7f93f1c756a2: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^

Now this is strange, because it's crashing at 6 frames and I did finish this encode a few weeks ago on a full 25 min length video footage but that's maybe something in my hardware or setup :(

All of this test were run with latest git as for 01/02/16 16:34:36

I hope it helps.

overflows should be fixed by now

jackoneill

4th February 2016, 18:10

overflows should be fixed by now

What do you mean?

feisty2

4th February 2016, 18:13

What do you mean?

I pulled 2 requests that fixed the mentioned overflows on github

jackoneill

14th March 2016, 13:24

Are_: Any news about that crash? A backtrace, maybe? I can't test your script because KNLMeansCL requires a fancier video card than what I have.

Are_

14th March 2016, 16:06

OK, now this is a little embarrassing, first script was OK, and now runs without errors/warnings, but second script was ran with 8bit input. :(
Now I'm gonna make sure everything is proper.

import vapoursynth as vs
import havsfunc as haf

core = vs.get_core()

src = core.lsmas.LWLibavSource(r'rule6')

clip = src
clip = core.fmtc.bitdepth(clip, bits=16)
clip = haf.QTGMC(clip, TFF=True, Preset="slower", ShowSettings=False)

clip.set_output()
bt 8bit (https://paste.kde.org/pwikxqvoq/iv2asp)
bt 16bit (https://paste.kde.org/pp1ie8jgc/qu97uc)
stderr (https://paste.kde.org/pktkpdzqj/groqoi)

mvtools and vapoursynth are current git.

jackoneill

14th March 2016, 20:34

OK, now this is a little embarrassing, first script was OK, and now runs without errors/warnings, but second script was ran with 8bit input. :(
Now I'm gonna make sure everything is proper.

import vapoursynth as vs
import havsfunc as haf

core = vs.get_core()

src = core.lsmas.LWLibavSource(r'rule6')

clip = src
clip = core.fmtc.bitdepth(clip, bits=16)
clip = haf.QTGMC(clip, TFF=True, Preset="slower", ShowSettings=False)

clip.set_output()
bt 8bit (https://paste.kde.org/pwikxqvoq/iv2asp)
bt 16bit (https://paste.kde.org/pp1ie8jgc/qu97uc)
stderr (https://paste.kde.org/pktkpdzqj/groqoi)

mvtools and vapoursynth are current git.

Oops. That's a bug I introduced more recently. It's fixed now (plus another two).

Are_

14th March 2016, 20:46

Hurray! No errors, no warnings, nothing wrong to be seen. :)

jackoneill

14th March 2016, 21:12

Hurray! No errors, no warnings, nothing wrong to be seen. :)

Even with the script that uses KNLMeansCL?

Are_

14th March 2016, 21:33

Even with the script that uses KNLMeansCL?

Screw me, I was going too fast. :/
It works with 16bit video, but 8bit video produces this stderr (https://paste.kde.org/p614k64gv/oovkd7), I guess it's the same as before.
This is with the script I posted before, also, it becomes terribly slow (and the process eventually dies after a few frames).

jackoneill

14th March 2016, 22:19

Screw me, I was going too fast. :/
It works with 16bit video, but 8bit video produces this stderr (https://paste.kde.org/p614k64gv/oovkd7), I guess it's the same as before.
This is with the script I posted before, also, it becomes terribly slow (and the process eventually dies after a few frames).

Can you run it in gdb to see why it dies?

Are_

14th March 2016, 22:26

OK, not a segfault, it just runs out of memory, I have 12GB on this machine.

jackoneill

14th March 2016, 22:39

OK, not a segfault, it just runs out of memory, I have 12GB on this machine.

It doesn't happen with v11, does it?

Are_

14th March 2016, 22:47

Mmmh... yes, it does. I don't usually do much processing in 8bit, so I don't know if this is something only happening to me, if it's new or not. :/

Are_

14th March 2016, 23:21

I did try valgrind, but it refuses to run saying:

==28010== Memcheck, a memory error detector
==28010== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==28010== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==28010== Command: vspipe -p mvtools-test.vpy /dev/null
==28010==
vex amd64->IR: unhandled instruction bytes: 0x8F 0xE8 0x58 0xA3 0x51 0xD0 0x0 0x8F
vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0
==28010== valgrind: Unrecognised instruction at address 0x5a62eb3.
==28010== at 0x5A62EB3: PyUnicode_FromUnicode (in /usr/lib64/libpython3.5m.so.1.0)
==28010== by 0x5AFBC32: _PySys_Init (in /usr/lib64/libpython3.5m.so.1.0)
==28010== by 0x5AEEE46: _Py_InitializeEx_Private (in /usr/lib64/libpython3.5m.so.1.0)
==28010== by 0x4E37515: real_init() (vsscript.cpp:82)
==28010== by 0x4E37B46: _M_invoke<> (functional:1531)
==28010== by 0x4E37B46: operator() (functional:1520)
==28010== by 0x4E37B46: void std::__once_call_impl<std::_Bind_simple<void (*())()> >() (mutex:697)
==28010== by 0x61832F0: __pthread_once_slow (pthread_once.c:116)
==28010== by 0x4E3784D: __gthread_once (gthr-default.h:699)
==28010== by 0x4E3784D: call_once<void (&)()> (mutex:729)
==28010== by 0x4E3784D: vsscript_init (vsscript.cpp:93)
==28010== by 0x406AFE: main (vspipe.cpp:580)
==28010== Your program just tried to execute an instruction that Valgrind
==28010== did not recognise. There are two possible reasons for this.
==28010== 1. Your program has a bug and erroneously jumped to a non-code
==28010== location. If you are running Memcheck and you just saw a
==28010== warning about a bad jump, it's probably your program's fault.
==28010== 2. The instruction is legitimate but Valgrind doesn't handle it,
==28010== i.e. it's Valgrind's fault. If you think this is the case or
==28010== you are not sure, please let us know and we'll try to fix it.
==28010== Either way, Valgrind will now raise a SIGILL signal which will
==28010== probably kill your program.
==28010==
==28010== Process terminating with default action of signal 4 (SIGILL)
==28010== Illegal opcode at address 0x5A62EB3
==28010== at 0x5A62EB3: PyUnicode_FromUnicode (in /usr/lib64/libpython3.5m.so.1.0)
==28010== by 0x5AFBC32: _PySys_Init (in /usr/lib64/libpython3.5m.so.1.0)
==28010== by 0x5AEEE46: _Py_InitializeEx_Private (in /usr/lib64/libpython3.5m.so.1.0)
==28010== by 0x4E37515: real_init() (vsscript.cpp:82)
==28010== by 0x4E37B46: _M_invoke<> (functional:1531)
==28010== by 0x4E37B46: operator() (functional:1520)
==28010== by 0x4E37B46: void std::__once_call_impl<std::_Bind_simple<void (*())()> >() (mutex:697)
==28010== by 0x61832F0: __pthread_once_slow (pthread_once.c:116)
==28010== by 0x4E3784D: __gthread_once (gthr-default.h:699)
==28010== by 0x4E3784D: call_once<void (&)()> (mutex:729)
==28010== by 0x4E3784D: vsscript_init (vsscript.cpp:93)
==28010== by 0x406AFE: main (vspipe.cpp:580)
==28010==
==28010== HEAP SUMMARY:
==28010== in use at exit: 214,921 bytes in 177 blocks
==28010== total heap usage: 1,183 allocs, 1,006 frees, 337,862 bytes allocated
==28010==
==28010== LEAK SUMMARY:
==28010== definitely lost: 0 bytes in 0 blocks
==28010== indirectly lost: 0 bytes in 0 blocks
==28010== possibly lost: 0 bytes in 0 blocks
==28010== still reachable: 214,921 bytes in 177 blocks
==28010== suppressed: 0 bytes in 0 blocks
==28010== Reachable blocks (those to which a pointer was found) are not shown.
==28010== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==28010==
==28010== For counts of detected and suppressed errors, rerun with: -v
==28010== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Any idea why?

Are_

14th March 2016, 23:58

OK, somewhat I managed, had to recompile python without -march=native :(
Some valgrind gibberish (https://www.dropbox.com/s/62xro2q770bf4hj/mvtools-valgrind.tar.bz2?dl=0)

feisty2

15th March 2016, 05:02

Random question here, why did you convert all cpp code to c?
Is it like cpp sucks as a programming language?

jackoneill

15th March 2016, 18:05

feisty2: C is simpler. Most of MVTools doesn't really need to be C++. That is, the code is not much more complicated when converted to C. It seemed like a good idea at the time, but I've come to regret it because of all the bugs I introduced.

Are_: Unfortunately those logs are not much use. Valgrind is still finding some instructions it doesn't like. I am unable to reproduce the memory leak, so you'll just have to keep recompiling things without -march=native until Valgrind stops talking about illegal instructions.

Edit: I forgot, you could also try -fsanitize=leak.

Are_

17th March 2016, 14:54

I think it's output is still useless, but I may need a little assistance discovering what's the next package that needs to be rebuild without optimizations. :(
more valgrind logs (https://www.dropbox.com/s/0twajbp5hawfcn7/valgrind-logs.tar.bz2?dl=0)

EDIT: I did try -fsanitize=leak but I think it does not output anything significant, maybe the memory leak is not in mvtools at all.

jackoneill

17th March 2016, 17:33

I think it's output is still useless, but I may need a little assistance discovering what's the next package that needs to be rebuild without optimizations. :(
more valgrind logs (https://www.dropbox.com/s/0twajbp5hawfcn7/valgrind-logs.tar.bz2?dl=0)

EDIT: I did try -fsanitize=leak but I think it does not output anything significant, maybe the memory leak is not in mvtools at all.

That looks like your libc.

You compiled vspipe with -fsanitize=leak, like the gcc manual says, right?

Just out of curiosity, does it help to have core.max_cache_size=300 in the script? It shouldn't, because the cache is limited to 1 GiB by default.

jackoneill

17th March 2016, 20:32

Maybe a different approach is in order.

In my fork (https://github.com/dubhater/vapoursynth/commits/master) you will find a commit which makes VapourSynth print something like this before vspipe outputs any frames:

Requesting 20 frames from com.vapoursynth.ffms2/Source.
Process memory usage before: 46 MiB, after: 54 MiB, difference: 8.
Requesting 20 frames from com.vapoursynth.std/SeparateFields.
Process memory usage before: 54 MiB, after: 58 MiB, difference: 4.
Requesting 20 frames from fmtconv/resample.
Process memory usage before: 58 MiB, after: 65 MiB, difference: 7.
Requesting 20 frames from fmtconv/bitdepth.
Process memory usage before: 65 MiB, after: 83 MiB, difference: 18.
Requesting 20 frames from chikuzen.does.not.have.his.own.domain.scd/Detect.

The last filter mentioned should be the leaky one, if it runs out of memory after only six frames.

Are_

18th March 2016, 00:45

It looks like it's Degrain1: log (https://paste.kde.org/pfloc7efp/noz9w0)
core.max_cache_size=300 didn't make any difference.

jackoneill

18th March 2016, 08:53

It looks like it's Degrain1: log (https://paste.kde.org/pfloc7efp/noz9w0)
core.max_cache_size=300 didn't make any difference.

... Oh. This whole time I was testing with a normal MVTools. With -fsanitize=undefined I get the memory leak too. It must be something about those misaligned loads, because everything is fine if I pass isse=0 to Degrain1.

jackoneill

18th March 2016, 22:52

Right then. v12 is here (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v12), with a change that some people requested a long time ago.

* FlowFPS: enable multi-threading.
* Analyse, Recalculate: pass the motion vectors in frame properties instead of the frames themselves.
* Support systems other than x86.
* Analyse: add sanity checks for pnew, pzero, pglobal, plevel parameters.
* Recalculate: add sanity check for pnew parameter.
* FlowFPS, FlowInter, Mask: fix use of uninitialised memory, which made the output slightly different every time (bug inherited from the Avisynth plugin).
* Analyse, Recalculate: prevent possible crash with 4x4 blocks and 8 bit input.
* Analyse, Recalculate: fix block size sanity check when divide=True (bug inherited from the Avisynth plugin).
* Analyse, Recalculate: fix some more signed integer overflows with 16 bit input.

Oh yeah. If you have a CPU with many threads, let me know how fast FlowFPS is, compared to v11.

MonoS

26th March 2016, 22:03

Is it still possible to save the vectors data to a file??

jackoneill

26th March 2016, 23:13

Is it still possible to save the vectors data to a file??

Looks like it's not possible. Do you need to do that?

MonoS

26th March 2016, 23:22

Looks like it's not possible. Do you need to do that?

I used the write vector function in my helper script quite a lot, it helped me save quite a lot of encoding days.

jackoneill

28th March 2016, 16:14

I used the write vector function in my helper script quite a lot, it helped me save quite a lot of encoding days.

You'll be able to save the vectors once you update to VapourSynth R32.

Instead of saving the contents of the frames returned by Analyse, you'll need to save two frame properties: "MVTools_MVAnalysisData" and "MVTools_vectors". The first one has the same contents in every frame.

Fizick

28th March 2016, 17:47

Hi! Thanks for hard work!
Sorry, I am not ready to Vaporsynth yet, but I have intention to update some functions of MVTools for Avisynth (to add overlapping to MVBlockFps, improve masking, planar formats may be).
I was out of game... some time. Questions:
1. What version is (was) use for a base?
2. Any important bug is found?
What may be backported ?

MonoS

28th March 2016, 19:03

You'll be able to save the vectors once you update to VapourSynth R32.

Instead of saving the contents of the frames returned by Analyse, you'll need to save two frame properties: "MVTools_MVAnalysisData" and "MVTools_vectors". The first one has the same contents in every frame.

Thank you, waiting for the final R32 release and some documentation about that functionality :)

jackoneill

28th March 2016, 19:08

Hi! Thanks for hard work!
Sorry, I am not ready to Vaporsynth yet, but I have intention to update some functions of MVTools for Avisynth (to add overlapping to MVBlockFps, improve masking, planar formats may be).
I was out of game... some time. Questions:
1. What version is (was) use for a base?
2. Any important bug is found?
What may be backported ?

1. The base is your latest, version 2.5.11.3.
2. Depends what you consider important. There are these (https://github.com/dubhater/vapoursynth-mvtools/commit/4dcaa2cbf344a70e91031f0bf9b9a753e41faa7f) two bugs (https://github.com/dubhater/vapoursynth-mvtools/commit/6e78a6e74a31b867348b0046613f41df78a4d692). There are also several parameters that didn't have sanity checks.

I invite you to peruse readme.rst (https://github.com/dubhater/vapoursynth-mvtools#differences), the release notes (https://github.com/dubhater/vapoursynth-mvtools/releases), and the list of commits (https://github.com/dubhater/vapoursynth-mvtools/commits/master). You can also see the changes between two particular versions (https://github.com/dubhater/vapoursynth-mvtools/compare/v4...v5).

jackoneill

28th March 2016, 19:20

Thank you, waiting for the final R32 release and some documentation about that functionality :)

vector_clip = core.mv.Analyse(...)
frame = vector_clip.get_frame(0)
file.write(frame.props.MVTools_MVAnalysisData)
for i in range(vector_clip.num_frames):
frame = vector_clip.get_frame(i)
file.write(frame.props.MVTools_vectors)

Something like that (completely untested code). The point is that accessing those frame properties will give you Python "bytes" objects whose contents you can write to a file or whatever. VapourSynth R31 and older will only give you the bytes up to the first null byte. This has been fixed in git.

Boulder

3rd April 2016, 12:53

This part of a script crashes vspipe, pointing to libmvtools.dll:

import vapoursynth as vs
core = vs.get_core()

clp = core.dgdecodenv.DGSource(r'dictator.dgi')

superanalyse = core.mv.Super(clp, pel=2, chroma=False, rfilter=4)

bv1 = core.mv.Analyse(superanalyse, blksize=16, overlap=8, chroma=False, isb=True, delta=1)
fv1 = core.mv.Analyse(superanalyse, blksize=16, overlap=8, chroma=False, isb=False, delta=1)

Apparently it's the superclip creation part. If I just try to output 'superanalyse', the crash occurs.

jackoneill

3rd April 2016, 14:15

This part of a script crashes vspipe, pointing to libmvtools.dll:

import vapoursynth as vs
core = vs.get_core()

clp = core.dgdecodenv.DGSource(r'dictator.dgi')

superanalyse = core.mv.Super(clp, pel=2, chroma=False, rfilter=4)

bv1 = core.mv.Analyse(superanalyse, blksize=16, overlap=8, chroma=False, isb=True, delta=1)
fv1 = core.mv.Analyse(superanalyse, blksize=16, overlap=8, chroma=False, isb=False, delta=1)

Apparently it's the superclip creation part. If I just try to output 'superanalyse', the crash occurs.

Here is a fixed DLL: http://ulozto.net/xEsRa8NM/vapoursynth-mvtools-v12-win64-7z

Boulder

3rd April 2016, 14:22

Thanks, works perfectly :)

edcrfv94

4th April 2016, 07:10

mvtools v12
haf.QTGMC(src, Preset='Slow', TFF=True)
VapourSynthEditor will Crash, but v11 fine.

jackoneill

4th April 2016, 11:07

mvtools v12
haf.QTGMC(src, Preset='Slow', TFF=True)
VapourSynthEditor will Crash, but v11 fine.

Did you try this DLL from a few posts up? http://ulozto.net/xEsRa8NM/vapoursynth-mvtools-v12-win64-7z

edcrfv94

4th April 2016, 22:39

Did you try this DLL from a few posts up? http://ulozto.net/xEsRa8NM/vapoursynth-mvtools-v12-win64-7z

Work fine.Thanks

Tarutaru

5th April 2016, 17:41

Hi,

After upgrading to the latest version from Git, using RefineMotion=True in SMDegrain will throw this error:
# vspipe foo.vpy -
Filter Recalculate declared the size 1952x2486, but it returned a frame with the size 1952x1112.
Aborted (core dumped)
My input source is a 1920x1080 clip, don't know why it will return these strange resolution.

vpy code:
import vapoursynth as vs
import havsfunc as haf
core = vs.get_core()
ret = core.lsmas.LWLibavSource(source='/tmp/foo.bar')
ret = haf.SMDegrain(ret, RefineMotion=True)
ret.set_output()

Thanks!

jackoneill

5th April 2016, 18:28

Hi,

After upgrading to the latest version from Git, using RefineMotion=True in SMDegrain will throw this error:
# vspipe foo.vpy -
Filter Recalculate declared the size 1952x2486, but it returned a frame with the size 1952x1112.
Aborted (core dumped)
My input source is a 1920x1080 clip, don't know why it will return these strange resolution.

vpy code:
import vapoursynth as vs
import havsfunc as haf
core = vs.get_core()
ret = core.lsmas.LWLibavSource(source='/tmp/foo.bar')
ret = haf.SMDegrain(ret, RefineMotion=True)
ret.set_output()

Thanks!

It's fixed now.

1952 = 1920 + 16 + 16 (padding). The strange heights are due to the downscaled copies made for motion estimation. Look at the output of mv.Super if you're curious.