Log in

View Full Version : MVTools, Depan, DepanEstimate for VapourSynth


Pages : 1 [2] 3 4 5 6 7 8 9 10

Mystery Keeper
5th October 2014, 22:51
jackoneill

I really don't want to suggest anything but the code in SVPflow is really cleaned comparing to original MVTools ;)
Just compare a few numbers - as you already know all the magic is in "PlaneOfBlocks" cpp/h, and they're ~80 KB of code(*) in MVTools (and in your build too) BUT only 42 KB in SVPflow.

(*) huge commented blocks are also included


Also original MVTools loosing >= 20% of performance just for nothing...So would you kindly port it to VapourSynth?

chainik_svp
5th October 2014, 23:03
So would you kindly port it to VapourSynth?

Since we (SVP) need ffdshow support I'm thinking only of AVS+ 64bit right now.

But if "porting" is just a few interface functions then why not...

===

In fact it's not great to have so many branches of MVTools:
- original version
- the one with MT built-in
- SVP's build of original version plus (external) GPU rendering
- completely refactored SVPflow
- this VS version

and all the version but SVPflow share the very same code for MV search algorithm

Mystery Keeper
5th October 2014, 23:18
chainik_svp, put algorithm into static library and link it to the different projects?

chainik_svp
5th October 2014, 23:26
chainik_svp, put algorithm into static library and link it to the different projects?

this's how SVPflow is working :)

rendering part with GPU support (and some more features) is "hidden"
"MAnalyse" part is in a separate GPL library

since SVPflow is the only branch changing something in the MVTools' math I really think it should be the base for any other versions

Mystery Keeper
5th October 2014, 23:35
chainik_svp, keep in mind that VapourSynth is multithreaded (GPU use becomes complicated), cross-platform and both 32 and 64-bit versions are used. Though 32-bit version is only needed when script uses AviSynth plugins.

jackoneill
9th October 2014, 12:58
v4 is out (https://github.com/dubhater/vapoursynth-mvtools/releases), with more barely tested changes.

* Fix the use of an uninitialised variable in Recalculate (kind of important).
* Add some more SAD functions. Block sizes of 8x4, 16x2, 32x16, and 32x32 should now be just as fast as in the original Avisynth plugin.
* Allow YUV422P8 input. The filters only needed to accept such clips, because the required code was already there.

Are_
10th October 2014, 16:56
Wow, I'm having about ~20% speed improvement on my linux box, or I'm doing something wrong or you did a really good job there.

jackoneill
10th October 2014, 18:51
Wow, I'm having about ~20% speed improvement on my linux box, or I'm doing something wrong or you did a really good job there.

If you compiled the latest from git it's probably that change the SVP folks made. I finally copied it. v5 will have it.

Aurelio
7th November 2014, 14:38
Great work!

Any plans to port also MFlowFps and MBlockFps?

jackoneill
7th November 2014, 21:29
Great work!

Any plans to port also MFlowFps and MBlockFps?

Yes, there are plans.

Mystery Keeper
10th December 2014, 11:04
An unofficial build of MVTools test version with "dct" parameter back and working. (https://www.mediafire.com/?7vxaaf01dnqspqx)

Reel.Deel
23rd December 2014, 22:21
Since Mystery Keeper's build is already a few commits behind any plans to release a current binary?

Mystery Keeper
24th December 2014, 00:45
I just thought I shouldn't build on every commit and waited to be notified about release. Can build anytime.

jackoneill
24th December 2014, 11:09
Since Mystery Keeper's build is already a few commits behind any plans to release a current binary?

I don't think the newer commits change much for the user.

MonoS
24th December 2014, 14:53
Any information about 16bit??

jackoneill
24th December 2014, 16:47
Any information about 16bit??

It will happen in a few years.

jackoneill
2nd January 2015, 14:03
Here is v5 (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v5).



* Import speedup from the SVP fork. Affects every filter except Super.
* Add "search_coarse" parameter to Analyse, also from SVP.
* Add FlowFPS and BlockFPS.
* Bring back the "dct" parameter to Analyse and Recalculate (thanks to Mystery Keeper). I hope it sticks this time.
* Fix bug with infinite clips and isb=False in Analyse and Recalculate.


I'm told that FlowFPS is too slow for realtime frame rate conversion (24 to 60 fps). BlockFPS is probably fast enough, but its output is even uglier. :)

mark0077
5th January 2015, 20:33
I currently use SVP dlls with a customised InterFrame script in our HTPC. Is v5 above with the speedups and some of the params from SVP close to being everything that has changed from mvtools to SVP, or are there any plans for the full SVP code to be workable with vapoursynth.

Is it worth taking at this early stage for a live system over avisynth 2.6 MT with SVP dlls in terms of quality and speed?

jackoneill
6th January 2015, 11:35
I currently use SVP dlls with a customised InterFrame script in our HTPC. Is v5 above with the speedups and some of the params from SVP close to being everything that has changed from mvtools to SVP, or are there any plans for the full SVP code to be workable with vapoursynth.

Is it worth taking at this early stage for a live system over avisynth 2.6 MT with SVP dlls in terms of quality and speed?

I believe the speedup and the "search_coarse" parameter were the only changes between 2.5.11.3 and the SVP fork's 2.5.11.9. The SVP's additional filters are not open source, as far as I know.

I have heard that Avisynth MT is unstable and not very efficient in the way it does multithreading. :) The quality is the same. For the speed, see the comparisons posted earlier in this thread.

Pat357
25th January 2015, 19:00
I can't get Degrain (from libmvtools.dll v5) to work. Sometimes it works when I limit the threads to 1 and use only Degrain1.
With Avisynth MVTools version, I have no problems at all and can use threads=8 and MDegrain3.

What happens is that vspipe.exe always crashes when (trying) to produce an output clip. I get no error code or whatever.
When I use VSEDIT-32b, and do "check script" it gives all OK and provides me the correct properties from the output clip, but when I try to preview, the editor crashes the same way as vspipe.exe

Also using the VSFS plugin, it always creates a large .avi file, but as soon as I try to read it using a player, the file disappears and the player gives a black screen.


Faulting application name: vspipe.exe, version: 0.0.0.0, time stamp: 0x547394ac
Faulting module name: libmvtools.dll, version: 0.0.0.0, time stamp: 0x00000000
Exception code: 0xc0000005
Fault offset: 0x0023126e
Faulting process id: 0x112c
Faulting application start time: 0x01d038ba87e6c605
Faulting application path: c:\Program Files (x86)\VapourSynth\core32\vspipe.exe
Faulting module path: c:\Program Files (x86)\VapourSynth\filters\vapoursynth-mvtools-v5-win32\libmvtools.dll
Report Id: c5d8c719-a4ad-11e4-b6fb-005056c00008

Same for VSedit-32 :

Faulting application name: vsedit-32bit.exe, version: 3.0.0.0, time stamp: 0x548596cb
Faulting module name: libmvtools.dll, version: 0.0.0.0, time stamp: 0x00000000
Exception code: 0xc0000005
Fault offset: 0x0023126e
Faulting process id: 0x9c0
Faulting application start time: 0x01d038d06ee303c6
Faulting application path: K:\programs\VapourSynthEditor-32bit\vsedit-32bit.exe
Faulting module path: c:\Program Files (x86)\VapourSynth\filters\vapoursynth-mvtools-v5-win32\libmvtools.dll
Report Id: f88ead17-a4c3-11e4-b6fb-005056c00008

Fault bucket 813441851, type 17
Event Name: APPCRASH
Response: Not available
Cab Id: 0

Problem signature:
P1: vsedit-32bit.exe
P2: 3.0.0.0
P3: 548596cb
P4: libmvtools.dll
P5: 0.0.0.0
P6: 00000000
P7: c0000005
P8: 0023126e
P9:
P10:

Attached files:
F:\TEMP\WERD819.tmp.WERInternalMetadata.xml

These files may be available here:
C:\Users\patrick\AppData\Local\Microsoft\Windows\WER\ReportArchive\AppCrash_vsedit-32bit.exe_86177ff6bffc9c9619a498727ee95caed3faf8e0_172cf356

Analysis symbol:
Rechecking for solution: 0
Report Id: f88ead17-a4c3-11e4-b6fb-005056c00008
Report Status: 0


Here is my "problem" script :


import vapoursynth as vs
core = vs.get_core(threads=4, accept_lowercase = True)
core.std.LoadPlugin(path=r"c:\Program Files (x86)\VapourSynth\filters\vapoursynth-mvtools-v5-win32\libmvtools.dll")
core.avs.loadplugin(path=r"k:\programs\Neuron\dgmpgdec158\DGDecode.dll")
ret=core.avs.MPEG2Source(r"k:\film\Ghost Rider - Spirit of Vengeance (2012) HDTVRip\good files demuxed\ghost - 1 - MPEG2, 576p25.d2v")
ret=core.std.Trim(ret, 2000 , 5000)
ret = core.resize.Lanczos(clip=ret, width=720, height=480, format=vs.YUV420P8)
src=ret
c = core.mv.Super(src,pel=2, sharp=1)
bv3 = core.mv.Analyse(c, isb = 1, delta = 3, overlap=4)
bv2 = core.mv.Analyse(c, isb = 1, delta = 2, overlap=4)
bv1 = core.mv.Analyse(c, isb = 1, delta = 1, overlap=4)
fv1 = core.mv.Analyse(c, isb = 0, delta = 1, overlap=4)
fv2 = core.mv.Analyse(c, isb = 0, delta = 2, overlap=4)
fv3 = core.mv.Analyse(c, isb = 0, delta = 3, overlap=4)
# ret = core.mv.Degrain1(clip=src, super=c, mvbw=bv1, mvfw=fv1)
ret = core.mv.Degrain3(clip=src, super=c, mvbw=bv1, mvfw=fv1, mvbw2=bv2, mvfw2=fv2 ,mvbw3=bv3, mvfw3=fv3)
ret.set_output()


Any idea why vspipe and vsedit-32 both crash without any error ?
Also if I do "vspipe --info script.vpy - " , it does not crash and gives me the correct properties from output video, no crashes.

i7-970 3.2Ghz processor (6+6 cores), 24 GB RAM, WIN7pro-x64 , full up to date.
Vapoursynth r25 and r26rc2 both tested (32-bit versions), libmvtools.dll is latest v5, Python v3.4.2 (32-bit)
System is not over-clocked.

Other .vpy files that do not use the libmvtools.dll plugin the run great, so the installation is not the problem, I think.

Also, to exclude other factors as much as possible, this script based on "original MVTools" runs just fine :

import vapoursynth as vs
# import sys
core = vs.get_core(threads=8, accept_lowercase = True)
# core.std.LoadPlugin(path=r"k:\programs\ffms2-r936c59-avs_vsp\ffms2.dll")
# core.std.LoadPlugin(path=r"c:\Program Files (x86)\VapourSynth\filters\vapoursynth-mvtools-v5-win32\libmvtools.dll")
core.avs.loadplugin(path=r"k:\programs\Neuron\dgmpgdec158\DGDecode.dll")
core.avs.LoadPlugin(path=r"k:\programs\AviSynth 2.5\special filters\mvtools-v2.5.11.3\mvtools2.dll")
# core.std.LoadPlugin(path=r"k:\programs\AviSynth 2.5\special filters\Vapoursynth avisynth filters\vsrawsource.dll")
core.avs.LoadPlugin(path=r"k:\programs\AviSynth 2.5\special filters\masktools-v2.0a48\mt_masktools-26.dll")
core.avs.LoadPlugin(path=r"k:\programs\AviSynth 2.5\special filters\MaskTools-v1.5.8\MaskTools.dll")
# ret=core.ffms2.Source(r"k:\film\Battlefield_3_Fault_Line_Full_Trailer_mov_remux.mkv", threads=1)
ret=core.avs.MPEG2Source(r"k:\film\Ghost Rider - Spirit of Vengeance (2012) HDTVRip\good files demuxed\ghost - 1 - MPEG2, 576p25.d2v")
ret=core.std.Trim(ret, 2000 , 7000)
ret = core.resize.Lanczos(clip=ret, width=720, height=408, format=vs.YUV420P8)
src=ret
c = core.avs.MSuper(src,pel=2, sharp=1)
bv3 = core.avs.MAnalyse(c, isb = 1, delta = 3, overlap=4)
bv2 = core.avs.MAnalyse(c, isb = 1, delta = 2, overlap=4)
bv1 = core.avs.MAnalyse(c, isb = 1, delta = 1, overlap=4)
fv1 = core.avs.MAnalyse(c, isb = 0, delta = 1, overlap=4)
fv2 = core.avs.MAnalyse(c, isb = 0, delta = 2, overlap=4)
fv3 = core.avs.MAnalyse(c, isb = 0, delta = 3, overlap=4)
ret = core.avs.MDegrain3(src,c,bv1,fv1,bv2,fv2,bv3,fv3,thSAD=400)
# ret = core.avs.Degrain3(clip=src, super=c, mvbw=bv1, mvfw=fv1, mvbw2=bv2, mvfw2=fv2 ,mvbw3=bv3, mvfw3=fv3 , thsad=400)
# ret = core.avs.MDegrain3(c1=src,c2=c,c3=bv1,c4=fv1,c5=bv2,c6=fv2,thSAD=400)
# diff = core.avs.mt_makediff(src,ret1)
# ret2 = core.avs.mt_adddiff(src,diff)
# ret = core.resize.Lanczos(clip=ret, width=1440, height=720, format=vs.YUV420P8)
# last=ret
# ret = core.std.Trim(ret , 1 , 5001 )
# ret.output(sys.stdout, y4m=True)
# print(core.raws.FormatList())
# print(core.list_functions())
ret.set_output()



I spend already 2 full days to try to find the issue, and honestly I'm still clueless ;
- Testing other files and various formats and containers
- Testing with other plugins to read the files : AVISourse, FFMS v2.20, DGdecNV 2048, DG MPEG2Source 1.58, ...
- Memory consumption was never more then 400 MB.

How can I further debug this ? What tools can I use for this ?

Are_
25th January 2015, 20:18
This is strange, maybe related to de 32bit version on Windows.
Did you test 64bit version?

I managed a ~5h long encode making use of d2v+qtgmc+mvtools with no problems (also your snippet does not give me any error).

Mystery Keeper
25th January 2015, 21:05
24GB RAM? Why in the world are you using 32-bit versions? Use 64-bit. Crash with no error might very well be memory allocation error.

jackoneill
25th January 2015, 23:17
What if you pass "isse=False" to all the MVTools filters?

Also you can replace DGDecode with d2vsource (https://github.com/dwbuiten/d2vsource/releases).

foxyshadis
25th January 2015, 23:39
Hmm, an access violation doesn't sound like an out of memory, but a stray pointer somewhere. jackoneill, have you tried running the Clang Static Analyzer to check for issues? It's in Fedora with yum install llvm-clang-analyzer

jackoneill
26th January 2015, 13:03
I wouldn't know what to do with the static analyzer.

Does this happen to work better? http://ulozto.net/xZ99sXrQ/vapoursynth-mvtools-1ec0868f3bd9-win32-7z

If the DLL above still crashes, retest Degrain1 and Degrain2. In this new DLL either they all crash or they all work.

Pat357
26th January 2015, 13:16
What if you pass "isse=False" to all the MVTools filters?

Also you can replace DGDecode with d2vsource (https://github.com/dwbuiten/d2vsource/releases).

Thanks , but the "isse=False" didn't help.
Script below gave once the expected output, but then every time had crashed vspipe.exe

Adapted script :

import vapoursynth as vs
# import sys
core = vs.get_core(threads=2, accept_lowercase = True)
core.std.LoadPlugin(path=r"c:\Program Files (x86)\VapourSynth\filters\vapoursynth-mvtools-v5-win32\libmvtools.dll")
core.std.LoadPlugin(path=r"c:\Program Files (x86)\VapourSynth\filters\d2vsource_beta7\32bit\d2vsource.dll")
ret=core.d2v.Source(input=r"k:\film\Ghost Rider - Spirit of Vengeance (2012) HDTVRip\good files demuxed\ghost - 1 - MPEG2, 576p25.d2v")
# ret=core.std.Trim(ret, 2000 , 5000)
ret = core.resize.Lanczos(clip=ret, width=720, height=480, format=vs.YUV420P8)
src=ret
super = core.mv.Super(src ,pel=2, sharp=1 , isse=False)
# bv3 = core.mv.Analyse(super , blksize=8, isb=True, delta=3, overlap=4)
bv2 = core.mv.Analyse(super , blksize=8, isb=True, delta=2, overlap=4)
bv1 = core.mv.Analyse(super , blksize=8, isb=True, delta=1, overlap=4)
fv1 = core.mv.Analyse(super , blksize=8, isb= False, delta=1, overlap=4)
fv2 = core.mv.Analyse(super , blksize=8, isb=False, delta=2 , overlap=4)
# fv3 = core.mv.Analyse(super , blksize=8, isb=False, delta=3, overlap=4)
# ret = core.mv.Degrain1(clip=src, super=super, mvbw=bv1, mvfw=fv1 , thsad=400)
ret = core.mv.Degrain2(clip=src, super=super, mvbw=bv1, mvfw=fv1, mvbw2=bv2, mvfw2=fv2 , thsad=400 )
# ret = core.mv.Degrain3(clip=src, super=super, mvbw=bv1, mvfw=fv1, mvbw2=bv2, mvfw2=fv2 , mvbw3=bv3, mvfw3=fv3 , thsad=400)
ret.set_output()

Pat357
26th January 2015, 13:51
I wouldn't know what to do with the static analyzer.

Does this happen to work better? http://ulozto.net/xZ99sXrQ/vapoursynth-mvtools-1ec0868f3bd9-win32-7z

If the DLL above still crashes, retest Degrain1 and Degrain2. In this new DLL either they all crash or they all work.

File has been deleted. Please re-upload it.
Thanks !

Pat357
26th January 2015, 14:10
I wouldn't know what to do with the static analyzer.

Does this happen to work better? http://ulozto.net/xZ99sXrQ/vapoursynth-mvtools-1ec0868f3bd9-win32-7z

If the DLL above still crashes, retest Degrain1 and Degrain2. In this new DLL either they all crash or they all work.

Got the file. Thanks for that.

Now Degrain2 sometimes works with "threads=1/2"', Degrain3 only with threads=1. Still not stable though : 4 out of 5 attempts still crash vspipe.exe.

I really doubt that memory is an issue here : using the original MVTools, I can use MDegrain3 with 8 threads on the same input video....

Do you need more crash reports ?

jackoneill
26th January 2015, 14:16
I was able to get it to crash in a 32 bit Windows XP, but passing isse=False to Degrain2 and Degrain3 makes them work. Are you sure this doesn't help you?

Pat357
26th January 2015, 14:32
I was able to get it to crash in a 32 bit Windows XP, but passing isse=False to Degrain2 and Degrain3 makes them work. Are you sure this doesn't help you?

Thank you very much !
The first time you mentioned isse=False, I've added it only to Super().
Now I've added it to Degrain2 and Degrain3 and with success !

I'm now running Degrain3 with "threads=4" and still no crashes.

BTW I'm running Win7Pro-64 bit to be able to access my 24GB RAM. Non-Pro Win7-64 can only address 16 GB RAM IIRC.

Does this "isse=False" does what I think it does ? Not using SSE instructions ?
Would this make the script slower ?

jackoneill
26th January 2015, 14:45
isse=False makes Degrain3 use C++ functions instead of some SSE2 functions. It probably makes things slower (you can compare Degrain1 with and without isse).

Pat357
26th January 2015, 14:56
isse=False makes Degrain3 use C++ functions instead of some SSE2 functions. It probably makes things slower (you can compare Degrain1 with and without isse).

Do you have an idea why this fixes Degrain2 and Degrain3 ?

My system is i7-970 3.2 MHz , which should support SSE2 and a lot more.

Do you think switching to vapoursynth 64-bit would give the same problems ?
Is it somewhere intrinsic to VS 32 bit versions ?

jackoneill
26th January 2015, 17:01
Do you have an idea why this fixes Degrain2 and Degrain3 ?

My system is i7-970 3.2 MHz , which should support SSE2 and a lot more.

Do you think switching to vapoursynth 64-bit would give the same problems ?
Is it somewhere intrinsic to VS 32 bit versions ?

Found the problem. My compiler needed an additional parameter. Nothing to do with your computer or VapourSynth. This one should just work: http://ulozto.net/xKFH4DUb/vapoursynth-mvtools-v5-realigned-stack-win32-7z

Pat357
26th January 2015, 19:00
Found the problem. My compiler needed an additional parameter. Nothing to do with your computer or VapourSynth. This one should just work: http://ulozto.net/xKFH4DUb/vapoursynth-mvtools-v5-realigned-stack-win32-7z

Thanks !
This one works without the "isse=False" option ! .:thanks:

I still wonder why I was apparently the first to stumble on this problem.

Also on another system (Hasswell 4770, 16 GB, Win7Pro x64) had the very same problem. I will test this version again on that system.

Myrsloik
26th January 2015, 19:05
Thanks !
This one works without the "isse=False" option ! .:thanks:

I still wonder why I was apparently the first to stumble on this problem.

Also on another system (Hasswell 4770, 16 GB, Win7Pro x64) had the very same problem. I will test this version again on that system.

Most likely because everyone else is using the 64bit version which doesn't have the problem and is faster. And you probably should do the same unless you really need certain avisynth filter.

jackoneill
30th January 2015, 21:29
v6 is here (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v6), and it didn't even take as many years as advertised!

* Add support for grayscale, 4:4:0, and 4:4:4 video.
* Add support for up to 16 bits per sample.
* Add SCDetection filter.
* Reject overlap greater than half the block size.
* Fix crash in BlockFPS when the input clip's frame rate is unknown (introduced in v5).
* Fix colourful bottom border in Degrain3 when overlap is greater than 0 (introduced in v5).
* Fix possible bug with infinite clips in Compensate.
* Fix frequent crash in Degrain2 and Degrain3 due to stack misalignment, specific to the win32 builds. Probably all previous versions are affected.


Obviously everything is slower with 16 bit input, and more memory is used.

DarkSpace
30th January 2015, 21:46
v6 is here (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v6), and it didn't even take as many years as advertised!
Nice!


* Add support for grayscale, 4:4:0, and 4:4:4 video.

What is 4:4:0 supposed to be? Grey-and-Red video? Do you have any examples? I'm rather confused right now...

jackoneill
30th January 2015, 22:26
Nice!


What is 4:4:0 supposed to be? Grey-and-Red video? Do you have any examples? I'm rather confused right now...

In 4:4:0 the chroma has the same width as the luma, and half the height.

DarkSpace
30th January 2015, 22:39
In 4:4:0 the chroma has the same width as the luma, and half the height.
Ah, okay, thanks. I just noticed (once again) that I assign the wrong meaning to these numbers. :mad:
Now that you explain it, I even remember at least reading once that rotated 4:2:2 is 4:4:0 ... :stupid:

buchanan
2nd February 2015, 23:31
Hi,
I get a crash when feeding QTGMC with a 1920x1080 16bit per sample clip : vspipe crashes telling me the faulty module is libmvtools.dll
When I open the scrip using VDub, i get a "division by zero" error in libmvtools, here is the crash report :

VirtualDub crash report -- build 35491 (release-AMD64)
--------------------------------------

Disassembly:
6ef94c00: 784a js 16ef94c4c
6ef94c02: 8914c7 mov [rdi+rax*8], edx
6ef94c05: 42890c80 mov [rax+r8*4], ecx
6ef94c09: 488b8424c00000 mov rax, [rsp+c0]
00
6ef94c11: 49634b10 movsxd rcx, [r11+10h]
6ef94c15: 4c630c28 movsxd r9, [rax+rbp]
6ef94c19: 31c0 xor eax, eax
6ef94c1b: 4c39c9 cmp ecx, ecx
6ef94c1e: 7d22 jge 16ef94c42
6ef94c20: 4a8d0409 lea rax, [rcx+r9]
6ef94c24: 4c89ca mov edx, r9
6ef94c27: 4829ca sub edx, ecx
6ef94c2a: 480fafc2 imul eax, edx
6ef94c2e: 4d0fafc9 imul ecx, ecx
6ef94c32: 480fafc9 imul ecx, ecx
6ef94c36: 48c1e008 shl rax, 08h
6ef94c3a: 4899 cdq
6ef94c3c: 4c01c9 add ecx, ecx
6ef94c3f: 48f7f9 idiv eax, ecx
6ef94c42: 4983f801 cmp rax, 01h
6ef94c46: 43890487 mov [r15+r8*4], eax
6ef94c4a: 0f859b010000 jnz 16ef94deb
6ef94c50: 8b8c24f4010000 mov ecx, [rsp+1f4]
6ef94c57: 8b8424f0010000 mov eax, [rsp+1f0]
6ef94c5e: 41b900010000 mov ecx, 00000100
6ef94c64: 4c8bb424a00000 mov r14, [rsp+a0]
00
6ef94c6c: 4c897c2438 mov [rsp+38h], r15
6ef94c71: 4883c614 add rsi, 14h
6ef94c75: 48897c2420 mov [rsp+20h], rdi
6ef94c7a: 448d8408010100 lea r8d, [rax+rcx+101]
00
6ef94c82: c1e008 shl eax, 08h
6ef94c85: 99 cdq
6ef94c86: 41f7f8 idiv eax, eax <-- FAULT
6ef94c89: 4129c1 sub ecx, eax
6ef94c8c: 898424f0010000 mov [rsp+1f0], eax
6ef94c93: 89c8 mov eax, ecx
6ef94c95: c1e008 shl eax, 08h
6ef94c98: 488b8c24880000 mov rcx, [rsp+88]
00
6ef94ca0: 99 cdq
6ef94ca1: 41f7f8 idiv eax, eax
6ef94ca4: 8b942408010000 mov edx, [rsp+108]
6ef94cab: 4129c1 sub ecx, eax
6ef94cae: 898424f4010000 mov [rsp+1f4], eax
6ef94cb5: 488b442478 mov rax, [rsp+78h]
6ef94cba: 44894c2430 mov [rsp+30h], r9d
6ef94cbf: 448b8c24d80000 mov r9d, [rsp+d8]
00
6ef94cc7: 4889442428 mov [rsp+28h], rax
6ef94ccc: 488b8424c80000 mov rax, [rsp+c8]
00
6ef94cd4: 4e8d0410 lea r8, [rax+r10]
6ef94cd8: 41ff16 call dword ptr [r14]
6ef94cdb: 4c8ba424b80000 mov r12, [rsp+b8]
00
6ef94ce3: 8d0c1b lea ecx, [rbx+rbx]
6ef94ce6: 488b9424f80000 mov rdx, [rsp+f8]
00
6ef94cee: 4c8b8c24f00000 mov r9, [rsp+f0]
00
6ef94cf6: 4c8b8424880000 mov r8, [rsp+88]
00
6ef94cfe: 4863 db 63h

Built on Althena on Sun Oct 27 16:00:02 2013 using compiler version 1400

Windows 6.1 (Windows 7 x64 build 7601) [Service Pack 1]
Memory status: virtual free 8386984M/8388608M, commit limit 49032M, physical total 24517M

RAX = fffffd00
RBX = 3a0
RCX = ffffff02
RDX = ffffffff
RSI = 6dfc4
RDI = 1540f9d0
RBP = 0
R8 = 0
R9 = 100
R10 = 3a0
R11 = 1873eb88
R12 = 9319ee0
R13 = 0
R14 = 118f7148
R15 = 1540f960
RSP = 1540f770
RIP = 6ef94c86
EFLAGS = 00010287


Crash reason: Integer Divide-by-Zero

Crash context:
An integer division by zero occurred in module 'libmvtools'.

Pointer dumps:

RDI 1540f9d0: 286bdce0 00000000 30b5e610 00000000 28fec020 00000000 293e2020 00000000
RSP 1540f770: 00000002 00000000 00000007 000007fe 30868020 00000000 00000f20 00000000
1540f790: 1540f9d0 00000000 00000010 00000000 00000106 00000000 1540f960 00000000
1540f7b0: 00000000 00000000 00000002 00000000 00000002 00000000 00000010 00000000
1540f7d0: 1540fad0 00000000 1540f940 00000000 1540f9a0 00000000 1540f950 00000000
R11 1873eb88: 000001d0 000002f0 00000000 0000002a ffbc1e25 000001d8 000002f0 00000000
R12 09319ee0: 30868020 00000000 3372270c 88000050 08cc6d30 00000000 3372270d 8800ff50
R14 118f7148: 6f1d0690 00000000 6f1d2190 00000000 6f1d2190 00000000 6ef91df0 00000000
R15 1540f960: fffffffd ffffff02 038f4700 00000000 00000f00 00000780 00000780 00000000

Thread call stack:
6ef94c86: libmvtools!VapourSynthPluginInit [6ef80000+1b80+13106]
7fedb122bee: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+1792e]
7fedb120424: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+15164]
7fedb183c48: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+78988]
7fedb1aca30: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+a1770]
7fedb110439: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+5179]
7fedb184477: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+791b7]
7fedb1846c1: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+79401]
7fedb11f19b: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+13edb]
7fedb11bd93: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+10ad3]
7fedb11d4c5: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+12205]
7fedb192739: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+87479]
7fedb183e15: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+78b55]
7fedb157afb: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+4c83b]
7fedb11fae8: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+14828]
7fedb15b530: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+50270]
7fedb1890f6: VapourSynth!getVapourSynthAPI [7fedb0e0000+2b2c0+7de36]
776959ed: kernel32!BaseThreadInitThunk [77680000+159e0+d]
778cc541: ntdll!RtlUserThreadStart [778a0000+2c520+21]

-- End of report


Do you need any other info I could provide ?

jackoneill
2nd February 2015, 23:50
It's already fixed in git. Until v7, you can avoid the crash by passing "isse=False" to Analyse when feeding it 16 bit video. Only 16 bit video is affected. 15 and lower is fine.

buchanan
2nd February 2015, 23:58
Ok thanks ! :)

Are_
3rd February 2015, 14:39
Welp, more test then:

CPU is a AMD FX-8150, 3600 MHz, Turbo CORE/Cool n' Quiet/C6 dissabled. Windows 7 Ultimate 64bit / Gentoo Linux 64bit.

Input is 720×480 YUV420P8, mpeg2, 2000 frames, decoded with lsmash-works.

VapourSynth command used with an additional "--requests 1" for the 1 thread tests:
vspipe test.py /dev/null --start 9501 --end 11500

AviSynth command used:
AVSMeter.exe "tests.avs" -range=9501,11500

Software versions:
VapourSynth version = r26 (https://github.com/vapoursynth/vapoursynth/releases/tag/R26)
vapoursynth-mvtools version = v6 (https://github.com/dubhater/vapoursynth-mvtools/releases/tag/v6)
avisynth vanilla mvtools version = v2.5.11.3 (http://avisynth.org.ru/mvtools/mvtools2.html#download)
avisynth svp mvtools version = 2.5.11.9-svp (http://www.svp-team.com/wiki/Download)
avisynth firesledge mvtools version = 2.6.0.5 (http://forum.doom9.org/showthread.php?p=1386559#post1386559)
lsmash-works version = r775 (https://www.dropbox.com/sh/3i81ttxf028m1eh/AAABkQn4Y5w1k-toVhYLasmwa?dl=0)
AviSynth version = 2.6.0 RC1 (http://forum.doom9.org/showthread.php?t=171668)
AVSMeter version = v1.9.4 (http://forum.doom9.org/showthread.php?t=165528)

Only the 64bit version of VapourSynth was tested, as 32bit is gonna be deprecated and noone should be using it anyway.
No MT version of AviSynth was tested because it has proven unstable and somewhat useless nowdays, 4gb_ram / number_of_threads is not enough for HD content.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Results for Degrain test:

8 threads:
VapourSynth Windows = 32.35 fps (100% cpu)
VapourSynth Linux = 37.50 fps (100% cpu)
AviSynth firesledge = 8.32 fps (35% cpu)
1 thread:
VapourSynth Windows = 5.32 fps (12% cpu)
VapourSynth Linux = 5.50 fps (12% cpu)
AviSynth Vanilla = 4.42 fps (12% cpu)
AviSynth SVP = 6.05 fps (12% cpu)
VapourSynth script:
import vapoursynth as vs
core = vs.get_core() # threads=1
v = core.lsmas.LWLibavSource(r'720x480 YUV420P8 mpeg2.mkv')
super = core.mv.Super(src)
mvbw3 = core.mv.Analyse(super, isb=True, delta=3, overlap=4)
mvbw2 = core.mv.Analyse(super, isb=True, delta=2, overlap=4)
mvbw = core.mv.Analyse(super, isb=True, delta=1, overlap=4)
mvfw = core.mv.Analyse(super, isb=False, delta=1, overlap=4)
mvfw2 = core.mv.Analyse(super, isb=False, delta=2, overlap=4)
mvfw3 = core.mv.Analyse(super, isb=False, delta=3, overlap=4)
v = core.mv.Degrain3(clip=src, super=super, mvbw=mvbw, mvfw=mvfw, mvbw2=mvbw2, mvfw2=mvfw2, mvbw3=mvbw3, mvfw3=mvfw3)
v.set_output()
AviSynth script:
LWLibavVideoSource("720x480 YUV420P8 mpeg2.mkv")
super = MSuper(last)
mvbw3 = MAnalyse(super, isb=True, delta=3, overlap=4)
mvbw2 = MAnalyse(super, isb=True, delta=2, overlap=4)
mvbw = MAnalyse(super, isb=True, delta=1, overlap=4)
mvfw = MAnalyse(super, isb=False, delta=1, overlap=4)
mvfw2 = MAnalyse(super, isb=False, delta=2, overlap=4)
mvfw3 = MAnalyse(super, isb=False, delta=3, overlap=4)
MDeGrain3(last, super, mvbw, mvfw, mvbw2, mvfw2, mvbw3, mvfw3)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Results for BlockFPS test (change frame rate: 23.97->25):
8 threads:
VapourSynth Windows = 289.26 fps (99% cpu)
VapourSynth Linux = 312.38 fps (99% cpu)
AviSynth firesledge = 51.66 fps (30% cpu)
1 thread:
VapourSynth Windows = 55.98 fps (12% cpu)
VapourSynth Linux = 83.92 fps (12% cpu)
AviSynth Vanilla = 45.41 fps (12% cpu)
AviSynth SVP = 60.09 fps (12% cpu)
VapourSynth script:
import vapoursynth as vs
core = vs.get_core() # threads=1
v = core.lsmas.LWLibavSource(r'720x480 YUV420P8 mpeg2.mkv')
super = core.mv.Super(v)
mvbw = core.mv.Analyse(super, isb=True, delta=1, overlap=0)
mvfw = core.mv.Analyse(super, isb=False, delta=1, overlap=0)
v = core.mv.BlockFPS(clip=v, super=super, mvbw=mvbw, mvfw=mvfw)
AviSynth script:
LWLibavVideoSource("720x480 YUV420P8 mpeg2.mkv")
super = MSuper()
mvbw = MAnalyse(super, isb=True, delta=1, overlap=0)
mvfw = MAnalyse(super, isb=False, delta=1, overlap=0)
MBlockFps(super, mvbw, mvfw)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Results for FlowFPS test (change frame rate: 23.97->25):
8 threads:
VapourSynth Windows = 29.83 fps (26% cpu)
VapourSynth Linux = 39.08 fps (26% cpu)
AviSynth firesledge = 19.38 fps (29% cpu)
1 thread:
VapourSynth Windows = 11.76 fps (12% cpu)
VapourSynth Linux = 13.81 fps (12% cpu)
AviSynth Vanilla = 11.75 fps (12% cpu)
AviSynth SVP = 15.73 fps (12% cpu)
VapourSynth script:
import vapoursynth as vs
core = vs.get_core() # threads=1
v = core.lsmas.LWLibavSource(r'720x480 YUV420P8 mpeg2.mkv')
super = core.mv.Super(v)
mvbw = core.mv.Analyse(super, isb=True, delta=1, overlap=4)
mvfw = core.mv.Analyse(super, isb=False, delta=1, overlap=4)
v = core.mv.FlowFPS(clip=v, super=super, mvbw=mvbw, mvfw=mvfw)
v.set_output()
AviSynth script:
LWLibavVideoSource("720x480 YUV420P8 mpeg2.mkv")
super = MSuper()
mvbw = MAnalyse(super, isb=True, delta=1, overlap=4)
mvfw = MAnalyse(super, isb=False, delta=1, overlap=4)
MFlowFps(super, mvbw, mvfw)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FlowFPS results were strange, not only it was not able to beat avisynth version, but it was not able to top the cores when multithreading was used (only one thread is maxed out).

jackoneill
3rd February 2015, 15:20
Thanks for the comparison!

FlowFPS is the only filter that still runs on a single thread. It's due to the way it's written. The input frames it needs can be generated in parallel, which is why you see some speed-up with 8 threads.

feisty2
3rd February 2015, 15:31
is core.mv.degrain3 equal to Expr ([core.mv.degrain1 (clip, super, mvbw1, mvfw1).std.Lut ("x / 3"), core.mv.degrain1 (clip, super, mvbw2, mvfw2).std.Lut ("x / 3"), core.mv.degrain1 (clip, super, mvbw3, mvfw3).std.Lut ("x / 3")], ["x y + z +"]) ?
if so, I think I can extend the time radius to any int by script
edit: typo

Are_
3rd February 2015, 15:52
Thanks for the comparison!

FlowFPS is the only filter that still runs on a single thread. It's due to the way it's written. The input frames it needs can be generated in parallel, which is why you see some speed-up with 8 threads.

Thanks to you jackoneill. I see, now it makes sense.

Btw, I updated it with linux results, for some obscure reason, linux is still faster than windows, the power of -march=native?

jackoneill
4th February 2015, 09:54
Oh, I forgot: comparing to 2.5.11.3 isn't exactly fair anymore. I imported a change from the SVP fork which makes it a bit faster, so that's what should be used in comparisons.

Groucho2004
4th February 2015, 10:59
avisynth mvtools version = v2.5.11.3 (http://avisynth.org.ru/mvtools/mvtools2.html#download)
You should try cretindesalpes's 2.6.0.5 that is internally multithreaded. See here (http://forum.doom9.org/showthread.php?p=1386559#post1386559).

Are_
4th February 2015, 13:49
Ok, updated the post with ST results for svp fork and MT results for firesledge fork.

zerowalker
4th February 2015, 22:10
Ok, updated the post with ST results for svp fork and MT results for firesledge fork.

Oh nice.

Something seems off though, was sure it used more CPU.
But then again in my fast tests the FPS difference is like yours.

It's very impressive, only downside is that i am so used to Avisynth that it's hard to get things going, luckily though it's still script which makes it fairly easy to understand:)