Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 26th May 2005, 13:20   #1  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
MT 0.7(+custom avisynth)a filter to run filters multithreaded. Usefull for SMP or HT

I made this small filter now that dualcore processors begins to show up. I don't know how much faster the filters will be when running on a Pentium 4 HT but with my old dual celeron 400 MHz I got a 40 % speed increase. Also usefull for all the 8 way dualcore opteron computers
Also Included is a custom build of avisynth 2.5.7 that provides the two function SetMTMode and GetMTMode and some change to internal filters to support multithreading.

please post if there is some filters that doesn't work or what speed increase you got.

Get version 0.7 here (contains avisynth 2.57 MT version 5[src])
avisynth 2.57 MT version 4 [src]
Get version 0.6 here
Get version 0.5 here
or version 0.41 here

You can also get futher help to MT at the mediaWIKI page here

from the readme

MT is a filter that enables other filter to run multithreaded. This should hopeful speed up processing on hyperthreaded multicore processors or multiprocessor systems.
Technical info

Important: Allways remember to judge the result by looking at the speed improvement not the cpu utilization.

MT is a filter that split a frame up in smaller fragment that are processed in individual threads allowing full utilization of multiprocessor or hyperthread enabled computers. I tested it on my old abit bp6 with 2x celeron 400 MHz and it increased the speed by 40%. Note that if you is already getting 100% cpu utilization when processing avs scripts(ie if you're encoding to DivX/XviD) you don't need to use this filter.

The filter works like this avs function:

function PseudoMT(clip c,string filter)
{
a=eval("c.crop(0,0,src.width/2,src.height)."+filter)
b=eval("c.crop(src.width/2,0,src.width/2,src.height)."+filter)
stackhorizontal(a,b)
}


The only difference is that a and b is executed in parallel and it is possible to split the frame into more than 2 pieces. If the filter works with the above script it should work with MT if the filtercode is threadsafe. Dust does not work with the above script so if you want to use iip use another denoiser or get Steady to fix the bug.
Limitations

The filter to be run must only accept one input clip and that is last. Also the filter should not rely on the content of the whole frame(like smart deinterlacers) else there is a risk that only part of the frame will be processed. The filter should also be threadsafe. Most filters are threadsafe but some will produce a wrong result or crach.
Installation

copy mt.dll into the avisynth plugin directory and copy the included avisynth.dll into your windows\system32 directory or where avisynth.dll is located and remember to take a backup of the old avisynth.dll(rename it or something) if you don't have version 2.6 installed.

from version 0.7 two other filters are included too:

MTi() that creates two threads and let each thread process one field before combining them like this avs function
function PseudoMTi(clip c,string filter)
{
a=eval("c.AssumeFieldBased().SeparateFields.selecteven()."+filter)
b=eval("c.AssumeFieldBased().SeparateFields.selectodd()."+filter)
interleave(a,b).weave()
}
like the other pseudoscript a and b are executed in parallel. Note that only two threads are created so it will only use two (virtual) cores.

MTsource() that are used to run source filters multithreaded. It works like this:
function PseudoMTsource(string filter)
{
SetMTmode(2)
eval(filter)
SetMtmode(0)
}
So different from the two other filters it is a temporal filter that fetches frames ahead of time and store them in the cache for fast retrieval.

Syntax
MT(clip clip,string filter,int threads,int overlap,bool splitvertical)

All parameters are named. Function parameters:

clip clip = last
input clip

filter string = No default
filter to run multithreaded. Note that the filter must not change both the frame height and width (but colorspace is okay) and the only 1 input clip is allowed. It can be any build-in filter, avs defined filter or external plugin filter as long as the restrictions are observed.

threads int = 2
number of threads to run. Set this to the number of threads your computer is able to run concurrently.

overlap int = 0
- number of pixel to add at the top and bottom border or left and right border. Increase this if you see artifacts where the frame is split.

splitvertical bool = false
- if true the frame are cut vertical(and the filter is allowed to change the height) else it is cut horizontal(and the filter is allowed to change the width).

MTi
MTi(clip clip,string filter)

All parameters are named. Function parameters:

clip clip = last
input clip. Must be mod2 height for RGB and YUY2 colorspace and mod4 height for YV12 colorspace

filter string = No default
filter to run multithreaded. Note that the filter are allowed to change both width and height at the same time but only 1 input clip is allowed. It can be any build-in filter, avs defined filter or external plugin filter as long as the restrictions are observed.


MTsource
MTSource(string filter,int delta,int threads,int max_fetch)

All parameters are named. Function parameters:

filter string = No default
source filter to run multithreaded. Currently only internal and external source filters are supported (like DirectShowSource, Avisource, MPEG2Source) . You can use an avs defined filter or a non-source filter but it might crash or produce frame corruption.

delta int = 1
this is how many frames there are between each frame request so if you are only going to read every second frame set it to 2 or if you are reading the frames backwards set it to -1. More complex frame access pattern like SelectEvery(10,3,6,7) are not supported (but might work anyway as the requested frames are in the cache, there will just be some waisted memory from non requested frame in the cache)

threads int = 2
number of threads to run. Set this to the number of threads your computer is able to run concurrently.

max_fetch int = 30
This is the maximum number of frames ahead of the currently requested frame that MTsource will fetch. Setting it to low will leaving the threads idle for most of the time and setting it to high will waste to much memory.

Examples:

MT("blur(1)",2,2)

also user defined function (uses variableblur):

MT("unsharp(2,0.7)",2,2)
function unsharpen(clip c,float variance,float k)
{
blr=binomialBlur(c,vary=variance,varc=2,Y=3,U=2,V=2)
return yv12lutxy(blr,c,"y x - "+string(k)+" * y +",y=3,u=2,v=2)
}

This one will not produce the intended result but shows how to use the triple quotes:

MT(""" subtitle("Doh") """,4,0)

License

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as published by
the Free Software Foundation.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

Please consider to make some donation.
Version changes:

* 0.1 first release.
* 0.2 Should be more thread safe.
* 0.21 forgot to comment out a Sleep(0)
* 0.25 Added the splitvertical option
* 0.3 More stable(and slower)
* 0.4 Includes a custom version of avisynth 2.56 beta that should speed things up
* 0.41 Minor speed increase
* 0.5 Requires the included modified avisynth 2.5.6 or avisynth 2.6
* 0.6 Bugfix: height can be changed with splitvertical=true without crashing. Also includes modified avisynth 2.5.7
* 0.7 two new filters: MTi(), MTsource() and Avisynth MT 2.5.7.5


modified avisynth 2.5.7

It contains the two new functions SetMTMode() and GetMTMode() and is needed by MT.dll. Install it by overwriting avisynth.dll in your c:\windows\system32 (and remember to take a backup of the old file first)
Technical info

These functions enable avisynth to use more than one thread when processing filters. This is useful if you have more than one cpu/core or hyperthreading. This feature is still experimental.

Syntax:

GetMTMode(bool threads)
threads - if true GetMTMode? returns the number of threads used else the current mode is returned (see below). Default value false.

SetMTmode(int "mode",int "threads")

Place this at the first line in the avs file to enable temporal (that is more than one frame is processed at the same time) multithreading. Use it later in the script to change the mode for the filters below it.
mode - there are 6 modes 1 to 6. Default value 2.

* Mode 1 is the fastest but only works with a few filter
* Mode 2 should work with most filters but uses more memory
* Mode 3 should work with some of the filters that doesn't work with mode 2 but is slower
* Mode 4 is a combination of mode 2 and 3 and should work with even more filter but is both slower and uses more memory
* Mode 5 is slowest(Slower than not using SetMTMode) but should work with all filters that doesn't require linear frameserving (that is the frames come in order (frame 0,1,2 ... last)
* Mode 6 is a modified mode 5 that might be slightly faster (But still slower than not using SetMTMode)

threads - number of threads to use. Set to 0 to set it to the number of processors available. It is not possible to change the number of threads other than in the first SetMTMode. Default value 0.

Example:

SetMTMode(2,0) #enables multihreading using thread = to the number of available processors and mode 2
LoadPlugin("...\LoadPluginEX.dll") #needed to load avisynth 2.0 plugins
LoadPlugin("...\DustV5.dll") #Loads Pixiedust
import("limitedsharpen.avs")
src=AVIsource("test.avi")
SetMTMode(5) #change the mode to 5 for the lines below
src=src.converttoyuy2().PixieDust()#Pixiedust needs mode 5 to function.
SetMTMode(2) #change the mode back to 2
src.LimitedSharpen() #because LimitedSharpen works well with mode 2
subtitle("Number of threads used: "+string(GetMTMode(true))+" Current MT Mode: "+string(GetMTMode())) #display mode and number of threads in use

Last edited by tsp; 23rd November 2007 at 18:03.
tsp is offline   Reply With Quote
Old 27th May 2005, 11:44   #2  |  Link
MacAddict
XviD User
 
Join Date: Oct 2004
Location: Ky
Posts: 190
Wow Too bad I no longer have the AMD dualie system to test this with. Perhaps by year end most of us will be able to afford a dual core though. Nice work once again tsp.
__________________
DFI NF4 SLI Expert | Opteron 165 CCBBE 0616 XPMW (9x325HTT=2.9Ghz) | 2x1GB G.Skill HZ (3-4-4-8-12-16-2T) | LG 62L DVD/CD | Geforce 7300GT | All SATA | Antec 650 Trio PSU | XP SP2
MacAddict is offline   Reply With Quote
Old 28th May 2005, 00:50   #3  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
thanks. The princip in this filter is really simpel, something like this:
Code:
function MT(clip c,int threads,string function)
{
clip r[threads]
for(int i=0;i<threads;i++)
{
c.crop(height/threads*i,0,height/threads,0)
r[i]=last.eval(function)
}
stackvertical(r[0],r[1],...,r[threads-1])
}
the difficult part is to make the code thread safe so two threads isn't writing to the same memory at the same time. So I'm working on making the IScriptEnvironment threadsafe.
tsp is offline   Reply With Quote
Old 28th May 2005, 02:58   #4  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
Does "function" refer to only a single filter?

For example, can function refer to an imported function using several filters, whether those filters are Avisynth filters or plugins?

I'd appreciate a more thorough explanation as to what "function" means.

Thanks.

[edit]

I ran this script on a 3 minute analog capture:

loadplugin("d:\plugins\DeComb521.dll")
loadplugin("d:\plugins\mt.dll")
loadplugin("d:\plugins\FFT3dGPU9b.dll")

Avisource("d:\test-mt.avi").ConvertToYV12

Telecide(order=1,Post=0,Guide=1)
Decimate(Cycle=5,Mode=0,Quality=3)

mt(fft3dgpu)

fft3dgpu(sigma=4.0,bw=32,bh=32,bt=3,plane=0,mode=2)


This gave an exception when opened in VirtualDub 1.5.10 and pointed to
the line

mt(fft3dgpu)





Last edited by hartford; 28th May 2005 at 03:17.
hartford is offline   Reply With Quote
Old 28th May 2005, 08:54   #5  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
hartford: function is like the function to the buildt in filter scriptclip. So it can be an internal(build-in) or external(plugin) filter or a function that is defined in an AVS file. just remember the quotes(or triple quotes) because it's a string. So your example would be like this:
Code:
Avisource("d:\test-mt.avi").ConvertToYV12

Telecide(order=1,Post=0,Guide=1)
Decimate(Cycle=5,Mode=0,Quality=3)
mt("fft3dgpu(sigma=4.0,bw=32,bh=32,bt=3,plane=0,mode=2)")
But fft3dgpu doesn't work well with MT because only one thread at a time can acces fft3dgpu (Directx really hates multithreading)

you can also use it like this:
Code:
Telecide(order=1,Post=0,Guide=1)
Decimate(Cycle=5,Mode=0,Quality=3)
MT("fft3d()")

function fft3d(clip c)
{
c
fft3dfilter(plane=0)
fft3dfilter(plane=1)
fft3dfilter(plane=2)
}

Last edited by tsp; 28th May 2005 at 09:56.
tsp is offline   Reply With Quote
Old 2nd June 2005, 02:34   #6  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
Quote:
Originally Posted by tsp
mt("fft3dgpu(sigma=4.0,bw=32,bh=32,bt=3,plane=0,mode=2)")
OK, the "quote" stuff fixed it. (I used fft3dgpu only because you authored it and I thought that it would be a "torture" test)

I did a test with the above script using TBilateral and I got no errors:

mt("TBilateral(diameterL=3,diameterC=3,sDevL=2.0,sDevC=2.0,iDevL=6.0,iDevC=7.0,d2=true,gui=false)")


I'll try MT version 0.25 soon and will report.

Thanks.



[added edit]

More testing.

Difference with respect to speed noticed between versions with
VirtualDub 1.15.10:

v 0.1about 7.22 Minutes
v 0.25 about 6:55 minutes

However, VirtualDub v.1.66 crashes when using MT version 0.25. It exits
with no error messages.

Last edited by hartford; 2nd June 2005 at 03:54.
hartford is offline   Reply With Quote
Old 3rd June 2005, 00:02   #7  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Hasn't it been over a year since Steady showed up last?

hartford: How much faster is mt compaired to without it and did vdub 1.6.6 crash at startup or after a couple of frames. There is a new version here that might be a little faster and/or stable.
tsp is offline   Reply With Quote
Old 4th June 2005, 10:27   #8  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,277
nice filter !

"VirtualDub v.1.66 crashes when using MT version 0.25. It exits with no error messages."
=> if you use a filter with mt that requires mod16 make sure your input is mod32, so that if mt splits the frame it's still mod16. (for more than 2 threads one probably has to go to mod64 and so on)

At least for me that stopped VD from closing.
=> correction, this prevented normal VD from closing directly though it closes after some frames

over here it seems like 0.1 is more stable than 0.25.
(seems to work with VD 1.66, though it crashes with VDMod1.5.10 after some frames)

0.1 + mt("undot()") works under VD1.6.6
0.25 + mt("undot()") didn't work under VD1.6.6 / VDM
0.1 + mt("mergechroma(blur(1.3))") didn't work under VD1.6.6 / VDM


Cu Selur

Last edited by Selur; 4th June 2005 at 11:15.
Selur is offline   Reply With Quote
Old 4th June 2005, 18:28   #9  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
@tsp

Let me fix an error in previous report and change times to seconds (results were swapped) :

No MT = 568 sec

v0.10 = 415 sec
v0.25 = 442 sec
v0.29 = 596 sec

Yes, v0.29 is slower than not using it

VDub 1.66 crashed when loading the script.


@Selur
Input is 640x480.
hartford is offline   Reply With Quote
Old 28th May 2005, 20:25   #10  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
new version ready. Should be more thread safe.

Also found a bug? in pixiedust. When using dynamic_cast Pixiedust throws __non_rtti_object exception. And it always complaines about "First-chance exception at 0x015327e2 in virtualdubmod.exe: 0xC000001D: Illegal Instruction" in the debugview.
tsp is offline   Reply With Quote
Old 1st June 2005, 12:28   #11  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Another update. Now the filter can change the height or width but not both.

By the way is there anyone using this filter and does it infact increase the speed? Also is there anyone who has the sourcecode for Dust and is willing to do some bugfixing because this filter really doesn't like dust.
tsp is offline   Reply With Quote
Old 1st June 2005, 13:02   #12  |  Link
Manao
Registered User
 
Join Date: Jan 2002
Location: France
Posts: 2,856
Dust has always been close source. Ask Steady to modify his code, if he still maintains it.
Manao is offline   Reply With Quote
Old 10th June 2005, 06:56   #13  |  Link
mg262
Clouded
 
mg262's Avatar
 
Join Date: Jul 2003
Location: Cambridge, UK
Posts: 1,148
Quote:
Originally Posted by tsp
Also found a bug? in pixiedust. When using dynamic_cast Pixiedust throws __non_rtti_object exception. And it always complaines about "First-chance exception at 0x015327e2 in virtualdubmod.exe: 0xC000001D: Illegal Instruction" in the debugview.
I believe you can't call the dust filters more than once in a script, which your filter presumably implicitly does... one workaround is to load the DLL twice and use the DLLName_FilterName syntax.
mg262 is offline   Reply With Quote
Old 10th June 2005, 17:18   #14  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Quote:
Originally Posted by mg262
I believe you can't call the dust filters more than once in a script, which your filter presumably implicitly does... one workaround is to load the DLL twice and use the DLLName_FilterName syntax.
yes that explains why it produces garbage but it doesn't explain why it crash when the dll is loaded the first time when dynamic_cast is used because at that time the constructor is not called yet so there is no way dust knows it is called many times. It's really a shame such a good filter is programmed so "bad" (and the source code is not available).

Quote:
Originally Posted by Leak
Ummm... I really wouldn't use MT with plugins that make important decisions by analyzing the whole image - if you're unlucky, you get one half of the image deinterlaced and the other weaved, or one halfs is matched forward and one matched backward, or one half is considered video and the other film etc...
You're right about that. If/when I succeed in making avisynth.dll threadsafe I will try to make an option to process two or more frames in parallel. So such filters can be used.
tsp is offline   Reply With Quote
Old 14th June 2005, 00:01   #15  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
new version ready that includes a custom build of avisynth 2.56 that should speed this filter up.
tsp is offline   Reply With Quote
Old 9th June 2005, 10:28   #16  |  Link
psme
Registered User
 
Join Date: Mar 2005
Posts: 68
Will it work with Decomb/TIVTC? I use Avisynth inside FFDshow for realtime playback processing.

Using DScaler 5.006 decoder playing DVD, using TIVTC's tdm in FFDShow for 3/2 pulldown recovery, on my P4 3G Northwood, CPU usage is around 60-90%.

If this filter can reduce the CPU loading by half on 2 CPUs setup then I'm thinking the new dual core system!

Thanks in advance.

regards,

Li On

Edit: sorry, just saw others are already running Decomb with MT but seems not much performance gain...

Last edited by psme; 9th June 2005 at 10:36.
psme is offline   Reply With Quote
Old 9th June 2005, 11:33   #17  |  Link
Leak
ffdshow/AviSynth wrangler
 
Leak's Avatar
 
Join Date: Feb 2003
Location: Austria
Posts: 2,441
Quote:
Originally Posted by psme
Edit: sorry, just saw others are already running Decomb with MT but seems not much performance gain...
Ummm... I really wouldn't use MT with plugins that make important decisions by analyzing the whole image - if you're unlucky, you get one half of the image deinterlaced and the other weaved, or one halfs is matched forward and one matched backward, or one half is considered video and the other film etc...
Leak is offline   Reply With Quote
Old 14th June 2005, 02:17   #18  |  Link
psme
Registered User
 
Join Date: Mar 2005
Posts: 68
Thanks for the great filter. Will try it tonight.

Will it speed up Didee's LimitedSharpen? I use it for realtime DVD playback using Avisynth in FFDShow. LimitedSharpen uses up most of my P4 3G Northwood and I can only do good flag 24fps 480p NTSC DVD. 25fps 576p PAL DVD gets stutter with 100% CPU. I'll get a dual core CPU if the filter help.

Thanks in advance.

regards,

Li On
psme is offline   Reply With Quote
Old 14th June 2005, 03:24   #19  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
Sorry for the late update; had problems.

Test was done reading from one drive and writing to another.
System is a Gigabyte GA-7DPXDWP, modified xp2400+ processors,
Tekram U160 SCSI Adaptor, Maxtor 73GB and 147GB u360 drives.

Test avi is an analog capture at 640x480.


--
Script:

lloadplugin("d:\plugins\DeComb521.dll")
#loadplugin("d:\plugins\MT_0.10.dll")
#loadplugin("d:\plugins\MT_0.29.1.dll")
loadplugin("d:\plugins\MT_0.30.dll")
loadplugin("d:\plugins\TBilateral-096.dll")

Avisource("d:\test-mt.avi").ConvertToYV12

Telecide(order=1,Post=0,Guide=1)
Decimate(Cycle=5,Mode=0,Quality=3)


#TBilateral(diameterL=3,diameterC=3,sDevL=2.0,sDevC=2.0,iDevL=6.0,iDevC=7.0,d2=true,gui=false)

mt("TBilateral(diameterL=3,diameterC=3,sDevL=2.0,sDevC=2.0,iDevL=6.0,iDevC=7.0,d2=true,gui=false)")

--



No MT = 522 seconds

MT 0.10 =
An out-of-bounds memory access (access violation) occurred in module 'TBilateral-096'.

MT 0.29.1 = 410 seconds

MT 0.30 = 534 seconds
hartford is offline   Reply With Quote
Old 21st June 2005, 02:31   #20  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
Script and test avi the same as before.
Huffy Compression; read one drive, write to another
Audio included
Tests run back-to-back.


No MT: 539 seconds CPU time: 54%

MT 0.10 431 seconds CPU time: 65% to 74%

MT 0.41: 540 seconds CPU time: 54%


MT 0.41-1 with special build avisynth.dll

541 seconds CPU time: 50% to 58%



MT v0.10 isn't stable, but worked this time. It will error on loading
or it will work for the test.

If I have time, I will test on a longer capture (28000-30000 frames, or about 20 min)

Slight differences probably due to read drive being a bit full (slower read on inner tracks).

[Edit:] Clarification: Differences from June 13 test.

Last edited by hartford; 21st June 2005 at 02:38.
hartford is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:49.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.