Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 26th May 2005, 13:20   #1  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
MT 0.7(+custom avisynth)a filter to run filters multithreaded. Usefull for SMP or HT

I made this small filter now that dualcore processors begins to show up. I don't know how much faster the filters will be when running on a Pentium 4 HT but with my old dual celeron 400 MHz I got a 40 % speed increase. Also usefull for all the 8 way dualcore opteron computers
Also Included is a custom build of avisynth 2.5.7 that provides the two function SetMTMode and GetMTMode and some change to internal filters to support multithreading.

please post if there is some filters that doesn't work or what speed increase you got.

Get version 0.7 here (contains avisynth 2.57 MT version 5[src])
avisynth 2.57 MT version 4 [src]
Get version 0.6 here
Get version 0.5 here
or version 0.41 here

You can also get futher help to MT at the mediaWIKI page here

from the readme

MT is a filter that enables other filter to run multithreaded. This should hopeful speed up processing on hyperthreaded multicore processors or multiprocessor systems.
Technical info

Important: Allways remember to judge the result by looking at the speed improvement not the cpu utilization.

MT is a filter that split a frame up in smaller fragment that are processed in individual threads allowing full utilization of multiprocessor or hyperthread enabled computers. I tested it on my old abit bp6 with 2x celeron 400 MHz and it increased the speed by 40%. Note that if you is already getting 100% cpu utilization when processing avs scripts(ie if you're encoding to DivX/XviD) you don't need to use this filter.

The filter works like this avs function:

function PseudoMT(clip c,string filter)
{
a=eval("c.crop(0,0,src.width/2,src.height)."+filter)
b=eval("c.crop(src.width/2,0,src.width/2,src.height)."+filter)
stackhorizontal(a,b)
}


The only difference is that a and b is executed in parallel and it is possible to split the frame into more than 2 pieces. If the filter works with the above script it should work with MT if the filtercode is threadsafe. Dust does not work with the above script so if you want to use iip use another denoiser or get Steady to fix the bug.
Limitations

The filter to be run must only accept one input clip and that is last. Also the filter should not rely on the content of the whole frame(like smart deinterlacers) else there is a risk that only part of the frame will be processed. The filter should also be threadsafe. Most filters are threadsafe but some will produce a wrong result or crach.
Installation

copy mt.dll into the avisynth plugin directory and copy the included avisynth.dll into your windows\system32 directory or where avisynth.dll is located and remember to take a backup of the old avisynth.dll(rename it or something) if you don't have version 2.6 installed.

from version 0.7 two other filters are included too:

MTi() that creates two threads and let each thread process one field before combining them like this avs function
function PseudoMTi(clip c,string filter)
{
a=eval("c.AssumeFieldBased().SeparateFields.selecteven()."+filter)
b=eval("c.AssumeFieldBased().SeparateFields.selectodd()."+filter)
interleave(a,b).weave()
}
like the other pseudoscript a and b are executed in parallel. Note that only two threads are created so it will only use two (virtual) cores.

MTsource() that are used to run source filters multithreaded. It works like this:
function PseudoMTsource(string filter)
{
SetMTmode(2)
eval(filter)
SetMtmode(0)
}
So different from the two other filters it is a temporal filter that fetches frames ahead of time and store them in the cache for fast retrieval.

Syntax
MT(clip clip,string filter,int threads,int overlap,bool splitvertical)

All parameters are named. Function parameters:

clip clip = last
input clip

filter string = No default
filter to run multithreaded. Note that the filter must not change both the frame height and width (but colorspace is okay) and the only 1 input clip is allowed. It can be any build-in filter, avs defined filter or external plugin filter as long as the restrictions are observed.

threads int = 2
number of threads to run. Set this to the number of threads your computer is able to run concurrently.

overlap int = 0
- number of pixel to add at the top and bottom border or left and right border. Increase this if you see artifacts where the frame is split.

splitvertical bool = false
- if true the frame are cut vertical(and the filter is allowed to change the height) else it is cut horizontal(and the filter is allowed to change the width).

MTi
MTi(clip clip,string filter)

All parameters are named. Function parameters:

clip clip = last
input clip. Must be mod2 height for RGB and YUY2 colorspace and mod4 height for YV12 colorspace

filter string = No default
filter to run multithreaded. Note that the filter are allowed to change both width and height at the same time but only 1 input clip is allowed. It can be any build-in filter, avs defined filter or external plugin filter as long as the restrictions are observed.


MTsource
MTSource(string filter,int delta,int threads,int max_fetch)

All parameters are named. Function parameters:

filter string = No default
source filter to run multithreaded. Currently only internal and external source filters are supported (like DirectShowSource, Avisource, MPEG2Source) . You can use an avs defined filter or a non-source filter but it might crash or produce frame corruption.

delta int = 1
this is how many frames there are between each frame request so if you are only going to read every second frame set it to 2 or if you are reading the frames backwards set it to -1. More complex frame access pattern like SelectEvery(10,3,6,7) are not supported (but might work anyway as the requested frames are in the cache, there will just be some waisted memory from non requested frame in the cache)

threads int = 2
number of threads to run. Set this to the number of threads your computer is able to run concurrently.

max_fetch int = 30
This is the maximum number of frames ahead of the currently requested frame that MTsource will fetch. Setting it to low will leaving the threads idle for most of the time and setting it to high will waste to much memory.

Examples:

MT("blur(1)",2,2)

also user defined function (uses variableblur):

MT("unsharp(2,0.7)",2,2)
function unsharpen(clip c,float variance,float k)
{
blr=binomialBlur(c,vary=variance,varc=2,Y=3,U=2,V=2)
return yv12lutxy(blr,c,"y x - "+string(k)+" * y +",y=3,u=2,v=2)
}

This one will not produce the intended result but shows how to use the triple quotes:

MT(""" subtitle("Doh") """,4,0)

License

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as published by
the Free Software Foundation.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

Please consider to make some donation.
Version changes:

* 0.1 first release.
* 0.2 Should be more thread safe.
* 0.21 forgot to comment out a Sleep(0)
* 0.25 Added the splitvertical option
* 0.3 More stable(and slower)
* 0.4 Includes a custom version of avisynth 2.56 beta that should speed things up
* 0.41 Minor speed increase
* 0.5 Requires the included modified avisynth 2.5.6 or avisynth 2.6
* 0.6 Bugfix: height can be changed with splitvertical=true without crashing. Also includes modified avisynth 2.5.7
* 0.7 two new filters: MTi(), MTsource() and Avisynth MT 2.5.7.5


modified avisynth 2.5.7

It contains the two new functions SetMTMode() and GetMTMode() and is needed by MT.dll. Install it by overwriting avisynth.dll in your c:\windows\system32 (and remember to take a backup of the old file first)
Technical info

These functions enable avisynth to use more than one thread when processing filters. This is useful if you have more than one cpu/core or hyperthreading. This feature is still experimental.

Syntax:

GetMTMode(bool threads)
threads - if true GetMTMode? returns the number of threads used else the current mode is returned (see below). Default value false.

SetMTmode(int "mode",int "threads")

Place this at the first line in the avs file to enable temporal (that is more than one frame is processed at the same time) multithreading. Use it later in the script to change the mode for the filters below it.
mode - there are 6 modes 1 to 6. Default value 2.

* Mode 1 is the fastest but only works with a few filter
* Mode 2 should work with most filters but uses more memory
* Mode 3 should work with some of the filters that doesn't work with mode 2 but is slower
* Mode 4 is a combination of mode 2 and 3 and should work with even more filter but is both slower and uses more memory
* Mode 5 is slowest(Slower than not using SetMTMode) but should work with all filters that doesn't require linear frameserving (that is the frames come in order (frame 0,1,2 ... last)
* Mode 6 is a modified mode 5 that might be slightly faster (But still slower than not using SetMTMode)

threads - number of threads to use. Set to 0 to set it to the number of processors available. It is not possible to change the number of threads other than in the first SetMTMode. Default value 0.

Example:

SetMTMode(2,0) #enables multihreading using thread = to the number of available processors and mode 2
LoadPlugin("...\LoadPluginEX.dll") #needed to load avisynth 2.0 plugins
LoadPlugin("...\DustV5.dll") #Loads Pixiedust
import("limitedsharpen.avs")
src=AVIsource("test.avi")
SetMTMode(5) #change the mode to 5 for the lines below
src=src.converttoyuy2().PixieDust()#Pixiedust needs mode 5 to function.
SetMTMode(2) #change the mode back to 2
src.LimitedSharpen() #because LimitedSharpen works well with mode 2
subtitle("Number of threads used: "+string(GetMTMode(true))+" Current MT Mode: "+string(GetMTMode())) #display mode and number of threads in use

Last edited by tsp; 23rd November 2007 at 18:03.
tsp is offline   Reply With Quote
Old 27th May 2005, 11:44   #2  |  Link
MacAddict
XviD User
 
Join Date: Oct 2004
Location: Ky
Posts: 190
Wow Too bad I no longer have the AMD dualie system to test this with. Perhaps by year end most of us will be able to afford a dual core though. Nice work once again tsp.
__________________
DFI NF4 SLI Expert | Opteron 165 CCBBE 0616 XPMW (9x325HTT=2.9Ghz) | 2x1GB G.Skill HZ (3-4-4-8-12-16-2T) | LG 62L DVD/CD | Geforce 7300GT | All SATA | Antec 650 Trio PSU | XP SP2
MacAddict is offline   Reply With Quote
Old 28th May 2005, 00:50   #3  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
thanks. The princip in this filter is really simpel, something like this:
Code:
function MT(clip c,int threads,string function)
{
clip r[threads]
for(int i=0;i<threads;i++)
{
c.crop(height/threads*i,0,height/threads,0)
r[i]=last.eval(function)
}
stackvertical(r[0],r[1],...,r[threads-1])
}
the difficult part is to make the code thread safe so two threads isn't writing to the same memory at the same time. So I'm working on making the IScriptEnvironment threadsafe.
tsp is offline   Reply With Quote
Old 28th May 2005, 02:58   #4  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
Does "function" refer to only a single filter?

For example, can function refer to an imported function using several filters, whether those filters are Avisynth filters or plugins?

I'd appreciate a more thorough explanation as to what "function" means.

Thanks.

[edit]

I ran this script on a 3 minute analog capture:

loadplugin("d:\plugins\DeComb521.dll")
loadplugin("d:\plugins\mt.dll")
loadplugin("d:\plugins\FFT3dGPU9b.dll")

Avisource("d:\test-mt.avi").ConvertToYV12

Telecide(order=1,Post=0,Guide=1)
Decimate(Cycle=5,Mode=0,Quality=3)

mt(fft3dgpu)

fft3dgpu(sigma=4.0,bw=32,bh=32,bt=3,plane=0,mode=2)


This gave an exception when opened in VirtualDub 1.5.10 and pointed to
the line

mt(fft3dgpu)





Last edited by hartford; 28th May 2005 at 03:17.
hartford is offline   Reply With Quote
Old 28th May 2005, 08:54   #5  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
hartford: function is like the function to the buildt in filter scriptclip. So it can be an internal(build-in) or external(plugin) filter or a function that is defined in an AVS file. just remember the quotes(or triple quotes) because it's a string. So your example would be like this:
Code:
Avisource("d:\test-mt.avi").ConvertToYV12

Telecide(order=1,Post=0,Guide=1)
Decimate(Cycle=5,Mode=0,Quality=3)
mt("fft3dgpu(sigma=4.0,bw=32,bh=32,bt=3,plane=0,mode=2)")
But fft3dgpu doesn't work well with MT because only one thread at a time can acces fft3dgpu (Directx really hates multithreading)

you can also use it like this:
Code:
Telecide(order=1,Post=0,Guide=1)
Decimate(Cycle=5,Mode=0,Quality=3)
MT("fft3d()")

function fft3d(clip c)
{
c
fft3dfilter(plane=0)
fft3dfilter(plane=1)
fft3dfilter(plane=2)
}

Last edited by tsp; 28th May 2005 at 09:56.
tsp is offline   Reply With Quote
Old 28th May 2005, 20:25   #6  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
new version ready. Should be more thread safe.

Also found a bug? in pixiedust. When using dynamic_cast Pixiedust throws __non_rtti_object exception. And it always complaines about "First-chance exception at 0x015327e2 in virtualdubmod.exe: 0xC000001D: Illegal Instruction" in the debugview.
tsp is offline   Reply With Quote
Old 1st June 2005, 12:28   #7  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Another update. Now the filter can change the height or width but not both.

By the way is there anyone using this filter and does it infact increase the speed? Also is there anyone who has the sourcecode for Dust and is willing to do some bugfixing because this filter really doesn't like dust.
tsp is offline   Reply With Quote
Old 1st June 2005, 13:02   #8  |  Link
Manao
Registered User
 
Join Date: Jan 2002
Location: France
Posts: 2,856
Dust has always been close source. Ask Steady to modify his code, if he still maintains it.
Manao is offline   Reply With Quote
Old 2nd June 2005, 02:34   #9  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
Quote:
Originally Posted by tsp
mt("fft3dgpu(sigma=4.0,bw=32,bh=32,bt=3,plane=0,mode=2)")
OK, the "quote" stuff fixed it. (I used fft3dgpu only because you authored it and I thought that it would be a "torture" test)

I did a test with the above script using TBilateral and I got no errors:

mt("TBilateral(diameterL=3,diameterC=3,sDevL=2.0,sDevC=2.0,iDevL=6.0,iDevC=7.0,d2=true,gui=false)")


I'll try MT version 0.25 soon and will report.

Thanks.



[added edit]

More testing.

Difference with respect to speed noticed between versions with
VirtualDub 1.15.10:

v 0.1about 7.22 Minutes
v 0.25 about 6:55 minutes

However, VirtualDub v.1.66 crashes when using MT version 0.25. It exits
with no error messages.

Last edited by hartford; 2nd June 2005 at 03:54.
hartford is offline   Reply With Quote
Old 3rd June 2005, 00:02   #10  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Hasn't it been over a year since Steady showed up last?

hartford: How much faster is mt compaired to without it and did vdub 1.6.6 crash at startup or after a couple of frames. There is a new version here that might be a little faster and/or stable.
tsp is offline   Reply With Quote
Old 4th June 2005, 10:27   #11  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,255
nice filter !

"VirtualDub v.1.66 crashes when using MT version 0.25. It exits with no error messages."
=> if you use a filter with mt that requires mod16 make sure your input is mod32, so that if mt splits the frame it's still mod16. (for more than 2 threads one probably has to go to mod64 and so on)

At least for me that stopped VD from closing.
=> correction, this prevented normal VD from closing directly though it closes after some frames

over here it seems like 0.1 is more stable than 0.25.
(seems to work with VD 1.66, though it crashes with VDMod1.5.10 after some frames)

0.1 + mt("undot()") works under VD1.6.6
0.25 + mt("undot()") didn't work under VD1.6.6 / VDM
0.1 + mt("mergechroma(blur(1.3))") didn't work under VD1.6.6 / VDM


Cu Selur

Last edited by Selur; 4th June 2005 at 11:15.
Selur is offline   Reply With Quote
Old 4th June 2005, 18:28   #12  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
@tsp

Let me fix an error in previous report and change times to seconds (results were swapped) :

No MT = 568 sec

v0.10 = 415 sec
v0.25 = 442 sec
v0.29 = 596 sec

Yes, v0.29 is slower than not using it

VDub 1.66 crashed when loading the script.


@Selur
Input is 640x480.
hartford is offline   Reply With Quote
Old 5th June 2005, 21:29   #13  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
That makes more sense with the timing. I figured out what was causing the filter to crash after I got the remote debugger working(clever little thing). It was crop that caused it. It just returned a pointer to the videodata so each thread was infact working in the same memory and avisynth doesn't like that. So expect a new version shortly
tsp is offline   Reply With Quote
Old 6th June 2005, 01:14   #14  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
New version ready. It's not total stable but should run longer. Get version 0.29.1 here
tsp is offline   Reply With Quote
Old 7th June 2005, 02:40   #15  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
@tsp

Take this result with a "grain of salt."

My video processing disk is nearly full since I began testing.
I would estimate that this test might score 10% better if done
7 days ago. I'm sorry, but I have a number of videos backlogged
videos that are due for processing and I'm pressed for time. I wish
that I could give you a more consistent result.

v0.29.1 = 523 seconds
hartford is offline   Reply With Quote
Old 7th June 2005, 12:28   #16  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
hartford: It is fine with an approximate time so I can get an idea of how much faster/slower my experimental releases are (I didn't expect 0.29.0 to be so much slower because it was faster on my dual 400 mhz celeron).

I discovered that the internel cach that are inserted after each filter isn't thread safe so when I figure out how to insert my own cache instead of the buildt in when using more complex filters as iip I will release a new version that should be stable (I have a version running without the cache and executing temporalsoften stable. Usually virtualdub craches after a few frames using version 0.29.1.)
tsp is offline   Reply With Quote
Old 7th June 2005, 21:34   #17  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
ok version 0.30 is up. It should be more stable(I could render 13000 frames without a crash).
tsp is offline   Reply With Quote
Old 8th June 2005, 02:59   #18  |  Link
hartford
Registered User
 
hartford's Avatar
 
Join Date: Nov 2003
Posts: 324
I downloaded v0.30.

I should have time to try it this Friday or Saturday.

I'll do the test by reading the Test.avi from one drive and putting the result
to another drive that has room and to which I seldom put much data. I'll test
versions 0.10, 0.29.1, and 0.30. Writing the output to another drive should make the results more accurate.
hartford is offline   Reply With Quote
Old 9th June 2005, 01:49   #19  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
hartford: thanks I now had version 0.30 process about 100.000 frames without crashing compaired to about 1-20 frames with version 0.10-0.29.1 the downside is that the speed has decreased a lot. I fear it might be slower than not using mt. I will try to incorperate the necesary changes directly in avisynth. That should speed things up.
tsp is offline   Reply With Quote
Old 9th June 2005, 10:28   #20  |  Link
psme
Registered User
 
Join Date: Mar 2005
Posts: 68
Will it work with Decomb/TIVTC? I use Avisynth inside FFDshow for realtime playback processing.

Using DScaler 5.006 decoder playing DVD, using TIVTC's tdm in FFDShow for 3/2 pulldown recovery, on my P4 3G Northwood, CPU usage is around 60-90%.

If this filter can reduce the CPU loading by half on 2 CPUs setup then I'm thinking the new dual core system!

Thanks in advance.

regards,

Li On

Edit: sorry, just saw others are already running Decomb with MT but seems not much performance gain...

Last edited by psme; 9th June 2005 at 10:36.
psme is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:47.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.