Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 1st June 2010, 21:49   #1  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
ThreadRequest : yet another plugin for multithread processing

ThreadRequest is an AviSynth plugin for multithread processing written by LANTIS.
This is another mounting of PipeLine that obtains the idea of QuaddiMM, and it was quietly open to the public.

I had found this plugin on web in this March. but the license was not described there and I coudn't contact the author.(E-mail adress is not published in his page.)
I called LANTIS in an AviSynth forum in Japan, and waited for his reaction.
He noticed my calling about the middle of last month and decided it to making license GPL.
Because the issue was cleared, I decided to introduce this here.

Author's page : http://lantis.homeunix.org/avisynth.shtml
DownLoad : http://lantis.homeunix.org/archive/T...equest102a.zip

Quote:
Syntax:

ThreadRequest(clip, int "cache", int "queue", int "warmingup", string "name")

Syntax of ThreadRequest() is almost the same as PipeLine().
 cache : Number of cached frame. Note that this plugin does the sequential access as much as possible if the access flies within the range of cash.(default 3)
 queue : Size of queue used for delivery from thread. It might be about 3 enough as long as it is not a filter that changes extremely the processing speed.(default 3)
 warmingup : Frequency in which GetFrame is done from the main thread when processing begins. This evades the problem that the thread start running before the construction of the filter is completed.(default 5)
 name : Identifier to report on execution time of thread with text file when filter is unloaded. If this is empty, it won't be reported.(default "")

Synchronize(int "cache", string "name")

Take synchronization of each threads.
 cache : Number of cached frame.(default 3)
 name : Identifier specified to construct one critical section with two or more Synchronize. If this is empty, the section becomes peculiar to the Synchronize.(Default: "")

Usage:
Code:
#example_1
Filter1()
ThreadRequest()
Filter2()
ThreadRequest()
Filter3()

#example_2
Synchronize(20)
Interleave(\
    Filter().SelectEvery(4, 0).ThreadRequest(),\
    Filter().SelectEvery(4, 1).ThreadRequest(),\
    Filter().SelectEvery(4, 2).ThreadRequest(),\
    Filter().SelectEvery(4, 3).ThreadRequest()\
)
Message from author:
Quote:
本プログラムは、ふと思い立ってから突貫工事で作ったため、色々とチェックが甘かったり妙なパラメータを渡すと簡単にクラッシュすることがありますw
このプログラムをヒントに更に素晴らしいプラグインが登場することを願っています。
(This plugin was made impulsively, and it isn't being checked so much. Therefore when weird params are established, this may crash.
I'm wishing that more wonderful plugin appear using this program as a hint.)
NOTE: I'm not a programmer, so I (Chikuzen) wouldn't be able to answer a question about this filter at all.

Last edited by Chikuzen; 3rd June 2010 at 05:30.
Chikuzen is offline   Reply With Quote
Old 2nd June 2010, 00:20   #2  |  Link
a451guy451
Xbox Live: o 4lif o
 
a451guy451's Avatar
 
Join Date: Jun 2009
Location: Monrovia, CA
Posts: 64
I'm fiddling this with this right now, but not really getting a speed increase at all (I'm on an 8-core machine). Can you post some scripts you use this with on a regular basis, and perhaps provide some insight into what kind of speed returns you get out of it? I've been using SET's 2.6 MT build for quite some time, with very few major issues but am always looking for ways to speed things up further. Thanks for the post though, I'll post if I get some significant testing results.
a451guy451 is offline   Reply With Quote
Old 2nd June 2010, 08:45   #3  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
...ok.

#environment
   CPU : intel Core2Quad Q9450
   Mem: 8GB (DDR2-800 2GBx4)
   GPU : Radeon HD5870
   OS : Windows7 Ultimate (64bit)
AviSynth: 2.5.8MT by SEt (32bit)

#Source
Anime OP (MPEG2 TS 1440x1080ix2700frames)

#script
SetMemoryMax(1536)
MPEG2Source("source.d2v")
#ThreadRequest(30)
TFM()
#ThreadRequest(30)
TDecimate() #2700frames -> 2160frames
#ThreadRequest(30)
FFT3DGPU(sigma=3.0)
#ThreadRequest(30)
LanczosResize(1280,720)
Return last

#Benchmark
  avs2avi.exe script.avs -c null -o n

#Result
       without ThreadRequest()
* Pass 1/1: Finished in 00:01:52.698 (19.17 FPS)
* Pass 1/1: Finished in 00:01:52.703 (19.17 FPS)
* Pass 1/1: Finished in 00:01:52.964 (19.12 FPS)
  average : Finished in 00:01:52.788 (19.15 FPS)

       with ThreadRequest()
 * Pass 1/1: Finished in 00:00:57.009 (37.89 FPS)
 * Pass 1/1: Finished in 00:00:56.210 (38.43 FPS)
 * Pass 1/1: Finished in 00:00:55.946 (38.61 FPS)
  average : Finished in 00:00:56.388 (38.31 FPS)

it seems that the speed has been upped to about 200%...

Last edited by Chikuzen; 2nd June 2010 at 19:06.
Chikuzen is offline   Reply With Quote
Old 2nd June 2010, 16:13   #4  |  Link
a451guy451
Xbox Live: o 4lif o
 
a451guy451's Avatar
 
Join Date: Jun 2009
Location: Monrovia, CA
Posts: 64
Wow. Awesome thank you for the examples.
a451guy451 is offline   Reply With Quote
Old 2nd June 2010, 18:06   #5  |  Link
a451guy451
Xbox Live: o 4lif o
 
a451guy451's Avatar
 
Join Date: Jun 2009
Location: Monrovia, CA
Posts: 64
...dokey. I just wasn't using a complicated enough script apparently.

#environment
   CPU : intel X5560
   Mem: 3GB (DDr3-10700 1GBx3)
   OS : WindowsXP
AviSynth: 2.6 SET (32bit)

#Source
Elephant's Dream (Cineform 1920x1080px2879frames)

#script
avisource("elephantsdream1080.avi")
threadrequest(10,5,2).Splin36resize(2048,1152)
threadrequest(10,5,2).assumeframebased()
threadrequest(10,5,2).seperatefields()
threadrequest(10,5,2).selectevery(8,0,1,2,3,2,5,4,7,6,7)
threadrequest(10,5,2).weave()
edeinted = threadrequest(10,5,2).tdeint()
threadrequest(10,5,2).tfm(edeint=edeinted)
threadrequest(10,5,2).tdecimate()
threadrequest(10,5,2).blur(0,1)
threadrequest(10,5,2).sharpen(0,1)
threadrequest(10,5,2).spline36resize(640,360)
trim(0,2878)

#Benchmark
  Virtualdub 1.9.9 (Cineform)

#Results
-no Threading
Finished in 00:04:59

- with ThreadRequest()
Finished in 00:01:58

That certainly made a huge difference. And seems more stable than setmtmode with overly complicated scripts like the one above.
a451guy451 is offline   Reply With Quote
Old 3rd June 2010, 08:59   #6  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
In fact is using this can be interesting for scripts with severals functions. Pipeline is interesting when you have severals task, as a451guy451 stated.
But, if you are runing for example a scrit with only one call, MT/SetMTMode are more interesting, and pipeline will do nothing.
jpsdr is offline   Reply With Quote
Old 3rd June 2010, 22:20   #7  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
Quote:
Originally Posted by jpsdr View Post
In fact is using this can be interesting for scripts with severals functions. Pipeline is interesting when you have severals task, as a451guy451 stated.
But, if you are runing for example a scrit with only one call, MT/SetMTMode are more interesting, and pipeline will do nothing.
I don't think so...

Quote:
Source: MPEG2 TS 1440x1080ix2700frames

MPEG2Source("Source.d2v")
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:27.578 (97.90 FPS) CPU:25%

SetMTMode(1,0)
MPEG2Source("Source.d2v")
Trim(0,2699)
  ***** CRASH! ****

SetMTMode(2,0)
MPEG2Source("Source.d2v")
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:33.278 (81.13 FPS) CPU:91%-97%

SetMTMode(3,0)
MPEG2Source("Source.d2v")
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:25.713 (105.01 FPS) CPU:28%

SetMTMode(4,0)
MPEG2Source("Source.d2v")
Trim(0,2699)
  * Pass 1/1: Finished in 00:01:39.181 (27.22 FPS) CPU:24%-28%

SetMTMode(5,0)
MPEG2Source("Source.d2v")
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:27.836 (97.00 FPS)

SetMTMode(6,0)
MPEG2Source("Source.d2v")
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:27.845 (96.97 FPS) CPU:25%

MPEG2Source("Source.d2v").ThreadrRequest()
trim(0,2699)
  * Pass 1/1: Finished in 00:00:24.653 (109.52 FPS) CPU:26.5%

MPEG2Source("Source.d2v")
ConvertToYUY2(interlaced=true)
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:31.555 (85.56 FPS)

SetMTMode(3,0)
LoadPlugin(PDir+"DGDecode.dll")
MPEG2Source("Source.d2v")
ConvertToYUY2(interlaced=true)
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:27.035 (99.87 FPS) CPU:30%-33%

SetMTMode(3,0)
MPEG2Source("Source.d2v")
SetMTMode(1,0)
ConvertToYUY2(interlaced=true)
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:27.049 (99.82 FPS) CPU:30%-33%

MPEG2Source("Source.d2v").ThreadRequest()
ConvertToYUY2(interlaced=true).ThreadRequest()
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:25.182 (107.22 FPS) CPU:30%-33%

MPEG2Source("Source.d2v")
ConvertToYUY2(interlaced=true)
ConvertToYV12(interlaced=true)
ConvertToYUY2(interlaced=true)
ConvertToYV12(interlaced=true)
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:40.565 (66.56 FPS)

SetMTMode(3,0)
MPEG2Source("Source.d2v")
ConvertToYUY2(interlaced=true)
ConvertToYV12(interlaced=true)
ConvertToYUY2(interlaced=true)
ConvertToYV12(interlaced=true)
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:28.367 (92.18 FPS) CPU:38%-41%

SetMTMode(3,0)
MPEG2Source("Source.d2v")
SetMTMode(1,0)
ConvertToYUY2(interlaced=true)
ConvertToYV12(interlaced=true)
ConvertToYUY2(interlaced=true)
ConvertToYV12(interlaced=true)
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:28.891 (93.45 FPS) CPU:37%-40%

MPEG2Source("Source.d2v").ThreadRequest()
ConvertToYUY2(interlaced=true).ThreadRequest()
ConvertToYV12(interlaced=true).ThreadRequest()
ConvertToYUY2(interlaced=true).ThreadRequest()
ConvertToYV12(interlaced=true).ThreadRequest()
Trim(0,2699)
  * Pass 1/1: Finished in 00:00:28.109 (96.05 FPS) CPU:38%-41%

MPEG2Source("Source.d2v")
TDeint()
Trim(0,2699)
  * Pass 1/1: Finished in 00:02:09.933 (20.78 FPS) CPU:25%

SetMTMode(3,0)
MPEG2Source("Source.d2v")
TDeint()
  ***** CRASH! ****

SetMTMode(3,0)
MPEG2Source("Source.d2v")
SetMTMode(2,0)
TDeint()
  * Pass 1/1: Finished in 00:00:56.364 (47.90 FPS) CPU:91%-95%

MPEG2Source("Source.d2v").ThreadRequest()
TDeint().ThreadRequest()
trim(0,2699)
  * Pass 1/1: Finished in 00:01:54.827 (23.51 FPS) CPU:27%-30%

SetMTMode(5,0)
MPEG2Source("Source.d2v").Threadrequest()
SetMTMode(2,0)
TDeint()
  * Pass 1/1: Finished in 00:00:56.290 (47.97 FPS) CPU:92%-98%
Chikuzen is offline   Reply With Quote
Old 4th June 2010, 00:26   #8  |  Link
a451guy451
Xbox Live: o 4lif o
 
a451guy451's Avatar
 
Join Date: Jun 2009
Location: Monrovia, CA
Posts: 64
I haven't been cataloging my tests quite as well as Chikuzen, but I tend to agree with his analysis. There have only been a few instances where a transcode would go (slightly) faster for me with setmtmode (It might have something to do with the actual CPU specs though). Those cases were either just fps adjustments, or scripts with no real filtering happening. The major determining factor for me though is stability, and at that threadrequest() wins hands down. I had wanted to bench the two fairly with the script I referenced in my above post, but setmtmode couldn't run it with anything but modes 5 & 6, which wasn't really a fair comparison imhp.

Also, I just realized my placement of threadrequest() in my first script tests is wrong. It seems like it should follow a filter, not precede it, correct?
a451guy451 is offline   Reply With Quote
Old 4th June 2010, 02:23   #9  |  Link
Didée
Registered User
 
Join Date: Apr 2002
Location: Germany
Posts: 5,389
Quote:
Originally Posted by a451guy451 View Post
I haven't been cataloging my tests quite as well as Chikuzen, but I tend to agree with his analysis.
Well, I don't agree (yet). I tested this case here, too ... and the results are "interesting", however I don't quite understand them. (Note that I pretty much missed years of experience with Avisynth multithreading, I hopped on the train just these days.)

First, your quoted script above won't run at all. There's splin36resize instead of spline36resize, also seperatefields instead of separatefields, and tfm(edeint=edeinted) probably should read tfm(clip2=edeinted), I guess?

After correcting those, I got the following results on 1080p:

1 Thread:.. 1:46 (28.3 fps) (100%)
8 Thread:.. 1:28 (34.1 fps) (120%)
ThreadReq: 0:52 (57.7 fps) (204%)

By these numbers, threadrequest wins hands down. But it made me wonder, since in my tests with TGMC - which is a fairly complicated & complex script - I got much better results with setmtmode.

What was apparent is that, in all three cases, the actual CPU usage was fairly low. Singlethreaded it was around 22%, 8-threaded was about 25%, and threadrequest'ed was about 32%. That's very little.

Then I suspected it could be related to TFM and tdecimate. So I went ahead and simply deleted them both. Instead I put tdeint into place. (Yes the script then makes little sense processing-wise, but it's just for testing.)
Okay ... without TFM+tdecimate, but with tdeint, I then got the following numbers:

1 Thread:.. 3:08 (15.9 fps) (100%)
8 Thread:.. 0:57 (52.6 fps) (331%)
ThreadReq: 2:22 (21.1 fps) (132%)

Now look at that! Obviously, it is TFM+TDecimate in particular that don't play well with setmtmode. If these two are not involved, then the picture changes completely.

I don't claim to understand *why* it is like this, but at least there's a sign where the problem is located.

Hence, I have to comment:
Quote:
There have only been a few instances where a transcode would go (slightly) faster for me with setmtmode [...]
[...] Those cases were either just fps adjustments, or scripts with no real filtering happening.
(Interesting. When the script is doing basically nothing, then there is not that awfully much that could be accelerated.)

As indicated here, and more obviously in the other thread linked above, SetMTmode seems to work very well even with compuitational intensive scripts. 4-fold speed on a processing monster like TGMC is quite remarkable, I'd say.

The true question is why SetMTmode works so badly together with TFM+TDecimate!
(By another spontaneous idea, I tried to "buffer" these filters with RequestLinear, but that had only very little effect.)

And lastly,
Quote:
The major determining factor for me though is stability, and at that threadrequest() wins hands down. I had wanted to bench the two fairly with the script I referenced in my above post, but setmtmode couldn't run it with anything but modes 5 & 6, which wasn't really a fair comparison imhp.
so far I have not had any stability problems with SetMTmode. The script from this thread here works without any problems, with MTMode(2). Same for MT'ing the TGMC script, which I got running out-of-the-box, in the very first try.

On the other hand, I already made a (weak) try to put ThreadRequest *into* the TGMC function. But it is not running, I get nothing but crashes, crashes, crashes. Could be I'm using it not appropriately, quite possible. TGMC is complex, and the knowledge base for ThreadRequest is little.


Once more, I'm quite a newbie in Avisynth multithreading. I just try to learn, and report what I'm experiencing on the way.
__________________
- We´re at the beginning of the end of mankind´s childhood -

My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!)
Didée is offline   Reply With Quote
Old 4th June 2010, 03:00   #10  |  Link
Chikuzen
typo lover
 
Chikuzen's Avatar
 
Join Date: May 2009
Posts: 595
Quote:
Originally Posted by a451guy451 View Post
Also, I just realized my placement of threadrequest() in my first script tests is wrong. It seems like it should follow a filter, not precede it, correct?
This plugin directions are not yet certain for me.
However,I think that you are correct when considering it from the examples enumerated by LANTIS(the examples of my first post is it).
Chikuzen is offline   Reply With Quote
Old 4th June 2010, 03:40   #11  |  Link
tritical
Registered User
 
Join Date: Dec 2003
Location: MO, US
Posts: 999
Didée, what mt mode did you use? 2? I don't have any actual experience with it, but from my understanding mode 2 creates as many instances of each filter as there are threads. It also replaces the normal cache with a special cache that manages the threads so that a filter instance is only accessed by one thread at a time. In this way instance 1 can produce frame 1, instance 2 can produce frame 2, etc... and it doesn't matter if the filter is thread safe or not (access to class variables etc..). Although, it begins to matter if the filter produces the same results if you seek vs linear requesting. This will work fine for tfm/tdecimate... i.e. wont produce errors and the output should be the same as linear requesting (as long as the frame requests to each instance of tdecimate are not greater than two cycles. For tfm, seeking doesn't matter except for one of the special matching modes for blending that works based on previous matches.). However, it wont save any time with tdecimate because for its decisions each instance of tdecimate needs the statistics about every frame in each cycle. So all of the instances are doing all of the calculations, and therefore it doesn't save any time vs using a single instance.

Last edited by tritical; 4th June 2010 at 03:48.
tritical is offline   Reply With Quote
Old 4th June 2010, 08:20   #12  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by Didée View Post
By these numbers, threadrequest wins hands down. But it made me wonder, since in my tests with TGMC - which is a fairly complicated & complex script - I got much better results with setmtmode.
There is an important difference between threadrequest and setmtmode when applied to a complex function like TGMC. Because SetMTMode is built-in to (a modified) Avisynth, each line is multithreaded separately. With threadrequest, it is the function as a whole (ie fetching of its output frames) that is multithreaded.

Edit: In effect, if I understand correctly, applying threadrequest to TGMC is the same as applying it to just the last filter used inside TGMC.

Last edited by Gavino; 4th June 2010 at 08:56.
Gavino is offline   Reply With Quote
Old 4th June 2010, 08:41   #13  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
I think complex script like for exemple TGMC need to be internaly rewriten with threadrequest inside them to realy see the benefit of it.
Otherwise, on the global script, SetMTMode will do better, your results don't surprise me. It's each function inside the script wich need to be pipelined, not the whole script.
Rewrite TGMC with threadrequest inside "on each line" (thing i inteded do try one day), and test afterward.
jpsdr is offline   Reply With Quote
Old 4th June 2010, 12:50   #14  |  Link
Didée
Registered User
 
Join Date: Apr 2002
Location: Germany
Posts: 5,389
@ tritical - thanks, I see. (Hope so, at least).

Let me ask the other way round: IF a script involves decimation via tdecimate, what would be promising approaches to make use of multi-threading? Any? Or not at all, necessarily?


Gavino, jpsdr -
I might be a novice with MT, but I'm not silly. Of course I did not do TGMC().threadrequest(). Surely it was fiddled into the script, but ... problems.

Practical example and problem: Consider this quite-familiar piece of code:
Code:
o = bob(0,0.5)                                 #.threadrequest(10,5,2) 
					       #			      
o.temporalsoften(1,255,255,28,2).merge(o,0.25) #.threadrequest(10,5,2) 
					       #			      
sup = msuper()                                 #.synchronize(80)	      
bv3 = sup.manalyse(isb=true,delta=3,search=4)  #.threadrequest(10,5,2) 
bv2 = sup.manalyse(isb=true,delta=2,search=4)  #.threadrequest(10,5,2) 
bv1 = sup.manalyse(isb=true,delta=1,search=4)  #.threadrequest(10,5,2) 
fv1 = sup.manalyse(isb=false,delta=1,search=4) #.threadrequest(10,5,2) 
fv2 = sup.manalyse(isb=false,delta=2,search=4) #.threadrequest(10,5,2) 
fv3 = sup.manalyse(isb=false,delta=3,search=4) #.threadrequest(10,5,2) 
					       #			      
o.mdegrain3(sup,bv1,fv1,bv2,fv2,bv3,fv3)       #.threadrequest(10,5,2)
Challenge: Use ThreadRequest within this little script, so that it actually works.

I added TR to some lines, I added it to all lines, with default parameters, with larger-than-default parameters, also with synchronize in various places .... my clueless try is shown right of the # column.

All I get is crashes after a few frames. It seems to get better when adding Synchronize(xx) to the sup=msuper() line - with that, it won't crash after just a view frames, but will crash after a few hundred frames.

Seems its quite some hassle to get it right ...

(... while SetMTMode runs out-of-the-box on this example. And for the time while the ThreadRequest'ed script IS running, it is slower than SetMTmode (66fps vs. 72fps).
__________________
- We´re at the beginning of the end of mankind´s childhood -

My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!)

Last edited by Didée; 4th June 2010 at 12:53.
Didée is offline   Reply With Quote
Old 4th June 2010, 14:36   #15  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by Didée View Post
I might be a novice with MT, but I'm not silly. Of course I did not do TGMC().threadrequest().
Doing xxx().threadrequest(), where xxx is a function, is not necessarily silly, and might even give better performance than applying threadrequest separately to every line of xxx.

The most effective way to multithread any given script is not at all obvious and requires careful analysis of frame access patterns over the entire filter graph, also taking into account the action of the Avisynth cache. Add to that the unknown 'thread-safeness' of the individual filters and the whole thing is a minefield.
Gavino is offline   Reply With Quote
Old 4th June 2010, 14:56   #16  |  Link
Didée
Registered User
 
Join Date: Apr 2002
Location: Germany
Posts: 5,389
Quote:
Originally Posted by Gavino View Post
the whole thing is a minefield
So much I figured already.

Quote:
The most effective way to multithread any given script is not at all obvious and requires careful analysis of frame access patterns over the entire filter graph, also taking into account the action of the Avisynth cache.
And how would a mere-mortal Avisynth user perform such an analysis? (Heck, there's not even an obvious way of figuring a suited SetMemoryMax value, except for just trying and be surprised whether-or-not it crashes in the long run ...)

Patiently waiting for enlightenment on how to successfully use ThreadRequest on a "primitive" MDegrain sequence. If it works so well with SetMTmode, it should be possible with ThreadRequest, too?
__________________
- We´re at the beginning of the end of mankind´s childhood -

My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!)
Didée is offline   Reply With Quote
Old 4th June 2010, 16:07   #17  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
In fact, for me, on my i7@980, i must said that on 480p with TGMC, i was going from aroud 5fps to around 15fps with SetMTMode(2,12).
So, as i'm lazy and already have something wich speed-up, i may not try to use threadreaquest on TGMC....
jpsdr is offline   Reply With Quote
Old 4th June 2010, 16:39   #18  |  Link
tritical
Registered User
 
Join Date: Dec 2003
Location: MO, US
Posts: 999
Atm, there isn't a good way to multithread tdecimate at the script level via setmtmode/MT etc... It would have to be done internally. Actually, it could probably be done using openmp with just a few lines, but I'd have to check the source code.

Personally, I think that for Avisynth 2.6 a function should be added to the plugin interface that requires a filter to report which mt modes it is compatible with, and for MT type multithreading automatically report how much overlap it needs for the current settings (still allow that to be overridden by the user though). That would eliminate a lot of problems for users trying to figure this out (in reality the only way to know which mode will work is to look at the source code). It would also get authors to think more about their implementation with respect to threading. Anyways, getting OT for this thread.

Last edited by tritical; 4th June 2010 at 16:41.
tritical is offline   Reply With Quote
Old 4th June 2010, 19:49   #19  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by Didée View Post
And how would a mere-mortal Avisynth user perform such an analysis?
I don't know. As I said, it's not obvious.
However, there's a difference between 'the most effective way' and 'a way that doesn't crash' - here a simpler analysis might suffice.
Quote:
Patiently waiting for enlightenment on how to successfully use ThreadRequest on a "primitive" MDegrain sequence. If it works so well with SetMTmode, it should be possible with ThreadRequest, too?
One thing that springs to mind is that any node in the filter graph that feeds into two or more different places needs to have its access serialised unless it is known to be thread-safe. (See this post).

The result of bob() is fed into three different places (temporalsoften, merge and mdegrain3), while that of msuper() is fed into the six manalyse filters. Bob is probably(?) thread-safe, but likely the upstream source filter is not. So the first thing I would try is putting Synchronise() on bob() (or before it) and on Msuper().

Once the script works without crashing, it can then be tuned further by adjusting the threadrequest placement and parameters.

This is all in theory - I haven't actually used any multithreading stuff as my processor is an ancient single-core.
Gavino is offline   Reply With Quote
Old 9th June 2010, 05:35   #20  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,926
Interesting it seem combining SetMtMode(2) + .ThreadRequest for only the heaviest parts (tried with filter like yadif and unblock) seems to be faster then to use .ThreadRequest in every line alone

for example this was more efficient

SetMTMode(2)
directshowsource("E:\test.ts", audio=false)
Yadif(mode=0,order=1).ThreadRequest()

then this

directshowsource("E:\test.ts", audio=false)
Yadif(mode=0,order=1).ThreadRequest()

or that

directshowsource("E:\test.ts", audio=false).ThreadRequest()
Yadif(mode=0,order=1).ThreadRequest()

interesting it gave me immediately very constant 95% utilization on Avisynth SET 2.5.8 MT and a nice speedup
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 9th June 2010 at 05:39.
CruNcher is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:15.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.