Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 11th March 2012, 15:52   #1  |  Link
cretindesalpes
͡҉҉ ̵̡̢̛̗̘̙̜̝̞̟̠͇̊̋̌̍̎̏̿̿
 
cretindesalpes's Avatar
 
Join Date: Feb 2009
Location: No support in PM
Posts: 712
AVSTP — A library for multithreaded plug-in development

AVSTP is a programming library for Avisynth plug-in developers. It helps supporting native multi-threading in plug-ins. It works by sharing a thread pool between multiple plug-ins, so the number of threads stays low whatever the number of instantiated plug-ins. This helps saving resources, especially when working in an Avisynth MT environment.

There is a low-level API using a C interface to access the shared functions of the library. There is also a set of C++ classes on the top of that, for convenient use.


Download

>>>> avstp-1.0.4.zip <<<<

Documentation, library and source code is included in the archive.


Short user manual

From the Avisynth user point of view, an AVSTP-enabled plug-in requires the avstp.dll file to be installed. Put it in the usual AviSynth 2.5\plugins\ directory, or load it manually with LoadPlugin("path\avstp.dll") in your script. The dll is shared between all plug-ins using AVSTP, so keep only one avstp.dll file in your plug-in set. If you're updating from a previous version, make sure that Avisynth will get access only to the latest one.

If a plug-in requiring AVSTP cannot find the dll file, it could crash, emit an error, or fall back gracefully on a single-threaded mode, depending on its design and implementation. There is no mandatory or pre-defined behaviour for such a case.

The number of threads is automatically set to the number of available logical processors. The thread count can also be controlled via an Avisynth function, so multi-threading can be disabled globally if not desired.
__________________
dither 1.28.1 for AviSynth | avstp 1.0.4 for AviSynth development | fmtconv r30 for Vapoursynth & Avs+ | trimx264opt segmented encoding

Last edited by cretindesalpes; 7th October 2020 at 10:56. Reason: v1.0.4
cretindesalpes is offline   Reply With Quote
Old 13th March 2012, 14:29   #2  |  Link
mastrboy
Registered User
 
Join Date: Sep 2008
Posts: 365
Good news, just hope more developers start using it so we can get a unified queue for multithreaded plugins.

Personally i hope someone would implement it into TTempsmooth
mastrboy is offline   Reply With Quote
Old 29th March 2012, 02:49   #3  |  Link
VIIII
Registered User
 
Join Date: Mar 2012
Posts: 1
I'd be incredibly pleased if this was used in more filters.

Even with just the ones in the dither package, I've already seen noticeable fps increases.

It's nice to be able to utilise my cpu without having to rely on the finicky beast that SetMTMode() and MT() are.
VIIII is offline   Reply With Quote
Old 30th March 2012, 16:49   #4  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,926
We need benchmarks evaluations (Intel, AMD) and a final solution in the end am i right that we have 5 different implementations already now for Multithreading ?

Sets = MT
LANTIS = ThreadRequest
SAPikachu = MP_Pipeline
Leimings2006 = SoraThread

First unified Plugin Approach:

cretindesalpes = Avstp

anyone i missed ?
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 30th March 2012 at 17:09.
CruNcher is offline   Reply With Quote
Old 27th May 2012, 11:38   #5  |  Link
cretindesalpes
͡҉҉ ̵̡̢̛̗̘̙̜̝̞̟̠͇̊̋̌̍̎̏̿̿
 
cretindesalpes's Avatar
 
Join Date: Feb 2009
Location: No support in PM
Posts: 712
avstp 1.0.1 (see first post):
  • Removed any floating point code from the implementation, so avstp doesn't get confused when the client code doesn't flush the FP register state after MMX operations.
    It was occasionally causing deadlocks.
__________________
dither 1.28.1 for AviSynth | avstp 1.0.4 for AviSynth development | fmtconv r30 for Vapoursynth & Avs+ | trimx264opt segmented encoding
cretindesalpes is offline   Reply With Quote
Old 30th May 2012, 08:51   #6  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,542
Quote:
Originally Posted by cretindesalpes View Post
It was occasionally causing deadlocks.
Perhaps I have found the cause of locked encodings...
__________________
@turment on Telegram
tormento is online now   Reply With Quote
Old 3rd June 2012, 20:03   #7  |  Link
Revgen
Registered User
 
Join Date: Sep 2004
Location: Near LA, California, USA
Posts: 1,545
Quote:
Originally Posted by cretindesalpes View Post
Link isn't working for me. Dither link isn't working either.

EDIT

Link works now. Thank You.
__________________
Pirate: Now how would you like to die? Would you like to have your head chopped off or be burned at the stake?

Curly: Burned at the stake!

Moe: Why?

Curly: A hot steak is always better than a cold chop.

Last edited by Revgen; 4th June 2012 at 13:16.
Revgen is offline   Reply With Quote
Old 21st June 2012, 12:22   #8  |  Link
Pat357
Registered User
 
Join Date: Jun 2006
Posts: 452
Thanks for your nice piece of work !
I got it working using MVTools v2.6.0.4 and completed just a almost 2h during film with it : got zero problems !! Nice
Just one small question, how exactly can we manually set the number of threads being used ?

From the documentation :
avstp_set_threads (var c , number of threads)

What should I put for "var c" ? A name of a variable, a value, ...?

To get for example 4 threads being used, I've tried the following (without success) :

Added a line containing one of the following lines to the end of my script (before the latest "last"is returned)
avstp_set_threads (var c , 4)
avstp_set_threads (klm , 4)
avstp_set_threads (c , 4)
avstp_set_threads (4 , 4)
avstp_set_threads (4)

Because none of these work, it seems I have to put something there.... but what ??

Thanks in advance !

Last edited by Pat357; 21st June 2012 at 12:37.
Pat357 is offline   Reply With Quote
Old 21st June 2012, 19:15   #9  |  Link
cretindesalpes
͡҉҉ ̵̡̢̛̗̘̙̜̝̞̟̠͇̊̋̌̍̎̏̿̿
 
cretindesalpes's Avatar
 
Join Date: Feb 2009
Location: No support in PM
Posts: 712
Quote:
Originally Posted by Pat357 View Post
avstp_set_threads (4)
This one should work. If it doesn't, what is the error message? On which Avisynth version? The "var c" is mostly a workaround to make the function work even if last isn't defined.
__________________
dither 1.28.1 for AviSynth | avstp 1.0.4 for AviSynth development | fmtconv r30 for Vapoursynth & Avs+ | trimx264opt segmented encoding
cretindesalpes is offline   Reply With Quote
Old 21st June 2012, 21:26   #10  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by cretindesalpes View Post
The "var c" is mostly a workaround to make the function work even if last isn't defined.
I don't see the point of that.
If c only exists so that last can be preserved, then it is unnecessary, since a function that returns an int leaves last unchanged. (last is only set when a function returns a clip.)
__________________
GScript and GRunT - complex Avisynth scripting made easier
Gavino is offline   Reply With Quote
Old 4th July 2012, 19:34   #11  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,542
Is it possible to get the previous version? I think this one makes appear some strange glitches.
__________________
@turment on Telegram
tormento is online now   Reply With Quote
Old 3rd October 2012, 21:12   #12  |  Link
Pat357
Registered User
 
Join Date: Jun 2006
Posts: 452
It has been quite a while since the last post.... has development been stopped on AVSTP ?
Pat357 is offline   Reply With Quote
Old 4th October 2012, 07:15   #13  |  Link
cretindesalpes
͡҉҉ ̵̡̢̛̗̘̙̜̝̞̟̠͇̊̋̌̍̎̏̿̿
 
cretindesalpes's Avatar
 
Join Date: Feb 2009
Location: No support in PM
Posts: 712
Not really, but what would you expect to be added? It’s a simple library with a simple goal, and I think it doesn’t really need more. Maybe at the wrapping code level, to handle complex dependencies or different ways to split a frame.
__________________
dither 1.28.1 for AviSynth | avstp 1.0.4 for AviSynth development | fmtconv r30 for Vapoursynth & Avs+ | trimx264opt segmented encoding
cretindesalpes is offline   Reply With Quote
Old 22nd July 2013, 12:49   #14  |  Link
TurboPascal7
Registered User
 
TurboPascal7's Avatar
 
Join Date: Jan 2010
Posts: 270
I tried avstp lately and while I definitely like the idea, current implementation has some issues.

First - library interface is a bit dated, imho. While moving from C++11 thread pool with lambda support to avstp, I had to replace code like
Code:
for (int i = 0; i < threadPool.numberOfThreads(); ++i) {
        threadPool.enqueue([=]{
            prepareBuffers_op(pSrc + (offset_+i*heightPerThread)*srcPitch, buffers_, width, heightPerThread+2, srcPitch, bufferPitch_, heightPerThread / 2 * i * bufferPitch_);
        });
    }
with
Code:
 struct RunData {
        const BYTE* pSrc;
        BYTE** buffers;
        int width;
        int height;
        int srcPitch;
        int bufferPitch;
        int bufferOffset;
        decltype(prepareBuffers_op) op;
    };

    std::vector<RunData> datas;

    for (int i = 0; i < pool.numberOfThreads(); ++i) {
        RunData d;
        d.pSrc = pSrc + (offset_+i*heightPerThread)*srcPitch;
        d.buffers = buffers_;
        d.width = width;
        d.height = heightPerThread + 2;
        d.srcPitch = srcPitch;
        d.bufferPitch = bufferPitch_;
        d.bufferOffset = heightPerThread / 2 * i * bufferPitch_;
        d.op = prepareBuffers_op;
        datas.push_back(d);
    }

    for (int i = 0; i < pool.numberOfThreads(); ++i) {
        pool.enqueue(dispatcher_, [](avstp_TaskDispatcher *dispatcher, void *userData){
            auto data = reinterpret_cast<RunData*>(userData);
            data->op(data->pSrc, data->buffers, data->width, data->height, data->srcPitch, data->bufferPitch, data->bufferOffset);
        }, &datas[i]);
    }
While lambdas support isn't something you can't live without, it would be definitely nice to have.

Second - avstp is slow. I get about 10-15% fps drop compared to a simple implementation like https://github.com/progschj/ThreadPool. I'm not sure what is the reason for this slowdown but it's here and it's really annoying. Maybe I'm doing something wrong in the code above, but I don't think it's the case.

And a personal question - why are you reimplementing a lot of things that have been available in different compilers/boost for ages? I do appreciate extremely easy compilation process of your plugins, but most of avstp code looks like a rewrite of boost to me.

-------
Okay, nevermind the performance part, but your API is just way too confusing.
When called avstp_set_threads(4), thread pool will actually have 3 threads. This is good because we also have the main thread, but I didn't find any mention of this in the documentation. At the same time avstp_get_nbr_threads (also, please rename to "number_of_threads" or "thread_count", nbr is just bad) returns 4, leading to the wrong assumption that there are actually 4 threads in the pool and hence incorrect workload distribution. You should either make this part correct at API level (return actual number of threads in the pool) or note it in the documentation.

Last edited by TurboPascal7; 24th July 2013 at 14:54. Reason: important update
TurboPascal7 is offline   Reply With Quote
Old 28th July 2013, 18:42   #15  |  Link
cretindesalpes
͡҉҉ ̵̡̢̛̗̘̙̜̝̞̟̠͇̊̋̌̍̎̏̿̿
 
cretindesalpes's Avatar
 
Join Date: Feb 2009
Location: No support in PM
Posts: 712
Quote:
Originally Posted by TurboPascal7 View Post
While lambdas support isn't something you can't live without, it would be definitely nice to have.
I agree. I haven’t any C++11-compatible compiler installed at the moment so I cannot develop it, but it would simplify the writing, indeed.

Quote:
Second - avstp is slow. I get about 10-15% fps drop compared to a simple implementation
I don’t know why. The thread synchronization is done by calling WaitForSingleObjectEx() and ReleaseSemaphore(). Do faster synchronization primitives exist in the modern Windows APIs? I don’t know what std::condition_variable uses on your compiler. But this shouldn’t make any noticeable difference with coarse-grained tasks.

Quote:
And a personal question - why are you reimplementing a lot of things that have been available in different compilers/boost for ages? I do appreciate extremely easy compilation process of your plugins, but most of avstp code looks like a rewrite of boost to me.
Some of theses primitives are actually quite old and I use them for a long time. I’m comfortable with them and I don’t see any reason to switch to a complicated external library.

Quote:
Okay, nevermind the performance part, but your API is just way too confusing.
When called avstp_set_threads(4), thread pool will actually have 3 threads. This is good because we also have the main thread, but I didn't find any mention of this in the documentation. At the same time avstp_get_nbr_threads (also, please rename to "number_of_threads" or "thread_count", nbr is just bad) returns 4, leading to the wrong assumption that there are actually 4 threads in the pool and hence incorrect workload distribution. You should either make this part correct at API level (return actual number of threads in the pool) or note it in the documentation.
When avstp_set_threads() is called with 4, there are exactly 4 working threads, because the main thread is also a working thread. Check the individual thread load with Process Explorer to make sure. I’ll mention it in the documentation. Maybe this misunderstanding is the cause of the slowdown you noticed previously?

But why “nbr” is bad?
__________________
dither 1.28.1 for AviSynth | avstp 1.0.4 for AviSynth development | fmtconv r30 for Vapoursynth & Avs+ | trimx264opt segmented encoding

Last edited by cretindesalpes; 28th July 2013 at 18:46.
cretindesalpes is offline   Reply With Quote
Old 28th July 2013, 22:13   #16  |  Link
TurboPascal7
Registered User
 
TurboPascal7's Avatar
 
Join Date: Jan 2010
Posts: 270
Quote:
Originally Posted by cretindesalpes View Post
When avstp_set_threads() is called with 4, there are exactly 4 working threads, because the main thread is also a working thread. Check the individual thread load with Process Explorer to make sure. I’ll mention it in the documentation. Maybe this misunderstanding is the cause of the slowdown you noticed previously?
Yes, this was the cause of the slowdown, that's why I said "nevermind the performance part". Basically I allocated one more task to avstp threads running on the assumption that number of threads returned by avstp == number of threads in the pool.

I still think it's an API problem. Avstp is a thread pool - nothing more. Whatever a user is doing with his main thread should not concern it. When a user calls get_nbr_threads in avstp, it expects to get number of threads in avstp, not number of threads + 1 because it assumes that you're doing some work in the main thread. I think the fact that I spent some time using this library in a wrong way speaks for itself.

Or maybe it's just me and having more opinions would help.

Quote:
Originally Posted by cretindesalpes View Post
But why “nbr” is bad?
It's just a bad naming practice. Calling it "number" won't make your program slower, but will make reading it a bit easier. Of course, in my opinion.
TurboPascal7 is offline   Reply With Quote
Old 29th July 2013, 07:30   #17  |  Link
cretindesalpes
͡҉҉ ̵̡̢̛̗̘̙̜̝̞̟̠͇̊̋̌̍̎̏̿̿
 
cretindesalpes's Avatar
 
Join Date: Feb 2009
Location: No support in PM
Posts: 712
Quote:
Originally Posted by TurboPascal7 View Post
I still think it's an API problem. Avstp is a thread pool - nothing more. Whatever a user is doing with his main thread should not concern it. When a user calls get_nbr_threads in avstp, it expects to get number of threads in avstp, not number of threads + 1 because it assumes that you're doing some work in the main thread.
No, it doesn’t work exactly like a regular thread pool as you assumed. What happens in the main thread when avstp_wait_completion() is called is not up to the user; the main thread becomes the last worker thread, whereas it would just wait for the other threads if implemented as a naive thread pool.

But these are implementation details. For a client application, the important thing to consider is the number of simultaneous working threads. I would find much more confusing that get_nbr_threads() would return something different from the actual parallel processing capability.
__________________
dither 1.28.1 for AviSynth | avstp 1.0.4 for AviSynth development | fmtconv r30 for Vapoursynth & Avs+ | trimx264opt segmented encoding

Last edited by cretindesalpes; 29th July 2013 at 07:33.
cretindesalpes is offline   Reply With Quote
Old 3rd November 2015, 09:05   #18  |  Link
kotuwa
Registered User
 
Join Date: May 2012
Posts: 66
Quote:
Originally Posted by cretindesalpes View Post
Will this work for x64 dither.dll ?
Or is there another 64 bit version of avstp somewhere else?
or no support for x64 ?
kotuwa is offline   Reply With Quote
Old 30th December 2015, 23:49   #19  |  Link
cretindesalpes
͡҉҉ ̵̡̢̛̗̘̙̜̝̞̟̠͇̊̋̌̍̎̏̿̿
 
cretindesalpes's Avatar
 
Join Date: Feb 2009
Location: No support in PM
Posts: 712
avstp 1.0.3:
  • Recompiled with MSVC 2013 with various internal updates. API not changed at the moment.
  • 64-bit support.
__________________
dither 1.28.1 for AviSynth | avstp 1.0.4 for AviSynth development | fmtconv r30 for Vapoursynth & Avs+ | trimx264opt segmented encoding
cretindesalpes is offline   Reply With Quote
Old 31st December 2015, 10:32   #20  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quick test with your mvtools2 2.6.0.5 and avstp 1.0.3 vs. 1.0.1 (i5 2500K @ 4GHz, 4 cores):

Script:
Code:
colorbars(width = 1920, height = 1080, pixel_type = "yv12").killaudio().assumefps(25, 1).trim(0, 199)
super = MSuper()
multi_vec = MAnalyse (super, multi = true, delta = 9)
MDegrainN(super, multi_vec, 9, thSAD = 400, thSAD2 = 150)
Result with 1.0.1:
Code:
Frames processed:               200 (0 - 199)
FPS (min | max | average):      3.196 | 8.265 | 4.213
Memory usage (phys | virt):     539 | 538 MB
Thread count:                   4
CPU usage (average):            86%
Result with 1.0.3:
Code:
Frames processed:               200 (0 - 199)
FPS (min | max | average):      3.203 | 8.198 | 4.143
Memory usage (phys | virt):     538 | 537 MB
Thread count:                   8
CPU usage (average):            87%
Avstp 1.0.3 creates 4 additional threads compared to 1.0.1. Using "avstp_set_threads(16)" for example would create 20 threads as opposed to 16 with 1.0.1. I suppose this is by design?

Also, the speed/efficiency is slightly worse with the new version.

Last edited by Groucho2004; 31st December 2015 at 10:35.
Groucho2004 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:46.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.