Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 1st December 2011, 04:20   #1  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
MP_Pipeline 0.18 - run parts of avisynth script in external processes [2014-04-06]

This plugin is originally written for my friend to work-around the [del]2GB[/del] 4GB problem of 32-bit process. (Well, also for fun. ) Don't know whether it is useful for others, but I decided to post it here anyways.

As of 0.11, overhead of the plugin is much smaller, it may be possible to use it to speed up more scripts.

Change log:
Code:
0.18
* Fix deadlock when exported clip is consumed by multiple script block

0.17
* Properly terminate slave processes when initialization fails
* Fix "Not a clip" error when using ### inherit and the last block is empty

0.16
* Try to silent all error dialogs on exit of slave process
* Slave process shouldn't be stuck on exit anymore, it will terminate itself if it doesn't exit cleanly after 15 seconds
* Fix ### branch statement, previously it incorrectly rejects some input

0.15
* Properly clean script environment up on exit
* Allow using different avisynth dll to run script block (### dll)

0.14
* Fixed another crashing bug

0.13
* Fixed a bug that causes occasional crashing

0.12
* Fixed a problem that makes scripts unable to be loaded in some programs

0.11
* Greatly improved performance, maximum 80% overhead reduction
* New feature: Ability to lock threads to cores, may improve performance in some cases
* (0.10 is skipped to avoid confusion)

0.9
* New feature: Frame prefetching
* New feature: Exporting multiple clip variables in a single process
* New feature: Code block can be shared between processes

0.3
* Binaries in the x86 folder are in correct version now (In 0.2 the win64 slave is actually win32...)
* Integrated a patched TCPDeliver, no longer depend on the external one
* Fixed random crash when filter chain is destroyed
* Thunked branching

0.2
* x64 support (please copy TCPDeliver.dll in the package to respective plugin folder)
* x86/x64 mixed slave process (requires both x86/x64 version of AviSynth to be installed)
* Add a script variable in branch slave process, make it distinguishable in script
Limitations:
* Since each process have its own script environment, all script variables and loaded plugins won't be inherited, they must be re-initialized if needed
* Due to the limitation above, manually-loaded plugins and imported scripts need to be reloaded/re-imported before they can be used in new process (Or use inherited script snippet, please see MP_Pipeline_readme.avs for details)
* Clips before MP_Pipeline will be ignored
* Audio is not supported
* Every script block must return a clip (i.e. "last" must be a clip), otherwise MPP will raise this error: Invalid arguments to function "MPP_PrepareDownstreamClip"

Binary: http://nmm.me/z6
Source code: https://github.com/SAPikachu/MP_Pipeline/tree/0.18

Some example:

1. Basic usage:
Code:
MP_Pipeline("""
FFVideoSource("SomeVideo")
QTGMC()
### prefetch: 16, 0
### ###
""")
MCTD()

# MCTD and QTGMC will be run parallelly in 2 separate processes
2. Speed up MCTD at the cost of memory
Code:
# Must be 64bit system with at least 8GB memory to run this script
MP_Pipeline("""

# This may be smaller, but I only tested this number
SetMemoryMax(3072)

FFVideoSource("SomeVideo")
MCTD(settings="high")
### prefetch: 16, 0
### ###
""")

# Some time ago I used a script similar to this one for encoding, it is about 20% ~ 30% faster than plain MCTD.
3. Branching
Code:
MP_Pipeline("""
FFVideoSource("SomeVideo")
TNLMeans()
### prefetch: 16, 0
### branch: 4
### ###
""")

# TNLMeans will be run in 4 processes with branching (please see example script in the package for details)
4. Frame caching
Code:
MP_Pipeline("""
FFVideoSource("SomeUnseekableVideo", seekmode=-1)
TNLMeans()
### prefetch: 32, 24
# It is important to use a big backward cache since we can't seek

### ###

MCTD()

""")
Please see example script in the binary package for some other usage and setting explanations.
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3

Last edited by SAPikachu; 6th April 2014 at 10:57.
SAPikachu is offline   Reply With Quote
Old 1st December 2011, 08:10   #2  |  Link
TheRyuu
warpsharpened
 
Join Date: Feb 2007
Posts: 787
Quote:
Originally Posted by SAPikachu View Post
work-around the 4GB problem of 32-bit process.
ftfy.
TheRyuu is offline   Reply With Quote
Old 1st December 2011, 08:16   #3  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by TheRyuu View Post
ftfy.
User processes can only use 2GB of full address space, don't they? (well... actually 3GB on some conditions, but that's a special case)
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 1st December 2011, 11:55   #4  |  Link
SEt
Registered User
 
Join Date: Aug 2007
Posts: 374
Actually 4GB on 64 bit OS.
SEt is offline   Reply With Quote
Old 1st December 2011, 12:30   #5  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by SEt View Post
Actually 4GB on 64 bit OS.
Didn't notice that until read this. Learned something today, thanks.
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 1st December 2011, 14:35   #6  |  Link
kemuri-_9
Compiling Encoder
 
kemuri-_9's Avatar
 
Join Date: Jan 2007
Posts: 1,348
Quote:
Originally Posted by SEt View Post
Actually 4GB on 64 bit OS.
generally everything has to be compiled with large address awareness for the 32bit binaries (executable and dlls) to really allow addressing over 2GB of memory.

as this is also not usually a default build option iirc, most things don't have it enabled, preventing beyond 2GB of addressable memory.
__________________
custom x264 builds & patches | F@H | My Specs
kemuri-_9 is offline   Reply With Quote
Old 1st December 2011, 14:58   #7  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
So this basically is an AVS2YUV clone, but not as a stand-alone application, but as an Avisynth plug-in?

I think this would be particularly useful to load 32-Bit plugins (that don't have 64-Bit equivalents) into a 64-Bit Avisynth environment. Or vice versa.

Is that supported/intended?
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

Last edited by LoRd_MuldeR; 1st December 2011 at 15:01.
LoRd_MuldeR is offline   Reply With Quote
Old 1st December 2011, 14:58   #8  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by kemuri-_9 View Post
generally everything has to be compiled with large address awareness for the 32bit binaries (executable and dlls) to really allow addressing over 2GB of memory.
I thought it was just executables (not dlls). Thus Avisynth will benefit from increased memory if used by a client that has been built as 'large address aware'.
__________________
GScript and GRunT - complex Avisynth scripting made easier
Gavino is offline   Reply With Quote
Old 1st December 2011, 15:09   #9  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
Quote:
Originally Posted by Gavino View Post
I thought it was just executables (not dlls). Thus Avisynth will benefit from increased memory if used by a client that has been built as 'large address aware'.
Right:
http://blogs.msdn.com/b/oldnewthing/.../10065933.aspx

But then loading a DLL into some "LARGEADDRESSAWARE" process might break it, if the code in that DLL isn't prepared to deal with addresses beyond 2 GB.
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊
LoRd_MuldeR is offline   Reply With Quote
Old 1st December 2011, 15:30   #10  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by LoRd_MuldeR View Post
So this basically is an AVS2YUV clone, but not as a stand-alone application, but as an Avisynth plug-in?

I think this would be particularly useful to load 32-Bit plugins (that don't have 64-Bit equivalents) into a 64-Bit Avisynth environment. Or vice versa.

Is that supported/intended?
It is functionally similar to avs2yuv, but with some additional features like multiple levels of pipeline.

That is not my original intention, but it is interesting. It only supports x86 now, I will add x64 and mixed script environment support later when I have time.
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 1st December 2011, 16:30   #11  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,843
Can we use this to divide file (using trim) to few parts and run on each one (in seperate process) QTGMC and put them together at the end?


Andrew
kolak is offline   Reply With Quote
Old 1st December 2011, 16:33   #12  |  Link
06_taro
soy sauce buyer
 
Join Date: Mar 2010
Location: United Kingdom
Posts: 164
Now add large memory aware flag to exceed 2GB limit in avs4x264mod.
06_taro is offline   Reply With Quote
Old 1st December 2011, 21:25   #13  |  Link
-Vit-
Registered User
 
Join Date: Jul 2010
Posts: 448
Quote:
Originally Posted by kolak View Post
Can we use this to divide file (using trim) to few parts and run on each one (in seperate process) QTGMC and put them together at the end?
It's an interesting plugin, but doesn't seem to help for that unless I'm missing something. I tried this on some SD footage:
Code:
MP_Pipeline("""

WhateverSource("Some\Source")

### ###

QTGMC("Placebo")

### branch: 4

### ###

""")
Worked OK, ran five slave processes and produced the correct result. However, it was slower than single threaded (single threaded is 6fps, this script was 5fps). Used about 2.4Gb memory. Increasing branch slowed it down further, reducing branch to 2 speeded it up to just over 6fps.

By comparison, splitting the video and running many separate single threaded encoding processes, or just using SetMTMode gives 20-25fps. SetMTMode uses a lot less memory.
-Vit- is offline   Reply With Quote
Old 1st December 2011, 21:44   #14  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,843
Hmmm- shame.

I'm forced to run few instances for HD- not a big deal, but if it could be automated than it would be easier.
kolak is offline   Reply With Quote
Old 2nd December 2011, 02:08   #15  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by -Vit- View Post
It's an interesting plugin, but doesn't seem to help for that unless I'm missing something. I tried this on some SD footage:
Code:
MP_Pipeline("""

WhateverSource("Some\Source")

### ###

QTGMC("Placebo")

### branch: 4

### ###

""")
Worked OK, ran five slave processes and produced the correct result. However, it was slower than single threaded (single threaded is 6fps, this script was 5fps). Used about 2.4Gb memory. Increasing branch slowed it down further, reducing branch to 2 speeded it up to just over 6fps.

By comparison, splitting the video and running many separate single threaded encoding processes, or just using SetMTMode gives 20-25fps. SetMTMode uses a lot less memory.
The branch statement is actually not very useful, it is only suitable for spatial single-threaded plugins like TNLMeans, for temporal scripts/filters (especially complex script like QTGMC), the same frame will be repeatedly processed by multiple processes and cpu time will be wasted, decreasing speed. That's why I didn't mention it in OP.
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 2nd December 2011, 04:26   #16  |  Link
-Vit-
Registered User
 
Join Date: Jul 2010
Posts: 448
Quote:
Originally Posted by SAPikachu View Post
The branch statement is actually not very useful, it is only suitable for spatial single-threaded plugins like TNLMeans, for temporal scripts/filters (especially complex script like QTGMC), the same frame will be repeatedly processed by multiple processes and cpu time will be wasted, decreasing speed. That's why I didn't mention it in OP.
Ah yes, because it splits into interleaved sequences... Would it be difficult to have it split into several contiguous chunks instead? Or is there some other reason not to do that?
-Vit- is offline   Reply With Quote
Old 2nd December 2011, 07:27   #17  |  Link
06_taro
soy sauce buyer
 
Join Date: Mar 2010
Location: United Kingdom
Posts: 164
Quote:
Originally Posted by -Vit- View Post
Ah yes, because it splits into interleaved sequences... Would it be difficult to have it split into several contiguous chunks instead? Or is there some other reason not to do that?
Because it is much easier to use selectevery and interleave. You don't need to care exactly how many frames in total.
06_taro is offline   Reply With Quote
Old 2nd December 2011, 07:39   #18  |  Link
SAPikachu
Registered User
 
SAPikachu's Avatar
 
Join Date: Aug 2007
Posts: 218
Quote:
Originally Posted by -Vit- View Post
Ah yes, because it splits into interleaved sequences... Would it be difficult to have it split into several contiguous chunks instead? Or is there some other reason not to do that?
Splitting the whole clip into big thunks doesn't make sense as we can't have parallelism in avs filter in this way. But I think we can split the clip into small thunks (32 frames each thunk for example), and use it with ThreadRequest. This may reduce duplicated processing, the speed may increase (but I'm afraid that this method will never be faster than SetMTMode since the overhead is much bigger) I need to make some new filters for that. Again, when I have time...
__________________
f3kdb 1.5.1 / MP_Pipeline 0.18

ffms2 builds with 10bit output hack:
libav-9a60b1f / ffmpeg-1e4d049 / FFmbc-0.7.1
Built from ffms2 6e0d654 (hack a9fe004)

Mirrors: http://bit.ly/19TwDD3
SAPikachu is offline   Reply With Quote
Old 6th December 2011, 11:22   #19  |  Link
pbristow
Registered User
 
pbristow's Avatar
 
Join Date: Jun 2009
Location: UK
Posts: 263
Ooh!

So does this mean I can at last do:

Code:
LoadPlugin("MP_Pipeline.dll")
LoadPLugin("PB_3D_tools.dll")

AVIsource("some_anaglyph_3D_thing.avi")

MP_Pipeline("""
global LeftEye = ExtractOneSide(Eye="Left")
### ###
global RightEye = ExtractOneSide(Eye="Right")
""")

StackHorizontal(RightEye,LeftEye)
...to create my cross-eye 3d versions from anaglyph 3D stuff in half the time?

I'm assuming "last" is passed though in the normal way. Have got the right idea about getting data back from the processes? Do LeftEye and RightEye need to be global variables, and if so, where should they be defined: Inside MP_Pipeline, outside, or both?
pbristow is offline   Reply With Quote
Old 6th December 2011, 12:14   #20  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by pbristow View Post
Code:
...
MP_Pipeline("""
global LeftEye = ExtractOneSide(Eye="Left")
### ###
global RightEye = ExtractOneSide(Eye="Right")
""")

StackHorizontal(RightEye,LeftEye)
I don't think that will work, since:
Quote:
Originally Posted by SAPikachu View Post
* Since each process have its own script environment, all script variables and loaded plugins won't be inherited, they must be re-initialized if needed
I assume this also applies the other way, so the outer script does not inherit any variables set by the processes.

I expect also that the script in quotes must return a clip, and yours doesn't.
__________________
GScript and GRunT - complex Avisynth scripting made easier
Gavino is offline   Reply With Quote
Reply

Tags
avisynth, multi-process, pipeline

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 08:00.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.