pbristow
18th June 2011, 11:09
Since the topic of "yes, but why can't we have multi-threading*too*?" keeps cropping up in other development threads, I thought it might be useful to create a general roundup of the situation. If it's useful enough, it might be a candidate for a sticky...?
Current multithreading options, in Avisynth versions up to 2.5.8:
---------------------------------------------------------------
1. SetMTMode - Works by creating a shared cache so that several instances can be created of each filter(-chain), on separate threads, without having to duplicate the processing required to get the frames required. Essentially, this is "temporal" or "frame-wise" multi-threading.
- Requirements: an MT-enabled build of Avisynth.
- Pros: Can speed up the operation of most filters; avoids "split screen" artefacts.
- Cons: Interferes with operation of temporal filters such as TemporalSoften, MVTools, etc.; Requires m=several different modes to cope with different cases; Not always easy to understand what it's doing when debugging.
- Known bugs/issues: (to be added)
2. The MT() function - Works by splitting each frame of video either vertically or horizontally into strips, each strip being processed by one thread. This is "spatial" multi-threading.
- Requirements: The MT plugin plus an MT-enabled build of Avisynth.
- Pros: Easy to visualise and understand what's happening
- Cons: Interferes with operation of spatial filters (large blurs, de-blocking etc.); Interferes with block-matching across the strip boundaries (MVTools etc.).
- Known bugs/issues: (to be added)
3. The MTi() function - Works by creating two threads and passing "upper" fields to one, "lower" fields to the other. Theis is a specialised case of temporal multithreading.
- Requirements: The MT plugin plus an MT-enabled build of Avisynth.
- Pros: For truly interlaced material that will *remain* interlaced, neatly exploits the need to process the fields independently of each other.
- Cons: Does not allow processing of one set of fields to exploit data from the other (e.g. high quality de-interlacing);
Limited applicability.
- Known bugs/issues: (to be added)
4. Custom-implemented multithreading, inside specific functions.
- Requirements: An MT-enabled version of the relevant plug-in (which may not exist (yet)).
- Pros: The developer of the function has full control of what gets multithreaded and how, can diagnose the problems that relate to their specific case, and can create efficient, targeted solutions for those problems. (E.g. if spatial multi-threading is used, the optimum overlap can be determined more reliably, or the overlap merging scheme can be tailored to ignore pixels that will not be significant to the output, etc.);
Entirely new or case-specific methods of multithreading can be applied that may be more efficient than the obvious ones, (e.g. One thread for each colour plane; Only multi-threading computation of output pixels/blocks that match a computed mask; etc.)
Does not require any change to Avisynth itself, so can be used immediately with new/alpha/beta versions which do not yet have any built-in multithreading.
- Cons: More work for plugin developers; Have to wait for developer to create an mt version of the filter; Filter developer has to learn & think about appropriate mt techniques; Filters become more complex and harder to debug.
- Known bugs/issues: (These will be specific to individual filters.)
5. The ThreadRequest() function (See: http://forum.doom9.org/showthread.php?t=154886)
This enables a chain of filters to be split "mid-way", causing everything before ThreadRequest() to execute in a separate thread from everything after it.
- Requirements: ThreadRequest plugin; (does it also require an MT build of Avisynth?)
- Pros: Well suited (in theory) to long, but simple, linear filter chains;
Doesn't require spatial splitting of video data, so avoids joining/overlap artefacts;
Doesn't require temporal splitting of data, so avoids duplicated GetFrames/Caching issues/need for SetMTMode;
- Cons: Poorly documented; Requires careful analysis of script to determine best usage (which is hard to do without good documentation!);
- Known bugs/issues: Reported as either crashing or slowing to a crawl after a certain point in processing.
Suggested alternative/future methods:
-------------------------------------
1. Exploiting natural parallelism within scripted tasks, e.g.:
Movie3d = AVISource("whatever.avi")
LeftEye = Movie3d.SomeComplexFunction(Left=True)
RightEye = Movie3d.SomeComplexFunction(Left=False)
StackHorizontal(RightEye,LeftEye) # Cross your eyes to view!
Becomes:
Movie3d = AVISource("whatever.avi")
ParallelProcess(threads=2, \
"LeftEye = Movie3d.SomeComplexFunction(Left=True)", \
"RightEye = Movie3d.SomeComplexFunction(Left=False)" \
)
StackHorizontal(RightEye,LeftEye) # Cross your eyes to view!
- Pros: Should be simple to implement;
Interestingly versatile: Using this method it would be possible for the *user* to effectively re-implement the MT() function, simply by cropping their source video into sections before passing them to ParallelProcess(). They would also then have more control over where the boundaries between strips/blocks fell.
Similarly, MTi() Becomes something like:
Function MTi(clip, string function) {
clip = SeparateFields(clip)
Upper = clip.SelectEven
Lower = clip.SelectOdd
ParallelProcess(2, "Up2 = Upper." + Function, "Low2 = Lower." + Function)
Interleave(Up2,Low2).Weave
}
- Cons: Limited *direct* applicability to more linear processing tasks (but see above);
Problem of how to return multiple clips as output. Syntax as suggested here breaks with Avisynth norms, in that it has to return multiple results via, effectively, auto-generated global variables. ( What happens if a script containing this method is called by another script, using this method? Clash of global variables? Possible alternative: Multiple outputs are returned as a single clip by stacking them vertically (computationally efficient) or horizontally (may be necessary if results have different widths);
... OK, what have I missed/got wrong? :)
Current multithreading options, in Avisynth versions up to 2.5.8:
---------------------------------------------------------------
1. SetMTMode - Works by creating a shared cache so that several instances can be created of each filter(-chain), on separate threads, without having to duplicate the processing required to get the frames required. Essentially, this is "temporal" or "frame-wise" multi-threading.
- Requirements: an MT-enabled build of Avisynth.
- Pros: Can speed up the operation of most filters; avoids "split screen" artefacts.
- Cons: Interferes with operation of temporal filters such as TemporalSoften, MVTools, etc.; Requires m=several different modes to cope with different cases; Not always easy to understand what it's doing when debugging.
- Known bugs/issues: (to be added)
2. The MT() function - Works by splitting each frame of video either vertically or horizontally into strips, each strip being processed by one thread. This is "spatial" multi-threading.
- Requirements: The MT plugin plus an MT-enabled build of Avisynth.
- Pros: Easy to visualise and understand what's happening
- Cons: Interferes with operation of spatial filters (large blurs, de-blocking etc.); Interferes with block-matching across the strip boundaries (MVTools etc.).
- Known bugs/issues: (to be added)
3. The MTi() function - Works by creating two threads and passing "upper" fields to one, "lower" fields to the other. Theis is a specialised case of temporal multithreading.
- Requirements: The MT plugin plus an MT-enabled build of Avisynth.
- Pros: For truly interlaced material that will *remain* interlaced, neatly exploits the need to process the fields independently of each other.
- Cons: Does not allow processing of one set of fields to exploit data from the other (e.g. high quality de-interlacing);
Limited applicability.
- Known bugs/issues: (to be added)
4. Custom-implemented multithreading, inside specific functions.
- Requirements: An MT-enabled version of the relevant plug-in (which may not exist (yet)).
- Pros: The developer of the function has full control of what gets multithreaded and how, can diagnose the problems that relate to their specific case, and can create efficient, targeted solutions for those problems. (E.g. if spatial multi-threading is used, the optimum overlap can be determined more reliably, or the overlap merging scheme can be tailored to ignore pixels that will not be significant to the output, etc.);
Entirely new or case-specific methods of multithreading can be applied that may be more efficient than the obvious ones, (e.g. One thread for each colour plane; Only multi-threading computation of output pixels/blocks that match a computed mask; etc.)
Does not require any change to Avisynth itself, so can be used immediately with new/alpha/beta versions which do not yet have any built-in multithreading.
- Cons: More work for plugin developers; Have to wait for developer to create an mt version of the filter; Filter developer has to learn & think about appropriate mt techniques; Filters become more complex and harder to debug.
- Known bugs/issues: (These will be specific to individual filters.)
5. The ThreadRequest() function (See: http://forum.doom9.org/showthread.php?t=154886)
This enables a chain of filters to be split "mid-way", causing everything before ThreadRequest() to execute in a separate thread from everything after it.
- Requirements: ThreadRequest plugin; (does it also require an MT build of Avisynth?)
- Pros: Well suited (in theory) to long, but simple, linear filter chains;
Doesn't require spatial splitting of video data, so avoids joining/overlap artefacts;
Doesn't require temporal splitting of data, so avoids duplicated GetFrames/Caching issues/need for SetMTMode;
- Cons: Poorly documented; Requires careful analysis of script to determine best usage (which is hard to do without good documentation!);
- Known bugs/issues: Reported as either crashing or slowing to a crawl after a certain point in processing.
Suggested alternative/future methods:
-------------------------------------
1. Exploiting natural parallelism within scripted tasks, e.g.:
Movie3d = AVISource("whatever.avi")
LeftEye = Movie3d.SomeComplexFunction(Left=True)
RightEye = Movie3d.SomeComplexFunction(Left=False)
StackHorizontal(RightEye,LeftEye) # Cross your eyes to view!
Becomes:
Movie3d = AVISource("whatever.avi")
ParallelProcess(threads=2, \
"LeftEye = Movie3d.SomeComplexFunction(Left=True)", \
"RightEye = Movie3d.SomeComplexFunction(Left=False)" \
)
StackHorizontal(RightEye,LeftEye) # Cross your eyes to view!
- Pros: Should be simple to implement;
Interestingly versatile: Using this method it would be possible for the *user* to effectively re-implement the MT() function, simply by cropping their source video into sections before passing them to ParallelProcess(). They would also then have more control over where the boundaries between strips/blocks fell.
Similarly, MTi() Becomes something like:
Function MTi(clip, string function) {
clip = SeparateFields(clip)
Upper = clip.SelectEven
Lower = clip.SelectOdd
ParallelProcess(2, "Up2 = Upper." + Function, "Low2 = Lower." + Function)
Interleave(Up2,Low2).Weave
}
- Cons: Limited *direct* applicability to more linear processing tasks (but see above);
Problem of how to return multiple clips as output. Syntax as suggested here breaks with Avisynth norms, in that it has to return multiple results via, effectively, auto-generated global variables. ( What happens if a script containing this method is called by another script, using this method? Clash of global variables? Possible alternative: Multiple outputs are returned as a single clip by stacking them vertically (computationally efficient) or horizontally (may be necessary if results have different widths);
... OK, what have I missed/got wrong? :)