Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

 
 
Thread Tools Search this Thread Display Modes
Prev Previous Post   Next Post Next
Old 23rd January 2017, 10:29   #1  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,309
JPSDR Avisynth's plugins pack

Merge of Avisynth's plugins

Finaly, the All Might, One for All plugin is out, mergin all the avisynth's plugins i've made.

The main purpose of it is to reduce the number of threads created when you use more than one plugin, as in that case there is only one DLL file, so one threadpool created, instead of using several DLLs, creating several threadpools and so more threads.

Current version: 3.3.5

Merge of :
AutoYUY2: 4.1.8
NNEDI3: 0.9.4.63
ResampleMT/Desample: 2.3.9
aWarpSharpMT: 2.1.9
HDRTools: 1.0.5

Sources are here.
Binaries are here.

Version history
3.3.5 : Update to new AVS+ headers and fix awarp colorspace issue.
3.3.4 : Update of resample with DTL pull-request.
3.3.3 : Update to new AVS+ headers.
3.3.2 : Fix the value of a BT.1886 parameter in HDRTools.
3.3.1 : Update on threadpool, no user limit (except memory).
3.3.0 : Update of HDRTools with ACES tonemap.
3.2.8 : Fix on threadpool, using prefetch parameter created hang. Add negative prefetch for triming, read Multithreading.txt or Multithreading chapter here. Fix on resample.
3.2.7 : Fix on awarpsharp, fix on threadpool.
3.2.6 : Fix on resampler, fix on awarpsharp, new resamplers added in nnedi3_rpow2, update avs headers.
3.2.5 : Small fix on resampler and new kernel function.
3.2.4 : New resamplers and also added in nnedi3_rpow2 resizers.
3.2.3 : Fix for nnedi3 and new function in resample.
3.2.2 : Fix issue introduced in nnedi3.
3.2.1 : Fix on aWarpSharpMT, update to new avisynth headers.
3.2.0 : Update to HDRTools 0.6.0 (add of BT2446 A & C methods).
3.1.3 : Update HDRTools (add Crosstalk parameter and EOTF for SDR).
3.1.2 : Minor code change after threadpool update, fix in the number of threads,
fix in resampler to perfectly match avs+ output.
3.1.0 : Update in threadpool, add ThreadLevel parameter.
3.0.0 : Add of HDRTools.
2.2.0 : Update Matrix Class, fix bug in nnedi3 for YUY2, add 16 bits support on AutoYUY2.
2.1.1 : Optimized CPU placement if SetAffinity=true for prefetch>1, SetAffinity back to default false.
2.1.0 : 16 bits aWarpasharp, merge of new resample code, some fixes.
2.0.5 : Fix aWarp/aWarp4 default settings.
2.0.4 : Fix (good this once) crash on aBlur x64.
2.0.3 : Fix crash on aBlur and clarify some aWarp4 modes (aWarpSharpMT part).
2.0.2 : Fix on aWarpSharp and Resample.
2.0.1 : Fix on AutoYUY2.
2.0.0 : Add of aWarpSharpMT, small update on the others.
1.2.1 : Update to ResampleMT v2.0.1, fix in Desample functions.
1.2.0 : Update to ResampleMT v2.0.0 with Desample functions.
1.1.10 : Fix resample issue of doing nothing and fix possible deadlock in threadpool.
1.1.9 : Forgot to add AVX path on planarframe of NNEDI3.
1.1.8 : Fix threadpool, add AVX path in AutoYUY2 and NNEDI3.
1.1.7 : Same minor change an all filters.
1.1.6 : Minor update on threadpool and minor fix on NNEDI3.
1.1.5 : Minor update on threadpool.
1.1.4 : Fix YUYV planarframe NNEDI3 crash.
1.1.3 : Fix NNEDI3 x64 crash.
1.1.2 : Update of NNEDI3, and small fix.
1.1.1 : Update of all plugins (most significant is fix of range issue in ResampleMT).
1.1.0 : "Big" update of NNEDI3, and also update of AutoYUY2.
1.0.1 : Update of NNEDI3 and ResampleMT.
1.0.0 : First release.

For more informations on the changes, check each filter thread.

==================================================================

Multi-threading information

CPU example case : 4 cores with hyper-threading.

If you leave all the multi-threading parameters to their default value, it's set to be "optimal" when you're not using prefetch or if you are under standard avisynth, all the logical CPU will be used.
If you put SetAffinity to true it will allocate the threads on the CPU contiguously. Physical CPU 1 will have threads (0,1), ... physical CPU 4 will have threads (6,7), allowing optimal cache use. Make test to see what's best for you.

Now, if you are using prefetch on your script, things are different !
If you're using it with the max number of CPUs (8 in our exemple case), you still can make tests, but i would strongly advise to disable the internal multi-threading by using threads=1. In this case, there is no threadpool created, and all the other multi-threading related filter parameters have no effect, even prefetch.
If you're using prefetch on your script, with less than your CPU number, you may want to try to mix the external and internal mutli-threading, setting the internal multi-threading to a lower number of threads, and setting the prefetch parameter of the filter. This parameter will set the number of internal threadpool created, the best is to match the prefetch script value. If you don't set it (leave it to 1) or set a lower value than prefetch on your script, you'll have several instances (or GetFrame) created, but they'll not be running efficiently, because each instance (or GetFrame) will spend time waiting for a threadpool to be avaible, if not enough were created.
Unfortunately, as things are now, i have no way of knowing the prefetch value used in the avisynth script at the time i need the information, this is why you have to use the prefetch parameter in the filter.
In our CPU exemple case, you can have things like :
Code:
filter(...,threads=1)
prefetch(8)
or
Code:
filter(...,threads=2,prefetch=4)
prefetch(4)
or
Code:
filter(...,threads=4,prefetch=2)
prefetch(2)
or even
Code:
filter(...,threads=3,prefetch=4)
prefetch(4)
if you want to boost and go a little over your total CPU number.

Also, if your prefetch is not higher than your number of physical cores, you can try to put SetAffinity to true, but in that case, you have to set MaxPhysCore to false. The threads of each pool will be set on CPUs by steps.
For exemple, in our case :
Code:
filter(...,threads=2,prefetch=4,SetAffinity=true,MaxPhysCore=false)
prefetch(4)
Will create 4 pool of 2 threads, with the following :
pool[0] : threads(0 -> 1) on CPU 1.
pool[1] : threads(0 -> 1) on CPU 2.
pool[2] : threads(0 -> 1) on CPU 3.
pool[3] : threads(0 -> 1) on CPU 4.
Code:
filter(...,threads=4,prefetch=2,SetAffinity=true,MaxPhysCore=false)
prefetch(2)
Will create 2 pool of 4 threads, with the following :
pool[0] : threads(0 -> 1) on CPU 1.
pool[0] : threads(2 -> 3) on CPU 2.
pool[1] : threads(0 -> 1) on CPU 3.
pool[1] : threads(2 -> 3) on CPU 4.

Negative prefetch
The possibility to put negative prefecth to tune the prefetch parameter to optimal value has been added. The filter will throw an error if the number is not high enough to avoid waiting when requesting internal threadpool. For this to work properly, you have to put negative prefetch on ALL the filters of your script, and also ALL instances of the same filter.

Exemple :
Code:
filter(...,threads=2,prefetch=-2)
prefetch(2)
You'll see an error.

But with :
Code:
filter(...,threads=2,prefetch=-3)
prefetch(2)
You'll see no error, so the optimal is :
Code:
filter(...,threads=2,prefetch=3)
prefetch(2)
Once you've tune, put back a positive value.

Last edited by jpsdr; 20th November 2023 at 21:58.
jpsdr is offline   Reply With Quote
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:03.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.