Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 24th November 2017, 12:54   #1  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Updated aWarpSharp version, with some new features

I've updated the aWarpSharp plugin, adding :
- New parameters for better tuning.
- Internal multi-threading.

This version works on all avs+ version, and on all 2.6.x versions.

Current version : 2.1.9

Sources are here.
Binaries are here.

Version history
2.1.9 : Update to new AVS+ headers and fix awarp colorspace issue.
2.1.8 : Update to new AVS+ headers.
2.1.7 : Update on threadpool, no user limit (except memory).
2.1.6 : Fix on threadpool, using prefetch parameter created hang. Add negative prefetch for triming, read Multithreading.txt or Multithreading chapter here.
2.1.5 : Fix aSobel crash, fix on threadpool.
2.1.4 : Fix aWarp4 issues, update to new avisynth headers.
2.1.3 : Fix aWarp issue on 16 bits, update to new avisynth headers.
2.1.2 : Minor code change after threadpool update, fix in the number of threads.
2.1.0 : Update of the threadpool, add ThreadLevel parameter.
2.0.1 : Optimized CPU placement if SetAffinity=true for prefetch>1, SetAffinity back to default false.
2.0.0 : Add 16 bits support (thanks for pinterf's help), filter is MT_NICE.
1.0.4 : Fix aWarp/aWarp4 default settings.
1.0.3 : Fix (for good... ) bug in aBlur x64 asm.
1.0.2 : Fix bug in aBlur x64 asm code, aWarp4 don't copy planes on some chroma modes (doc updated).
1.0.1 : Fix depthVC default value, fix depthC on aWarpSharp (thanks real.finder).
1.0.0 : First release.

==================================================================

The functions inside this plugin are :

aWarpSharp2(int thresh,int blur,int type,int depth,int chroma,int depthC,string "cplace",
int blurV,int depthV,int depthVC,int blurC,int blurVC,int threshC,
int threads,bool logicalCores,bool MaxPhysCore,bool SetAffinity,bool sleep,int prefetch,int ThreadLevel)

aSobel(int thresh,int chroma,int threshC,
int threads,bool logicalCores,bool MaxPhysCore,bool SetAffinity,bool sleep,int prefetch,int ThreadLevel)

aBlur(int blur,int type,int chroma,int blurV,int blurC,int blurVC,
int threads,bool logicalCores,bool MaxPhysCore,bool SetAffinity,bool sleep,int prefetch,int ThreadLevel)

aWarp(edge_mask_clip,int depth,int chroma,int depthC,string "cplace",int depthV,int depthVC,
int threads,bool logicalCores,bool MaxPhysCore,bool SetAffinity,bool sleep,int prefetch,int ThreadLevel)

aWarp4(edge_mask_clip,int depth,int chroma,int depthC,string "cplace",int depthV,int depthVC,
int threads,bool logicalCores,bool MaxPhysCore,bool SetAffinity,bool sleep,int prefetch,int ThreadLevel)


Parameters are exactly the same than the orignal aWarpSharp functions, and in the same order, so they are totaly
backward compatible.

Check the TXT file for a more complete description.
The new parameters are added at the end of all the parameters. They have been added to allow more tuning for processing, allowing separate values for horizontal and vertical, and different value for Y/C when not avaible originaly. The parameters are the following:

threshC: default thresh. Set the limit for edge detection on chroma planes.

blurV: Default blur. If blurV is different from blur, horizontal process will be
done only on blur passes, and vertical process will be done only on blurV passes.

blurC : Number of blur passes for the chroma, default (blur+1)/2 passes.

blurVC: Default blurC. If blurVC is different from blurC, horizontal process will be
done only on blurC passes, and vertical process will be done only on blurVC passes.

depthV: default depth. depth set the warping strength for horizontal, and depthV
set de warping strenght for vertical.

depthVC: default depthC. depthC set the warping strength for horizontal, and depthVC
set de warping strenght for vertical.

threads -

Controls how many threads will be used for processing. If set to 0, threads will
be set equal to the number of detected logical or physical cores,according logicalCores parameter.

Default: 0 (int)

logicalCores -

If threads is set to 0, it will specify if the number of threads will be the number
of logical CPU (true) or the number of physical cores (false). If your processor doesn't
have hyper-threading or threads<>0, this parameter has no effect.

Default: true (bool)

MaxPhysCore -

If true, the threads repartition will use the maximum of physical cores possible. If your
processor doesn't have hyper-threading or the SetAffinity parameter is set to false,
this parameter has no effect.

Default: true (bool)

SetAffinity -

If this parameter is set to true, the pool of threads will set each thread to a specific core,
according MaxPhysCore parameter. If set to false, it's leaved to the OS.

Default: true (bool)

sleep -
If this parameter is set to true, once the filter has finished one frame, the threads of the
threadpool will be suspended (instead of still running but waiting an event), and resume when
the next frame will be processed. If set to false, the threads of the threadpool are always
running and waiting for a start event even between frames.

Default: false (bool)

prefetch -
This parameter will allow to create more than one threadpool, to avoid mutual resources acces
lock/wait if "prefetch" is used in the avs script.
0 : Will set automaticaly to the prefetch value use in the script. Well... that's what i wanted
to do, but for now it's not possible for me to get this information when i need it, so, for
now, 0 will result in 1. For now, if you're using "prefetch" in your script, put the same
value on this parameter.

ThreadLevel -
This parameter will set the priority level of the threads created for the processing (internal
multithreading). No effect if threads=1.
1 : Idle level.
2 : Lowest level.
3 : Below level.
4 : Normal level.
5 : Above level.
6 : Highest level.
7 : Time critical level (WARNING !!! use this level at your own risk)

Default : 6

The logicalCores, MaxPhysCore, SetAffinity and sleep are parameters to specify how the pool of thread will be created and handled, allowing if necessary each people to tune according his configuration.

==================================================================

Multi-threading information

CPU example case : 4 cores with hyper-threading.

If you leave all the multi-threading parameters to their default value, it's set to be "optimal" when you're not using prefetch or if you are under standard avisynth, all the logical CPU will be used.
If you put SetAffinity to true it will allocate the threads on the CPU contiguously. Physical CPU 1 will have threads (0,1), ... physical CPU 4 will have threads (6,7), allowing optimal cache use. Make test to see what's best for you.

Now, if you are using prefetch on your script, things are different !
If you're using it with the max number of CPUs (8 in our exemple case), you still can make tests, but i would strongly advise to disable the internal multi-threading by using threads=1. In this case, there is no threadpool created, and all the other multi-threading related filter parameters have no effect, even prefetch.
If you're using prefetch on your script, with less than your CPU number, you may want to try to mix the external and internal mutli-threading, setting the internal multi-threading to a lower number of threads, and setting the prefetch parameter of the filter. This parameter will set the number of internal threadpool created, the best is to match the prefetch script value. If you don't set it (leave it to 1) or set a lower value than prefetch on your script, you'll have several instances (or GetFrame) created, but they'll not be running efficiently, because each instance (or GetFrame) will spend time waiting for a threadpool to be avaible, if not enough were created.
Unfortunately, as things are now, i have no way of knowing the prefetch value used in the avisynth script at the time i need the information, this is why you have to use the prefetch parameter in the filter.
In our CPU exemple case, you can have things like :
Code:
filter(...,threads=1)
prefetch(8)
or
Code:
filter(...,threads=2,prefetch=4)
prefetch(4)
or
Code:
filter(...,threads=4,prefetch=2)
prefetch(2)
or even
Code:
filter(...,threads=3,prefetch=4)
prefetch(4)
if you want to boost and go a little over your total CPU number.

Also, if your prefetch is not higher than your number of physical cores, you can try to put SetAffinity to true, but in that case, you have to set MaxPhysCore to false. The threads of each pool will be set on CPUs by steps.
For exemple, in our case :
Code:
filter(...,threads=2,prefetch=4,SetAffinity=true,MaxPhysCore=false)
prefetch(4)
Will create 4 pool of 2 threads, with the following :
pool[0] : threads(0 -> 1) on CPU 1.
pool[1] : threads(0 -> 1) on CPU 2.
pool[2] : threads(0 -> 1) on CPU 3.
pool[3] : threads(0 -> 1) on CPU 4.
Code:
filter(...,threads=4,prefetch=2,SetAffinity=true,MaxPhysCore=false)
prefetch(2)
Will create 2 pool of 4 threads, with the following :
pool[0] : threads(0 -> 1) on CPU 1.
pool[0] : threads(2 -> 3) on CPU 2.
pool[1] : threads(0 -> 1) on CPU 3.
pool[1] : threads(2 -> 3) on CPU 4.

Negative prefetch
The possibility to put negative prefecth to tune the prefetch parameter to optimal value has been added. The filter will throw an error if the number is not high enough to avoid waiting when requesting internal threadpool. For this to work properly, you have to put negative prefetch on ALL the filters of your script, and also ALL instances of the same filter.

Exemple :
Code:
filter(...,threads=2,prefetch=-2)
prefetch(2)
You'll see an error.

But with :
Code:
filter(...,threads=2,prefetch=-3)
prefetch(2)
You'll see no error, so the optimal is :
Code:
filter(...,threads=2,prefetch=3)
prefetch(2)
Once you've tune, put back a positive value.

Last edited by jpsdr; 20th November 2023 at 21:50.
jpsdr is offline   Reply With Quote
Old 24th November 2017, 17:43   #2  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
finally

now this filter is very useful, even ablur can be used alone for Dot Crawl Removal, I did used in DDComb

and if you can added the others blur type that was in original awarpsharp by Marc FD it will be more useful, aside form HBD in avs+
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 24th November 2017, 18:33   #3  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
I don't know what they are, i need first the original code source. Where can i find it ?
jpsdr is offline   Reply With Quote
Old 24th November 2017, 18:46   #4  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by jpsdr View Post
I don't know what they are, i need first the original code source. Where can i find it ?
you can ask prunedtree in IRC (rizon server) same person that you asked about Debilinear
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 25th November 2017, 09:24   #5  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Yes, and you've already told me this... i just don't remember on what channel i can found him...
jpsdr is offline   Reply With Quote
Old 25th November 2017, 15:59   #6  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by jpsdr View Post
Yes, and you've already told me this... i just don't remember on what channel i can found him...
in #darkhold and maybe others
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 26th November 2017, 12:05   #7  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Yes... Right, now i remember indeed... I'll try to get in touch with him.
jpsdr is offline   Reply With Quote
Old 26th November 2017, 17:34   #8  |  Link
Motenai Yoda
Registered User
 
Motenai Yoda's Avatar
 
Join Date: Jan 2010
Posts: 709
can you fix the awarp4 chroma bug?
__________________
powered by Google Translator
Motenai Yoda is offline   Reply With Quote
Old 26th November 2017, 18:01   #9  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by Motenai Yoda View Post
can you fix the awarp4 chroma bug?
what bug?
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 27th November 2017, 10:23   #10  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,645
Does this now supersede Pinterf's version? ..Nevermind, readme clarified.

"Addition by pinterf on 20160623:
- AviSynth 2.6 interface, Avisynth+ header
- working x64 version
- minor cleanup"

Last edited by ryrynz; 27th November 2017 at 10:27.
ryrynz is offline   Reply With Quote
Old 27th November 2017, 10:31   #11  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Quote:
Originally Posted by real.finder View Post
now this filter is very useful, even ablur can be used alone for Dot Crawl Removal, I did used in DDComb
Isn't a threshold median filter more appropriate for this ? (If "Dot Crawl" is what i think, sometimes i'm not sure of the exact english technical version).
jpsdr is offline   Reply With Quote
Old 27th November 2017, 14:44   #12  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by jpsdr View Post
Isn't a threshold median filter more appropriate for this ? (If "Dot Crawl" is what i think, sometimes i'm not sure of the exact english technical version).
median not work well with Dot Crawl and it's slower
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 27th November 2017, 15:12   #13  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Ok, it wasn't what i thought, "dot" mislead me, so no surprise median is not adapted. Hard to remove, as it should have been done properly "before" either when demodulating chroma, or even more when creating the CVBS signal...
jpsdr is offline   Reply With Quote
Old 27th November 2017, 22:52   #14  |  Link
Motenai Yoda
Registered User
 
Motenai Yoda's Avatar
 
Join Date: Jan 2010
Posts: 709
Quote:
Originally Posted by real.finder View Post
what bug?
chroma values 2 5 6 are bugged as it doesn't scale back chroma/luma channels
maybe it can take them from last
__________________
powered by Google Translator

Last edited by Motenai Yoda; 27th November 2017 at 23:04.
Motenai Yoda is offline   Reply With Quote
Old 28th November 2017, 04:38   #15  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by Motenai Yoda View Post
chroma values 2 5 6 are bugged as it doesn't scale back chroma/luma channels
maybe it can take them from last
yes, there are bug in chroma=6 or 5 or 2

Code:
ColorBars(pixel_type="yv12")
aWarp4(nnedi3_rpow2(rfactor=2).nnedi3_rpow2(rfactor=2), aSobel().aBlur(), depth=2,chroma=6)
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 28th November 2017, 09:37   #16  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Is it a bug i've introduced or was it also on the original ?
jpsdr is offline   Reply With Quote
Old 28th November 2017, 09:55   #17  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
About original blur mode, i don't realy see the point as they are very similar from what i've understood taking a quick look. The hq mode is just the same of the R2, with just a little more accurate result, less rounding propagate errors. I'll explain. The actual version (both R2 and R6) uses pavgb function, wich compute the average of 8 bits data with the following formula : pavgb(A,B) = (A+B+1)>>1.
If you want to compute (A+2*B+C)/4 you can do the following :
A=pavgb(A,C)
result=pavgb(A,B)
But, you cumulate 2 successive pavgb and so rounding errors.
You can also do the followig :
Expand A,B,C on 16 bits, and compute (on 16bits) result=(A+2*B+C+2)>>2 and convert back to 8 bits.
This way, you have a little more accurate result.
Honestly don't see real interest of puting back the old hq blur mode, and as for the others, they are already implemented with the actual R2 and R6.
jpsdr is offline   Reply With Quote
Old 28th November 2017, 18:20   #18  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by jpsdr View Post
Is it a bug i've introduced or was it also on the original ?
in original awarpsharp2 by SEt

Quote:
Originally Posted by jpsdr View Post
About original blur mode, i don't realy see the point as they are very similar from what i've understood taking a quick look. The hq mode is just the same of the R2, with just a little more accurate result, less rounding propagate errors. I'll explain. The actual version (both R2 and R6) uses pavgb function, wich compute the average of 8 bits data with the following formula : pavgb(A,B) = (A+B+1)>>1.
If you want to compute (A+2*B+C)/4 you can do the following :
A=pavgb(A,C)
result=pavgb(A,B)
But, you cumulate 2 successive pavgb and so rounding errors.
You can also do the followig :
Expand A,B,C on 16 bits, and compute (on 16bits) result=(A+2*B+C+2)>>2 and convert back to 8 bits.
This way, you have a little more accurate result.
Honestly don't see real interest of puting back the old hq blur mode, and as for the others, they are already implemented with the actual R2 and R6.
there are big difference and superiority in original awarpsharp versus the new awarpsharp2

see here and here

and the hq is not only one, there are others (fast 1-pass is the default)
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 28th November 2017, 20:18   #19  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
When i look at the code, the "fast 1-pass" is the actuel R6 with one pass...
jpsdr is offline   Reply With Quote
Old 28th November 2017, 21:15   #20  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by jpsdr View Post
When i look at the code, the "fast 1-pass" is the actuel R6 with one pass...
why there are some notable difference then? bug in compatibility mapping for aWarpSharp parameters? and there are already one here
__________________
See My Avisynth Stuff

Last edited by real.finder; 28th November 2017 at 21:18.
real.finder is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 04:04.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.