PDA

View Full Version : Slow speed in recursive filter operation


vcmohan
27th October 2004, 04:44
I am trying out a plugin which recursively generates input clip as result( like in Horizontal stack), clip2 a new clip each time and a bunch of other parameters. Each of the new filter operates in a range of frames not over lapping. i.e if filter1 operates in range of frames n1 to n2 and for corresponding audio. outside of this range it just returns the input frame. The ranges do not overlap.
Even in its own range except for a few frames, it just returns frame of child2 clip.
I find that for six child2 clips it is taking intolerably long time to process.My frame size is 720x480. My PC is AMD XP 2000+, 256 MB, 40GB hdd Windows XP proff. Is the memory a problem? Since for any given frame only child and a different child2 will be opened why memory should be a problem?
Is there some other issue that can slow down processing?
My child clip is a blank clip and child2 is a DirectShowSource clip.

vion11
27th October 2004, 19:20
Recursive functions have impact on memory and cpu.
It would help to optimize your script, when you post it.

krieger2005
23rd December 2004, 13:17
I Have the same problem. I use the following functions:

function ExpandToThin(clip c, int thin){
x=thin%3==0 ? c.expand() : c
x=x.inflate.inflate
q=Logic(c,x,"max")
thin <= 1 ? q.Blur(1.58) : ExpandToThin(q,thin-1)
return last
}

function getSharp(clip c, int sharp_cycles){
sharp_cycles <= 1 ? c : getSharp(c,sharp_cycles-1)
sharpen(0.6)
return last
}

function getBlur(clip c, int blur_cycles){
blur_cycles <= 1 ? c : getBlur(c,blur_cycles-1)
Blur(0.6)
return last
}


"getSharp" calls 2 times themself in the script
"getBlur" calls 3 times themself
"ExpandToThin" calls 7 times themself in my script
also i call "hqdn3d" 2 times in my script. The slowest function should be "hqdn3d". But what i get here is too slow.

For a 50 min. movie i get a calculated time of '1 day, 23 hours, 45 minutes'. This is strongly too long. Scince i used "hqdn3d" before without the recursive functions i know, that this result in a 2-3 fps-script, so it finished with 50 minutes after 8-9 hours, but not 1 day...

Maybe someone can help. Maybe there are alternative functions, which i can use to produce the same result...

akupenguin
24th December 2004, 01:03
2-3 fps? what's your CPU? what's the rest of the script? hqdn3d by itself should be much faster than that.

krieger2005
24th December 2004, 15:20
I have an Athlon 1400.

To get more speed i think i will change the function in something like that (for expample for the Blur-Function):

function getBlur(clip c, int blur_cycles){
b1=c.Blur(0.6)
b2=b1.Blur(0.6)
b3=b2.Blur(0.6)
b4=b3.Blur(0.6)
b5=b4.Blur(0.6)

blur_cycles <= 5 ? Select(blur_cycles-1,b1,b2,b3,b4,b5)
\: getBlur(b5,blur_cycles-5)

return last
}


This should not be the solution and maybe someone have a better one, but here you get 5 times less recursion. How do you think about it?

akupenguin
25th December 2004, 10:15
My normal solution to slow filters is to rewrite them in C. Especially in the case of your getBlur or getSharpen, which could be algorithmically faster in addition to losing avisynth's inefficiencies of recursion.
(A large gaussian blur is more efficiently implemented as a single filter than as chained small blurs.)

And I still want to see your whole script that was going to take 2 days.

krieger2005
27th December 2004, 11:48
Sorry for the late answer... you know the hole Holidays ;)

The script is calling only once this function:
function CleanEdges(clip c, int "dering_luma", int "dering_floor", bool "use_old_chroma", int "sharpen_cycles", int "blur_cycles", int "thin", bool "mask", int "edgeinfluence"){c
EdgeBias = 16
dering_floor= default(dering_floor,13)
dering_luma = default(dering_luma, 78)
use_old_chroma = default(use_old_chroma,true)
mask=default(mask,false)
diff=default(edgeinfluence,20)

faktor=float(c.height)*float(c.width)/400000.0
sharpen_cycles=default(sharpen_cycles,round(faktor*3))
blur_cycles=default(blur_cycles,round(faktor*4))
thin = default(thin,round(faktor*2))

blurr = hqdn3d(50,50,0,0)
clean = deen(thrY=60, thrUV=60)
\.hqdn3d(10,10,1,1)
\.DeGrainMedian(limitY=5,limitUV=5,mode=0)
\.UnSharpMask(20,1,0)
\.RemoveGrain(1)

tmpp = string(dering_floor)

sharpy=YToUV(blurr.UToY,blurr.UToY,blurr.BlankClip(pixel_type="YV12"))
blurry=DeRing_getBlur(blurr,blur_cycles)
shar=DeRing_getSharp(blurr,sharpen_cycles)

edge0a = YV12subtract( shar, blurry).YV12LUT(YExpr="x 128 - abs")
u=edge0a.BlankClip(pixel_type="YV12").UToY
edge1a = YV12subtract( sharpy, blurry)
edge1b = yv12subtract( edge1a,
\ edge1a.xsharpen(155,155),tol=1,wideRange=true )
\ .yv12lut(yexpr="x 128 - abs "+tmpp+" *")
\ .inflate()
\ .FineEdge(EdgeBias).blur(1)
\ .DeRing_ExpandToThin(thin)
\ .greyscale.Ylevels(19,1.6,dering_luma,0,255)

tmpp=string(diff)
a=YV12LUTxy(edge0a,edge1b,YExpr="x "+tmpp+" >= 0 y ?")
z=YV12LUTxy(edge1a,a,YExpr="x 150 >= 0 y ?")

use_old_chroma ? mask ? z :
\ maskedmerge( c, clean, z, Y=3,U=1,V=1, useMMX=true ) :
\ mask ? z :
\ maskedmerge( c, clean, z, Y=3,U=3,V=3, useMMX=true )
}


It is a littlebit modified version of the Deringing Algorithm, which "Didée" used in his "IIP". For me there a 3 possible Positions, where the function slowdown:
1. blurr : But you have said before, and i know it, that this make the script not sooo slow
2. clean : This could it be!
3. DeRing_getBlur/DeRing_getSharp/DeRing_ExpandToThin (the functions above).

Before i used "one-pass" getBlur/getSharp... (this means without recursion) and the function was 1-2 fps fast. After i used the recursion it slow down.