Thread: Avisynth+
View Single Post
Old 14th November 2017, 17:14   #3741  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by Myrsloik View Post
About how much faster is avx2 vs sse2 on a modern cpu in your expr version?
I had to do that in blind mode, I have no AVX2, only through SDE emulator. I could test it only two days ago on a 2 yr old i5 notebook and the results show that it was worth to implement.
Other speed tests are welcome, that's why there are optXXX parameters.

Code:
results in fps
avx2: set it only in Expr through optAvx2 parameter
bits      i5 sse2 32/64 bit  i5Avx2 32/64 bit
8         17.00 19.30         24.63 28.98
16       15.69 17.59         20.38 23.26   
32       12.70 13.59         16.03 17.14
The script was something like this (deleted my debug experimental commented out lines)
Code:
lsmashvideosource("13HoursCUT.mp4", format="YUV444P8")
Spline64Resize(486,240) #resize, result is a multistacked image
src=last
# expr
c8 = CalcTest(src,8, False) 
c16 = CalcTest(src,16, False)
c32 = CalcTest(src,32, False)

# lutxy
c8e  = CalcTest(src,8, True)
c16e  = CalcTest(src,16, True)
c32e = CalcTest(src,32, True)

res8=Diff(c8,src)
res16=Diff(c16,src)
res32=Diff(c32,src)

res8e=Diff(c8e,src)
res16e=Diff(c16e,src)
res32e=Diff(c32e,src)

col1=StackVertical(c8,c16.convertbits(8),c32.convertbits(8))
col2=StackVertical(res8, res16, res32)
col3=StackVertical(c8e,c16e.convertbits(8),c32e.convertbits(8))
col4=StackVertical(res8e, res16e, res32e)
StackHorizontal(col1, col2, col3, col4)

#used only c8, c16 or c32 output for speed test from the clips above.
# change parameters. e.g. optSSE2=true, optSingleMode=false, optAvx2=false
c8

Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)).Levels(120, 1, 255-120, 0, 255, coring=false)
}

Function CalcTest(clip src, int bits, bool lut)
{
src
convertbits(bits)
tmp=last
method=Blur(1)

szrp=16
spwr=4
str=100/100.0
sdmplo=4
sdmphi=48
expr_pow = "x y == x x x y - abs "+string(Szrp) +" scaleb / 1 "+string(Spwr)+" / ^ "+string(Szrp) +" scaleb * "+string(str)+" * x y - 2 ^ x y - 2 ^ "
\+string(SdmpLo)+" scaleb scaleb + / * x y - x y - abs / * 1 "+string(SdmpHi)+" scaleb 0 == 0 x y - abs "+string(SdmpHi)+" scaleb / 4 ^ ? + / + ?"

ret=lut ? mt_lutxy(tmp,method, yexpr=expr_pow, U=1,V=1 ) : Expr(tmp,method,expr_pow,"","", optSSE2=true, optSingleMode=false, optAvx2=false) 
return ret
}

Last edited by pinterf; 14th November 2017 at 17:20. Reason: test env clarification
pinterf is offline