Ok, i've made tests.
Test were made using an YV12 640x480 avi file.
Script used is the following :
Code:
SetMemoryMax(64)
AVISource("Test.avi",False,"YV12")
SetPlanarLegacyAlignment(True)
#ConvertToYUY2()
#ConvertToYV16()
#ConvertToYV24()
#ConvertToY8()
nnedi3_rpow2(rfactor=2,cshift="Spline36Resize",fwidth=960,fheight=720,nsize=0,nns=3,qual=2)
It allows me to quick change ouput format for testing.
Results with v0.9.4.3 :
Y8,YV12,YUY2,YV24 : fast.
YV16 : Very slow !
So, i've made modifications to my code to test several cases :
Converting internaly YV16->YUY2(process)->YV16 either in
nnedi3 or
nnedi3_rpow2.
If YV16->YUY2(process)->YV16 is made in
nnedi3, and
nnedi3_rpow2 process YV16 without change, result is slow. First hint : Slowness is not because of
nnedi3 code.
If YV16->YUY2(process)->YV16 is made in
nnedi3_rpow2, and
nnedi3 process YV16 without change, result is fast.
Issue is within the YV16 process code of
nnedi3_rpow2. Code is :
Code:
for (int i=0; i<ct; ++i)
{
v = new nnedi3(v.AsClip(),i==0?1:0,true,true,true,true,nsize,nns,qual,etype,pscrn,threads,opt,fapprox,env);
v = env->Invoke(turnRightFunction,v).AsClip();
// always use field=1 to keep chroma/luma horizontal alignment
v = new nnedi3(v.AsClip(),1,true,true,true,true,nsize,nns,qual,etype,pscrn,threads,opt,fapprox,env);
v = env->Invoke(turnLeftFunction,v).AsClip();
}
I've first removed both turn : Result was fast.
If i put back either one of the turn, result is still fast, but... it seems a little slower, not realy obvious. If i put back both, result is very slow.
So, here i'm at a point i don't know what to do to solve this issue, i'm totaly
!!
I need help from someone who master avisynth a looot better than me for solving this case, better than what i've done for now (doing YV16->YUY2(process)->YV16).