Doom9's Forum - View Single Post

jpsdr · 15th January 2014, 20:38

Ok, i've made tests.
Test were made using an YV12 640x480 avi file.

Script used is the following :

Code:

SetMemoryMax(64)
AVISource("Test.avi",False,"YV12")
SetPlanarLegacyAlignment(True)
#ConvertToYUY2()
#ConvertToYV16()
#ConvertToYV24()
#ConvertToY8()
nnedi3_rpow2(rfactor=2,cshift="Spline36Resize",fwidth=960,fheight=720,nsize=0,nns=3,qual=2)

It allows me to quick change ouput format for testing.

Results with v0.9.4.3 :
Y8,YV12,YUY2,YV24 : fast.
YV16 : Very slow !

So, i've made modifications to my code to test several cases :
Converting internaly YV16->YUY2(process)->YV16 either in nnedi3 or nnedi3_rpow2.

If YV16->YUY2(process)->YV16 is made in nnedi3, and nnedi3_rpow2 process YV16 without change, result is slow. First hint : Slowness is not because of nnedi3 code.

If YV16->YUY2(process)->YV16 is made in nnedi3_rpow2, and nnedi3 process YV16 without change, result is fast.

Issue is within the YV16 process code of nnedi3_rpow2. Code is :

Code:

					for (int i=0; i<ct; ++i)
					{
						v = new nnedi3(v.AsClip(),i==0?1:0,true,true,true,true,nsize,nns,qual,etype,pscrn,threads,opt,fapprox,env);
						v = env->Invoke(turnRightFunction,v).AsClip();
						// always use field=1 to keep chroma/luma horizontal alignment
						v = new nnedi3(v.AsClip(),1,true,true,true,true,nsize,nns,qual,etype,pscrn,threads,opt,fapprox,env);
						v = env->Invoke(turnLeftFunction,v).AsClip();
					}

I've first removed both turn : Result was fast.
If i put back either one of the turn, result is still fast, but... it seems a little slower, not realy obvious. If i put back both, result is very slow.

So, here i'm at a point i don't know what to do to solve this issue, i'm totaly

!!
I need help from someone who master avisynth a looot better than me for solving this case, better than what i've done for now (doing YV16->YUY2(process)->YV16).

15th January 2014, 20:38	#3 \| Link
jpsdr Registered User Join Date: Oct 2002 Location: France Posts: 2,316	Ok, i've made tests. Test were made using an YV12 640x480 avi file. Script used is the following : Code: SetMemoryMax(64) AVISource("Test.avi",False,"YV12") SetPlanarLegacyAlignment(True) #ConvertToYUY2() #ConvertToYV16() #ConvertToYV24() #ConvertToY8() nnedi3_rpow2(rfactor=2,cshift="Spline36Resize",fwidth=960,fheight=720,nsize=0,nns=3,qual=2) It allows me to quick change ouput format for testing. Results with v0.9.4.3 : Y8,YV12,YUY2,YV24 : fast. YV16 : Very slow ! So, i've made modifications to my code to test several cases : Converting internaly YV16->YUY2(process)->YV16 either in nnedi3 or nnedi3_rpow2. If YV16->YUY2(process)->YV16 is made in nnedi3, and nnedi3_rpow2 process YV16 without change, result is slow. First hint : Slowness is not because of nnedi3 code. If YV16->YUY2(process)->YV16 is made in nnedi3_rpow2, and nnedi3 process YV16 without change, result is fast. Issue is within the YV16 process code of nnedi3_rpow2. Code is : Code: for (int i=0; i<ct; ++i) { v = new nnedi3(v.AsClip(),i==0?1:0,true,true,true,true,nsize,nns,qual,etype,pscrn,threads,opt,fapprox,env); v = env->Invoke(turnRightFunction,v).AsClip(); // always use field=1 to keep chroma/luma horizontal alignment v = new nnedi3(v.AsClip(),1,true,true,true,true,nsize,nns,qual,etype,pscrn,threads,opt,fapprox,env); v = env->Invoke(turnLeftFunction,v).AsClip(); } I've first removed both turn : Result was fast. If i put back either one of the turn, result is still fast, but... it seems a little slower, not realy obvious. If i put back both, result is very slow. So, here i'm at a point i don't know what to do to solve this issue, i'm totaly !! I need help from someone who master avisynth a looot better than me for solving this case, better than what i've done for now (doing YV16->YUY2(process)->YV16).