Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th December 2016, 23:34   #321  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by Groucho2004 View Post
By the way, the DLL from Release_Intel_XP_Core2_SSE4.2" does not work on XP. I installed the latest Intel redist package but Dependency Walker reveals that LIBIOMP5MD.DLL is looking for a function in kernel32.dll that does not exist on XP.
it's work here if you use this trick http://www.mediafire.com/file/5rp8jt...u/icl+-+xp.rar

tested in winxp sp3 32bit in VirtualBox

the last Intel redist package is the problem, use this https://software.intel.com/sites/def...2016.1.146.zip
__________________
See My Avisynth Stuff

Last edited by real.finder; 7th December 2016 at 23:37.
real.finder is offline   Reply With Quote
Old 8th December 2016, 01:03   #322  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,905
Thanks for the "trick".
With that, it works just fine.
Tested a few minutes ago with:

nnedi3_resize16(target_width=1280, target_height=720, mixed=true, thr=1.0, elast=1.5, nns=4, qual=2, etype=0, pscrn=4, threads=0, kernel_d="Spline", kernel_u="Spline", taps=12, f_d=1.0, f_u=2.0, sharp=0)

to downscale from 1080p to 720p.
Thanks for the update!
FranceBB is offline   Reply With Quote
Old 19th December 2016, 14:34   #323  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
I found something odd in the nnedi3 code, and i think there is an error :
Code:
	for (int y=0; y<ydia; ++y)
	{
		const uint8_t *srcpT = srcp+y_stride;

		for (int x=0; x<xdia; ++x, ++input)
		{
			sum += srcpT[x];
			sumsq += srcpT[x]*srcpT[x];
			input[0] = srcpT[x];
		}
		y_stride+=stride2;
	}
	const float scale = 1.0f/(float)(xdia*ydia);
	mstd[0] = sum*scale;
	mstd[1] = sumsq*scale-mstd[0]*mstd[0];
If think we should have this instead :
Code:
mstd[1] = sumsq*scale*scale-mstd[0]*mstd[0];
Anyone is welcomed to comment.

Last edited by jpsdr; 19th December 2016 at 14:37.
jpsdr is offline   Reply With Quote
Old 19th December 2016, 15:12   #324  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: void
Posts: 2,633
well, you could have a try and see if it still works..
only tritical will ever know the exact answer
feisty2 is offline   Reply With Quote
Old 19th December 2016, 19:43   #325  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
It's too bad he's not on doom9 anymore...
jpsdr is offline   Reply With Quote
Old 19th December 2016, 22:30   #326  |  Link
ajp_anton
Registered User
 
ajp_anton's Avatar
 
Join Date: Aug 2006
Location: Stockholm/Helsinki
Posts: 805
Not knowing exactly what that part is for, but yeah, it sure looks odd.

Breaking it down, if
- sum is just the sum of srcpT's, whatever those are.
- sumsq is the sum of the squared srcpT's.
- "sqsum" is the square of the sum (introducing my own variable).
then
- mstd[1] = (sumsq - sqsum*scale)*scale
which looks weirdly unbalanced. Either
- mstd[1] = (sumsq - sqsum)*scale
or
- mstd[1] = (sumsq - sqsum)*scale*scale
would look better. I guess it's the latter (same as your suggested edit) because mstd[0] already has one scale, so squaring that has two.

Edit:
Then again, squaring scale is also weird, becase it's basically (the number of elements in the sum)^-1, so it's a normalization factor. Maybe it's supposed to be (sumsq - sqsum)*scale ?
Like feisty said, try and see the results.

Last edited by ajp_anton; 19th December 2016 at 22:39.
ajp_anton is offline   Reply With Quote
Old 20th December 2016, 00:33   #327  |  Link
Wilbert
Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,364
There's definitely something wrong, but you should look at entire source to figure out how to correct it.

Sadly any documentation in the source is missing. Here is my take. Disclaimer i understand nothing of the source.

Most of the fun seems to be happing in the function evalFunc_2 in nnedi3.cpp. The code:

Code:
void evalFunc_2(void *ps)
{
	...
	const int qual = pss->qual;
	const float scale = 1.0f/(float)qual;
	void (*extract)(const uint8_t*,const int,const int,const int,float*,float*);
	void (*wae5)(const float*,const int,float*);

	if (opt==1) wae5=weightedAvgElliottMul5_m16_C;
	else wae5=weightedAvgElliottMul5_m16_SSE2;
	...
	if (fapprox&2) // use int16 dot products
		{
			if (opt==1) extract=extract_m8_i16_C;
			else extract=extract_m8_i16_SSE2;
			...
		}
		else // use float dot products
		{
			if (opt==1) extract=extract_m8_C;
			else extract=extract_m8_SSE2;
			...
	}
	...
	extract(srcpp+x,src_pitch,xdia,ydia,mstd,input);
	...
	wae5(temp,nns,mstd);
	...
	if (opt>1) castScale_SSE(mstd,&scale,dstp+x);
	else dstp[x]=min(max((int)(mstd[3]*scale+0.5f),0),255);
	...
}
Looking at the last line, it implies that mstd[3] and the destination pixels differ a factor scale (since dstp[x]=mstd[3]*scale, removing the rounding).
castScale_SSE is defined nnedi3_asm.asm, but i don't know how to read asm.

The function weightedAvgElliottMul5_m16_C which is called in evalFunc_2 (and is set to wae5) gives another clue:
Code:
void weightedAvgElliottMul5_m16_C(const float *w,const int n,float *mstd)
{
	...
	if (wsum>min_weight_sum[0]) mstd[3]+=((5.0f*vsum)/wsum)*mstd[1]+mstd[0];
	else mstd[3]+=mstd[0];
}
This implies that mstd[3], mstd[1] and mst[0] should be of the same scale.

Later on in the code, extract_m8_i16_C/extract_m8_i16_SSE2/extract_m8_C/extract_m8_SSE2, is set to extract. The function extract is called as
Code:
extract(srcpp+x,src_pitch,xdia,ydia,mstd,input);
Here mstd is defined. jspdr pasted some code of the function extract_m8_C, but the issue is there in all of these four functions. In extract_m8_C we see
Code:
void extract_m8_C(const uint8_t *srcp,const int stride,const int xdia,const int ydia,float *mstd,float *input)
{
	...
	const float scale = 1.0f/(float)(xdia*ydia);

	mstd[0] = sum*scale;
	mstd[1] = sumsq*scale-mstd[0]*mstd[0];
	mstd[3] = 0.0f;
	if (mstd[1]<=FLT_EPSILON) mstd[1]=mstd[2]=0.0f;
	else
	{
		mstd[1]=sqrtf(mstd[1]);
		mstd[2]=1.0f/mstd[1];
	}
	...
}
mstd[0] and sum (the source pixels) differ a factor scale which is consistent with the above. That is, if the value of scale in extract_m8_C is the same as scale in evalFunc_2. I have no idea if that's the case.
If we change 'mstd[1] = sumsq*scale-mstd[0]*mstd[0];' to 'mstd[1] = sumsq*scale*scale-mstd[0]*mstd[0];', it implies that mstd[1] and mstd[0] differ a factor scale, but mstd[1] is overwritten by its square root later on, so 'mstd[1]=sqrtf(mstd[1]);'. So now mstd[1] and mstd[0] have the same scale which is consistent with the above.
So you need to change that in all four functions.

What i don't understand what mstd[2] is supposed to do. It has scale^(-1) compared to mstd[1]. I don't see where mstd[2] is used, and thus if its scaling is correct.

Last edited by Wilbert; 20th December 2016 at 00:54.
Wilbert is offline   Reply With Quote
Old 20th December 2016, 00:45   #328  |  Link
Wilbert
Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,364
mmm scale in evalFunc_2 is set to '1.0f/(float)qual;' with qual being an input parameter (being 1 or 2). While scale in extract_m8_C is equal to '1.0f/(float)(xdia*ydia);'.

qual doesn't seem equal to xdia*ydia to me?? xdia and ydia are set by
Code:
pssInfo[i].xdia = xdiaTable[nsize];
pssInfo[i].ydia = ydiaTable[nsize];
and these tables by (see header file):
Code:
const int xdiaTable[NUM_NSIZE] = {8,16,32,48,8,16,32};
const int ydiaTable[NUM_NSIZE] = {6,6,6,6,4,4,4};

Last edited by Wilbert; 20th December 2016 at 00:49.
Wilbert is offline   Reply With Quote
Old 20th December 2016, 10:09   #329  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Finaly, after viewing things in statistic way, it's good. mstd : probably Mean STandard Deviation.
mstd[0] is mean, mstd[1] is mean standard deviation, which is the square root of : mean of the sum of the squares, less the square of the mean.

Sorry, my mistake.

Last edited by jpsdr; 20th December 2016 at 10:12.
jpsdr is offline   Reply With Quote
Old 20th December 2016, 12:43   #330  |  Link
Wilbert
Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,364
Quote:
Originally Posted by jpsdr View Post
Finaly, after viewing things in statistic way, it's good. mstd : probably Mean STandard Deviation.
mstd[0] is mean, mstd[1] is mean standard deviation, which is the square root of : mean of the sum of the squares, less the square of the mean.
Yes indeed.

Your post is a bit cryptic. I think you are right that it should be
Code:
mstd[1] = sumsq*scale*scale-mstd[0]*mstd[0];
But i also think that the scale variables in evalFunc_2 and in the extract functions should be the same. I don't understand why they are different.
Wilbert is offline   Reply With Quote
Old 20th December 2016, 14:42   #331  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Again error from my side, the mean standard deviations is not what i've said after checking (my memory was not exactly right). We are not far, but it's not exactly what is calculated here.
But, what is done here is the mean of the squares less the square of the mean, and viewing like this, it can somehow make sense. So, maybe the formula is correct.
jpsdr is offline   Reply With Quote
Old 20th December 2016, 17:27   #332  |  Link
Wilbert
Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,364
I give up. Leave the bugs in.
Quote:
But, what is done here is the mean of the squares less the square of the mean
This is called the variance, and if you take the square of it you will get the standard deviation. Thus

VAR[X] = E[(X-E[X])^2] = E[x^2]-E[X^2], SD[X] = sqrt(VAR[X])
Wilbert is offline   Reply With Quote
Old 20th December 2016, 17:51   #333  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: void
Posts: 2,633
Quote:
Originally Posted by Wilbert View Post
I give up. Leave the bugs in.

This is called the variance, and if you take the square of it you will get the standard deviation. Thus

VAR[X] = E[(X-E[X])^2] = E[x^2]-E[X^2], SD[X] = sqrt(VAR[X])
should be E(x^2) - E(x)^2

EDIT: Var(x) = E((x-E(x))^2) = E(x^2 - 2xE(x) + E(x)^2) = E(x^2) - 2E(x)E(x) + E(x)^2 = E(x^2) - E(x)^2

Last edited by feisty2; 20th December 2016 at 17:58.
feisty2 is offline   Reply With Quote
Old 20th December 2016, 19:37   #334  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
So, finaly there is probably no bug, sumsq*scale-mstd[0]*mstd[0] produce the variance.
E(x^2)=sumsq*scale
E(x)^2=mstd[0]*mstd[0]
No...?

Still not been able to get 16bits working, and i can't figure out where it's going wrong....

Last edited by jpsdr; 20th December 2016 at 19:43.
jpsdr is offline   Reply With Quote
Old 20th December 2016, 22:45   #335  |  Link
Wilbert
Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,364
Quote:
Originally Posted by jpsdr View Post
So, finaly there is probably no bug, sumsq*scale-mstd[0]*mstd[0] produce the variance.
E(x^2)=sumsq*scale
E(x)^2=mstd[0]*mstd[0]
No...?

Still not been able to get 16bits working, and i can't figure out where it's going wrong....
E(x^2)=sumsq*scale^2 as i see it, but i guess i can't convince anyone.

Anyway. This scale factor is 1 by default (= qual input parameter). Could you make some screenshots voor qual=1 and qual=2 and compare them?
Wilbert is offline   Reply With Quote
Old 20th December 2016, 23:59   #336  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
Quote:
Originally Posted by Wilbert View Post
Could you make some screenshots voor qual=1 and qual=2 and compare them?
This is an English only forum, please don't post in foreign language here, I don't want to have to draw an administrators attention to this. Thank you for your compliance.

Merry Xmas Wilbert et al. [Latin dont count as a foreign language as only dead Romans speak it + a few Swiss Romansch nearly Roman speakers [bout 10,000 I believe]]
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 21st December 2016 at 06:34.
StainlessS is offline   Reply With Quote
Old 21st December 2016, 18:22   #337  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
After a bloody and painfull struggle, i've been able to make the 16bits working.
Can someone explain to me why this is working :
Code:
const uint8_t *srcp = pss->srcp[b];
const uint8_t *srcpp = srcp-(ydia-1)*src_pitch-xdiad2m1;
and why this is not (at least with VS2015 community) :
Code:
const uint8_t *srcp = pss->srcp[b];
const uint8_t *srcpp = srcp-((ydia-1)*src_pitch-xdiad2m1);
???????????

Thanks again again to feisty2 for the code, it was very usefull, especialy for the init part and weight calcul adjustment.
And future thanks also for the part i'll begin to work in : the ASM ! The code will be helpfull.
jpsdr is offline   Reply With Quote
Old 21st December 2016, 18:44   #338  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by jpsdr View Post
Can someone explain to me why this is working :
Code:
const uint8_t *srcp = pss->srcp[b];
const uint8_t *srcpp = srcp-(ydia-1)*src_pitch-xdiad2m1;
and why this is not (at least with VS2015 community) :
Code:
const uint8_t *srcp = pss->srcp[b];
const uint8_t *srcpp = srcp-((ydia-1)*src_pitch-xdiad2m1);
???????????
Because the additional braces in the second statement change the precedence in which the variables are evaluated.
__________________
Groucho's Avisynth Stuff

Last edited by Groucho2004; 21st December 2016 at 18:50.
Groucho2004 is offline   Reply With Quote
Old 21st December 2016, 19:34   #339  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Argh... Back home too late to delete my stupid question after i've realised it...
jpsdr is offline   Reply With Quote
Old 21st December 2016, 19:42   #340  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Great news, I suppose the hard thing was having uint16_t instead of a byte, does it automatically work for e.g. 10 bit videos? (Ideally all filters that work for 16 bits should also support 10, 12 and 14 bit videos)
pinterf is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 21:49.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.