Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 Doom9's Forum Dogway's Filters Packs
 Register FAQ Calendar Search Today's Posts Mark Forums Read

27th September 2021, 08:59   #421  |  Link
tormento
Acid fr0g

Join Date: May 2002
Location: Italy
Posts: 2,503
Quote:
 Originally Posted by Dogway I tried though to implement to ex_contrast() without success
Look at this and this too.
__________________
@turment on Telegram

 27th September 2021, 09:23 #422  |  Link pinterf Registered User   Join Date: Jan 2014 Posts: 2,308 'exp' is a valid Expr function, and is using SIMD, probably worth using it. In your replacement code above, luckily power with small integer exponents like 1, 2, 3 and 4 are optimized internally into mul (and dup), but for larger exponent values the result is calculated using a^b = exp(b*ln(a)) which needs much more computing. __________________ AviSynth+ on github, Other repos: RgTools, Masktools2, MvTools2, TIVTC, Average
 27th September 2021, 13:55 #423  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 I didn't want to make the snippet more complex than it is so I used "n ^", in real I'm using vars to reuse operations. I tested with taylor series and gave a great performance improvement over 'exp' but it might not be valid for high steepness, I crafted a graph to see what was happening, I still might be doing something wrong. https://www.desmos.com/calculator/guetsfy9ww Guess I can join another polynomial but it makes things more complex. EDIT: just tested and yes, 'exp' is as fast. I had the notion that not when coding ex_bilateral() removing 'exp' gave a huge speed boost. @tormento: thanks. I think cos, sin and tan are the easiest, I already adapted 'interpolation' mode in ex_blend(), the problem comes when 'x' uses derivatives and other complex functions like atan. If I'm not wrong atan(x) = tan(y) = sin(y) / cos(y) __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 27th September 2021 at 14:17.
27th September 2021, 15:16   #424  |  Link
tormento
Acid fr0g

Join Date: May 2002
Location: Italy
Posts: 2,503
Quote:
 Originally Posted by Dogway If I'm not wrong atan(x) = tan(y) = sin(y) / cos(y)
Nope, the arctan function is the inverse of the tangent function: it returns the angle whose tangent is a given number.

There is a Taylor series & here for it too. Look here also.

Explicit algorithm is present too.
__________________
@turment on Telegram

Last edited by tormento; 27th September 2021 at 15:32.

 27th September 2021, 17:21 #425  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 Thanks tormento, I managed to build a piecewise function for atan(x) since the Taylor series didn't converge between 0.8 and 1.65 so I built a polynomial in that section. https://www.desmos.com/calculator/bb392gsvnu I tested on avisynth and works fine, now I will try to optimize it and benchmark, and see if I can reduce it on a case by case basis. Here's the code and bench (400% speed increase): Code: `Expr(last,Format(" x 255 / atan 255 *"),"") # 90` Code: ```e8 = "X dup dup * X2@ X * X3@ 0.333333 * - X2 X3 * X5@ 0.200001 * + X5 X2 * X7@ 0.142857143 * - X7 X2 * 0.111111111 * + X7 X3 * 0.0909090909 * -" # up to 0.8 e16 = " X2 -0.245982 * X 1.00976 * + 0.021622 +" # up to 1.65 els = "pi 0.5 * 1 X / - 1 X3 3 * / + 1 X5 5 * / - 1 X7 7 * / - " # from 1.65 onwards atan = "X@ 0.8 <= "+e8+" X 1.65 >= "+els+" "+e16+" ? ?" # atan = "X@ 0.8 <= "+e8+" "+e16+" ?" # for atan([0-1]) Expr(last,Format("x range_max / "+atan+" range_max *"),"") # 413``` __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 27th September 2021 at 17:51.
27th September 2021, 18:59   #426  |  Link
tormento
Acid fr0g

Join Date: May 2002
Location: Italy
Posts: 2,503
Quote:
 Originally Posted by Dogway Thanks tormento, I managed to build a piecewise function for atan(x) since the Taylor series didn't converge between 0.8 and 1.65
You are the most welcome. AFAIK you used a McLaurin (x=0) and not a Taylor series (where x is an arbitrary point), that's why it doesn't fit for |x| larger than 0.
__________________
@turment on Telegram

Last edited by tormento; 27th September 2021 at 19:05.

 27th September 2021, 22:18 #427  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 Yes I know, I'm currently working on cos(x) where x is pi/2 since some functions need cosines as high as pi. Here's the Taylor series of cos(x) when x=pi/2, converges between 0 and pi. Code: ```cosTP = " pi 0.500001 * - X@ 0.00000367321 swap - X dup * X2@ 0.0000018366 * + X2 X * X3@ 0.166666666 * + X2 dup * 0.00000015305 * - X2 X3 * 0.008333333 * -" Expr(last,Format("x range_max / pi * "+cosTP+" range_max *"),"") # 390 #Expr(last,Format("x range_max / pi * cos range_max *"),"") # 70``` __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
 27th September 2021, 23:25 #428  |  Link wonkey_monkey Formerly davidh*****     Join Date: Jan 2004 Posts: 2,485 If pi/2 < x < pi, can't you just subtract x from pi and then take the negative of the result? Or have I misunderstood... you say you're working on cos(x) when x = pi/2, but that's just zero every time... Possible helpful reading: http://gruntthepeon.free.fr/ssemath/sse_mathfun.h __________________ My AviSynth filters / I'm the Doctor Last edited by wonkey_monkey; 27th September 2021 at 23:31.
 27th September 2021, 23:36 #429  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 It's not cosine of pi/2 but a cosine function approximation around pi/2, so when cos(pi) it gives more accurate results than if I design the Taylor series around x=0. Here is the desmos graph (check around x=pi ) __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
 27th September 2021, 23:42 #430  |  Link wonkey_monkey Formerly davidh*****     Join Date: Jan 2004 Posts: 2,485 I see, so as per my previous comment: you could design your calculation around 0 < x' < pi/2, reducing x to this range appropriately first. It might be faster for the same accuracy (or more accurate for the same speed). The purple one needs 5 powers of x, the green one only needs 3. You would basically be taking the first part of the green line (up to pi/2) and rotating it around its endpoint to extend it to pi. https://www.desmos.com/calculator/bktakxsm7u Edit: there is a slight discontinuity at pi/2 but you can remove that my nudging the coefficients. __________________ My AviSynth filters / I'm the Doctor Last edited by wonkey_monkey; 27th September 2021 at 23:56.
 28th September 2021, 00:04 #431  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 Yes, makes total sense, inverting the function and make it piecewise. I will bench speed and quality in case the discontinuity is visible. __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
28th September 2021, 00:05   #432  |  Link
tormento
Acid fr0g

Join Date: May 2002
Location: Italy
Posts: 2,503
Quote:
 Originally Posted by Dogway Here's the Taylor series of cos(x) when x=pi/2, converges between 0 and pi.
Please share the desmos, it was really interesting.
__________________
@turment on Telegram

 28th September 2021, 00:10 #433  |  Link wonkey_monkey Formerly davidh*****     Join Date: Jan 2004 Posts: 2,485 Using 0.0013934 (this is just a rough approximation, not a calculated value) as the x^6 coefficient should all but remove the discontinuity. The maximum error is about 0.0000924. Adding an x^8 term can make the max error almost 100x smaller. I'll try and work on best coefficients tomorrow, if I have time. __________________ My AviSynth filters / I'm the Doctor Last edited by wonkey_monkey; 28th September 2021 at 00:15.
 28th September 2021, 00:27 #434  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 I tested and speed is 2% faster, it was already pretty fast, from 420 to 430fps. Quality wise I think it's better because it doesn't touch range extremes which are always sensible and the discontinuity is not appreciable. But if you can find a better coefficient that would be great. There's another approximation noted by tormento, the Bhaskara I approx. but it's not as good as the six degree polynomial. Code: `(pi^2 - 4x^2) / (pi^2 + x^2)` BTW, if I compute the Taylor series for x=pi/4 it might fix the discontinuity -> graph EDIT: yep, coefficient 0.001329 is almost a match, much better. tormento: check wonkey_monkey's link above. He includes all the three approximations, the one from my post is the purple. __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 28th September 2021 at 01:16.
28th September 2021, 02:39   #435  |  Link
Pauly Dunne
Grumpy Old Man.

Join Date: Jul 2019
Location: Out There....
Posts: 692
"Real" testing,,

Hi Dogway (and the rest),

I asked a question back here :-

Quote:
 Do any of you ppl that are constantly improving & testing these filters & scripts, actually do any encoding with them ?? Like I mean a full length, 4K movie with your filters, just to see how they REALLY perform !!!
If you're only "benching" them, that's probably NO indication on how they will work in an encoding job.
__________________
Not poorly done, just doin' it my way !!!
Live every day like it's your last, because one day, it will be !! (M\$B)
PD Builds, etc

28th September 2021, 03:51   #436  |  Link
kedautinh12
Registered User

Join Date: Jan 2018
Posts: 2,133
Quote:
 Originally Posted by Pauly Dunne Hi Dogway (and the rest), I asked a question back here :- https://forum.doom9.org/showthread.p...24#post1953024 That didn't get answered. If you're only "benching" them, that's probably NO indication on how they will work in an encoding job.

28th September 2021, 07:22   #437  |  Link
Pauly Dunne
Grumpy Old Man.

Join Date: Jul 2019
Location: Out There....
Posts: 692
Quote:
Hi,

I don't consider that as an answer...it's very a confusing comment, and I still can't get the latest builds to work.

And I do remember encoding DVD's at a very slow pace, my fave tool was DVD2SVCD, and I had a very powerful dual Athlon MP2600 system to churn thru it.
__________________
Not poorly done, just doin' it my way !!!
Live every day like it's your last, because one day, it will be !! (M\$B)
PD Builds, etc

 28th September 2021, 15:16 #438  |  Link wonkey_monkey Formerly davidh*****     Join Date: Jan 2004 Posts: 2,485 Best coefficients I've found so far: Up to x^6: Code: ```1 - 0.5x^2 + 0.041574811029363x^4 - 0.001292112506266x^6 Max error: 0.000019976279586 (43x better than truncated Taylor series, 4.6x better than modifying only last term to avoid discontinuity)``` Up to x^8: Code: ```1 - 0.5x^2 + 0.041666666666667x^4 - 0.001387723061268x^6 + 0.000023661684925x^8 Max error: 0.000000330438621 (72x better than truncated Taylor series, 6x better than modifying only last term to avoid discontinuity)``` __________________ My AviSynth filters / I'm the Doctor Last edited by wonkey_monkey; 28th September 2021 at 15:25.
 28th September 2021, 15:38 #439  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 Thanks a lot. Will keep the first one for performance reasons. EDIT: By the way, I managed to also create a Taylor (Maclaurin) series for exp(x), I know 'exp' is accelerated in Expr but it was the main cause of ex_bilateral() drop in performance ('exp' called many times) so I decided to give it a go. Well it works very well for as low as a 5th degree polynomial even in PC levels, it increased from 115fps for ex_bilateral(1) to 167fps, so it's faster than vsTBilateral(). In ex_contrast() it isn't worth it as it's only called once, and the range of action is larger (from -8 to +8 in x) __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 28th September 2021 at 16:07.
 28th September 2021, 22:15 #440  |  Link Dogway Registered User   Join Date: Nov 2009 Posts: 2,351 Access violation, not sure if my fault or a bug in avs+: Code: ```a=FlipHorizontal() #~ ex_blend(a,"interpolation",1,0.7) cosTS = "X dup * X2@ 0.500001 * 1 swap - X2 dup * X4@ 0.041574811029363 * + X4 X2 * 0.001292112506266 * -" # 0.00129 to fix discontinuity at pi/2 cosT = "X@ pi 0.500001 * <= "+cosTS+" dup pi swap - -1 * ? " Expr(last,a,"x ymin - ymax ymin - / pi * "+cosT+" 0.250001 * 0.500001 swap - y ymin - ymax ymin - / pi * "+cosT+" 0.250001 * - ymax ymin - *" ,"")``` I also tried with "dup pi swap - 1 neg * ? " but the neg operator seems to not be working. This works though: Code: ```cosTS = "X dup * X2@ 0.500001 * 1 swap - X2 dup * X4@ 0.041574811029363 * + X4 X2 * 0.001292112506266 * - dup" cosT = "X@ pi 0.500001 * <= "+cosTS+" pi swap - -1 * ? "``` Looks like I cannot dup a referenced string(?) __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 28th September 2021 at 22:21.

 Tags avisynth, dogway, filters, hbd, packs