Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 3rd July 2021, 11:16   #81  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
I checked it yesterday and didn't find an issue. I tested null resize with both default b and c, with 0 and 0.5, and raw source and got no differences.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
Dogway is offline   Reply With Quote
Old 3rd July 2021, 12:46   #82  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
Quote:
I checked it yesterday and didn't find an issue. I tested null resize with both default b and c, with 0 and 0.5, and raw source and got no differences.
BicubicResize optimises out (incorrectly, in my opinion) null resizes. Non-null resizes will reveal the difference. It's very slight, but the defaults do blur.
__________________
My AviSynth filters / I'm the Doctor

Last edited by wonkey_monkey; 3rd July 2021 at 13:09.
wonkey_monkey is offline   Reply With Quote
Old 4th July 2021, 17:24   #83  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
I still don't understand, yes, adding a minimal shift shows a difference, it's blurrier:
bicubicresize(1920, 1036, src_left=0.00001, src_top = 0.00001)

If I add catrom coeffs as you suggest it's no blurrier anymore:
bicubicresize(1920, 1036, 0, 0.5, src_left=0.00001, src_top = 0.00001)

But that matches a simple null resize with default values:
bicubicresize(1920, 1036)
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
Dogway is offline   Reply With Quote
Old 4th July 2021, 17:50   #84  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
Wonkey,
Didn't you post some example which clearly showed the problem and which stood out like "a sore thumb".
.
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???
StainlessS is offline   Reply With Quote
Old 4th July 2021, 18:57   #85  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
what about z_BicubicResize ?
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 4th July 2021, 19:01   #86  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
Quote:
But that matches a simple null resize with default values:
bicubicresize(1920, 1036)
Because Avisynth deliberately does nothing when it recognises a null resize. What it should do is produce a slightly blurred clip, for consistency. This code makes the problem obvious by using ridiculous b and c parameters:

Code:
version
animate(0,16, "bicubicresize", width,height,4,4,-16.0,-16.0, width,height,4,4,16.0,16.0)
You'll see that the middle frame is not consistent with the rest.

Anyway I really just wanted to make sure you were aware that by using BicubicResize with the default parameters you will (unless your resize happens to be a null one) induce a slight, possibly unwanted, blur.
__________________
My AviSynth filters / I'm the Doctor

Last edited by wonkey_monkey; 4th July 2021 at 19:08.
wonkey_monkey is offline   Reply With Quote
Old 8th July 2021, 13:00   #87  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
Continuing from this post:

@wonkey_monkey: So this is my last attempt at the ternary method. What is wrong about it? The ternary chooses either one or the other, A or C based on the previous comparison which I stored on Q, so I have still 3 elements in the second rightmost comparison.

Code:
 +"A C      < Q@ A   AA@ C   CC@ ? Q C   CC@ A   AA@ ? + Z^ "
 +"B D      < Q@ B   BB@ D   DD@ ? Q D   DD@ B   BB@ ? + Z^ "
 +"E G      < Q@ E   EE@ G   GG@ ? Q G   GG@ E   EE@ ? + Z^ "
 +"F H      < Q@ F   FF@ H   HH@ ? Q H   HH@ F   FF@ ? + Z^ "
 +"AA EE    < Q@ AA   A@ EE   E@ ? Q EE   E@ AA   A@ ? + Z^ "
 +"BB FF    < Q@ BB   B@ FF   F@ ? Q FF   F@ BB   B@ ? + Z^ "
 +"CC GG    < Q@ CC   C@ GG   G@ ? Q GG   G@ CC   C@ ? + Z^ "
 +"DD HH    < Q@ DD   D@ HH   H@ ? Q HH   H@ DD   D@ ? + Z^ "
 +"A B      < Q@ A   AA@ B   BB@ ? Q B   BB@ A   AA@ ? + Z^ "
 +"C D      < Q@ C   CC@ D   DD@ ? Q D   DD@ C   CC@ ? + Z^ "
 +"E F      < Q@ E   EE@ F   FF@ ? Q F   FF@ E   EE@ ? + Z^ "
 +"G H      < Q@ G   GG@ H   HH@ ? Q H   HH@ G   GG@ ? + Z^ "
 +"CC EE    < Q@ CC  CC@ EE   E@ ? Q EE   E@ CC   C@ ? + Z^ "
 +"DD FF    < Q@ DD   D@ FF   F@ ? Q FF   F@ DD   D@ ? + Z^ "
 +"BB E     < Q@ BB   B@ E   EE@ ? Q E   EE@ BB   B@ ? + Z^ "
 +"D GG     < Q@ D   DD@ GG   G@ ? Q GG   G@ D   DD@ ? + Z^ "
 +"B C      < Q@ B   BB@ C   CC@ ? Q C   CC@ B   BB@ ? + Z^ "
 +"DD EE    < Q@ DD   D@ EE   E@ ? Q EE   E@ DD   D@ ? + Z^ "
 +"F G      < Q@ F   FF@ G   GG@ ? Q G   GG@ F   FF@ ? + Z^ "
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
Dogway is offline   Reply With Quote
Old 8th July 2021, 13:15   #88  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
Code:
A C      < Q@ A   AA@ C   CC@ ? Q C   CC@ A   AA@ ? + Z^
The problem is that A writes to AA and C writes to CC (twice) regardless of the result of the ternary, because everything is executed. You're throwing away the actual result of both ternaries by adding them together and dumping them in Z.

The stack goes:

Code:
A
A, C
(A<C) (followed by a store to Q which does nothing to the stack)
(A<C), A (followed by store to AA which does nothing to the stack)
(A<C), A, C (followed by store to CC which does nothing to the stack)
(A or C) <- result of first "?"
(A or C), Q
(A or C), Q, C (followed by store to CC, does nothing)
(A or C), Q, C, A (followed by store to AA, does nothing)
(A or C), (C or A)
Then you add them together and throw them away into Z.

So that code simplifies to

Code:
A @AA C @CC
and that's it.
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Old 8th July 2021, 13:34   #89  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
As I see it, it was a matter of swap:
Code:
A C      < Q@ A   AA@ C   CC@ ? Q C   AA@ A   CC@ ? +
I just want to tell you that your suggestions in this post are flawed, because you can't concatenate two ternaries with no relations between them. You either pop them or do a dumb sum, whichever suits you.

test this and you will get an error:
Code:
"A C < A C ? A C < C A ?"
The issue is that poping the stack flushes the previous set variables AA and CC....

EDIT: I saw it.

Code:
A C      < Q@ A   C    ? AA@ Q C   A    ? CC@
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread

Last edited by Dogway; 8th July 2021 at 13:39.
Dogway is offline   Reply With Quote
Old 8th July 2021, 13:41   #90  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
Quote:
Originally Posted by Dogway View Post
test this and you will get an error:
Code:
"A C < A C ? A C < C A ?"
Because there are two items left on the stack, not one. It wasn't intended to be a complete expression, but an answer to "Is a sort operator possible?"

All suggestions leave the values of both A and C on the stack in order.
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Old 8th July 2021, 13:55   #91  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
That only answers an ideal test scenario of 2 inputs (no need for a sort algo actually) and not what I asked which was a very well defined sorting network of 8 inputs.

One output needs to feed the next and so on in order to work. In your example you are not popping anything, if you did the assumed benefit of Q@ (or R@ in your case) wouldn't work.

This works (very slow):

Code:
+"A C   < A  C  ? AA^ A C   < C  A  ? CC^ "
+"B D   < B  D  ? BB^ B D   < D  B  ? DD^ "
+"E G   < E  G  ? EE^ E G   < G  E  ? GG^ "
+"F H   < F  H  ? FF^ F H   < H  F  ? HH^ "
+"AA EE < AA EE ?  A^ AA EE < EE AA ?  E^ "
+"BB FF < BB FF ?  B^ BB FF < FF BB ?  F^ "
+"CC GG < CC GG ?  C^ CC GG < GG CC ?  G^ "
+"DD HH < DD HH ?  D^ DD HH < HH DD ?  H^ "
+"A B   < A  B  ? AA^ A B   < B  A  ? BB^ "
+"C D   < C  D  ? CC^ C D   < D  C  ? DD^ "
+"E F   < E  F  ? EE^ E F   < F  E  ? FF^ "
+"G H   < G  H  ? GG^ G H   < H  G  ? HH^ "
+"CC EE < CC EE ? CC^ CC EE < EE CC ?  E^ "
+"DD FF < DD FF ?  D^ DD FF < FF DD ?  F^ "
+"BB E  < BB E  ?  B^ BB E  < E  BB ? EE^ "
+"D GG  < D  GG ? DD^ D GG  < GG D  ?  G^ "
+"B C   < B  C  ? BB^ B C   < C  B  ? CC^ "
+"DD EE < DD EE ?  D^ DD EE < EE DD ?  E^ "
+"F G   < F  G  ? FF^ F G   < G  F  ? GG^ "

There's no way to reuse the comparison output of R@ or your other optimization suggestions.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
Dogway is offline   Reply With Quote
Old 8th July 2021, 20:38   #92  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
Quote:
Originally Posted by Dogway
There's no way to reuse the comparison output of R@ or your other optimization suggestions.
Isn't there? This seems to work:

Code:
A C   < R@ A  C  ?  AA^    R C  A  ? CC^
as does this, which duplicates the comparison result without removing it from the stack and is slightly faster:

Code:
A C   < dup A  C  ? AA^    C  A  ? CC^
But anyway, this (which I think was the last of my suggestions, although I may have accidentally deleted my TED Talk) is much faster:

Code:
A C min AA^ A C max CC^
And you can squeeze a few more % out with this:

Code:
A C dup1 dup1 min AA^ max CC^
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Old 11th July 2021, 00:17   #93  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
Yes, I don't remember how I tested but it wasn't working at that point. Another option would be to pop out the comparison "A C < R^" but in any case I benchmarked it and barely improved the performance so I went with your min max suggestion. I use dup occasionally but never used dupn, it's not explained in the docs, my guess is it duplicates the n previous element in the stack but this is a guess. Not sure why but it improves performance by 5% so very nice.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
Dogway is offline   Reply With Quote
Old 11th July 2021, 13:47   #94  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
dup or dup0 copies the top item, dup1 copies the second (zero-indexed) item. It saves Expr from loading the variable from memory again - the value is already in a register because it's on the stack, so it just copies it to another register. "dup0 dup2" would be equally valid in place of "dup1 dup1" because it doesn't matter which order you copy A and C in; I just thought "dup1 dup1" looked nicer, and could be a tiny bit faster (edit: actually it's surprising a lot quicker).

By using swap (which is a free operation) and carefully tracking what's on the stack, you can eliminate the use of variables entirely and get another 4% speedup* (this assumes you're using pinterf's latest test which allows newlines within Expr):

Code:
Expr("
x[-1,1]  x[1,1]  dup1 dup1 min swap2 max
x[0,1]   x[-1,0] dup1 dup1 min swap2 max
x[1,0]   x[0,-1] dup1 dup1 min swap2 max
x[-1,-1] x[1,-1] dup1 dup1 min swap2 max

swap7 swap1 swap3 dup1 dup1 min swap2 max
swap5 swap1 swap3 dup1 dup1 min swap2 max
swap6 swap1 swap2 dup1 dup1 min swap2 max
swap4 swap1 swap7 dup1 dup1 min swap2 max
swap3 swap1 swap2 dup1 dup1 min swap2 max
swap7 swap1 swap2 dup1 dup1 min swap2 max
swap5 swap1 swap6 dup1 dup1 min swap2 max
swap4 swap1 swap3 dup1 dup1 min swap2 max
swap6 swap1 swap3 dup1 dup1 min swap2 max
swap5 swap1 swap4 dup1 dup1 min swap2 max
swap7 swap1 swap5 dup1 dup1 min swap2 max
swap5 swap1 swap3 dup1 dup1 min swap2 max
swap3 swap1 swap4 dup1 dup1 min swap2 max
swap4 swap1 swap5 dup1 dup1 min swap2 max
swap7 swap1 swap3 dup1 dup1 min swap2 max



swap7 Z^ Z^ Z^ Z^ Z^
swap1
Z^
x swap2
clip
")
Before the line "swap7 Z^ Z^ Z^ Z^ Z^ (where I've left a gap) the stack order is "E HH BB CC D AA FF GG". You can then use swaps and pops to pick out the values you're interested in. In this case, where the final section emulates that median clip thingy, it's BB and GG.

No doubt this will give some people conniptions but there's nothing I can do about that.

A further optimisation would be to delete some of the calculations that don't end up getting used at all, depending on the particular algorithm you're implementing at the time (edit: I orginally wrote 7% speedup above because that was the result of deleting one of the superfluous - for this algorithm - calculations).
__________________
My AviSynth filters / I'm the Doctor

Last edited by wonkey_monkey; 11th July 2021 at 15:07.
wonkey_monkey is offline   Reply With Quote
Old 11th July 2021, 14:17   #95  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
PS This is about 13x faster (on an AVX2 processor) than you'd get from a non-SIMD C++ plugin - and that's with a fixed kernel size.
__________________
My AviSynth filters / I'm the Doctor

Last edited by wonkey_monkey; 11th July 2021 at 14:29.
wonkey_monkey is offline   Reply With Quote
Old 11th July 2021, 14:26   #96  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,156
Wow, thanks
kedautinh12 is offline   Reply With Quote
Old 11th July 2021, 15:30   #97  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
Quote:
Originally Posted by wonkey_monkey View Post
dup or dup0 copies the top item, dup1 copies the second (zero-indexed) item. It saves Expr from loading the variable from memory again - the value is already in a register because it's on the stack, so it just copies it to another register. "dup0 dup2" would be equally valid in place of "dup1 dup1" because it doesn't matter which order you copy A and C in; I just thought "dup1 dup1" looked nicer, and could be a tiny bit faster (edit: actually it's surprising a lot quicker).

By using swap (which is a free operation) and carefully tracking what's on the stack, you can eliminate the use of variables entirely and get another 4% speedup* (this assumes you're using pinterf's latest test which allows newlines within Expr):

Code:
Expr("
x[-1,1]  x[1,1]  dup1 dup1 min swap2 max
x[0,1]   x[-1,0] dup1 dup1 min swap2 max
x[1,0]   x[0,-1] dup1 dup1 min swap2 max
x[-1,-1] x[1,-1] dup1 dup1 min swap2 max

swap7 swap1 swap3 dup1 dup1 min swap2 max
swap5 swap1 swap3 dup1 dup1 min swap2 max
swap6 swap1 swap2 dup1 dup1 min swap2 max
swap4 swap1 swap7 dup1 dup1 min swap2 max
swap3 swap1 swap2 dup1 dup1 min swap2 max
swap7 swap1 swap2 dup1 dup1 min swap2 max
swap5 swap1 swap6 dup1 dup1 min swap2 max
swap4 swap1 swap3 dup1 dup1 min swap2 max
swap6 swap1 swap3 dup1 dup1 min swap2 max
swap5 swap1 swap4 dup1 dup1 min swap2 max
swap7 swap1 swap5 dup1 dup1 min swap2 max
swap5 swap1 swap3 dup1 dup1 min swap2 max
swap3 swap1 swap4 dup1 dup1 min swap2 max
swap4 swap1 swap5 dup1 dup1 min swap2 max
swap7 swap1 swap3 dup1 dup1 min swap2 max



swap7 Z^ Z^ Z^ Z^ Z^
swap1
Z^
x swap2
clip
")
Before the line "swap7 Z^ Z^ Z^ Z^ Z^ (where I've left a gap) the stack order is "E HH BB CC D AA FF GG". You can then use swaps and pops to pick out the values you're interested in. In this case, where the final section emulates that median clip thingy, it's BB and GG.
Wow, I need time to wrap my head around that. Eager to test it out. I had no idea that variables had an overhead, or that dup was any different. I will try to implement this first on the easier modes and check performance. ex_median() is already 3.7.1+ only so formatting is fine.

About your last suggestion, I'm already doing that, removing the last unneeded operations (didn't find much of a benefit though), or for example remove the final division in variances since it doesn't change the comparison output.

I just updated ExTools with new modes. Fixed bokeh which now is a real ring kernel, I tried to create a new mode to sample a bokeh image values, but this would require absolute coordinates, I tried with resizing to kernel size and then stacking (StackVert/Hor limit seems to be 506) but wasn't what I was expecting. Will think about it in the future.

Added removegrain 17 and 18 to ex_median(), weighted median and weighted percentile, median 5x5, and kuwahara mode. I'm also trying to implement Retinex and a corner detection algo.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
Dogway is offline   Reply With Quote
Old 11th July 2021, 15:45   #98  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
Quote:
About your last suggestion, I'm already doing that, removing the last unneeded operations (didn't find much of a benefit though)
I was going by the code here which has a few redundancies (e.g. the final values of E, D, and FF are not needed). Eliminating just one (I got a bit lost trying to eliminate more) did make a small but noticeable impact on speed.
__________________
My AviSynth filters / I'm the Doctor

Last edited by wonkey_monkey; 11th July 2021 at 15:58.
wonkey_monkey is offline   Reply With Quote
Old 11th July 2021, 16:12   #99  |  Link
Dogway
Registered User
 
Join Date: Nov 2009
Posts: 2,361
Yes, I eliminated those 2 days ago. I might be able to remove more but I have been more involved in adding new modes and bugfixing. Next versions will be mainly optimizations I guess.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread
Dogway is offline   Reply With Quote
Old 11th July 2021, 17:51   #100  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,497
By the way how did you come up with that order of min/max swaps in the first place? I can't work out how (for undot2) it determines BB and GG in less than 22 min/max swaps.
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Reply

Tags
avisynth, dogway, filters, hbd, packs

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 22:59.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.