Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
18th June 2021, 17:10 | #61 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,367
|
A little bit slower you mean. I asked last week so now Str is 2.0 by default, that means it does a slight shadow lift, set to 1.0 to disable (previous behaviour). Also for comparisons it's better to limit to 6 frames delta, 7 uses MDegrainN which is slower.
The other edit I did was revert ex_logic to mt_logic (in Contrasharp function), I found that "min" and "max" modes run slower on Expr(), but it might not be the case, test this if the above doesn't solve it.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
18th June 2021, 19:40 | #62 | Link | |
Acid fr0g
Join Date: May 2002
Location: Italy
Posts: 2,707
|
Quote:
I did not use Contrasharpening at all. So, the difference is because of Str=2.0?
__________________
@turment on Telegram |
|
18th June 2021, 20:05 | #63 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,367
|
What I meant is that comparing with high delta you are mostly comparing MAnalyse and little else, to easily check the impact of other things (prefilter, sharpening, etc) using a lower delta is preferred. But this is a debugging tip, I also run real case scripts often to compare.
Yes, I can only think of Str. Another thing to compare is quality. The source seems to be 8-bit, aside of quality gain from lifting the shadows in my version prefiltering is performed in 16-bit (knlmeans + luma_rebuild). I would try to compare dark scenes or with low contrast.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
18th June 2021, 23:35 | #64 | Link | |
Acid fr0g
Join Date: May 2002
Location: Italy
Posts: 2,707
|
Quote:
__________________
@turment on Telegram |
|
22nd June 2021, 20:01 | #65 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,367
|
Quick heads up. Big overhaul day with latest ExTools changes. Most of the filters, except for Transforms Pack have been updated so give it a look if you wish (and report bugs).
Benchmarks for LSFmod (5 fps performance increase) and GrainFactory3mod (+3 fps) have been updated. Cheers.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
1st July 2021, 18:36 | #67 | Link |
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,522
|
Hi Dogway,
Just copying/expanding on this from the Avisynth+ thread. You can get a speed increase, at least 20%, from ex_boxblur (should work with the rest as well) with the following changes: Code:
for (px = -krnlrd, krnlrd, 1) { krnh = px == -krnlrd ? Format("x[{px},0] ") : Format(krnh + "x[{px},0] + ") } krnh = krnh + Format(" {wgh} * ") ... for (py = -krnlrv, krnlrv, 1) { krnv = py == -krnlrv ? Format("x[0,{py}] ") : Format(krnv + "x[0,{py}] + ") } krnv = krnv + Format(" {wgh} * ") ... str = "x[-1,1] x[0,1] + x[1,1] + x[-1,0] + x[0,0] + " \ +"x[1,0] + x[-1,-1] + x[0,-1] + x[1,-1] + 0.111111 * " It should also minimise rounding error. Personally I'd add a few more digits to 0.111111 as well (9 ones gives you the minimum rounding error when storing 1/9 as a float) Last edited by wonkey_monkey; 1st July 2021 at 18:46. |
1st July 2021, 18:56 | #68 | Link | |
Registered User
Join Date: Nov 2009
Posts: 2,367
|
Yes, I did the edits and get some 4% performance increase for ex_boxblur(1). Following with radius of 4.
Code:
# 294fps 286fps (pp sum, pp mult) #ex_boxblur(4) # 298fps 303fps (trailed sum, global div) #ex_boxblur(4) # 265fps 268fps (trailed sum, pp mult) #ex_boxblur(4) # 314fps 313fps (trailed sum, global mult) ex_boxblur(4) Quote:
The edited function: Code:
function ex_boxblur(clip a, int "radius", int "radiusV", int "UV", bool "fulls") { rgb = isRGB(a) isy = isy(a) bi = BitsPerComponent(a) rd = Default(radius, 1) # from 0 to inf rv = Default(radiusV, rd) # from 0 to inf UV = Default(UV, rgb ? 3 : 1) fs = Default(fulls, false) rd = max(rd, 1) rv = max(rv, 1) krnlsz = 2 * rd + 1 krnlrd = krnlsz/2 krnh = "" for (px = -krnlrd, krnlrd, 1) { krnh = Format(krnh + "x[{px},0] ") plsh = px == -krnlrd ? "" : plsh + "+ " } krnlsz = 2 * rv + 1 krnlrv = krnlsz/2 krnv = "" for (py = -krnlrv, krnlrv, 1) { krnv = Format(krnv + "x[0,{py}] ") plsv = py == -krnlrv ? "" : plsv + "+ " } str = "x[-1,1] x[0,1] x[1,1] x[-1,0] x[0,0] x[1,0] x[-1,-1] x[0,-1] x[1,-1] + + + + + + + + 0.111111111 *" fbox = rd == 1 && rv == 1 strv = fbox ? str : krnv + plsv + string(1. / krnlrv*2+1) + " *" strh = krnh + plsh + string(1. / krnlrd*2+1) + " *" rv == 0 ? a : \ isy ? Expr(a, strv ) : \ UV == 1 ? Expr(a, strv, "" ) : \ Expr(a, strv, ex_UVexpr(strv, UV, bi, rgb, fs), scale_inputs="none") rd == 0 || fbox ? last : \ isy ? Expr(last, strh ) : \ UV == 1 ? Expr(last, strh, "" ) : \ Expr(last, strh, ex_UVexpr(strh, UV, bi, rgb, fs), scale_inputs="none") }
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 1st July 2021 at 19:05. |
|
1st July 2021, 19:38 | #69 | Link | |
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,522
|
Quote:
With 6 ones, the value stored is (approximately): 0.11111100018024444580078125 With 9 or more ones, it's: 0.11111111193895339965820312 Last edited by wonkey_monkey; 1st July 2021 at 21:18. |
|
1st July 2021, 19:49 | #70 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,367
|
Thanks for the help, I will update and rerun benchmarks to see what I get and update OP if necessary.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
1st July 2021, 19:50 | #71 | Link |
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,522
|
It occurs to me that a box blur is one case where a non-SIMD variation of Expr could, in theory, work better. The ideal algorithm reads every pixel exactly twice - once when adding it to an accumulator, once when subtracting it. That will give you a near constant-time box blur. Expr has to read the full box of pixels every time.
On another topic, doesn't RemoveGrain(19) only averages the eight neighbours, missing out the central pixel? You should give yourself an extra 11.1% for that |
1st July 2021, 23:32 | #72 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,367
|
I did some optimizations to the kernels aside from the above tips and did some benchs and wow, ex_blur(1) is now 96% of removegrain(12)! ex_boxblur(1) is 91% (read below).
Also shaved 4% for ex_xxflate(). Optimized also some convolutions from ex_edge() so they should run faster, for some reason "hprewitt" is bugged I would swear it was working fine before. Also added Overlay_MTools() to the list. @wonkey_monkey: I don't remember fine but I think I didn't like what it was doing with bigger radius. In any case I removed the central pixel to match removegrain(19) (for parity) and then keep as previous for bigger radius. Now the speed is 97% that of removegrain!
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
2nd July 2021, 00:15 | #73 | Link |
Registered User
Join Date: Jan 2018
Posts: 2,163
|
I think can apply above code in here
https://github.com/Dogway/Avisynth-S...rter.avsi#L246 |
2nd July 2021, 00:29 | #74 | Link |
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,522
|
I saw BicubicResize on that line of code. Did you know that, with default parameters, BicubicResize is actually a little bit blurry, even if you do a "null" resize with just a infinitessimal shift of pixels? Setting b = 0, c = 0.5 fixes it and is more mathematically correct.
Last edited by wonkey_monkey; 2nd July 2021 at 00:38. |
2nd July 2021, 23:50 | #76 | Link |
Registered User
Join Date: Jan 2018
Posts: 2,163
|
About blur(.6)
https://forum.doom9.org/showthread.p...77#post1946677 |
3rd July 2021, 09:48 | #77 | Link |
Registered User
Join Date: Feb 2021
Posts: 125
|
Concerning the ex_merge function.
The formula (x*(range_max-z)+y*z)/255 can be simplified to x-(x-y)*(z/range_max) This will increase the speed. change Code:
str = "x range_max z - * y z * + range_max /" Code:
str = "x x y - z range_max / * -" Last edited by Arx1meD; 3rd July 2021 at 10:00. |
3rd July 2021, 10:54 | #78 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,367
|
Wow thank you, I was giving it a thought yesterday but couldn't come with any idea, except for the usual reciprocal.
Code:
"x x y - z 1 range_max / * * -"
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
Tags |
avisynth, dogway, filters, hbd, packs |
Thread Tools | Search this Thread |
Display Modes | |
|
|