Log in

View Full Version : Avisynth+ plugin modernization efforts


Pages : 1 2 3 [4] 5 6 7 8

jpsdr
5th September 2015, 10:59
Ported version of AutoYUY2, based on neuron2's pluggin, updated to API v2.6 and x64 version, so, not working with avisynth 2.5.8.
Sources are here (https://github.com/jpsdr/AutoYUY2).
Binaries are here (https://github.com/jpsdr/AutoYUY2/releases/download/20150905/AutoYUY2_20150905.7z).

New version.

jpsdr
7th June 2016, 20:01
Ported version of AutoYUY2, based on neuron2's pluggin, updated to API v2.6 and x64 version, so, not working with avisynth 2.5.8.
Sources are here (https://github.com/jpsdr/AutoYUY2).
Binaries are here (https://github.com/jpsdr/AutoYUY2/releases/download/20160607/AutoYUY2_20160607.7z).

New version, add multi-threading.

jpsdr
8th June 2016, 23:13
Ported version of AutoYUY2, based on neuron2's pluggin, updated to API v2.6 and x64 version, so, not working with avisynth 2.5.8.
Sources are here (https://github.com/jpsdr/AutoYUY2).
Binaries are here (https://github.com/jpsdr/AutoYUY2/releases/download/20160608/AutoYUY2_20160608.7z).

New version, some fixes.

pinterf
27th November 2016, 10:30
RgTools v0.93 released (x86/x64)

https://github.com/pinterf/RgTools/releases/tag/0.93


Same functionality as 0.92.1
New bit depths 10, 12, 14, 16 bits and 32 bit float
New formats: Planar RGB, Planar RGBA and YUVA support
Use aligned loads if possible, internally
10+ bit formats require SSE4 for fast processing
XP Support
Built with VS2015, requires Visual Studio Redistributable
Update 3 or newer
Of course it works with classic Avisynth for 8 bit videos.


todo: AVX2

Big thanks to tp7 for the nice code.

tormento
27th November 2016, 10:51
RgTools v0.93 released (x86/x64)
:thanks:
MaskTools is the next I hope :)

real.finder
24th January 2017, 08:51
RgTools v0.93 released (x86/x64)


just tried with complex script

I get CRASH

Fault Module Name: RgTools.dll
Fault Module Version: 0.93.0.0
Fault Module Timestamp: 583a9af0
Exception Code: c0000005
Exception Offset: 000300d3
Locale ID: 1033
Additional Information 1: 0a9e
Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
Additional Information 3: 0a9e
Additional Information 4: 0a9e372d3b4ad19135b953a78882e789


0.92.1 work fine

and about todo, there are others mode of RemoveGrain (more than 24), there are some scripts use them, and there are RemoveGrainHD and RemoveGrainT too, or it's hard to add them as tp7 said?

burfadel
24th January 2017, 08:58
just tried with complex script

I get CRASH

Fault Module Name: RgTools.dll
Fault Module Version: 0.93.0.0
Fault Module Timestamp: 583a9af0
Exception Code: c0000005
Exception Offset: 000300d3
Locale ID: 1033
Additional Information 1: 0a9e
Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
Additional Information 3: 0a9e
Additional Information 4: 0a9e372d3b4ad19135b953a78882e789


0.92.1 work fine

and about todo, there are others mode of RemoveGrain (more than 24), there are some scripts use them, and there are RemoveGrainHD and RemoveGrainT too, or it's hard to add them as tp7 said?

What CPU and OS are you using?

pinterf
24th January 2017, 09:02
Do you know, which filter/mode/frame dimensions were you using?

real.finder
24th January 2017, 09:05
What CPU and OS are you using?

i7 X 980 with windows server 2008 r2

real.finder
24th January 2017, 09:07
Do you know, which filter/mode/frame dimensions were you using?

I will try make a simple sample then, but not now, few hrs later maybe

pinterf
24th January 2017, 09:13
I will try make a simple sample then, but not now, few hrs later maybe
And I wonder that you are using unaligned crop or something that can result in unaligned (non-mod16) frames somewhere?

real.finder
24th January 2017, 17:46
And I wonder that you are using unaligned crop

yes, that is, with align=true in crop it work fine, but why the old RGtools was fine then?

and you didn't say something about

and about todo, there are others mode of RemoveGrain (more than 24), there are some scripts use them, and there are RemoveGrainHD and RemoveGrainT too, or it's hard to add them as tp7 said?

pinterf
25th January 2017, 12:35
RgTools 0.92 used only unaligned loads in RemoveGrain, thus did not check for it. My version uses aligned loads (six unaligned, three aligned one in a 3x3 pixel block). I wanted to get nearer to the speed of tp7's version with my VS2015 build (but could not reach it, 32 bit dll is a bit slower but 64 bit DLL is faster on my machine i7-3770).

#mode x86 (0.93/0.92) x64 (0.93/0.92)
#mode=2 32.7 37.1 40.3 39.2
#mode=4 32.7 36.6 43.3 41.0
#mode=12 74.0 73.7 79.2 78.1
#mode=19 79.5 82.7 84.6 85.1
#mode=20 33.7 34.6 37.8 37.1
#mode=21 35.5 43.1 46.8 43.4


In Clense for example there is an explicite check against this:
if (!is_16byte_aligned(pSrc) || !is_16byte_aligned(pRef1) || !is_16byte_aligned(pRef2)) {
env->ThrowError("Invalid memory alignment. Used unaligned crop?"); //omg I feel so dumb
}
But I forgot to apply this check into RemoveGrain.
(One note: unaligned crop may seem to be fast, but would result in speed loss in the next filter because that filter is ordered to use slow C path)

I had a look at the source code of modes 25-30, well, it would take time to reverse engineer the assembly code, and I'm not surprised that tp7 did not update rgtools.

Reel.Deel
25th January 2017, 15:46
I had a look at the source code of modes 25-30, well, it would take time to reverse engineer the assembly code, and I'm not surprised that tp7 did not update rgtools.

Considering the original author never documented these modes, I doubt anyone uses them.

On a side note, RgTools is based on RemoveGrain v1.0 which means that Clense is missing the reduceflicker parameter which breaks compatibility with some scripts that used the more popular RemoveGrain v0.9. Colours provided me with a script workaround but it's slower than the plugin and can only be used once per script. If it's not too difficult can you please add the reduceflicker parameter to Clense?

Edit: RgTools' Clense is also missing parameters that were added in RemoveGrain V1.0, see here: http://videoprocessing.fr.yuku.com/reply/638/Can-use-quantile-like-vertical-median-filter#reply-638

LoadPlugin("D:\RemoveGrainSSE2.dll")
LoadPlugin("D:\RemoveGrainTSSE2.dll")

blankclip( pixel_type = "YV12" )

clense( reduceflicker = false )

==> " clense does not have a named argument 'reduceflicker' "


Per default, clense always uses *not* the original previous frame, but the already clense'd previous frame (recursive operation). This kind of operation is

a) not a "true" temporal median

b) perfectly un-suited for e.g. MVTools-scripts, Interleave/SelectEvery solutions, etc.

With all the older RemoveGrain versions, the recursive operation could be simply switched off with "reduceflicker=false".

In the actual v1.0, this parameter does not exist anymore. Instead, you're faced with "previous/next" clip parameters (easy), with "recursion slots" (harder), and whatnotelse.



RemoveGrain-1.0.rar SSE3 works fine with the spline36 version of LSF here :)

Yeah, but it breaks compatibility with most (if not all) scripts that make use of the temporal filters (clense, temporalrepair). For these you need the new package RemoveGrainT. But then, it still breaks those scripts, because the framework of those filters was changed majorly.

That's the reason why I don't blindly recommend the actual RemoveGrain v1.0 package. If you tell "that's the version to use", then lots of older scripts will not work anymore.

pinterf
25th January 2017, 16:37
Considering the original author never documented these modes, I doubt anyone uses them.

On a side note, RgTools is based on RemoveGrain v1.0 which means that Clense is missing the reduceflicker parameter which breaks compatibility with some scripts that used the more popular RemoveGrain v0.9. Colours provided me with a script workaround but it's slower than the plugin and can only be used once per script. If it's not too difficult can you please add the reduceflicker parameter to Clense?

Edit: RgTools' Clense is also missing parameters that were added in RemoveGrain V1.0, see here: http://videoprocessing.fr.yuku.com/reply/638/Can-use-quantile-like-vertical-median-filter#reply-638

Clense is working 'recursively' if it can do, that is, one of the input clips of the Nth clense session is retrieved from the previous (N-1)th clensed frame. This frame (named lframe) and the frame number (name lnr) is saved internally in the class after each Clense. Why? Because it can only work properly, if Clense is getting the frame requests strictly sequentally.
If this condition does not get fulfilled it falls back to getting the (N-1)th child frame, the same behaviour as with reduceflicker=false.

Clense checks if lnr==n-1 (where lnr the last clensed frame number, that the class processed; n is the currently requested frame number in GetFrame()). Thus when Clense gets its request out of order, reduceflicker is ineffective.

So I suppose Clense can hardly work multithreaded properly when reduceflicker==true. And even if the filter is set to "serialized" in MT mode, nothing ensures that it gets the requests in a linear way.

So I can probably add the parameter but cannot ensure that is works. Or I can add 'reduceflicker' just for compatibility (like tp7 added 'planar') and won't use.

real.finder
25th January 2017, 18:56
Considering the original author never documented these modes, I doubt anyone uses them.


softsharp (web.archive.org/web/20160608111758/http://leon1789.perso.sfr.fr/avisynth/SoftSharpen-8.8.zip) for instance use 27

here 26 and 27 http://videoprocessing.fr.yuku.com/sreply/143/RemoveGrain-10-prerelease#.WIj5HGX-vIU

real.finder
25th January 2017, 19:12
RgTools 0.92 used only unaligned loads in RemoveGrain, thus did not check for it. My version uses aligned loads (six unaligned, three aligned one in a 3x3 pixel block). I wanted to get nearer to the speed of tp7's version with my VS2015 build (but could not reach it, 32 bit dll is a bit slower but 64 bit DLL is faster on my machine i7-3770).

#mode x86 (0.93/0.92) x64 (0.93/0.92)
#mode=2 32.7 37.1 40.3 39.2
#mode=4 32.7 36.6 43.3 41.0
#mode=12 74.0 73.7 79.2 78.1
#mode=19 79.5 82.7 84.6 85.1
#mode=20 33.7 34.6 37.8 37.1
#mode=21 35.5 43.1 46.8 43.4



tp7 was using vs2012, that mean new c++ is slower? he say that x64 can be slower in IRC back then, so what about build x86 with 2012 or 2010 for speed and x64 with 2015?

DJATOM
25th January 2017, 19:24
Guess I found a bug in Average. Look at this comparison: http://screenshotcomparison.com/comparison/198383. It seems Average produce lower values than RAverageW.
After some investigation I found that problem is here: http://i.imgur.com/VhmTOVN.png
Can someone fix it?

real.finder
25th January 2017, 19:58
about reduceflicker, it's in 0.9 or 1.0pre? http://videoprocessing.fr.yuku.com/reply/40/RemoveGrain-10-prerelease#reply-40

I think it's the "Beta" release | Recommended here http://avisynth.nl/index.php/RemoveGrain

real.finder
25th January 2017, 20:28
Guess I found a bug in Average. Look at this comparison: http://screenshotcomparison.com/comparison/198383. It seems Average produce lower values than RAverageW.
After some investigation I found that problem is here: http://i.imgur.com/VhmTOVN.png
Can someone fix it?

it's same in old Average too? http://avisynth.nl/index.php/Average#Average_for_AviSynth_2.5

DJATOM
25th January 2017, 20:44
it's same in old Average too? http://avisynth.nl/index.php/Average#Average_for_AviSynth_2.5
http://diff.pics/RL4g7UYPkxYl/1

pinterf
25th January 2017, 22:50
tp7 was using vs2012, that mean new c++ is slower? he say that x64 can be slower in IRC back then, so what about build x86 with 2012 or 2010 for speed and x64 with 2015?
I can try it.
But I suspect Intel, in the vcproj (https://github.com/tp7/RgTools/blob/master/RgTools/RgTools.vcxproj) file this setting was active: <PlatformToolset> Intel C++ Compiler XE 14.0

I spent a whole day with optimizing and understanding why mode 21 is that much slower. Tried VS2017RC but it is not better. Intel must handle XMM register usage and command reordering brilliantly. RemoveGrain mode21 - that has the largest difference on x86 - wants to use just one more XMM register (of the 8) that is available on x86, I think Intel can solve it w/o using extra memory access. I really wonder what code did it generate. But for x64, there are twice as much XMM registers, and the code is much faster indeed.

pinterf
25th January 2017, 23:15
Guess I found a bug in Average. Look at this comparison: http://screenshotcomparison.com/comparison/198383. It seems Average produce lower values than RAverageW.
After some investigation I found that problem is here: http://i.imgur.com/VhmTOVN.png
Can someone fix it?
What's wrong with that section? (I was planning porting Average for high bitdepth, it is not a difficult filter)

DJATOM
26th January 2017, 00:40
What's wrong with that section? (I was planning porting Average for high bitdepth, it is not a difficult filter)

First I tried to modify plugin's code to use "weighted_average_c". It works fine.
Next I tried to use "weighted_average_sse2" for processing. Result is identical to RAverageW, fine.
Next I commented
for (int i = 0; i < frames_count-1; i+=2) {
__m128i src = _mm_loadl_epi64(reinterpret_cast<const __m128i*>(src_pointers[i]+x));
__m128i src2 = _mm_loadl_epi64(reinterpret_cast<const __m128i*>(src_pointers[i+1]+x));
__m128i weight = _mm_set1_epi32(*reinterpret_cast<int*>(int_weights + i));

src = _mm_unpacklo_epi8(src, zero);
src2 = _mm_unpacklo_epi8(src2, zero);
__m128i src_lo = _mm_unpacklo_epi16(src, src2);
__m128i src_hi = _mm_unpackhi_epi16(src, src2);

__m128i weighted_lo = _mm_madd_epi16(src_lo, weight);
__m128i weighted_hi = _mm_madd_epi16(src_hi, weight);

acc_lo = _mm_add_epi32(acc_lo, weighted_lo);
acc_hi = _mm_add_epi32(acc_hi, weighted_hi);
}
in the "weighted_average_int_sse2" and got green frames. So I assume the pointed section is a culprit since it actually does some processing in my case. If it isn't, okay, but problem is still remain.
I decided to live with "weighted_average_sse2" for now, since it works for me, but isn't it better to have that bug fixed? ;)

pinterf
26th January 2017, 06:51
What weights were you using and how many frames? Maybe there is difference in rounding

DJATOM
26th January 2017, 08:38
Average(resize1(x,y), 0.3, resize2(x,y), 0.7)
Filter works as expected if weights are 0.5 for both clips.

pinterf
26th January 2017, 09:32
If one of the weights is over 1.0 then the internal calculation is using float instead of integer. C path is also using float calculation. But the default average method uses integer arithmetic, with weights scaled to 14 bits precision, and here we have the difference. Internal integer weights are calculated as (1 << 14) * weights[i] where weights[i] are 0.3 and 0.7 in your case. In Merge filter in avisynth core, there is rounding in the integer arithmetic after the multiplication, and here it is obviously missing.

You can try replace this
__m128i weighted_lo = _mm_madd_epi16(src_lo, weight);
__m128i weighted_hi = _mm_madd_epi16(src_hi, weight);

acc_lo = _mm_add_epi32(acc_lo, weighted_lo);
acc_hi = _mm_add_epi32(acc_hi, weighted_hi);


with this (not tried)

__m128i weighted_lo = _mm_madd_epi16(src_lo, weight);
__m128i weighted_hi = _mm_madd_epi16(src_hi, weight);

__m128i round_mask = _mm_set1_epi32(0x2000);
weighted_lo = _mm_add_epi32(weighted_lo, round_mask);
weighted_hi = _mm_add_epi32(weighted_hi, round_mask);

acc_lo = _mm_add_epi32(acc_lo, weighted_lo);
acc_hi = _mm_add_epi32(acc_hi, weighted_hi);

DJATOM
26th January 2017, 11:38
Yeah, output now is identical to RAverageW. Thank you.

pinterf
26th January 2017, 12:00
Yeah, output now is identical to RAverageW. Thank you.
O.K. forked the project, I make it right (even number of frames need another addition, plus high bit depth)

Reel.Deel
26th January 2017, 15:27
Clense is working 'recursively' if it can do, that is, one of the input clips of the Nth clense session is retrieved from the previous (N-1)th clensed frame. This frame (named lframe) and the frame number (name lnr) is saved internally in the class after each Clense. Why? Because it can only work properly, if Clense is getting the frame requests strictly sequentally.
If this condition does not get fulfilled it falls back to getting the (N-1)th child frame, the same behaviour as with reduceflicker=false.

Clense checks if lnr==n-1 (where lnr the last clensed frame number, that the class processed; n is the currently requested frame number in GetFrame()). Thus when Clense gets its request out of order, reduceflicker is ineffective.

So I suppose Clense can hardly work multithreaded properly when reduceflicker==true. And even if the filter is set to "serialized" in MT mode, nothing ensures that it gets the requests in a linear way.

So I can probably add the parameter but cannot ensure that is works. Or I can add 'reduceflicker' just for compatibility (like tp7 added 'planar') and won't use.

I searched around to see if people actually use reduceflicker=true and I only found 2 instances so I guess it's safe just to add a dummy parameter to keep compatibility. To be truly backward compatible, it needs to be before the previous and next parameters though. Clense is also missing the planar parameter.

Here's the script workaround for Clense(reduceflicker=true) (courtesy of Colours). As it was mentioned earlier, this script can only be used once per script and probably does not work with MT.

function clense3(clip c)
{
# I have never had a need to use global variables before so idk if this is correct
global clense_reference = c.trim(0,-1)
c
scriptclip("""
i = current_frame
a = i > 0 ? c.trim(0,length=i-1) + clense_reference + c.trim(i,end="""+string(c.framecount()-1)+""") : c
a.clense()
global clense_reference = last.trim(i,length=1)
return last
""")
}

On another topic, the default for the original VerticalCleaner is 2, in RgTools the default is 1.

Can you make RgTools self-register the MT mode?

softsharp (web.archive.org/web/20160608111758/http://leon1789.perso.sfr.fr/avisynth/SoftSharpen-8.8.zip) for instance use 27

here 26 and 27 http://videoprocessing.fr.yuku.com/sreply/143/RemoveGrain-10-prerelease#.WIj5HGX-vIU

So a total on one script uses mode 27. :rolleyes:
Even though I added all those links to the wiki I completely forgot he documented some of those modes. Regardless, I don't think any of them are really useful, hence why DitherTools nor RgTools included them.

about reduceflicker, it's in 0.9 or 1.0pre? http://videoprocessing.fr.yuku.com/reply/40/RemoveGrain-10-prerelease#reply-40

I think it's the "Beta" release | Recommended here http://avisynth.nl/index.php/RemoveGrain

The redcuceflicker parameter is included in RemoveGrain v0.9 and v1.0beta. v1.0pre is where Kassandro split the spatial and temporal filters into 2 different packages and broke backward compatibility with some of the filters (mainly the temporal ones). This is why the most popular and most compatible version is RemoveGrain v1.0beta.

Earlier I mentioned RgTools is based on RemoveGrain v1.0, I should of been more clear. RgTools is based on RemoveGrain v1.0pre, the Clense family is based on RemoveGrainT v1.0pre.

pinterf
26th January 2017, 16:05
New build.

Average v0.93 (https://github.com/pinterf/Average/releases/)
v0.93 (20170126 - pinterf)
Fix: rounding of intermediate results in fast integer average of 8 bit clips
Mod: faster results for two or three clips (8 bit)
New: Support for Avisynth+ color spaces: 10-16 bit and float YUV(A)/Planar RGB(A), RGB48 and RGB64
10+ bits are calculated in float precision internally.
New: auto register as NICE_FILTER for Avisynth+
New: add version resource
Info: built with VS2015 Update 3, may require Visual Studio 2015 Redistributable update 3


please test it

pinterf
26th January 2017, 16:17
@Reel.Deel:
This is already done, will appear in next release: "Can you make RgTools self-register the MT mode?"
Todo: add dummy reduceflicker and planar parameters for Clense for compatibility and easy transition of old scripts
Do you need unaligned crop support? (Done, but makes the DLL more huge - each version of each mode of each filter is forceinlined for speed reasons)

real.finder
26th January 2017, 18:16
Todo: add dummy reduceflicker and planar parameters for Clense for compatibility and easy transition of old scripts
Do you need unaligned crop support? (Done, but makes the DLL more huge - each version of each mode of each filter is forceinlined for speed reasons)

I think real reduceflicker will be better but with reduceflicker=false by default

and Clense has cache parameters too, but not in RGtools, so dummy cache parameters is needed too

unaligned crop support is not something important, but it's better to check against this by ThrowError("Invalid memory alignment. Used unaligned crop?"), but anyway dll size does not matter :) so do what you see is the best

and it will be great if you add RemoveGrain modes 25-27 (28-30 not needed at all cuz they not documented anywhere)

pinterf
26th January 2017, 22:18
Until I gain strength for modes 25-27 (which could be days or weeks), here is a
new RgTools 0.94 (https://github.com/pinterf/RgTools/releases) build.

RgTools 0.94
Clense: new parameter (from v0.9): bool reduceflicker (default false)
Clense: dummy compatibility parameters: bool planar, int cache
Autoregister filter MT modes as NICE_FILTER for Avisynth+
(except for Clense: when reduceflicker is true, MULTI_INSTANCE MT mode is reported)
Alignment check in Repair and RemoveGrain (anti-unaligned crop measures)

Play with it, especially with that Clense reduceflicker thingy. You have to test it, anything can happen. Try it in multithreading mode and report.

real.finder
27th January 2017, 00:18
RgTools 0.94 (https://github.com/pinterf/RgTools/releases) build.


:thanks:

seems you forget this little thing


On another topic, the default for the original VerticalCleaner is 2, in RgTools the default is 1.


not so important, so you can include it in the next release

pinterf
27th January 2017, 16:13
And here is a fresh
Average v0.94 (https://github.com/pinterf/Average/releases)

v0.94 (20170127)
Fix: fix the fix: rounding of intermediate results was ok for two clips
New: AVX for 10-16bit (+20-30%) and float (+50-60%) compared to v0.93
AVX for 8 bit non-integer path (+20% gain), e.g. when one of the weights is over 1.0
Note 1: AVX needs 32 byte frame alignment (Avisynth+ default)
Note 2: AVX CPU flag is reported by recent Avisynth+ version
Note 3: AVX is reported only on approriate OS (from Windows 7 SP1 on)

v0.93 (20170126 - pinterf)
Fix: rounding of intermediate results in fast integer average of 8 bit clips
Mod: faster results for two or three clips
New: Support for Avisynth+ color spaces: 10-16 bit and float YUV(A)/Planar RGB(A), RGB48 and RGB64
10+ bits are calculated in float precision internally.
New: auto register as NICE_FILTER for Avisynth+
New: add version resource
Info: built with VS2015 Update 3, may require Visual Studio 2015 Redistributable update 3

fAy01
27th January 2017, 22:30
Until I gain strength for modes 25-27 (which could be days or weeks), here is a
new RgTools 0.94 (https://github.com/pinterf/RgTools/releases) build.

RgTools 0.94
Clense: new parameter (from v0.9): bool reduceflicker (default false)
Clense: dummy compatibility parameters: bool planar, int cache
Autoregister filter MT modes as NICE_FILTER for Avisynth+
(except for Clense: when reduceflicker is true, MULTI_INSTANCE MT mode is reported)
Alignment check in Repair and RemoveGrain (anti-unaligned crop measures)

Play with it, especially with that Clense reduceflicker thingy. You have to test it, anything can happen. Try it in multithreading mode and report.

Build v0.93 worked fine but the latest build doesn't work with edgecleaner.
http://avisynth.nl/index.php/EdgeCleaner

real.finder
28th January 2017, 01:47
Build v0.93 worked fine but the latest build doesn't work with edgecleaner.
http://avisynth.nl/index.php/EdgeCleaner

what you get? I test it now and it seems work

fAy01
28th January 2017, 05:09
what you get? I test it now and it seems work

It seems there was a glitch with my pc. My bad.

yup
31st January 2017, 16:09
I am try shift from YUY2 to YV16, but it is not easy.
For example masktools do not support
mt_merge(,,,luma=true)
for YV16.
Also not all plugin which support YUY2 work with YV16.
yup.

Dreamland
26th February 2017, 22:28
Doesn't work with EdgeCleaner (with some crop parameters)

RGTools v 0.94 (with error)


https://t8.pixhost.org/thumbs/1485/37682966_appunti01.jpg (https://pixhost.org/show/1485/37682966_appunti01.jpg)


(Identical crop parameters) works with RGTools v0.92.1

:thanks:

pinterf
27th February 2017, 22:49
Doesn't work with EdgeCleaner (with some crop parameters)

RGTools v 0.94 (with error)

(Identical crop parameters) works with RGTools v0.92.1

:thanks:
RgTools 0.93 and 0.94 does not support unaligned frames, that are the result of an unaligned crop for example. In 0.93 the Repair would result in an exception but in 0.94 it presented an error message, at least that was my intention.

Unfortunately in function Repair, only one of the two clips is checked against the alignment errors, thus you got an access violation instead of the error message. (Alignment is required in this build)

Please, modify your script with a Crop parameter align=true.

(I would recommend everyone to stop using unaligned crops, because the speed gain would be neglible, and the next filter after crop would either fail or fall-back to an unoptimized ultra-slow C instead of optimized SIMD assembly.)

Thanks for the report, anyway.

ajp_anton
28th February 2017, 14:43
So shouldn't align=true be the default in Avisynth then?

pinterf
28th February 2017, 15:02
Avisynth+ has only aligned crop. If the alignment rules (32 bytes in AVS+) will not be broken, the faster SubFrame mode is automatically used, otherwise it copies the cropped area to a new frame.
The align parameter in Crop exists only for the backward compatibility, has no effect.

pinterf
14th March 2017, 18:08
New RgTools 0.95 (https://github.com/pinterf/RgTools/releases)

RgTools 0.95 (20170314)
Fix: RemoveGrain Mode 20: overflow at 14 and 16 bit depths in SSE4 (stripes)
"Repair": error on unaligned frames (unaligned crop) instead of access violation error

tormento
10th May 2017, 06:50
MedianBlur is used for MinBlur, that is used inside SMDegrainMOD, now available in 16 bit flavour.

I made some tests and here (https://forum.doom9.org/showthread.php?p=1806431#post1806431) MysteryX suggested me that the arisen problems could be caused by no high bit aware MedianBlur.

@pinterf (others too): can we get a 16 bit build to play with? With MinBlur functionality too perhaps? ;)

real.finder
10th May 2017, 12:26
MedianBlur is used for MinBlur, that is used inside SMDegrainMOD, now available in 16 bit flavour.

I made some tests and here (https://forum.doom9.org/showthread.php?p=1806431#post1806431) MysteryX suggested me that the arisen problems could be caused by no high bit aware MedianBlur.

@pinterf (others too): can we get a 16 bit build to play with? With MinBlur functionality too perhaps? ;)

MedianBlur is not always used, if you don't use prefilter mode 0 or 1 or 2 in SMDegrain and don't use contrasharp then no MinBlur is used

tormento
10th May 2017, 12:47
MedianBlur is not always used, if you don't use prefilter mode 0 or 1 or 2 in SMDegrain and don't use contrasharp then no MinBlur is used



Ok.. I had suspects. So, no idea why so different results for speed and size [emoji24]

pinterf
10th May 2017, 13:33
For speed: if a script is using mt_lutxy, it cannot always use fast lookup tables for memory reasons.
Above 12 bits mt_lutxy calculates the expression realtime, pixel-by-pixel, which is slooooow (unlike Expr in VapourSynth).
For the specific bit depths at which realtime expression evaluation kicks in, see masktools2 readme ("feature matrix" section) or the wiki.

MysteryX
10th May 2017, 15:03
If we call SMDegrain with default parameters (no prefilter, no contrasharp), performance is fine.

8-bit

FPS (min | max | average): 2.367 | 137556 | 56.36
Memory usage (phys | virt): 115 | 113 MiB
Thread count: 29
CPU usage (average): 62%


16-bit

FPS (min | max | average): 1.745 | 155894 | 41.05
Memory usage (phys | virt): 120 | 117 MiB
Thread count: 29
CPU usage (average): 62%


In practice, we generally want to use a prefilter and contrasharp. MedianBlur causes problems.