Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
30th January 2019, 20:57 | #41 | Link |
Registered User
Join Date: Dec 2006
Location: Germany
Posts: 91
|
Single-Threaded-Script:
----------------------- DGSourceIM("clip.dgi", engine=1) Trim(515, 4291) Crop(0, 0, -0, -8) FFT3DFilter(sigma=1.0, beta=1.0, bw=32, bh=32, sharpen=0.16, scutoff=0.27, plane=0, bt=3) return last Multi-Threaded-Script: ---------------------- DGSourceIM("clip.dgi", engine=1) RequestLinear(rlim=50, clim=50) # tested with and without; with has been faster Trim(515, 4291) Crop(0, 0, -0, -8) FFT3DFilter(sigma=1.0, beta=1.0, bw=32, bh=32, sharpen=0.16, scutoff=0.27, plane=0, bt=3) Prefetch(2) # tested with 1,2,3,4,6 return last And I am using the "mtmodes.avsi" from here: http://publishwith.me/ep/pad/view/ro.rDkwcdWn4k9/latest Results: AVSMeter 2.8.9 (x64): ------------------------------ 1) ST: 44.18 fps (CPU usage: 25%) 2) MT(2): 58.98 fps (CPU usage: 50%) # with RequestLinear(50,50) 3) MT(2): 57.73 fps (CPU usage: 50%) # with RequestLinear(100,100) 4) MT(3): 55.90 fps (CPU usage: 74%) # with RequestLinear(50,50) 5) MT(3): 60.83 fps (CPU usage: 74%) # with RequestLinear(100,100) 6) MT(4): 57.42 fps (CPU usage: 95%) # with RequestLinear(100,100) Results: Simple x264/x265 Launcher (64-Bit) 2.89.1138: ------------------------------------------------------ 1) 22.49 fps 2) 21.51 fps 5) 20.53 fps The corresponding x264.exe line: -------------------------------- --output-depth 8 --crf 18.0 --preset medium --tune film --trellis 2 --direct auto --me umh --partitions all --vbv-maxrate 24000 --vbv-bufsize 30000 --b-adapt 2 --bframes 3 --merange 16 --ref 3 --keyint 240 --subme 10 --aq-mode 1 --sar 1:1 --rc-lookahead 40 --output "clip.mkv" --frames 3777 --demuxer y4m --stdin y4m - And sometimes, the MT-Job crashes within Simple Launcher. To be up-to-date, I updated the Simple Launcher a few hours ago, and with this new version, the fps got less too (from 22.67 fps to 22.49 fps). I let AVS+ autoload all plugins and sripts; I put everything I need into the corresponding "plugins64"-folder: addgrain.avs AddGrainC.dll avstp.dll CheckTopFirst.avsi colormatrix.dll CompTest.avsi DGDecodeIM.dll DGDecodeNV.dll dither.avsi dither.dll fft3dfilter.dll FFT3dGPU.dll fft3dgpu.hlsl libmfxsw64.dll masktools2.dll mt_xxpand_multi.avsi mtmodes-rev.850.avsi RgTools.dll TIVTC.dll These are the installed filter and script-versions: AddGrainC 1.7.1 (25-11-2013) ColorMatrix 2.5 (20-03-2010) DGDecNV 2052 (30-07-2016) DGDecodeIM beta50 (10-10-2015) Dither tools 1.27.2 (30-12-2015) FFT3DFilter 2.5 (02-07-2018) FFT3dGPU 0.8.4 (21-11-2018) FFTW 3.3.8 (28-05-2018) MaskTools2 2.2.18 (05-09-2018) RgTools 0.97 (02-07-2018) TIVTC 1.0.11 (23-03-2018) Last edited by almosely; 30th January 2019 at 21:07. |
30th January 2019, 21:22 | #42 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
I ran a few tests with your script, it seems that fft3dfilter doesn't scale well in mt setups. I vaguely remember that it has adaptive mt mode registration depending on parameter values. The bottleneck could also be the FFTW library.
__________________
Groucho's Avisynth Stuff |
30th January 2019, 22:27 | #43 | Link | |
Registered User
Join Date: Dec 2006
Location: Germany
Posts: 91
|
Quote:
But I think I have to use AVS+ in 32-bit-mode or migrate back to AVS 2.6.0 MT (SEt), because the FFT3dFilter 2.5 (and 2.4 and 2.3) is messing with the luma, even when the filter is only in the filter-chain without adjustments. It looks like FFT3dFilter dithers and brightens and darkens the image, just when in the filter-chain included - it seems to be an issue with colorspace- or bit-depth-conversion to me. Maybe the old 2.1.1 version (2007) from Fizick is working right and I can use that one (but I did not find any 64-bit version of it and don't know if it's working with AVS+). FFT3DGPU does work fine, regarding that. And AVS 2.6.0 did not crash with this script, wether in ST- nor in MT-Mode. Last edited by almosely; 30th January 2019 at 22:33. |
|
30th January 2019, 23:56 | #44 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Quote:
__________________
Groucho's Avisynth Stuff |
|
31st January 2019, 00:19 | #45 | Link |
Registered User
Join Date: Dec 2006
Location: Germany
Posts: 91
|
Cool, thank you! :-) ... but, I just tried: Still not working right. Must be a problem of AVS+ (x64) :-(
-edit- Grml ... Obviously it's a difference in general between FFT3dFilter and FFT3DGPU :-( At that point I had'nt installed AVS+ and still tested with AVS 2.6.0. But how could it be, that nobody realised that problem before? Perhaps I should check every parameter - maybe a default-value is set wrong? Last edited by almosely; 31st January 2019 at 00:47. |
10th February 2019, 20:28 | #46 | Link |
Registered User
Join Date: Dec 2006
Location: Germany
Posts: 91
|
So, after a long term of testing AVS 2.6.0 MT (SEt) (x86) vs. AviSynth+ 0.1.0 r2772 MT (x64) I came to the conclusion, that AVS+ is faster in general - and, at least with my filter-collection - same/more stable.
With the newest available versions of my filters, VC Redist 2017 and AVS+, the encoding frame-rate went up from 16.76 fps to 16.99 fps and the one from AVSMeter 2.8.9 from 40.37 to 43.49 fps (with one common test-clip). But I discovered something more and that I will post within the corresponding AVS+ thread in a few minutes: https://forum.doom9.org/showthread.p...68856&page=225 |
26th August 2019, 23:16 | #48 | Link | ||
SuperVirus
Join Date: Jun 2012
Location: Antarctic Japan
Posts: 1,351
|
Quote:
https://forum.videohelp.com/threads/...ftw-3-3-7-DLLs and v3.3.8 is here: http://www.mediafire.com/file/ag82s5...tw-338.7z/file Quote:
|
||
26th August 2019, 23:44 | #49 | Link | |
Registered User
Join Date: Jul 2018
Posts: 2
|
Quote:
|
|
24th October 2019, 09:57 | #52 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,904
|
Just use this FFTW 3.3.8
|
2nd November 2019, 02:18 | #53 | Link |
Herr
Join Date: Apr 2009
Location: North Europe
Posts: 556
|
Just as a reminder:
In Windows10 64-bit, for AviSynthPlus, when I put libfftw3f-3.dll (64-bit version) in the directory, C:\Program Files\AviSynthPlus\plugins64 it didn't work. But when I put it in the directory, C:\Program Files\AviSynthPlus\plugins64+ it worked! Last edited by Forteen88; 2nd November 2019 at 11:33. |
10th April 2020, 12:35 | #54 | Link |
Registered User
Join Date: Oct 2001
Location: Germany
Posts: 7,277
|
getting:
Code:
'fftw: alloc.c:29: assertion failed: p' Code:
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\LoadDll.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\AddGrainC.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\dfttest.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\EEDI2.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\eedi3.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\FFT3DFilter.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\masktools2.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\mvtools2.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\TDeint.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\RgTools.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\PlanarTools.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\MedianBlur2.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\nnedi3.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\hqdn3d.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\FFT3dGPU.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\dither.dll") LoadPlugin("I:\Hybrid\32bit\AVISYN~1\KNLMeansCL.dll") LoadCPlugin("I:\Hybrid\32bit\AVISYN~1\yadif.dll") LoadDLL("I:\Hybrid\32bit\AVISYN~1\libfftw3f-3.dll") LoadDLL("I:\Hybrid\32bit\AVISYN~1\d3dx9_30.dll") Import("I:\Hybrid\32bit\avisynthPlugins\QTGMC.avsi") Import("I:\Hybrid\32bit\avisynthPlugins\SMDegrain.avsi") Import("I:\Hybrid\32bit\avisynthPlugins\AnimeIVTC.avsi") Import("I:\Hybrid\32bit\avisynthPlugins\dither.avsi") Import("I:\Hybrid\32bit\avisynthPlugins\TemporalDegrain-v2.avsi") LoadCPlugin("I:\Hybrid\32bit\AVISYN~1\ffms2.dll") SetFilterMTMode("DEFAULT_MT_MODE", MT_MULTI_INSTANCE) # loading source: F:\TestClips&Co\files\interlaceAndTelecineSamples\interlaced\proRes_interlaced_1080_pcm.mov # input color sampling YUY2 # input luminance scale tv FFVideoSource("F:\TESTCL~1\files\INTERL~1\INTERL~1\PRORES~1.MOV",cachefile="E:\Temp\mov_8679e2db13be0eb3ee9aa34b1fc35571_853323747_1_0.ffindex",colorspace="YUY2") # current resolution: 1920x1080 # deinterlacing ConvertToYUY2(interlaced=true) AssumeTFF() QTGMC(Preset="Fast", ediThreads=2) SelectEven() # cropping Crop(22,0,-14,-8)# 1884x1072 # filtering # grain handling # callConvertTo with: ConvertToYV12(interlaced=false) TemporalDegrain2() # <- usees fftw3 # letterboxing AddBorders(18,4,18,4)# resolution: 1884x1072 -> 1920x1080 ConvertToRGB32(matrix="Rec709") PreFetch(8) return last I tried the 3.3.8 build from filler56789, the 3.3.5 build from http://fftw.org/install/windows.html and the 3.3.7 build from https://forum.videohelp.com/threads/...ftw-3-3-7-DLLs (using Avisynth+ 32bit on Windows 10pro) -> is there another build I could try? Cu Selur |
10th April 2020, 13:45 | #55 | Link |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Any body make any sense out of this, from 3.3.8 source
Code:
/////////////////////// // # fftw-3.3.8 // /////////////////////// // # alloc.c #include "kernel/ifftw.h" void *X(malloc_plain)(size_t n) { void *p; if (n == 0) n = 1; p = X(kernel_malloc)(n); CK(p); // # alloc.c line 29 #ifdef MIN_ALIGNMENT A((((uintptr_t)p) % MIN_ALIGNMENT) == 0); #endif return p; } void X(ifree)(void *p) { X(kernel_free)(p); } void X(ifree0)(void *p) { /* common pattern */ if (p) X(ifree)(p); } ######################################################################## // # assert.c #include "kernel/ifftw.h" #include <stdio.h> #include <stdlib.h> void X(assertion_failed)(const char *s, int line, const char *file) { fflush(stdout); fprintf(stderr, "fftw: %s:%d: assertion failed: %s\n", file, line, s); #ifdef HAVE_ABORT abort(); #else exit(EXIT_FAILURE); #endif } ######################################################################## // # kernel\ifftw.h /* determine precision and name-mangling scheme */ #define CONCAT(prefix, name) prefix ## name #if defined(FFTW_SINGLE) typedef float R; # define X(name) CONCAT(fftwf_, name) #elif defined(FFTW_LDOUBLE) typedef long double R; # define X(name) CONCAT(fftwl_, name) # define TRIGREAL_IS_LONG_DOUBLE #elif defined(FFTW_QUAD) typedef __float128 R; # define X(name) CONCAT(fftwq_, name) # define TRIGREAL_IS_QUAD #else typedef double R; # define X(name) CONCAT(fftw_, name) #endif // # ... // # /* define HAVE_SIMD if any simd extensions are supported */ #if defined(HAVE_SSE) || defined(HAVE_SSE2) || \ defined(HAVE_AVX) || defined(HAVE_AVX_128_FMA) || \ defined(HAVE_AVX2) || defined(HAVE_AVX512) || \ defined(HAVE_KCVI) || \ defined(HAVE_ALTIVEC) || defined(HAVE_VSX) || \ defined(HAVE_MIPS_PS) || \ defined(HAVE_GENERIC_SIMD128) || defined(HAVE_GENERIC_SIMD256) #define HAVE_SIMD 1 #else #define HAVE_SIMD 0 #endif extern int X(have_simd_sse2)(void); extern int X(have_simd_avx)(void); extern int X(have_simd_avx_128_fma)(void); extern int X(have_simd_avx2)(void); extern int X(have_simd_avx2_128)(void); extern int X(have_simd_avx512)(void); extern int X(have_simd_altivec)(void); extern int X(have_simd_vsx)(void); extern int X(have_simd_neon)(void); // # ... // # /*-----------------------------------------------------------------------*/ /* alloca: */ #if HAVE_SIMD # if defined(HAVE_KCVI) || defined(HAVE_AVX512) # define MIN_ALIGNMENT 64 # elif defined(HAVE_AVX) || defined(HAVE_AVX2) || defined(HAVE_GENERIC_SIMD256) # define MIN_ALIGNMENT 32 /* best alignment for AVX, conservative for * everything else */ # else /* Note that we cannot use 32-byte alignment for all SIMD. For example, MacOS X malloc is 16-byte aligned, but there was no posix_memalign in MacOS X until version 10.6. */ # define MIN_ALIGNMENT 16 # endif #endif // # ... // # /* assert.c: */ IFFTW_EXTERN void X(assertion_failed)(const char *s, int line, const char *file); /* always check */ #define CK(ex) \ (void)((ex) || (X(assertion_failed)(#ex, __LINE__, __FILE__), 0)) #ifdef FFTW_DEBUG /* check only if debug enabled */ #define A(ex) \ (void)((ex) || (X(assertion_failed)(#ex, __LINE__, __FILE__), 0)) #else #define A(ex) /* nothing */ #endif extern void X(debug)(const char *format, ...); #define D X(debug)
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? |
10th April 2020, 16:33 | #56 | Link |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
Well, there is an assertion checking that pointer 'p' is not NULL. That is what the 'CK' macro does. And this assertion obviously failed, which means that 'p' was NULL.
Since 'p' was assigned to the result of kernel_malloc(), it looks like the memory allocation failed, i.e. it returned NULL. We don't see the details of kernel_malloc() here, but a typical cause for malloc operations to return NULL is "out of memory" situation. This certainly applies to a standard malloc().
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 10th April 2020 at 16:41. |
11th April 2020, 00:20 | #57 | Link |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Thankyou my lord,
thats what I thought, but all them there void X(...) whotsits left me a little perturbed. details of the kernel_malloc() below in kalloc.c Code:
#include "kernel/ifftw.h" #if defined(HAVE_MALLOC_H) # include <malloc.h> #endif /* ``kernel'' malloc(), with proper memory alignment */ #if defined(HAVE_DECL_MEMALIGN) && !HAVE_DECL_MEMALIGN extern void *memalign(size_t, size_t); #endif #if defined(HAVE_DECL_POSIX_MEMALIGN) && !HAVE_DECL_POSIX_MEMALIGN extern int posix_memalign(void **, size_t, size_t); #endif #if defined(macintosh) /* MacOS 9 */ # include <Multiprocessing.h> #endif #define real_free free /* memalign and malloc use ordinary free */ #define IS_POWER_OF_TWO(n) (((n) > 0) && (((n) & ((n) - 1)) == 0)) #if defined(WITH_OUR_MALLOC) && (MIN_ALIGNMENT >= 8) && IS_POWER_OF_TWO(MIN_ALIGNMENT) /* Our own MIN_ALIGNMENT-aligned malloc/free. Assumes sizeof(void*) is a power of two <= 8 and that malloc is at least sizeof(void*)-aligned. The main reason for this routine is that, as of this writing, Windows does not include any aligned allocation routines in its system libraries, and instead provides an implementation with a Visual C++ "Processor Pack" that you have to statically link into your program. We do not want to require users to have VC++ (e.g. gcc/MinGW should be fine). Our code should be at least as good as the MS _aligned_malloc, in any case, according to second-hand reports of the algorithm it employs (also based on plain malloc). */ static void *our_malloc(size_t n) { void *p0, *p; if (!(p0 = malloc(n + MIN_ALIGNMENT))) return (void *) 0; p = (void *) (((uintptr_t) p0 + MIN_ALIGNMENT) & (~((uintptr_t) (MIN_ALIGNMENT - 1)))); *((void **) p - 1) = p0; return p; } static void our_free(void *p) { if (p) free(*((void **) p - 1)); } #endif void *X(kernel_malloc)(size_t n) { void *p; #if defined(MIN_ALIGNMENT) # if defined(WITH_OUR_MALLOC) p = our_malloc(n); # undef real_free # define real_free our_free # elif defined(__FreeBSD__) && (MIN_ALIGNMENT <= 16) /* FreeBSD does not have memalign, but its malloc is 16-byte aligned. */ p = malloc(n); # elif (defined(__MACOSX__) || defined(__APPLE__)) && (MIN_ALIGNMENT <= 16) /* MacOS X malloc is already 16-byte aligned */ p = malloc(n); # elif defined(HAVE_MEMALIGN) p = memalign(MIN_ALIGNMENT, n); # elif defined(HAVE_POSIX_MEMALIGN) /* note: posix_memalign is broken in glibc 2.2.5: it constrains the size, not the alignment, to be (power of two) * sizeof(void*). The bug seems to have been fixed as of glibc 2.3.1. */ if (posix_memalign(&p, MIN_ALIGNMENT, n)) p = (void*) 0; # elif defined(__ICC) || defined(__INTEL_COMPILER) || defined(HAVE__MM_MALLOC) /* Intel's C compiler defines _mm_malloc and _mm_free intrinsics */ p = (void *) _mm_malloc(n, MIN_ALIGNMENT); # undef real_free # define real_free _mm_free # elif defined(_MSC_VER) /* MS Visual C++ 6.0 with a "Processor Pack" supports SIMD and _aligned_malloc/free (uses malloc.h) */ p = (void *) _aligned_malloc(n, MIN_ALIGNMENT); # undef real_free # define real_free _aligned_free # elif defined(macintosh) /* MacOS 9 */ p = (void *) MPAllocateAligned(n, # if MIN_ALIGNMENT == 8 kMPAllocate8ByteAligned, # elif MIN_ALIGNMENT == 16 kMPAllocate16ByteAligned, # elif MIN_ALIGNMENT == 32 kMPAllocate32ByteAligned, # else # error "Unknown alignment for MPAllocateAligned" # endif 0); # undef real_free # define real_free MPFree # else /* Add your machine here and send a patch to fftw@fftw.org or (e.g. for Windows) configure --with-our-malloc */ # error "Don't know how to malloc() aligned memory ... try configuring --with-our-malloc" # endif #else /* !defined(MIN_ALIGNMENT) */ p = malloc(n); #endif return p; } void X(kernel_free)(void *p) { real_free(p); }
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? |
14th April 2020, 19:26 | #58 | Link |
Registered User
Join Date: Oct 2001
Location: Germany
Posts: 7,277
|
Hmm,.. now I just get an access violation (when using Prefetch > 3), but that is probably due to the memory usage which hits the 32bit limit. (with Prefetch 3 script already uses ~2378MB according to avsmeter)
Thanks for that build. Cu Selur |
17th April 2020, 15:36 | #60 | Link | |
Registered User
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
|
Quote:
__________________
See My Avisynth Stuff |
|
Tags |
fftw, fftw3.dll |
Thread Tools | Search this Thread |
Display Modes | |
|
|