Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Announcements and Chat > General Discussion

Reply
 
Thread Tools Search this Thread Display Modes
Old 30th January 2019, 20:57   #41  |  Link
almosely
Registered User
 
Join Date: Dec 2006
Location: Germany
Posts: 91
Single-Threaded-Script:
-----------------------

DGSourceIM("clip.dgi", engine=1)

Trim(515, 4291)

Crop(0, 0, -0, -8)

FFT3DFilter(sigma=1.0, beta=1.0, bw=32, bh=32, sharpen=0.16, scutoff=0.27, plane=0, bt=3)

return last


Multi-Threaded-Script:
----------------------

DGSourceIM("clip.dgi", engine=1)

RequestLinear(rlim=50, clim=50) # tested with and without; with has been faster

Trim(515, 4291)

Crop(0, 0, -0, -8)

FFT3DFilter(sigma=1.0, beta=1.0, bw=32, bh=32, sharpen=0.16, scutoff=0.27, plane=0, bt=3)

Prefetch(2) # tested with 1,2,3,4,6

return last


And I am using the "mtmodes.avsi" from here: http://publishwith.me/ep/pad/view/ro.rDkwcdWn4k9/latest


Results: AVSMeter 2.8.9 (x64):
------------------------------

1) ST: 44.18 fps (CPU usage: 25%)

2) MT(2): 58.98 fps (CPU usage: 50%) # with RequestLinear(50,50)

3) MT(2): 57.73 fps (CPU usage: 50%) # with RequestLinear(100,100)

4) MT(3): 55.90 fps (CPU usage: 74%) # with RequestLinear(50,50)

5) MT(3): 60.83 fps (CPU usage: 74%) # with RequestLinear(100,100)

6) MT(4): 57.42 fps (CPU usage: 95%) # with RequestLinear(100,100)


Results: Simple x264/x265 Launcher (64-Bit) 2.89.1138:
------------------------------------------------------

1) 22.49 fps

2) 21.51 fps

5) 20.53 fps


The corresponding x264.exe line:
--------------------------------
--output-depth 8 --crf 18.0 --preset medium --tune film --trellis 2 --direct auto --me umh --partitions all --vbv-maxrate 24000 --vbv-bufsize 30000 --b-adapt 2 --bframes 3 --merange 16 --ref 3 --keyint 240 --subme 10 --aq-mode 1 --sar 1:1 --rc-lookahead 40 --output "clip.mkv" --frames 3777 --demuxer y4m --stdin y4m -


And sometimes, the MT-Job crashes within Simple Launcher. To be up-to-date, I updated the Simple Launcher a few hours ago, and with this new version, the fps got less too (from 22.67 fps to 22.49 fps).

I let AVS+ autoload all plugins and sripts; I put everything I need into the corresponding "plugins64"-folder:

addgrain.avs
AddGrainC.dll
avstp.dll
CheckTopFirst.avsi
colormatrix.dll
CompTest.avsi
DGDecodeIM.dll
DGDecodeNV.dll
dither.avsi
dither.dll
fft3dfilter.dll
FFT3dGPU.dll
fft3dgpu.hlsl
libmfxsw64.dll
masktools2.dll
mt_xxpand_multi.avsi
mtmodes-rev.850.avsi
RgTools.dll
TIVTC.dll

These are the installed filter and script-versions:

AddGrainC 1.7.1 (25-11-2013)
ColorMatrix 2.5 (20-03-2010)
DGDecNV 2052 (30-07-2016)
DGDecodeIM beta50 (10-10-2015)
Dither tools 1.27.2 (30-12-2015)
FFT3DFilter 2.5 (02-07-2018)
FFT3dGPU 0.8.4 (21-11-2018)
FFTW 3.3.8 (28-05-2018)
MaskTools2 2.2.18 (05-09-2018)
RgTools 0.97 (02-07-2018)
TIVTC 1.0.11 (23-03-2018)

Last edited by almosely; 30th January 2019 at 21:07.
almosely is offline   Reply With Quote
Old 30th January 2019, 21:22   #42  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
I ran a few tests with your script, it seems that fft3dfilter doesn't scale well in mt setups. I vaguely remember that it has adaptive mt mode registration depending on parameter values. The bottleneck could also be the FFTW library.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 30th January 2019, 22:27   #43  |  Link
almosely
Registered User
 
Join Date: Dec 2006
Location: Germany
Posts: 91
Quote:
Originally Posted by FFT3DFilter 2.5 (02-07-2018) (x64) Documentation
Version 2.3 - February 21, 2017
- apply current avs+ headers
- 10-16 bits and 32 bit float colorspace support in AVS+
- Planar RGB support
- look for libfftw3f-3.dll first, then fftw3.dll
- inline asm ignored on x64 builds
- pre-check: if plane to process for greyscale is U and/or V then returns original clip
- auto register MT mode for avs+: MT_SERIALIZED
- autoscale sigma and smin/smax parameter from 8 bit scale if colorspace is different
Version 2.4 - June 08, 2017
- some inline asm (not all) ported to simd intrisics, helps speedup x64 mode, but some of them faster also on x86.
- intrinsics bt=0
- intrinsics bt=2, degrid=0, pfactor=0
- intrinsics bt=3 sharpen=0/1 dehalo=0/1
- intrinsics bt=3
- Adaptive MT settings for Avisynth+: MT_SERIALIZED for bt==0 (temporal), MT_MULTI_INSTANCE for others
- Copy Alpha plane if exists
- reentrancy checks against bad multithreading usage
Note: for properly operating in MT_SERIALIZED mode in Avisynth MT, please use Avs+ r2504 or better.
Version 2.5 - July 02, 2018
- Change 32 bit float formar: U/V chroma center to zero instead of 0.5 to match Avisynth+ r2728
Yes, it has. I use bt=3, so mt-mode 2 will be used.

But I think I have to use AVS+ in 32-bit-mode or migrate back to AVS 2.6.0 MT (SEt), because the FFT3dFilter 2.5 (and 2.4 and 2.3) is messing with the luma, even when the filter is only in the filter-chain without adjustments. It looks like FFT3dFilter dithers and brightens and darkens the image, just when in the filter-chain included - it seems to be an issue with colorspace- or bit-depth-conversion to me. Maybe the old 2.1.1 version (2007) from Fizick is working right and I can use that one (but I did not find any 64-bit version of it and don't know if it's working with AVS+). FFT3DGPU does work fine, regarding that. And AVS 2.6.0 did not crash with this script, wether in ST- nor in MT-Mode.

Last edited by almosely; 30th January 2019 at 22:33.
almosely is offline   Reply With Quote
Old 30th January 2019, 23:56   #44  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by almosely View Post
Maybe the old 2.1.1 version (2007) from Fizick is working right and I can use that one (but I did not find any 64-bit version of it and don't know if it's working with AVS+).
I made a 64 bit build of 2.1.1 some time ago. You're welcome to try it. It does work just fine with AVS+.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 31st January 2019, 00:19   #45  |  Link
almosely
Registered User
 
Join Date: Dec 2006
Location: Germany
Posts: 91
Cool, thank you! :-) ... but, I just tried: Still not working right. Must be a problem of AVS+ (x64) :-(

-edit-

Quote:
Originally Posted by almosely View Post
I observed a big difference when comparing fft3dfilter against fft3dgpu within AvsPmod at first sight (histogram "luma" activated).
Grml ... Obviously it's a difference in general between FFT3dFilter and FFT3DGPU :-( At that point I had'nt installed AVS+ and still tested with AVS 2.6.0. But how could it be, that nobody realised that problem before? Perhaps I should check every parameter - maybe a default-value is set wrong?

Last edited by almosely; 31st January 2019 at 00:47.
almosely is offline   Reply With Quote
Old 10th February 2019, 20:28   #46  |  Link
almosely
Registered User
 
Join Date: Dec 2006
Location: Germany
Posts: 91
So, after a long term of testing AVS 2.6.0 MT (SEt) (x86) vs. AviSynth+ 0.1.0 r2772 MT (x64) I came to the conclusion, that AVS+ is faster in general - and, at least with my filter-collection - same/more stable.

With the newest available versions of my filters, VC Redist 2017 and AVS+, the encoding frame-rate went up from 16.76 fps to 16.99 fps and the one from AVSMeter 2.8.9 from 40.37 to 43.49 fps (with one common test-clip).

But I discovered something more and that I will post within the corresponding AVS+ thread in a few minutes:

https://forum.doom9.org/showthread.p...68856&page=225
almosely is offline   Reply With Quote
Old 26th August 2019, 06:15   #47  |  Link
ciko5
Registered User
 
Join Date: Jul 2018
Posts: 2
hello,

i need fftw3.dll for x64 bit and x32 bit fftw-3.3.8 or fftw-3.3.7 etc. or what do you have. My OS is Windows 10 64 bit

Sincerely.
ciko5 is offline   Reply With Quote
Old 26th August 2019, 23:16   #48  |  Link
filler56789
SuperVirus
 
filler56789's Avatar
 
Join Date: Jun 2012
Location: Antarctic Japan
Posts: 1,351
Quote:
Originally Posted by ciko5 View Post
hello,

i need fftw3.dll for x64 bit and x32 bit fftw-3.3.8 or fftw-3.3.7 etc. or what do you have.
v3.3.7 is here:

https://forum.videohelp.com/threads/...ftw-3-3-7-DLLs

and v3.3.8 is here:

http://www.mediafire.com/file/ag82s5...tw-338.7z/file

Quote:
My OS is Windows 10 64 bit.
Good luck with that
filler56789 is offline   Reply With Quote
Old 26th August 2019, 23:44   #49  |  Link
ciko5
Registered User
 
Join Date: Jul 2018
Posts: 2
Quote:
Originally Posted by filler56789 View Post
thank you so much i have a question. If i rename libfftw3f-3.dll to fftw3.dll can i use it? because fft3dfilter.dll needs fftw3.dll.
ciko5 is offline   Reply With Quote
Old 26th August 2019, 23:51   #50  |  Link
filler56789
SuperVirus
 
filler56789's Avatar
 
Join Date: Jun 2012
Location: Antarctic Japan
Posts: 1,351
Quote:
Originally Posted by ciko5 View Post
thank you so much i have a question. If i rename libfftw3f-3.dll to fftw3.dll can i use it? because fft3dfilter.dll needs fftw3.dll.
Yes, just rename it.
filler56789 is offline   Reply With Quote
Old 23rd October 2019, 19:48   #51  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,156
Quote:
Originally Posted by Wolfberry View Post
FFTW 3.3.8 built with ICC 19.0 now in my signature.

Have fun
I don't see in your signature
kedautinh12 is offline   Reply With Quote
Old 24th October 2019, 09:57   #52  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,904
Quote:
Originally Posted by kedautinh12 View Post
I don't see in your signature
Just use this FFTW 3.3.8
FranceBB is offline   Reply With Quote
Old 2nd November 2019, 02:18   #53  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 556
Just as a reminder:
In Windows10 64-bit, for AviSynthPlus, when I put libfftw3f-3.dll (64-bit version) in the directory,
C:\Program Files\AviSynthPlus\plugins64
it didn't work.
But when I put it in the directory,
C:\Program Files\AviSynthPlus\plugins64+
it worked!

Last edited by Forteen88; 2nd November 2019 at 11:33.
Forteen88 is offline   Reply With Quote
Old 10th April 2020, 12:35   #54  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,277
getting:
Code:
'fftw: alloc.c:29: assertion failed: p'
when using:
Code:
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\LoadDll.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\AddGrainC.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\dfttest.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\EEDI2.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\eedi3.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\FFT3DFilter.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\masktools2.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\mvtools2.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\TDeint.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\RgTools.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\PlanarTools.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\MedianBlur2.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\nnedi3.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\hqdn3d.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\FFT3dGPU.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\dither.dll")
LoadPlugin("I:\Hybrid\32bit\AVISYN~1\KNLMeansCL.dll")
LoadCPlugin("I:\Hybrid\32bit\AVISYN~1\yadif.dll")
LoadDLL("I:\Hybrid\32bit\AVISYN~1\libfftw3f-3.dll")
LoadDLL("I:\Hybrid\32bit\AVISYN~1\d3dx9_30.dll")
Import("I:\Hybrid\32bit\avisynthPlugins\QTGMC.avsi")
Import("I:\Hybrid\32bit\avisynthPlugins\SMDegrain.avsi")
Import("I:\Hybrid\32bit\avisynthPlugins\AnimeIVTC.avsi")
Import("I:\Hybrid\32bit\avisynthPlugins\dither.avsi")
Import("I:\Hybrid\32bit\avisynthPlugins\TemporalDegrain-v2.avsi")
LoadCPlugin("I:\Hybrid\32bit\AVISYN~1\ffms2.dll")
SetFilterMTMode("DEFAULT_MT_MODE", MT_MULTI_INSTANCE)
# loading source: F:\TestClips&Co\files\interlaceAndTelecineSamples\interlaced\proRes_interlaced_1080_pcm.mov
#  input color sampling YUY2
#  input luminance scale tv
FFVideoSource("F:\TESTCL~1\files\INTERL~1\INTERL~1\PRORES~1.MOV",cachefile="E:\Temp\mov_8679e2db13be0eb3ee9aa34b1fc35571_853323747_1_0.ffindex",colorspace="YUY2")
# current resolution: 1920x1080
# deinterlacing
ConvertToYUY2(interlaced=true)
AssumeTFF()
QTGMC(Preset="Fast", ediThreads=2)
SelectEven()
# cropping
Crop(22,0,-14,-8)# 1884x1072
# filtering
# grain handling
# callConvertTo with: 
ConvertToYV12(interlaced=false)
TemporalDegrain2() # <- usees  fftw3
# letterboxing
AddBorders(18,4,18,4)# resolution: 1884x1072 -> 1920x1080
ConvertToRGB32(matrix="Rec709")
PreFetch(8)
return last
only thing regarding this I found was https://github.com/FFTW/fftw3/issues/17
I tried the 3.3.8 build from filler56789, the 3.3.5 build from http://fftw.org/install/windows.html and the 3.3.7 build from https://forum.videohelp.com/threads/...ftw-3-3-7-DLLs (using Avisynth+ 32bit on Windows 10pro)
-> is there another build I could try?

Cu Selur
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 10th April 2020, 13:45   #55  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
Quote:
Originally Posted by Selur View Post
getting:
Code:
'fftw: alloc.c:29: assertion failed: p'
Any body make any sense out of this, from 3.3.8 source
Code:
///////////////////////
// # fftw-3.3.8      //
///////////////////////


// # alloc.c

#include "kernel/ifftw.h"

void *X(malloc_plain)(size_t n)
{
     void *p;
     if (n == 0)
          n = 1;
     p = X(kernel_malloc)(n);
     CK(p);                                                        // # alloc.c line 29

#ifdef MIN_ALIGNMENT
     A((((uintptr_t)p) % MIN_ALIGNMENT) == 0);
#endif

     return p;
}

void X(ifree)(void *p)
{
     X(kernel_free)(p);
}

void X(ifree0)(void *p)
{
     /* common pattern */
     if (p) X(ifree)(p);
}


########################################################################

// # assert.c

#include "kernel/ifftw.h"
#include <stdio.h>
#include <stdlib.h>

void X(assertion_failed)(const char *s, int line, const char *file)
{
     fflush(stdout);
     fprintf(stderr, "fftw: %s:%d: assertion failed: %s\n", file, line, s);
#ifdef HAVE_ABORT
     abort();
#else
     exit(EXIT_FAILURE);
#endif
}

########################################################################

// # kernel\ifftw.h

/* determine precision and name-mangling scheme */
#define CONCAT(prefix, name) prefix ## name
#if defined(FFTW_SINGLE)
  typedef float R;
# define X(name) CONCAT(fftwf_, name)
#elif defined(FFTW_LDOUBLE)
  typedef long double R;
# define X(name) CONCAT(fftwl_, name)
# define TRIGREAL_IS_LONG_DOUBLE
#elif defined(FFTW_QUAD)
  typedef __float128 R;
# define X(name) CONCAT(fftwq_, name)
# define TRIGREAL_IS_QUAD
#else
  typedef double R;
# define X(name) CONCAT(fftw_, name)
#endif

// #

...

// #


/* define HAVE_SIMD if any simd extensions are supported */
#if defined(HAVE_SSE) || defined(HAVE_SSE2) || \
      defined(HAVE_AVX) || defined(HAVE_AVX_128_FMA) || \
      defined(HAVE_AVX2) || defined(HAVE_AVX512) || \
      defined(HAVE_KCVI) || \
      defined(HAVE_ALTIVEC) || defined(HAVE_VSX) || \
      defined(HAVE_MIPS_PS) || \
      defined(HAVE_GENERIC_SIMD128) || defined(HAVE_GENERIC_SIMD256)
#define HAVE_SIMD 1
#else
#define HAVE_SIMD 0
#endif

extern int X(have_simd_sse2)(void);
extern int X(have_simd_avx)(void);
extern int X(have_simd_avx_128_fma)(void);
extern int X(have_simd_avx2)(void);
extern int X(have_simd_avx2_128)(void);
extern int X(have_simd_avx512)(void);
extern int X(have_simd_altivec)(void);
extern int X(have_simd_vsx)(void);
extern int X(have_simd_neon)(void);

// #

...

// #

/*-----------------------------------------------------------------------*/
/* alloca: */
#if HAVE_SIMD
#  if defined(HAVE_KCVI) || defined(HAVE_AVX512)
#    define MIN_ALIGNMENT 64
#  elif defined(HAVE_AVX) || defined(HAVE_AVX2) || defined(HAVE_GENERIC_SIMD256)
#    define MIN_ALIGNMENT 32  /* best alignment for AVX, conservative for
                   * everything else */
#  else
     /* Note that we cannot use 32-byte alignment for all SIMD.  For
    example, MacOS X malloc is 16-byte aligned, but there was no
    posix_memalign in MacOS X until version 10.6. */
#    define MIN_ALIGNMENT 16
#  endif
#endif


// #

...

// #

/* assert.c: */
IFFTW_EXTERN void X(assertion_failed)(const char *s,
                      int line, const char *file);

/* always check */
#define CK(ex)                       \
      (void)((ex) || (X(assertion_failed)(#ex, __LINE__, __FILE__), 0))

#ifdef FFTW_DEBUG
/* check only if debug enabled */
#define A(ex)                        \
      (void)((ex) || (X(assertion_failed)(#ex, __LINE__, __FILE__), 0))
#else
#define A(ex) /* nothing */
#endif

extern void X(debug)(const char *format, ...);
#define D X(debug)
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???
StainlessS is offline   Reply With Quote
Old 10th April 2020, 16:33   #56  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
Well, there is an assertion checking that pointer 'p' is not NULL. That is what the 'CK' macro does. And this assertion obviously failed, which means that 'p' was NULL.

Since 'p' was assigned to the result of kernel_malloc(), it looks like the memory allocation failed, i.e. it returned NULL.

We don't see the details of kernel_malloc() here, but a typical cause for malloc operations to return NULL is "out of memory" situation. This certainly applies to a standard malloc().
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

Last edited by LoRd_MuldeR; 10th April 2020 at 16:41.
LoRd_MuldeR is offline   Reply With Quote
Old 11th April 2020, 00:20   #57  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
Thankyou my lord,

thats what I thought, but all them there void X(...) whotsits left me a little perturbed.

details of the kernel_malloc() below in kalloc.c

Code:
#include "kernel/ifftw.h"

#if defined(HAVE_MALLOC_H)
#  include <malloc.h>
#endif

/* ``kernel'' malloc(), with proper memory alignment */

#if defined(HAVE_DECL_MEMALIGN) && !HAVE_DECL_MEMALIGN
extern void *memalign(size_t, size_t);
#endif

#if defined(HAVE_DECL_POSIX_MEMALIGN) && !HAVE_DECL_POSIX_MEMALIGN
extern int posix_memalign(void **, size_t, size_t);
#endif

#if defined(macintosh) /* MacOS 9 */
#  include <Multiprocessing.h>
#endif

#define real_free free /* memalign and malloc use ordinary free */

#define IS_POWER_OF_TWO(n) (((n) > 0) && (((n) & ((n) - 1)) == 0))
#if defined(WITH_OUR_MALLOC) && (MIN_ALIGNMENT >= 8) && IS_POWER_OF_TWO(MIN_ALIGNMENT)
/* Our own MIN_ALIGNMENT-aligned malloc/free.  Assumes sizeof(void*) is a
   power of two <= 8 and that malloc is at least sizeof(void*)-aligned.

   The main reason for this routine is that, as of this writing,
   Windows does not include any aligned allocation routines in its
   system libraries, and instead provides an implementation with a
   Visual C++ "Processor Pack" that you have to statically link into
   your program.  We do not want to require users to have VC++
   (e.g. gcc/MinGW should be fine).  Our code should be at least as good
   as the MS _aligned_malloc, in any case, according to second-hand
   reports of the algorithm it employs (also based on plain malloc). */
static void *our_malloc(size_t n)
{
     void *p0, *p;
     if (!(p0 = malloc(n + MIN_ALIGNMENT))) return (void *) 0;
     p = (void *) (((uintptr_t) p0 + MIN_ALIGNMENT) & (~((uintptr_t) (MIN_ALIGNMENT - 1))));
     *((void **) p - 1) = p0;
     return p;
}
static void our_free(void *p)
{
     if (p) free(*((void **) p - 1));
}
#endif

void *X(kernel_malloc)(size_t n)
{
     void *p;

#if defined(MIN_ALIGNMENT)

#  if defined(WITH_OUR_MALLOC)
     p = our_malloc(n);
#    undef real_free
#    define real_free our_free

#  elif defined(__FreeBSD__) && (MIN_ALIGNMENT <= 16)
     /* FreeBSD does not have memalign, but its malloc is 16-byte aligned. */
     p = malloc(n);

#  elif (defined(__MACOSX__) || defined(__APPLE__)) && (MIN_ALIGNMENT <= 16)
     /* MacOS X malloc is already 16-byte aligned */
     p = malloc(n);

#  elif defined(HAVE_MEMALIGN)
     p = memalign(MIN_ALIGNMENT, n);

#  elif defined(HAVE_POSIX_MEMALIGN)
     /* note: posix_memalign is broken in glibc 2.2.5: it constrains
    the size, not the alignment, to be (power of two) * sizeof(void*).
        The bug seems to have been fixed as of glibc 2.3.1. */
     if (posix_memalign(&p, MIN_ALIGNMENT, n))
      p = (void*) 0;

#  elif defined(__ICC) || defined(__INTEL_COMPILER) || defined(HAVE__MM_MALLOC)
     /* Intel's C compiler defines _mm_malloc and _mm_free intrinsics */
     p = (void *) _mm_malloc(n, MIN_ALIGNMENT);
#    undef real_free
#    define real_free _mm_free

#  elif defined(_MSC_VER)
     /* MS Visual C++ 6.0 with a "Processor Pack" supports SIMD
    and _aligned_malloc/free (uses malloc.h) */
     p = (void *) _aligned_malloc(n, MIN_ALIGNMENT);
#    undef real_free
#    define real_free _aligned_free

#  elif defined(macintosh) /* MacOS 9 */
     p = (void *) MPAllocateAligned(n,
#    if MIN_ALIGNMENT == 8
                    kMPAllocate8ByteAligned,
#    elif MIN_ALIGNMENT == 16
                    kMPAllocate16ByteAligned,
#    elif MIN_ALIGNMENT == 32
                    kMPAllocate32ByteAligned,
#    else
#      error "Unknown alignment for MPAllocateAligned"
#    endif
                    0);
#    undef real_free
#    define real_free MPFree

#  else
     /* Add your machine here and send a patch to fftw@fftw.org
        or (e.g. for Windows) configure --with-our-malloc */
#    error "Don't know how to malloc() aligned memory ... try configuring --with-our-malloc"
#  endif

#else /* !defined(MIN_ALIGNMENT) */
     p = malloc(n);
#endif

     return p;
}

void X(kernel_free)(void *p)
{
     real_free(p);
}
Selur, you take any note of mem usage in eg Task Manager when your script is called, or use G2K4 tools (avsMeter) ?
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???
StainlessS is offline   Reply With Quote
Old 14th April 2020, 19:26   #58  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,277
Hmm,.. now I just get an access violation (when using Prefetch > 3), but that is probably due to the memory usage which hits the 32bit limit. (with Prefetch 3 script already uses ~2378MB according to avsmeter)
Thanks for that build.

Cu Selur
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 16th April 2020, 10:44   #59  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,580
Quote:
Originally Posted by HolyWu View Post
URL=https://www.mediafire.com/file/ueof4d9d6y3y589/fftw3-20200404-ef15637.7z/file]fftw3-20200404-ef15637[/URL]
Thanks! What compiler did you use?
__________________
@turment on Telegram
tormento is offline   Reply With Quote
Old 17th April 2020, 15:36   #60  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by HolyWu View Post
thanks, will try them, since https://forum.doom9.org/showthread.p...64#post1907464 seems not as fast as v3.3.8 from https://forum.doom9.org/showthread.p...13#post1883313
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Reply

Tags
fftw, fftw3.dll

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 21:49.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.