Quote:
Originally Posted by pinterf
I left there four of them. Two had no sse replacement, only c. They'll be kept only until I check them what they are doing and move them to simd intrinsics.
|
Converted one of the inline asm function to simd, and yes, worth doing it.
For a simple FFT3DFilter(sigma=3,plane=4)
Code:
v2.3 (x86): 10.4 fps (VS2015, inline asm, speed is same as simd) - for comparison
v2.2 (x64): 9.27 fps (ICL build, C)
v2.3 (x64): 9.15 fps (VS2015, C)
test (x64): 11.94 fps (VS2015, simd intrinsics)