View Single Post
Old 15th January 2017, 16:53   #28  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: void
Posts: 2,633
Quote:
Originally Posted by Myrsloik View Post
YOU ARE NOT USING X87!!! You compiled it as x64 code and the ABI (more or less) requires it to use sse2 instructions to implement it. Obviously at least the scalar float versions. It's even possible that it managed to auto vectorize like half of this code since most of it is just mindless read and sum. Look at the generated code instead of asking us about what you, YOURSELF, told the compiler to do.
the "Look at the generated code" part is a bit too hard to me tho... Staring at thousands lines of generated assembly is far beyond my programming skill since I'm not a professionally trained programmer...
Quote:
Originally Posted by Myrsloik View Post
Your assumption still wouldn't be true about x87 vs avx. For simple algorithms you run into memory bw limitations long before you see the glory of sse (avx is even more rare to matter). Modern cpus are just too good.
which means it's pretty much pointless to manually optimize simple plugins?
feisty2 is offline   Reply With Quote