Quote:
Originally Posted by Myrsloik
YOU ARE NOT USING X87!!! You compiled it as x64 code and the ABI (more or less) requires it to use sse2 instructions to implement it. Obviously at least the scalar float versions. It's even possible that it managed to auto vectorize like half of this code since most of it is just mindless read and sum. Look at the generated code instead of asking us about what you, YOURSELF, told the compiler to do.
|
the "Look at the generated code" part is a bit too hard to me tho... Staring at thousands lines of generated assembly is far beyond my programming skill since I'm not a professionally trained programmer...
Quote:
Originally Posted by Myrsloik
Your assumption still wouldn't be true about x87 vs avx. For simple algorithms you run into memory bw limitations long before you see the glory of sse (avx is even more rare to matter). Modern cpus are just too good.
|
which means it's pretty much pointless to manually optimize simple plugins?