more update. now the mmx should work correctly. Get version 0.3
here.
The current mmx code preserves the alpha channel. The reason to kill the channel would be speed. IanB's code performce about 22% faster than the c code in my last test(RtoRGB). The current mmx code is about 15% faster than the c code. I think I will just add an option to keep the alpha channel then people get a choice what to get. Also I will convert the inline assembly to softwire there is to much repeated mmx code.