View Single Post
Old 9th November 2011, 01:45   #4  |  Link
redfordxx
Registered User
 
Join Date: Jan 2005
Location: Praha (not that one in Texas)
Posts: 863
1) From what you wrote, basically I should not bother too much with the speed issues, since every machine has it differently implemented, correct? Anyway would you recommend some document where is clock cycles or whatever speed specification of the instruction? Maybe to learn whether pxor is faster than movq on mmx...
2) MMX<->XMM still should be faster than from memory on every machine, or not?
3) So since the pitch is mod16, basically it is safe (and good idea???) to run one cycle from 0 to pitch*height, instead of two cycles inside each other for height and width?
4) Saturation means 250+10=255?
5) Should I be interested in MOVNTDQ? I don't know what is the benefit of nontemporal
redfordxx is offline   Reply With Quote