This! I would spend hours unwrapping loops etc, optimizing register use and then profile my amazing x86 version only to find the compiler had found a better route. But you're right -- on the times when you code came out 2X, 5X, 10X quicker than the compiler.. that was what you lived for :)