I hatesss it precious. Why you can't just simply specify "put these parameters in these register and the result will be here" and "stack everything I clobber" is a mystery to me. The constraints system is approaching a black art that requires virgin sacrifices to get something that works. And FFS don't even consider letting gcc allocate registers for you. That way lies madness and despair.
Seriously I have spent the evening trying to implement approximately 6 lines of assembler (repeated across three inline functions) and I'm still not done. Arrrrrgggghhh!