The clts code didn't use set_cr0 properly, so our lazy FPU
processing wasn't being done by the clts instruction at all.
(this isn't called on Intel as the hardware does the decode for us)
Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>