x86: also optimize KXOR{D,Q} and KANDN{D,Q}