x86: use D attribute also for SIMD templates
Various moves come in load and store forms, and just like on the GPR
and FPU sides there would better be only one pattern. In some cases this
is not feasible because the opcodes are too different, but quite a few
cases follow a similar standard scheme. Introduce Opcode_SIMD_FloatD and
Opcode_SIMD_IntD, generalize handling in operand_size_match() (reverse
operand handling there simply needs to match "straight" operand one),
and fix a long standing, but so far only latent bug with when to zap
found_reverse_match.
Also once again drop IgnoreSize where pointlessly applied to templates
touched anyway as well as *word when redundant with Reg*.