extra 3-in 2-out instructions, similar to Intel's
[shld](https://www.felixcloutier.com/x86/shld)
and [shrd](https://www.felixcloutier.com/x86/shrd),
-`mulx` and `idiv`, to be efficient.
+`mulx` and `divq`, to be efficient.
The same instruction (`maddedu`) is used in both
big-divide and big-multiply because 'maddedu''s primary
purpose is to perform a fused 64-bit scalar multiply with a large vector,
RB, the divisor, remains 64 bit. The instruction is therefore a 128/64
division, producing a (pair) of 64 bit result(s), in the same way that
-Intel [idiv](https://www.felixcloutier.com/x86/idiv) works.
+Intel [divq](https://www.felixcloutier.com/x86/div) works.
Overflow conditions
are detected in exactly the same fashion as `divdeu`, except that rather
than have `UNDEFINED` behaviour, RT is set to all ones and RS set to all