extra 3-in 2-out instructions, similar to Intel's
[shld](https://www.felixcloutier.com/x86/shld)
and [shrd](https://www.felixcloutier.com/x86/shrd),
-`mulx` and `divq`, to be efficient.
+`mulx` and
+[divq](https://www.felixcloutier.com/x86/div),
+to be efficient.
The same instruction (`maddedu`) is used in both
big-divide and big-multiply because 'maddedu''s primary
purpose is to perform a fused 64-bit scalar multiply with a large vector,