prod[0:127] = (RA) * (RB)
sum[0:127] = EXTZ(RC) + prod
RT <- sum[64:127]
- RS <- sum[0:63] # RS is either RC or RT+MAXVL
+ RS <- sum[0:63] # RS implicit register, see below
+
+* In Scalar (non-SVP64) usage: `RS=RT+1`
+* For SVP64: RS may be either RC or RT+MAXVL
RC is zero-extended (not shifted), the 128-bit product added
to it; the lower half of that result stored in RT and the upper half
half in RT, where `madded` stores the upper half in RS. There is no
equivalent to `maddld` because `maddld` performs sign-extension on RC.
-As a Scalar Power ISA operation, like `lq` and `stq` RS=RT+1.
-SVP64 overrides this behaviour.
+*Programmer's Note:
+As a Scalar Power ISA operation, like `lq` and `stq`, RS=RT+1.
+To achieve the same big-integer rolling-accumulation effect
+as SVP64, instructions may be issued `madded r20,r4,r8,r20
+madded r21,r5,r9,r21` etc. where the first `madded` will have
+stored the upper half of the 128-bit multiply into r21, such
+that it may be picked up by the second `madded`.*
+
+SVP64 overrides the Scalar behaviour of what defines RS.
For SVP64 EXTRA register extension, the `RM-1P-3S-1D` format is
used with the additional bit set for determining RS.
offset* (see SVP64 [[svp64/appendix]] for full details).
When `EXTRA2_MODE` is set to one, the implicit RS register is identical
-to RC extended to SVP64 numbering, including whether RC is set Scalar or
+to RC extended with SVP64 using `Rsrc3_EXTRA2` in every respect, including whether RC is set Scalar or
Vector.
# divrem2du RT,RA,RB,RC