From: Luke Kenneth Casson Leighton Date: Fri, 21 Oct 2022 14:23:58 +0000 (+0100) Subject: rewrite of ls003 example, RS!=RT+1 X-Git-Tag: opf_rfc_ls005_v1~57 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=bc79d54eab2659d64b7600e810f851d7b8555aa9;p=libreriscv.git rewrite of ls003 example, RS!=RT+1 --- diff --git a/openpower/sv/rfc/ls003.mdwn b/openpower/sv/rfc/ls003.mdwn index 3717eaba4..a33ec4f5f 100644 --- a/openpower/sv/rfc/ls003.mdwn +++ b/openpower/sv/rfc/ls003.mdwn @@ -135,16 +135,18 @@ modulo 2^64 and sign/zero extension from 64 to 128 bits produces identical results modulo 2^64. This is why there is no maddldu instruction. *Programmer's Note: -As a Scalar Power ISA operation, like `lq` and `stq`, RS=RT+1. To achieve a big-integer rolling-accumulation effect: -assuming the scalar to multiply is in r0, +assuming the scalar to multiply is in r0, and r3 is +used (effectively) as a 64-bit carry, the vector to multiply by starts at r4 and the result vector -in r20, instructions may be issued `maddedu r20,r4,r0,r20 -maddedu r21,r5,r0,r21` etc. where the first `maddedu` will have -stored the upper half of the 128-bit multiply into r21, such +in r20, instructions may be issued `maddedu r20,r4,r0,r3 +maddedu r21,r5,r0,r3` etc. where the first `maddedu` will have +stored the upper half of the 128-bit multiply into r3, such that it may be picked up by the second `maddedu`. Repeat inline to construct a larger bigint scalar-vector multiply, -as Scalar GPR register file space permits.* +as Scalar GPR register file space permits. If register +spill is required then r3, as the effective 64-bit carry, +continues the chain.* Examples: @@ -153,8 +155,10 @@ Examples: maddedu r4, r0, r1, r2 # Chaining together for larger bigint (see Programmer's Note above) -maddedu r20,r4,r0,r20 -maddedu r21,r5,r0,r21 +# r3 starts with zero (no carry-in) +maddedu r20,r4,r0,r3 +maddedu r21,r5,r0,r3 +maddedu r22,r6,r0,r3 ``` ----------