(no commit message)

[libreriscv.git] / openpower / sv / biginteger.mdwn
diff --git a/openpower/sv/biginteger.mdwn b/openpower/sv/biginteger.mdwn

index b122c428a7701d225c64af6d7d58b0fcd975e8fa..25bc3ad8652ebf152b73d90779af323e0368b0d5 100644 (file)
--- a/openpower/sv/biginteger.mdwn
+++ b/openpower/sv/biginteger.mdwn
@@ -2,7 +2,7 @@
  
  # Big Integer Arithmetic
  
-**DRAFT STATUS** 19apr2021
+**DRAFT STATUS** 19apr2022, last edited 23may2022
  
  * [[discussion]] page for notes
  * <https://bugs.libre-soc.org/show_bug.cgi?id=817> bugreport
@@ -57,10 +57,7 @@ The pseudocode for `madded RT, RA, RB, RC` is:
      RT <- sum[64:127]
      RS <- sum[0:63] # RS implicit register, see below
  
-* In Scalar (non-SVP64) usage: `RS=RT+1`
-* For SVP64: RS may be either RC or RT+MAXVL
-
-RC is zero-extended (not shifted), the 128-bit product added
+RC is zero-extended (not shifted, not sign-extended), the 128-bit product added
  to it; the lower half of that result stored in RT and the upper half
  in RS.
  
@@ -71,8 +68,10 @@ equivalent to `maddld` because `maddld` performs sign-extension on RC.
  *Programmer's Note:
  As a Scalar Power ISA operation, like `lq` and `stq`, RS=RT+1.
  To achieve the same big-integer rolling-accumulation effect
-as SVP64, instructions may be issued `madded r20,r4,r8,r20
-madded r21,r5,r9,r21` etc. where the first `madded` will have
+as SVP64: assuming the scalar to multiply is in r0, 
+the vector to multiply by starts at r4 and the result vector
+in r20, instructions may be issued `madded r20,r4,r0,r20
+madded r21,r5,r0,r21` etc. where the first `madded` will have
  stored the upper half of the 128-bit multiply into r21, such
  that it may be picked up by the second `madded`. Repeat inline
  to construct a larger bigint scalar-vector multiply,
@@ -116,7 +115,13 @@ RB, the divisor, remains 64 bit.  The instruction is therefore a 128/64
  division, producing a (pair) of 64 bit result(s).  Overflow conditions
  are detected in exactly the same fashion as `divdeu`, except that rather
  than have `UNDEFINED` behaviour, RT is set to all ones and RS set to all
-zeros.
+zeros on overflow.
+
+*Programmer's note: there are no Rc variants of any of these VA-Form
+instructions. `cmpi` will need to be used to detect overflow conditions:
+the saving in instruction count is that both RT and RS will have already
+been set to useful values needed as part of implementing Knuth's
+Algorithm D*
  
  For SVP64, given that this instruction is also 3-in 2-out 64-bit registers,
  the exact same EXTRA format and setting of RS is used as for `sv.madded`.
@@ -131,13 +136,10 @@ Pseudo-code:
          modulo <- dividend % divisor
          RT <- result[XLEN:(XLEN*2)-1]
          RS <- modulo[XLEN:(XLEN*2)-1]
-        overflow <- 0
      else
-        overflow <- 1
          RT <- [1]*XLEN
          RS <- [0]*XLEN
  
-
  # [DRAFT] EXT04 Proposed Map
  
  For the Opcode map (XO Field)