From 95693263675b148a63fd650decaac7b9002e3fa3 Mon Sep 17 00:00:00 2001
From: lkcl <lkcl@web>
Date: Wed, 20 Apr 2022 12:16:08 +0100
Subject: [PATCH]

---
 openpower/sv/biginteger.mdwn | 45 +++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/openpower/sv/biginteger.mdwn b/openpower/sv/biginteger.mdwn
index 27d6981e8..5220ea635 100644
--- a/openpower/sv/biginteger.mdwn
+++ b/openpower/sv/biginteger.mdwn
@@ -105,24 +105,33 @@ operations. Such a trick works equally as well in Scalar-only.
 
 **Application of SVP64**
 
-SVP64 has the means to re-target in-out registers that would normally
-be forced to be an overwrite.  Examples include `ldu` which, ordinarily,
-in Scalar v3.0B, has RA overwritten.  `sv.ldu` on the other hand permits
-limited range re-targetting, by applying one EXTRA bit to RA-as-a-source
-and a *separate* bit to RA-as-a-destination.
-
-If applied to this new 3-in 2-out mul-and-add operation it not only
-becomes possible to set RC as either scalar or vector, it becomes
-possible to stop RC from being overwritten.
-
-    product = RA*RB+RC      # RC sourced as Vector
-    RT = lowerhalf(product) # Vector destination
-    RC = upperhalf(product) # Vector destination
-
-Where previously this instruction had limited specialist applicability
-for big-integer multiply, because RC could only be utilised as a
-64-bit Carry, the possibility for RC to be a Vector greatly
-expands its potential.
+SVP64 has the means to mark registers as scalar or vector. However
+the available space in the prefix is extremely limited (9 bits).
+With effectively 5 operands (3 in, 2 out) some compromises are needed.
+However a little though gives a useful workaround: two modes,
+controlled by a single bit in `RM.EXTRA`, determine whether the 5th
+register is set to RC or whether to RT+VL. This then leaves only
+4 registers to qualify as scalar/vector, and this can use four
+EXTRA2 designators which fits into the available space.
+
+RS=RT+VL Mode:
+
+    product = RA*RB+RC
+    RT = lowerhalf(product)
+    RS=RT+VL = upperhalf(product)
+
+and RS=RC Mode:
+
+    product = RA*RB+RC
+    RT = lowerhalf(product)
+    RS=RC = upperhalf(product)
+
+Now there is much more potential, including setting RC to a Scalar,
+which would be useful as a 64 bit Carry. RC as a Vector would produce
+a Vector of the HI halves of a Vector of multiplies.  RS=RT+VL Mode
+would allow that same Vector of HI halves to not be an overwrite of RC.
+Also it is possible to specify that any of RA, RB or RC are scalar or
+vector. Overall it is extremely powerful.
 
 ## Divide
 
-- 
2.30.2