From: lkcl <lkcl@web>
Date: Sun, 2 Oct 2022 14:33:15 +0000 (+0100)
Subject: (no commit message)
X-Git-Tag: opf_rfc_ls005_v1~247
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=b4a697b389c7ec264e9be98bd14437824a31812b;p=libreriscv.git

---

diff --git a/openpower/sv/biginteger.mdwn b/openpower/sv/biginteger.mdwn
index 266924467..6a7392e9c 100644
--- a/openpower/sv/biginteger.mdwn
+++ b/openpower/sv/biginteger.mdwn
@@ -28,13 +28,19 @@ Dynamic SIMD ALUs for maximum performance and effectiveness.
 Covered in [[biginteger/analysis]] the summary is that standard `adde`
 is sufficient for SVP64 Vectorisation of big-integer addition (and `subfe`
 for subtraction) but that big-integer shift, multiply and divide require an
-extra 3-in 2-out instructions, similar to Intel's `shld`, `shrd`,
+extra 3-in 2-out instructions, similar to Intel's 
+[shld](https://www.felixcloutier.com/x86/shld)
+and [shrd](https://www.felixcloutier.com/x86/shrd),
 `mulx` and `idiv`, to be efficient.
-The same instruction (`maddedu`) is used for both because 'maddedu''s primary
+The same instruction (`maddedu`) is used in both
+big-divide and big-multiply because 'maddedu''s primary
 purpose is to perform a fused 64-bit scalar multiply with a large vector,
 where that result is Big-Added for Big-Multiply, but Big-Subtracted for
 Big-Divide.
 
+Chaining the operations together gives Scalar-by-Vector 
+operations, except for `sv.adde` and `sv.subfe` which are
+Vector-by-Vector Chainable (through the `CA` flag).
 Macro-op Fusion and back-end massively-wide SIMD ALUs may be deployed in a
 fashion that is hidden from the user, behind a consistent, stable ISA API.
 The same macro-op fusion may theoretically be deployed even on Scalar
@@ -44,7 +50,7 @@ operations.
 
 **DRAFT**
 
-`dsld` and `dsrd` are is similar to v3.0 `sld`, and
+`dsld` and `dsrd` are similar to v3.0 `sld`, and
 is Z23-Form in "overwrite" on RT.
 
 |0.....5|6..10|11..15|16..20|21.22|23..30|31|