From: lkcl <lkcl@web>
Date: Thu, 8 Sep 2022 15:29:05 +0000 (+0100)
Subject: (no commit message)
X-Git-Tag: opf_rfc_ls005_v1~616
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=790ecbcef0665c75a66f97bcb57bf4c97f09ee5d;p=libreriscv.git

---

diff --git a/openpower/sv/rfc/ls001.mdwn b/openpower/sv/rfc/ls001.mdwn
index 487774576..0940676ec 100644
--- a/openpower/sv/rfc/ls001.mdwn
+++ b/openpower/sv/rfc/ls001.mdwn
@@ -94,7 +94,7 @@ the next decade.
   with SVLR by SV-Branch-Conditional for exactly the same reason that NIA is swapped
   with LR
 
-* Vector Management Instructions
+**Vector Management Instructions**
 
 * **setvl** - Cray-style Scalar Vector Length instruction
 * **svstep** - used for Vertical-First Mode and for enquiring about internal state
@@ -108,14 +108,33 @@ the next decade.
 # SVP64 24-bit Prefix
 
 The SVP64 24-bit Prefix provides several options, too numerous to describe in this
-document. The primary options are:
+document but all fitting within the 24-bit space (and no other).
+The primary options are:
 
 * element-width overrides, which dynamically redefine each SFFS or SFS Scalar prefixed 
   instruction to be 8-bit, 16-bit, 32-bit or 64-bit operands **without requiring new
   8/16/32 instructions** [^pseudorewrite]
-
-* Due to a concept called "Element-width Overrides
-
+* predication.  this is an absolutely essential feature for a 3D GPU VPU ISA.
+  CR Fields are available as Predicate Masks hence the reason for their extension to 128.
+* Saturation. **all** LD/ST and Arithmetic and Logical operations may be saturated
+  (without adding explicit scalar saturated opcodes)
+* Reduction and Prefix-Sum (Fibonnacci Series) Modes
+
+# REMAP subsystem
+
+REMAP is extremely advanced but brings features already present in other DSPs and
+Supercomputing ISAs.
+
+* DCT/FFT REMAP brings more capability than TI's MSP-Series DSPs and Qualcom Hexagon DSPs
+* Matrix REMAP brings more capability than any other Matrix Extension (AMD GPUs,
+  Intel, ARM), not being restricted to Power-2 sizes. Also not limited to the type
+  of operation, it may perform Warshall Transitive Closure, Integer Matrix,
+  Bitmanipulation Matrix, Galois Field (carryless mul) Matrix, and with care potentially
+  Graph Maximum Flow as well. Also suited to Convolutions, Matrix Transpose and rotate.
+* General-purpose Indexed REMAP, this option is provided to implement an equivalent
+  of VSX `vperm`
+* Parallel Reduction REMAP, performs an automatic map-reduce using *any suitable
+  scalar operation*.
 
 [^extend]: Prefix opcode space **must** be reserved in advance to do so, in order to avoid the catastrophic binary-incompatibility mistake made by RISC-V RVV and ARM SVE/2
 [^likeext001]: SVP64-Single is remarkably similar to the "bit 1" of EXT001 being set to indicate that the 64-bits is to be allocated in full to a new encoding, but in fact it still embeds v3.0 Scalar operations.