From: lkcl <lkcl@web>
Date: Sat, 4 Jun 2022 14:41:53 +0000 (+0100)
Subject: (no commit message)
X-Git-Tag: opf_rfc_ls005_v1~1987
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=142706fd85a6b7d1c8337df8f8755cf45d10a53b;p=libreriscv.git

---

diff --git a/openpower/sv/remap.mdwn b/openpower/sv/remap.mdwn
index af60be92a..bf4ef79f9 100644
--- a/openpower/sv/remap.mdwn
+++ b/openpower/sv/remap.mdwn
@@ -36,14 +36,23 @@ latency, so should realistically be used only where it is worthwhile.
 Commonly-used patterns such as Matrix Multiply, DCT and FFT have
 helper instruction options which make REMAP easier to use.
 
+There are three types of REMAP:
+
+* **Matrix**, also known as 2D and 3D reshaping
+* **FFT/DCT**, with full triple-loop in-place support: limited to
+  Power-2 RADIX
+* **Indexing**, for any general-purpose reordering. Currently
+  under development.
+
 # Principle
 
-* normal vector element read/write as operands would be sequential
+* normal vector element read/write of operands would be sequential
   (0 1 2 3 ....)
 * this is not appropriate for (e.g.) Matrix multiply which requires
   accessing elements in alternative sequences (0 3 6 1 4 7 ...)
 * normal Vector ISAs use either Indexed-MV or Indexed-LD/ST to "cope"
   with this.  both are expensive (copy large vectors, spill through memory)
+  and very few Packed SIMD ISAs cope with non-Power-2.
 * REMAP **redefines** the order of access according to set "Schedules".
 * The Schedules are not necessarily restricted to power-of-two boundaries
   making it unnecessary to have for example specialised 3x4 transpose