From: lkcl <lkcl@web>
Date: Thu, 7 Jul 2022 10:00:55 +0000 (+0100)
Subject: (no commit message)
X-Git-Tag: opf_rfc_ls005_v1~1304
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=7aebe6389a3a7e38c488b972b838f136bc4ae9a8;p=libreriscv.git

---

diff --git a/openpower/sv/remap.mdwn b/openpower/sv/remap.mdwn
index 57f13f556..0ea1cc15c 100644
--- a/openpower/sv/remap.mdwn
+++ b/openpower/sv/remap.mdwn
@@ -126,9 +126,26 @@ having to perform data Transpose by pushing out through Memory and back,
 or computing Transposition Indices (costly) then copying to another
 Vector (costly).
 
-Matrix REMAP was thus designed to solve these issues by providing
+Matrix REMAP was thus designed to solve these issues by providing Hardware
+Assisted
 "Schedules" that can view what would otherwise be limited to a strictly
-linear Vector as instead being 2D (even 3D) in-place reordered.
+linear Vector as instead being 2D (even 3D) *in-place* reordered.
+With both Transposition and non-power-two being supported the issues
+faced by other ISAs are mitigated.
+
+Limitations of Matrix REMAP are that the Vector Length (VL) is currently
+restricted to 127: up to 127 FMAs may be performed in total (potentially
+127 vec2/3/4 FMAs may be used but this requires additional research).
+Also given that it is in-registers only at present some care has to be
+taken on regfile resource utilisation. However it is perfectly possible
+to utilise Matrix REMAP to perform the three inner-most "kernel" loops of
+the usual 6-level large Matrix Multiply, without the usual difficulties
+associated with SIMD.
+
+Also the `svshape` instruction only provides access to part of the
+Matrix REMAP capability. Rotation and mirroring need to be done by
+programming the SVSHAPE SPRs directly, which can take a lot more
+instructions.
 
 ## FFT/DCT Triple Loop