or computing Transposition Indices (costly) then copying to another
Vector (costly).
-Matrix REMAP was thus designed to solve these issues by providing
+Matrix REMAP was thus designed to solve these issues by providing Hardware
+Assisted
"Schedules" that can view what would otherwise be limited to a strictly
-linear Vector as instead being 2D (even 3D) in-place reordered.
+linear Vector as instead being 2D (even 3D) *in-place* reordered.
+With both Transposition and non-power-two being supported the issues
+faced by other ISAs are mitigated.
+
+Limitations of Matrix REMAP are that the Vector Length (VL) is currently
+restricted to 127: up to 127 FMAs may be performed in total (potentially
+127 vec2/3/4 FMAs may be used but this requires additional research).
+Also given that it is in-registers only at present some care has to be
+taken on regfile resource utilisation. However it is perfectly possible
+to utilise Matrix REMAP to perform the three inner-most "kernel" loops of
+the usual 6-level large Matrix Multiply, without the usual difficulties
+associated with SIMD.
+
+Also the `svshape` instruction only provides access to part of the
+Matrix REMAP capability. Rotation and mirroring need to be done by
+programming the SVSHAPE SPRs directly, which can take a lot more
+instructions.
## FFT/DCT Triple Loop