REMAP is an advanced form of Vector "Structure Packing" that
provides hardware-level support for commonly-used *nested* loop patterns.
-
+For more general reordering an Indexed REMAP mode is available.
REMAP allows the usual vector loop `0..VL-1` to be "reshaped" (re-mapped)
from a linear form to a 2D or 3D transposed form, or "offset" to permit
arbitrary access to elements, independently on each Vector src or dest
register.
-Their primary use is for Matrix Multiplication, reordering of sequential
+The initial primary motivation of REMAP was for Matrix Multiplication, reordering of sequential
data in-place. Four SPRs are provided so that a single FMAC may be
used in a single loop to perform 4x4 times 4x4 Matrix multiplication,
generating 64 FMACs. Additional uses include regular "Structure Packing"