100% Deterministic loop to perform 5x3 times 3x4 Matrix multiplication,
generating 60 FMACs *without needing explicit assembler unrolling*.
Additional uses include regular "Structure Packing" such as RGB pixel
-data extraction and reforming.
+data extraction and reforming (although less costly vec2/3/4 reshaping
+is achievable with `PACK/UNPACK`).
REMAP, like all of SV, is abstracted out, meaning that unlike traditional
Vector ISAs which would typically only have a limited set of instructions
that can be structure-packed (LD/ST and Move operations
being the most common), REMAP may be applied to
-literally any instruction: CRs, Arithmetic, Logical, LD/ST, anything.
+literally any instruction: CRs, Arithmetic, Logical, LD/ST, even
+Vectorised Branch-Conditional.
When SUBVL is greater than 1 a given group of Subvector
elements are kept together: effectively the group becomes the