## Basic principle
+The following illustrates why REMAP was added.
+
* normal vector element read/write of operands would be sequential
(0 1 2 3 ....)
* this is not appropriate for (e.g.) Matrix multiply which requires
* Matrix Schedules are not at all restricted to power-of-two boundaries
making it unnecessary to have for example specialised 3x4 transpose
instructions of other Vector ISAs.
+* DCT and FFT REMAP are RADIX-2 limited but this is the case in existing Packed/Predicated
+ SIMD ISAs anyway (and Bluestein Convolution is typically deployed to
+ solve that).
Only the most commonly-used algorithms in computer science have REMAP
support, due to the high cost in both the ISA and in hardware. For