extension to 128.
* Saturation. **all** LD/ST and Arithmetic and Logical operations may
be saturated (without adding explicit scalar saturated opcodes)
-* Reduction and Prefix-Sum (Fibonnacci Series) Modes
+* Reduction and Prefix-Sum (Fibonnacci Series) Modes as well as vec2/3/4
+ "Packing" and "Unpacking".
The `SVP64-Single` 24-bit encoding focusses primarily on ensuring that
all 128 Scalar registers are fully accessible, provides element-width
is extremely advanced but brings features already present in other
DSPs and Supercomputing ISAs.
-* DCT/FFT REMAP brings more capability than TI's MSP-Series DSPs and
+* **DCT/FFT** REMAP brings more capability than TI's MSP-Series DSPs and
Qualcom Hexagon DSPs, and is not restricted to Integer or FP.
(Galois Field is possible, implementing NTT). Operates *in-place*
significantly reducing register usage.
-* Matrix REMAP brings more capability than any other Matrix Extension
+* **Matrix** REMAP brings more capability than any other Matrix Extension
(AMD GPUs, Intel, ARM), not being restricted to Power-2 sizes. Also not
limited to the type of operation, it may perform Warshall Transitive
Closure, Integer Matrix, Bitmanipulation Matrix, Galois Field (carryless
mul) Matrix, and with care potentially Graph Maximum Flow as well. Also
suited to Convolutions, Matrix Transpose and rotate, *all* of which is
in-place.
-* General-purpose Indexed REMAP, this option is provided to implement
+* **General-purpose Indexed** REMAP, this option is provided to implement
an equivalent of VSX `vperm`
-* Parallel Reduction REMAP, performs an automatic map-reduce using
+* **Parallel Reduction** REMAP, performs an automatic map-reduce using
*any suitable scalar operation*.
# Scalar Operations