| ISA <br>name | Num <br>opcodes | Taxonomy / <br> Class | Predicate <br> Masks | Twin <br> Predication | Explicit <br> Vector regs | 128-bit | Bigint <br> capability | LDST <br> Fault-First | Data-dependent <br> Fail-first | Predicate-<br> Result | Matrix HW<br> support |
|----------------|-----------------|-----------------------|----------------------|-----------------------|----------------------------|---------|------------------------|-----------------------|--------------------------------|-----------------------|-----------------------|
-| SVP64 | 5 {1} | Scalable {2} | yes | yes {3} | no {4} | see {5} | yes {6} | yes {7} | yes {8} | yes {9} | yes {10} |
-| VSX | 700+ | Packed SIMD | no | no | yes {11} | yes | no | no | no | no | yes {12} |
-| NEON | ~250 {13} | Predicated SIMD | yes | no | yes | yes | no | no | no | no | no |
-| SVE2 | ~1000 {14} | Scalable HW {15} | yes | no | yes | yes | no | yes {7} | no | no | no |
-| AVX-512 {16} | ~1000s {17} | Predicated SIMD | yes | no | yes | yes | no | no | no | no | no |
-| RVV {18} | ~190 | Scalable {19} | yes | no | yes | yes {20}| no | yes | no | no | no |
-| Aurora SX {21} | ~200 {22} | Scalable {23} | yes | no | yes | no | no | no | no | no | no |
+| SVP64 | 5 (1) | Scalable (2) | yes | yes (3) | no (4) | see (5) | yes (6) | yes (7) | yes (8) | yes (9) | yes (10) |
+| VSX | 700+ | Packed SIMD | no | no | yes (11) | yes | no | no | no | no | yes (12) |
+| NEON | ~250 (13) | Predicated SIMD | yes | no | yes | yes | no | no | no | no | no |
+| SVE2 | ~1000 (14) | Scalable HW (15) | yes | no | yes | yes | no | yes (7) | no | no | no |
+| AVX-512 (16) | ~1000s (17) | Predicated SIMD | yes | no | yes | yes | no | no | no | no | no |
+| RVV (18) | ~190 | Scalable (19) | yes | no | yes | yes (20)| no | yes | no | no | no |
+| Aurora SX (21) | ~200 (22) | Scalable (23) | yes | no | yes | no | no | no | no | no | no |
-* {1}: plus EXT001 24-bit prefixing. See [[sv/svp64]]
-* {2}: A 2-Dimensional Scalable Vector ISA with both Horizontal-First and Vertical-First Modes. See [[sv/vector_isa_comparison]]
-* {3}: on specific operations. See [[opcode_regs_deduped]] for full list
-* {4}: SVP64 provides the Vector register concept on top of the **Scalar** GPR, FPR and CR Fields, extended to 128 entries.
-* {5}: SVP64 Vectorises Scalar instructions. It is up to the **implementor** to choose (**optionally**) whether to apply SVP64 to e.g. VSX Quad-Precision (128-bit) instructions.
-* {6}: big-integer add is just `sv.adde`. Mul and divide require addition of two scalar operations
-* {7} See [[sv/svp64/appendix]] and [ARM SVE Fault-First](https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf)
-* {8} Based on LD/ST Fail-first, extended to data. See [[sv/svp64/appendix]]
-* {9} Turns standard ops into a type of "cmp". See [[sv/svp64/appendix]]
-* {10} Any non-power-of-two Matrix up to 127 FMACs. Also DCT (Lee) and FFT Full (RADIX2) Triple-loops supported. See [[sv/svp64/remap]]
-* {11} VSX's Vector Registers are mis-named: they are 100% PackedSIMD. AVX-512 is not a Vector ISA either. See [Flynn's Taxonomy](https://en.wikipedia.org/wiki/Flynn%27s_taxonomy)
-* {12} Power ISA v3.1 contains "Matrix Multiply Assist" (MMA) which due to PackedSIMD is restricted to RADIX2 and requires inline assembler loop-unrolling for non-power-of-two Matrix dimensions
-* {13} difficult to ascertain, see [NEON/VFP](https://developer.arm.com/documentation/den0018/a/NEON-and-VFP-Instruction-Summary/List-of-all-NEON-and-VFP-instructions).
+* (1): plus EXT001 24-bit prefixing. See [[sv/svp64]]
+* (2): A 2-Dimensional Scalable Vector ISA with both Horizontal-First and Vertical-First Modes. See [[sv/vector_isa_comparison]]
+* (3): on specific operations. See [[opcode_regs_deduped]] for full list
+* (4): SVP64 provides the Vector register concept on top of the **Scalar** GPR, FPR and CR Fields, extended to 128 entries.
+* (5): SVP64 Vectorises Scalar instructions. It is up to the **implementor** to choose (**optionally**) whether to apply SVP64 to e.g. VSX Quad-Precision (128-bit) instructions.
+* (6): big-integer add is just `sv.adde`. Mul and divide require addition of two scalar operations
+* (7) See [[sv/svp64/appendix]] and [ARM SVE Fault-First](https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf)
+* (8) Based on LD/ST Fail-first, extended to data. See [[sv/svp64/appendix]]
+* (9) Turns standard ops into a type of "cmp". See [[sv/svp64/appendix]]
+* (10) Any non-power-of-two Matrix up to 127 FMACs. Also DCT (Lee) and FFT Full (RADIX2) Triple-loops supported. See [[sv/svp64/remap]]
+* (11) VSX's Vector Registers are mis-named: they are 100% PackedSIMD. AVX-512 is not a Vector ISA either. See [Flynn's Taxonomy](https://en.wikipedia.org/wiki/Flynn%27s_taxonomy)
+* (12) Power ISA v3.1 contains "Matrix Multiply Assist" (MMA) which due to PackedSIMD is restricted to RADIX2 and requires inline assembler loop-unrolling for non-power-of-two Matrix dimensions
+* (13) difficult to ascertain, see [NEON/VFP](https://developer.arm.com/documentation/den0018/a/NEON-and-VFP-Instruction-Summary/List-of-all-NEON-and-VFP-instructions).
Critically depends on ARM Scalar instructions
-* {14} difficult to exactly ascertain, see ARM Architecture Reference Manual Supplement, DDI 0584. Critically depends on ARM Scalar instructions.
-* {15}: ARM states that the Scalability is a [Silicon-partner choice](https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/102340_0001_00_en_introduction-to-sve2.pdf?revision=aae96dd2-5334-4ad3-9a47-393086a20fea).
+* (14) difficult to exactly ascertain, see ARM Architecture Reference Manual Supplement, DDI 0584. Critically depends on ARM Scalar instructions.
+* (15): ARM states that the Scalability is a [Silicon-partner choice](https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/102340_0001_00_en_introduction-to-sve2.pdf?revision=aae96dd2-5334-4ad3-9a47-393086a20fea).
this "Scalability independence" is not entirely extended in full to the programmer although ARM requests developers to consider it so, in practice this does not happen.
-* {16}: [Wikipedia](https://en.wikipedia.org/wiki/AVX-512), [Lifecycle of an instruction set](https://media.handmade-seattle.com/tom-forsyth/) including full slides
-* {17}: difficult to exactly ascertain, contains subsets. Critically depends on ISA support from earlier x86 ISA subsets (several more thousand instructions). See [SIMD ISA listing](https://www.officedaytime.com/simd512e/)
-* {18}: [RVV Spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc)
-* {19}: Like the original Cray RVV is a truly scalable Vector ISA (Cray setvl instruction).
-* {20}: like SVP64 it is up to the hardware implementor to choose whether to support 128-bit elements.
-* {21}: [NEC SX Aurora](https://ftp.libre-soc.org/NEC_SX_Aurora_TSUBASA_VectorEngine-as-manual-v1.2.pdf) is based on the original Cray Vectors
-* {22}: [Aurora ISA guide)(https://sxauroratsubasa.sakura.ne.jp/documents/guide/pdfs/Aurora_ISA_guide.pdf) Appendix-3 11.1 p508
-* {23}: Like the original Cray Vectors, the ISA Vector Length is independent of the underlying hardware, however Generation 1 has 256 elements per Vector register (3.2.4 p24, Aurora ISA guide)
+* (16): [Wikipedia](https://en.wikipedia.org/wiki/AVX-512), [Lifecycle of an instruction set](https://media.handmade-seattle.com/tom-forsyth/) including full slides
+* (17): difficult to exactly ascertain, contains subsets. Critically depends on ISA support from earlier x86 ISA subsets (several more thousand instructions). See [SIMD ISA listing](https://www.officedaytime.com/simd512e/)
+* (18): [RVV Spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc)
+* (19): Like the original Cray RVV is a truly scalable Vector ISA (Cray setvl instruction).
+* (20): like SVP64 it is up to the hardware implementor to choose whether to support 128-bit elements.
+* (21): [NEC SX Aurora](https://ftp.libre-soc.org/NEC_SX_Aurora_TSUBASA_VectorEngine-as-manual-v1.2.pdf) is based on the original Cray Vectors
+* (22): [Aurora ISA guide)(https://sxauroratsubasa.sakura.ne.jp/documents/guide/pdfs/Aurora_ISA_guide.pdf) Appendix-3 11.1 p508
+* (23): Like the original Cray Vectors, the ISA Vector Length is independent of the underlying hardware, however Generation 1 has 256 elements per Vector register (3.2.4 p24, Aurora ISA guide)