openpower/sv/comparison_table.mdwn

   1
   2 |ISA <br>name  |Num <br>opcodes|Taxonomy / <br> Class|setvl <br> scalable|Predicate <br> Masks|Twin <br> Predication|Explicit <br> Vector regs|128-bit <br> operations|Bigint <br> capability|LDST <br> Fault-First|Data-dependent <br> Fail-first|Predicate-<br> Result|Matrix HW<br> support|
   3 |--------------|---------------|---------------------|-------------------|--------------------|---------------------|-------------------------|-----------------------|----------------------|---------------------|------------------------------|---------------------|---------------------|
   4 |Draft SVP64   |5 (1)          |Scalable (2)         |yes                |yes                 |yes (3)              |no (4)                   |see (5)                |yes (6)               |yes (7)              |yes (8)                       |yes (9)              |yes (10)             |
   5 |VSX           |700+           |Packed SIMD          |no                 |no                  |no                   |yes (11)                 |yes                    |no                    |no                   |no                            |no                   |yes (12)             |
   6 |NEON          |~250 (13)      |Predicated SIMD      |no                 |yes                 |no                   |yes                      |yes                    |no                    |no                   |no                            |no                   |no                   |
   7 |SVE2          |~1000 (14)     |Predicated SIMD (15) |no (15)            |yes                 |no                   |yes                      |yes                    |no                    |yes (7)              |no                            |no                   |no                   |
   8 |AVX-512 (16)  |~1000s (17)    |Predicated SIMD      |no                 |yes                 |no                   |yes                      |yes                    |no                    |no                   |no                            |no                   |no                   |
   9 |RVV (18)      |~190           |Scalable (19)        |yes                |yes                 |no                   |yes                      |yes (20)               |no                    |yes                  |no                            |no                   |no                   |
  10 |Aurora SX (21)|~200 (22)      |Scalable (23)        |yes                |yes                 |no                   |yes                      |no                     |no                    |no                   |no                            |no                   |no                   |
  11
  12 * (1): plus EXT001 24-bit prefixing. See [[sv/svp64]]
  13 * (2): A 2-Dimensional Scalable Vector ISA with both Horizontal-First and Vertical-First Modes. See [[sv/vector_isa_comparison]]
  14 * (3): on specific operations.  See [[opcode_regs_deduped]] for full list
  15 * (4): SVP64 provides a Vector concept on top of the **Scalar** GPR, FPR and CR Fields, extended to 128 entries.
  16 * (5): SVP64 Vectorises Scalar instructions. It is up to the **implementor** to choose (**optionally**) whether to apply SVP64 to e.g. VSX Quad-Precision (128-bit) instructions, to create 128-bit Vector operations.
  17 * (6): big-integer add is just `sv.adde`. Bigint Mul and divide require addition of two scalar operations. See [[sv/biginteger]]
  18 * (7): See [[sv/svp64/appendix]] and [ARM SVE Fault-First](https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf)
  19 * (8): Based on LD/ST Fail-first, extended to data. See [[sv/svp64/appendix]]
  20 * (9): Turns standard ops into a type of "cmp". See [[sv/svp64/appendix]]
  21 * (10): Any non-power-of-two Matrix up to 127 FMACs.  Also DCT (Lee) and FFT Full (RADIX2) Triple-loops supported. See [[sv/remap]]
  22 * (11): VSX's Vector Registers are mis-named: they are 100% PackedSIMD. AVX-512 is not a Vector ISA either.  See [Flynn's Taxonomy](https://en.wikipedia.org/wiki/Flynn%27s_taxonomy)
  23 * (12): Power ISA v3.1 contains "Matrix Multiply Assist" (MMA) which due to PackedSIMD is restricted to RADIX2 and requires inline assembler loop-unrolling for non-power-of-two Matrix dimensions
  24 * (13): difficult to ascertain, see [NEON/VFP](https://developer.arm.com/documentation/den0018/a/NEON-and-VFP-Instruction-Summary/List-of-all-NEON-and-VFP-instructions).
  25   Critically depends on ARM Scalar instructions
  26 * (14): difficult to exactly ascertain, see ARM Architecture Reference Manual Supplement, DDI 0584.  Critically depends on ARM Scalar instructions.
  27 * (15): ARM states that the Scalability is a [Silicon-partner choice](https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/102340_0001_00_en_introduction-to-sve2.pdf?revision=aae96dd2-5334-4ad3-9a47-393086a20fea).
  28   Scalability in the ISA is **not available to the programmer**: there is no `setvl` instruction in SVE2, which is already causing assembler programmer difficulties. Effectively this makes SVE2 Predicated SIMD where the SIMD width is chosen by the "Silicon partner"
  29 * (16): [AVX512 Wikipedia](https://en.wikipedia.org/wiki/AVX-512), [Lifecycle of an instruction set](https://media.handmade-seattle.com/tom-forsyth/) including full slides
  30 * (17): difficult to exactly ascertain, contains subsets. Critically depends on ISA support from earlier x86 ISA subsets (several more thousand instructions). See [SIMD ISA listing](https://www.officedaytime.com/simd512e/)
  31 * (18): [RVV Spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc)
  32 * (19): Like the original Cray RVV is a truly scalable Vector ISA (Cray setvl instruction).
  33 * (20): like SVP64 it is up to the hardware implementor to choose whether to support 128-bit elements.
  34 * (21): [NEC SX Aurora](https://ftp.libre-soc.org/NEC_SX_Aurora_TSUBASA_VectorEngine-as-manual-v1.2.pdf) is based on the original Cray Vectors
  35 * (22): [Aurora ISA guide](https://sxauroratsubasa.sakura.ne.jp/documents/guide/pdfs/Aurora_ISA_guide.pdf) Appendix-3 11.1 p508
  36 * (23): Like the original Cray Vectors, the ISA Vector Length is independent of the underlying hardware, however Generation 1 has 256 elements per Vector register (3.2.4 p24, Aurora ISA guide)