From a0f46cc9f46dcd168bed29445f57807a779170ad Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Fri, 22 Jul 2022 16:06:49 +0100 Subject: [PATCH] add matrix hardware column --- openpower/sv/comparison_table.mdwn | 44 ++++++++++++++++-------------- 1 file changed, 23 insertions(+), 21 deletions(-) diff --git a/openpower/sv/comparison_table.mdwn b/openpower/sv/comparison_table.mdwn index d7ac3215f..c0bb36363 100644 --- a/openpower/sv/comparison_table.mdwn +++ b/openpower/sv/comparison_table.mdwn @@ -1,14 +1,14 @@ # ISA Comparison Table -| ISA
name | Num
opcodes | Taxonomy /
Class | Predicate
Masks | Twin
Predication | Explicit
Vector regs | 128-bit | Bigint
capability | LDST
Fault-First | Data-dependent
Fail-first | Predicate-
Result | -|----------------|-------------------|-----------------------|------------------------|-------------------------|------------------------------|---------|--------------------------|-------------------------|----------------------------------|-------------------------| -| SVP64 | 5 {1} | Scalable {2} | yes | yes {3} | no {4} | see {5} | yes {6} | yes {7} | yes {8} | yes {9} | -| VSX | 700+ | Packed SIMD | no | no | yes {10} | yes | no | no | no | no | -| NEON | ~250 {11} | Predicated SIMD | yes | no | yes | yes | no | no | no | no | -| SVE2 | ~1000 {12} | Scalable HW {13} | yes | no | yes | yes | no | yes {7} | no | no | -| AVX-512 {14} | ~1000s {15} | Predicated SIMD | yes | no | yes | yes | no | no | no | no | -| RVV {16} | ~190 | Scalable {17} | yes | no | yes | yes {18}| no | yes | no | no | -| Aurora SX {19} | ~200 {20} | Scalable {21} | yes | no | yes | no | no | no | no | no | +| ISA
name | Num
opcodes | Taxonomy /
Class | Predicate
Masks | Twin
Predication | Explicit
Vector regs | 128-bit | Bigint
capability | LDST
Fault-First | Data-dependent
Fail-first | Predicate-
Result | Matrix HW
support | +|----------------|-----------------|-----------------------|----------------------|-----------------------|----------------------------|---------|------------------------|-----------------------|--------------------------------|-----------------------|-----------------------| +| SVP64 | 5 {1} | Scalable {2} | yes | yes {3} | no {4} | see {5} | yes {6} | yes {7} | yes {8} | yes {9} | yes {10} | +| VSX | 700+ | Packed SIMD | no | no | yes {11} | yes | no | no | no | no | yes {12} | +| NEON | ~250 {13} | Predicated SIMD | yes | no | yes | yes | no | no | no | no | no | +| SVE2 | ~1000 {14} | Scalable HW {15} | yes | no | yes | yes | no | yes {7} | no | no | no | +| AVX-512 {16} | ~1000s {17} | Predicated SIMD | yes | no | yes | yes | no | no | no | no | no | +| RVV {18} | ~190 | Scalable {19} | yes | no | yes | yes {20}| no | yes | no | no | no | +| Aurora SX {21} | ~200 {22} | Scalable {23} | yes | no | yes | no | no | no | no | no | no | * {1}: plus EXT001 24-bit prefixing. See [[sv/svp64]] * {2}: A 2-Dimensional Scalable Vector ISA with both Horizontal-First and Vertical-First Modes. See [[sv/vector_isa_comparison]] @@ -19,17 +19,19 @@ * {7} See [[sv/svp64/appendix]] and [ARM SVE Fault-First](https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf) * {8} Based on LD/ST Fail-first, extended to data. See [[sv/svp64/appendix]] * {9} Turns standard ops into a type of "cmp". See [[sv/svp64/appendix]] -* {10} VSX's Vector Registers are mis-named: they are 100% PackedSIMD. AVX-512 is not a Vector ISA either. See [Flynn's Taxonomy](https://en.wikipedia.org/wiki/Flynn%27s_taxonomy) -* {11} difficult to ascertain, see [NEON/VFP](https://developer.arm.com/documentation/den0018/a/NEON-and-VFP-Instruction-Summary/List-of-all-NEON-and-VFP-instructions). +* {10} Any non-power-of-two Matrix up to 127 FMACs. Also DCT (Lee) and FFT Full (RADIX2) Triple-loops supported. See [[sv/svp64/remap]] +* {11} VSX's Vector Registers are mis-named: they are 100% PackedSIMD. AVX-512 is not a Vector ISA either. See [Flynn's Taxonomy](https://en.wikipedia.org/wiki/Flynn%27s_taxonomy) +* {12} Power ISA v3.1 contains "Matrix Multiply Assist" (MMA) which due to PackedSIMD is restricted to RADIX2 and requires inline assembler loop-unrolling for non-power-of-two Matrix dimensions +* {13} difficult to ascertain, see [NEON/VFP](https://developer.arm.com/documentation/den0018/a/NEON-and-VFP-Instruction-Summary/List-of-all-NEON-and-VFP-instructions). Critically depends on ARM Scalar instructions -* {12} difficult to exactly ascertain, see ARM Architecture Reference Manual Supplement, DDI 0584. Critically depends on ARM Scalar instructions. -* {13}: ARM states that the Scalability is a [Silicon-partner choice](https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/102340_0001_00_en_introduction-to-sve2.pdf?revision=aae96dd2-5334-4ad3-9a47-393086a20fea). +* {14} difficult to exactly ascertain, see ARM Architecture Reference Manual Supplement, DDI 0584. Critically depends on ARM Scalar instructions. +* {15}: ARM states that the Scalability is a [Silicon-partner choice](https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/102340_0001_00_en_introduction-to-sve2.pdf?revision=aae96dd2-5334-4ad3-9a47-393086a20fea). this "Scalability independence" is not entirely extended in full to the programmer although ARM requests developers to consider it so, in practice this does not happen. -* {14}: [Wikipedia](https://en.wikipedia.org/wiki/AVX-512), [Lifecycle of an instruction set](https://media.handmade-seattle.com/tom-forsyth/) including full slides -* {15}: difficult to exactly ascertain, contains subsets. Critically depends on ISA support from earlier x86 ISA subsets (several more thousand instructions). See [SIMD ISA listing](https://www.officedaytime.com/simd512e/) -* {16}: [RVV Spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc) -* {17}: Like the original Cray RVV is a truly scalable Vector ISA (Cray setvl instruction). -* {18}: like SVP64 it is up to the hardware implementor to choose whether to support 128-bit elements. -* {19}: [NEC SX Aurora](https://ftp.libre-soc.org/NEC_SX_Aurora_TSUBASA_VectorEngine-as-manual-v1.2.pdf) is based on the original Cray Vectors -* {20}: [Aurora ISA guide)(https://sxauroratsubasa.sakura.ne.jp/documents/guide/pdfs/Aurora_ISA_guide.pdf) Appendix-3 11.1 p508 -* {21}: Like the original Cray Vectors, the ISA Vector Length is independent of the underlying hardware, however Generation 1 has 256 elements per Vector register (3.2.4 p24, Aurora ISA guide) +* {16}: [Wikipedia](https://en.wikipedia.org/wiki/AVX-512), [Lifecycle of an instruction set](https://media.handmade-seattle.com/tom-forsyth/) including full slides +* {17}: difficult to exactly ascertain, contains subsets. Critically depends on ISA support from earlier x86 ISA subsets (several more thousand instructions). See [SIMD ISA listing](https://www.officedaytime.com/simd512e/) +* {18}: [RVV Spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc) +* {19}: Like the original Cray RVV is a truly scalable Vector ISA (Cray setvl instruction). +* {20}: like SVP64 it is up to the hardware implementor to choose whether to support 128-bit elements. +* {21}: [NEC SX Aurora](https://ftp.libre-soc.org/NEC_SX_Aurora_TSUBASA_VectorEngine-as-manual-v1.2.pdf) is based on the original Cray Vectors +* {22}: [Aurora ISA guide)(https://sxauroratsubasa.sakura.ne.jp/documents/guide/pdfs/Aurora_ISA_guide.pdf) Appendix-3 11.1 p508 +* {23}: Like the original Cray Vectors, the ISA Vector Length is independent of the underlying hardware, however Generation 1 has 256 elements per Vector register (3.2.4 p24, Aurora ISA guide) -- 2.30.2