From fadf8a82a57caa94f86e9783eeb98af471117c97 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Sat, 30 Jul 2022 00:25:13 +0100 Subject: [PATCH] found answer on SVE2 SMA: it is power-2 boundaried --- openpower/sv/comparison_table.mdwn | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/openpower/sv/comparison_table.mdwn b/openpower/sv/comparison_table.mdwn index 285b1d403..db64e1960 100644 --- a/openpower/sv/comparison_table.mdwn +++ b/openpower/sv/comparison_table.mdwn @@ -37,7 +37,7 @@ [^20]: [RVV Spec](https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc) [^21]: RISC-V Vectors are not stand-alone, i.e. like SVE2 and AVX-512 are critically dependent on the Scalar ISA (an additional ~96 instructions for the Scalar RV64GC set, needed for Linux). [^22]: Like the original Cray RVV is a truly scalable Vector ISA (Cray setvl instruction). However, like SVE2, the Maximum Vector length is a Silicon-partner choice, which creates similar limitations that SVP64 does not have. - The RISC-V Founders strongly discourage efforts by programmers to find out the Silicon's Maximum Vector Length, as an effort to steer programmers towards Silicon-independent assembler. This requires **all** algorithms to contain a loop construct. + The RISC-V Founders strongly discourage efforts by programmers to find out the Silicon's Maximum Vector Length, as an effort to steer programmers towards Silicon-independent assembler. **This requires all algorithms to contain a loop construct**. MAXVL in SVP64 is a Spec-hard-fixed quantity therefore loop constructs are not necessary 100% of the time. [^23]: like SVP64 it is up to the hardware implementor (Silicon partner) to choose whether to support 128-bit elements. [^24]: [NEC SX Aurora](https://ftp.libre-soc.org/NEC_SX_Aurora_TSUBASA_VectorEngine-as-manual-v1.2.pdf) is based on the original Cray Vectors @@ -50,9 +50,9 @@ [^31]: [RVV intrinsics listing](https://raw.githubusercontent.com/riscv-non-isa/rvv-intrinsic-doc/master/intrinsic_funcs.md) page is 25,000 lines long. [^32]: Unknown. estimated to be of the order of length of RVV due to also being a Cray-style Scalable ISA, NEC maintains an [LLVM hard fork](https://github.com/sx-aurora-dev) [^33]: [Scalable Matrix Optional Extension](https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/scalable-matrix-extension-armv9-a-architecture) - the key is an outer-product instruction + outer-product instructions [SMOPA](https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/SMOPA--Signed-integer-sum-of-outer-products-and-accumulate-?lang=en) - which is very hard to tell at a glance if it is power-2 or non-power-2 + which are power-2 based on Silicon-partner SIMD width. Non-power-2 not supported but [zero-input masking](https://www.realworldtech.com/forum/?threadid=202688&curpostid=207774) is. [^34]: [Advanced matrix Extensions](https://en.wikipedia.org/wiki/Advanced_Matrix_Extensions) supports BF16 and INT8 only. Separate regfile, power-of-two "tiles". Not general-purpose at all. [^35]: Although registers may be 128-bit in NEON, SVE2, and AVX, unlike VSX there are very few (or no) actual arithmetic 128-bit operations. Only RVV and SVP64 have the possibility of 128-bit ops [^36]: Mitch Alsup's MyISA 66000 is available on request. A powerful RISC ISA with a **Hardware-level auto-vectorisation** LOOP built-in as an extension named VVM. Classified as "Vertical-First". -- 2.30.2