From: lkcl Date: Thu, 8 Sep 2022 15:29:05 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~616 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=790ecbcef0665c75a66f97bcb57bf4c97f09ee5d;p=libreriscv.git --- diff --git a/openpower/sv/rfc/ls001.mdwn b/openpower/sv/rfc/ls001.mdwn index 487774576..0940676ec 100644 --- a/openpower/sv/rfc/ls001.mdwn +++ b/openpower/sv/rfc/ls001.mdwn @@ -94,7 +94,7 @@ the next decade. with SVLR by SV-Branch-Conditional for exactly the same reason that NIA is swapped with LR -* Vector Management Instructions +**Vector Management Instructions** * **setvl** - Cray-style Scalar Vector Length instruction * **svstep** - used for Vertical-First Mode and for enquiring about internal state @@ -108,14 +108,33 @@ the next decade. # SVP64 24-bit Prefix The SVP64 24-bit Prefix provides several options, too numerous to describe in this -document. The primary options are: +document but all fitting within the 24-bit space (and no other). +The primary options are: * element-width overrides, which dynamically redefine each SFFS or SFS Scalar prefixed instruction to be 8-bit, 16-bit, 32-bit or 64-bit operands **without requiring new 8/16/32 instructions** [^pseudorewrite] - -* Due to a concept called "Element-width Overrides - +* predication. this is an absolutely essential feature for a 3D GPU VPU ISA. + CR Fields are available as Predicate Masks hence the reason for their extension to 128. +* Saturation. **all** LD/ST and Arithmetic and Logical operations may be saturated + (without adding explicit scalar saturated opcodes) +* Reduction and Prefix-Sum (Fibonnacci Series) Modes + +# REMAP subsystem + +REMAP is extremely advanced but brings features already present in other DSPs and +Supercomputing ISAs. + +* DCT/FFT REMAP brings more capability than TI's MSP-Series DSPs and Qualcom Hexagon DSPs +* Matrix REMAP brings more capability than any other Matrix Extension (AMD GPUs, + Intel, ARM), not being restricted to Power-2 sizes. Also not limited to the type + of operation, it may perform Warshall Transitive Closure, Integer Matrix, + Bitmanipulation Matrix, Galois Field (carryless mul) Matrix, and with care potentially + Graph Maximum Flow as well. Also suited to Convolutions, Matrix Transpose and rotate. +* General-purpose Indexed REMAP, this option is provided to implement an equivalent + of VSX `vperm` +* Parallel Reduction REMAP, performs an automatic map-reduce using *any suitable + scalar operation*. [^extend]: Prefix opcode space **must** be reserved in advance to do so, in order to avoid the catastrophic binary-incompatibility mistake made by RISC-V RVV and ARM SVE/2 [^likeext001]: SVP64-Single is remarkably similar to the "bit 1" of EXT001 being set to indicate that the 64-bits is to be allocated in full to a new encoding, but in fact it still embeds v3.0 Scalar operations.