From 1a15190a24f1dd87ba7f6f0d0e84b009aa159fa0 Mon Sep 17 00:00:00 2001 From: lkcl Date: Wed, 14 Sep 2022 13:46:40 +0100 Subject: [PATCH] --- openpower/sv/rfc/ls001.mdwn | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/openpower/sv/rfc/ls001.mdwn b/openpower/sv/rfc/ls001.mdwn index 3ea13b6e6..aa70c4319 100644 --- a/openpower/sv/rfc/ls001.mdwn +++ b/openpower/sv/rfc/ls001.mdwn @@ -182,20 +182,38 @@ the same space): # SVP64 and SVP64-Single 24-bit Prefixes -The SVP64 24-bit Prefix provides several options, too numerous to describe -in this document but all fitting within the 24-bit space (and no other). +The SVP64 24-bit Prefix provides several options, +all fitting within the 24-bit space (and no other). REMAP is separately +outlined below. The primary options are: * element-width overrides, which dynamically redefine each SFFS or SFS Scalar prefixed instruction to be 8-bit, 16-bit, 32-bit or 64-bit - operands **without requiring new 8/16/32 instructions**[^pseudorewrite] + operands **without requiring new 8/16/32 instructions.**[^pseudorewrite] + This results in full BF16 and FP16 opcodes being added to the Power ISA + **without adding BF16 or FP16 opcodes** including full conversion + between all formats. * predication. this is an absolutely essential feature for a 3D GPU VPU ISA. CR Fields are available as Predicate Masks hence the reason for their extension to 128. * Saturation. **all** LD/ST and Arithmetic and Logical operations may be saturated (without adding explicit scalar saturated opcodes) -* Reduction and Prefix-Sum (Fibonnacci Series) Modes as well as vec2/3/4 - "Packing" and "Unpacking". +* Reduction and Prefix-Sum (Fibonnacci Series) Modes +* vec2/3/4 "Packing" and "Unpacking" (similar to VSX `vpack` and `vpkss`) + accessible in a way that is easier than REMAP, added for the same reasons + that drove `vpack` and `vpkss` etc. to be added: pixel, audio, and 3D + data manipulation. +* Load/Store speculative "fault-first" behaviour, identical to ARM and RVV + Fault-first: provides auto-truncation of a speculative LD/ST helping + solve the "SIMD Considered Harmful" stripmining problem from a Memory + Access perspective. +* Data-Dependent Fail-First: a 100% Deterministic extension of the LDST + ffirst concept: first `Rc=1 BO test` failure terminates looping and + truncates VL to that exact point. Useful for implementing algorithms + such as `strcpy` in around 14 high-performance Vector instructions, the + option exists to include or exclude the failing element. + +**SVP64Single** The `SVP64-Single` 24-bit encoding focusses primarily on ensuring that all 128 Scalar registers are fully accessible, provides element-width -- 2.30.2