From: lkcl Date: Sat, 14 May 2022 21:08:31 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~2237 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=eec1cff7710e60f29f61dc9eef3ab7e01c186296;p=libreriscv.git --- diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn index 214af62f3..ea0a6e8ed 100644 --- a/openpower/sv/SimpleV_rationale.mdwn +++ b/openpower/sv/SimpleV_rationale.mdwn @@ -169,7 +169,8 @@ Software Ecosystem? Debian supports most of these including s390: Andes in Audio DSPs, WD in HDDs and SSDs. These are all astoundingly commercially successful multi-billion-unit mass volume markets that almost nobody - knows anything about. Included for completeness. + knows anything about, outside their specialised proprietary + niche. Included for completeness. In order of least controlled to most controlled, the viable candidates for further advancement are: @@ -352,6 +353,10 @@ Sum) Boolean Logic in a Vector context, on top of an already-powerful Scalar Branch-Conditional/Counter instruction +All of these festures are added as "Augmentations", to create of +the order of 1.5 *million* instructions, none of which decode the +32-bit scalar suffix any differently. + **What is missing from Power Scalar ISA that a Vector ISA needs?** Remarkably, very little: the devil is in the details though. @@ -474,10 +479,11 @@ concept separated from the mathematical operation, there is no reason why Matrix Multiplication Schedules may not be applied to Integer Mul-and-Accumulate, Galois Field Mul-and-Accumulate, Logical AND-and-OR, or any other future instruction such as Complex-Number -Multiply-and-Accumulate that a future version of the Power ISA might +Multiply-and-Accumulate or Abs-Diff-and-Accumulate +that a future version of the Power ISA might support. The flexibility is not only enormous, but the compactness -unprecedented. RADIX2 in-place DCT Triple-loop Schedules may be created in -around 11 instructions. The only other processors well-known to have +unprecedented. RADIX2 in-place DCT may be created in +around 11 instructions using the Triple-loop DCT Schedule. The only other processors well-known to have this type of compact capability are both VLIW DSPs: TI's TMS320 Series and Qualcom's Hexagon, and both are targetted at FFTs only.