From f73f194cfda86d265da5c60ea5c7ea4d58cb01ee Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 8 Apr 2023 11:13:57 +0100 Subject: [PATCH] --- openpower/sv/rfc/ls012.mdwn | 46 ++++++++++++++++++++++++++++++++----- 1 file changed, 40 insertions(+), 6 deletions(-) diff --git a/openpower/sv/rfc/ls012.mdwn b/openpower/sv/rfc/ls012.mdwn index b467a89fb..85ebb09ae 100644 --- a/openpower/sv/rfc/ls012.mdwn +++ b/openpower/sv/rfc/ls012.mdwn @@ -13,17 +13,20 @@ themselves) which instructions should be submitted over the next 18 months. *It is expected that readers visit and interact with the Libre-SOC resources -in order to do due-diligence on the prioritisation evaluation*. +in order to do due-diligence on the prioritisation evaluation. Otherwise +the ISA WG is overwhelmed by piecemeal RFCs that may turn out not +to be useful, against a background of having no guiding overview*. Worth bearing in mind during evaluation that every "Defined Word" may or may not be Vectoriseable, but that every "Defined Word" -should have merits on its own not just when Vectorised. An example +should have merits on its own, not just when Vectorised. An example of a borderline Vectoriseable Defined Word is `mv.swizzle` which -only really becomes high-priority for Vector GPU and HPC Workloads, +only really becomes high-priority for Audio/Video, Vector GPU and HPC Workloads, but has less merit as a Scalar-only operation. -Power ISA Scalar (SFFS) has not been significantly advanced in 12 years. -With VSX bring 914 instructions and 128-bit it is far too much for any +Power ISA Scalar (SFFS) has not been significantly advanced in 12 years: +IBM's primary focus has understandably been on PackedSIMD VSX. +Unfortunately, with VSX being 914 instructions and 128-bit it is far too much for any new team to consider (10 years development effort) and far outside of Embedded or Tablet/Desktop/Laptop power budgets. Thus bringing Power Scalar up-to-date to modern standards is a reasonable goal, and the advantage is @@ -33,7 +36,7 @@ SVP64 Prefixing - also known by the terms "Zero-Overhead-Loop-Prefixing" as well as "True-Scalable-Vector Prefixing" - also literally brings new dimensions to the Power ISA. Thus when adding new Scalar "Defined Words" it has to unavoidably and simultaneously be taken into consideration their value when -Vectorised. +Vector-Prefixed, *as well as* SVP64Single-Prefixed. **Target areas** @@ -120,6 +123,37 @@ was giiven instead to transferring several CR Field bits into GPRs, whereupon the full set of tandard Scalar GPR Logical Operations may be used. This strategy has the side-effect of keeping the CRweird group down to only five instructions. +# Big-integer Math + +[[sv/biginteger]] has always been a high priority area for commercial applications, privacy, +Banking, as well as HPC Numerical Accuracy: libgmp as well as cryptographic uses +in Asymmetric Ciphers. poly1305 and ec25519 are finding their way into everyday +use via OpenSSL. + +A very early variant of the Power ISA had a 32-bit Carry-in Carry-out SPR. Its +removal from subsequent revisions is regrettable. An alternative concept is +to add six explicit 3-in 2-out operations that, on close inspection, always +turn out to be supersets of *existing Scalar operations* that discard upper +or lower DWords, or parts thereof. + +*Thus it is critical to note that not one single one of these operations +expands the bitwidth of any existing Scalar pipelines*. + +The `dsld` instruction for example merely places additional LSBs into the 64-bit +shift (64-bit carry-in), and then places the (normally discarded) MSBs into the second +output register (64-bit carry-out). It does **not** require a 128-bit shifter to +replace the existing Scalar Power ISA 64-bit shifters. + +The reduction in instruction count these operations bring, in critical hotloops, +is remarkably high, to the extent where a Scalar-to-Vector operation of +*arbitrary length* becomes just the one Vector-Prefixed instruction. + +Whilst these are 5-6 bit XO their utility is considered high strategic value +and as such are strongly advocated to be in EXT04. The alternative is to bring +back a 64-bit Carry SPR but how it is retrospectively applicable to pre-existing Scalar +Power ISA mutiply, divide, and shift operations at this late stage of maturity of +the Power ISA is an entire area of research on its own deemed unlikely to be +achievable. [[!inline pages="openpower/sv/rfc/ls012/areas.mdwn" raw=yes ]] -- 2.30.2