From 899486b2651c49dfd7a87fa352316ade3136ed55 Mon Sep 17 00:00:00 2001 From: lkcl Date: Mon, 29 May 2023 16:43:35 +0100 Subject: [PATCH] --- openpower/sv/svp64.mdwn | 35 ++++++++++++++++++----------------- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/openpower/sv/svp64.mdwn b/openpower/sv/svp64.mdwn index 2a9c6f2c0..eca3eeaca 100644 --- a/openpower/sv/svp64.mdwn +++ b/openpower/sv/svp64.mdwn @@ -198,17 +198,17 @@ execution of instructions, Simple-V requires a corresponding guarantee for Eleme because in Simple-V Execution of Elements is synonymous with Execution of instructions. +[^ieo]: Strict Instruction Execution Order is defined in Public v3.1 Book I Section 2.2 + ## Precise Interrupt Guarantees -Strict Instruction Execution Order[^ieo] is defined as giving the appearance, as far +Strict Instruction Execution Order is defined as giving the appearance, as far as programs are concerned, that instructions were executed strictly in the sequence that they occurred. A "Precise" out-of-order Micro-architecture goes to considerable lengths to ensure that this is the case. -[^ieo]: Strict Instruction Execution Order is defined in Public v3.1 Book I Section 2.2 - Many Vector ISAs allow interrupts to occur in the middle of processing of large Vector operations, only under the condition that partial results are cleanly discarded, and continuation on return @@ -221,11 +221,11 @@ accumulator than the registers. Simple-V operates on an entirely different paradigm from traditional Vector ISAs: as a "Sub-Execution Context", where "Elements" are synonymous -with Scalar instructions. With this in mind it is critical for -implementations to observe Strict **Element**-Level Execution Order[^svp64_eeo] +with Scalar instructions. With this in mind +implementations must observe Strict **Element**-Level Execution Order[^svp64_eeo] at all times. *Any* element is Interruptible and Architectural State may -be fully preserved and restored regardless of that same State +be fully preserved and restored regardless of that same State. *Engineering note: implementations are permitted have higher latency to perform context-switching (particularly if REMAP @@ -236,17 +236,18 @@ but the full SVP64 Architectural State may be saved and restored through manual copying of `SVSTATE` (and the four REMAP SPRs if in use at the time) -*Programmer's note: Trap Handlers (and function call stack save/restore) -may avoid the -use of SVP64 Prefixed instructions to perform the necessary -save/restore of Simple-V Architectural State. -This capability also allows nested function calls to be made from -inside Vertical-First Vector loops, which is very rare for Vector ISAs. +*Programmer's note: Trap Handlers (and any stack-based context save/restore) +must avoid the use of SVP64 Prefixed instructions to perform the necessary +save/restore of Simple-V Architectural State (SPR SVSTATE), +just as use of FPRs and VSRs is presently avoided. +However once saved, and set to known-good, SVP64 Prefixed instructions +may be used to save/restore GPRs, SPRs, FPRs and other state.* -Strict Program Order is also preserved by the Parallel Reduction -REMAP Schedule, but only at the cost of requiring the destination -Vector to be used (Deterministically) to store partial progress of the -Parallel Reduction. +*Programmer's note: SVSHAPE0-3 alters Element Execution Order, but only +if activated in SVSHAPE. It is therefore technically possible in a Trap +Handler to save SVSTATE (`mfspr t0, SVSTATE`), then clear bits 32-46. +At this point it becomes safe to use SVP64 to save sequential batches +of SPRs (`setvli MAXVL=VL=4; sv.mfspr *t0, *SVSHAPE0`)* The only major caveat for REMAP is that after an explicit change to @@ -256,7 +257,7 @@ it easier to take longer to calculate where in a given Schedule the re-mapping Indices were. Obvious examples include Interrupts occuring in the middle of a non-RADIX2 Matrix Multiply Schedule (5x3 by 3x3 for example), which -will force implementations to perform divide and modulo +will force some implementations to perform divide and modulo calculations. An additional caveat involves Condition Register Fields -- 2.30.2