From c7bb4dcdce6b695c5334863a1e8ac2d44903cca2 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Wed, 19 Apr 2023 12:41:16 +0100 Subject: [PATCH] move SPRs out of ls008 back to sprs.mdwn --- openpower/sv/rfc/ls008.mdwn | 174 +-------------------------------- openpower/sv/sprs.mdwn | 186 ++++++++++++++++++++++-------------- 2 files changed, 117 insertions(+), 243 deletions(-) diff --git a/openpower/sv/rfc/ls008.mdwn b/openpower/sv/rfc/ls008.mdwn index 252270c5d..7bee665e7 100644 --- a/openpower/sv/rfc/ls008.mdwn +++ b/openpower/sv/rfc/ls008.mdwn @@ -111,179 +111,9 @@ checked (`if RT = 0`) [[!inline pages="openpower/sv/svstep" raw=yes ]] [[!inline pages="openpower/sv/setvl" raw=yes ]] +[[!inline pages="openpower/sv/sprs" raw=yes ]] -# SVSTATE SPR - -The format of the SVSTATE SPR is as follows: - -| Field | Name | Description | -| ----- | -------- | --------------------- | -| 0:6 | maxvl | Max Vector Length | -| 7:13 | vl | Vector Length | -| 14:20 | srcstep | for srcstep = 0..VL-1 | -| 21:27 | dststep | for dststep = 0..VL-1 | -| 28:29 | dsubstep | for substep = 0..SUBVL-1 | -| 30:31 | ssubstep | for substep = 0..SUBVL-1 | -| 32:33 | mi0 | REMAP RA/FRA/BFA SVSHAPE0-3 | -| 34:35 | mi1 | REMAP RB/FRB/BFB SVSHAPE0-3 | -| 36:37 | mi2 | REMAP RC/FRT SVSHAPE0-3 | -| 38:39 | mo0 | REMAP RT/FRT/BF SVSHAPE0-3 | -| 40:41 | mo1 | REMAP EA/RS/FRS SVSHAPE0-3 | -| 42:46 | SVme | REMAP enable (RA-RT) | -| 47:52 | rsvd | reserved | -| 53 | pack | PACK (srcstrp reorder) | -| 54 | unpack | UNPACK (dststep order) | -| 55:61 | hphint | Horizontal Hint | -| 62 | RMpst | REMAP persistence | -| 63 | vfirst | Vertical First mode | - -Notes: - -* The entries are truncated to be within range. Attempts to set VL to - greater than MAXVL will truncate VL. -* Setting srcstep, dststep to 64 or greater, or VL or MVL to greater - than 64 is reserved and will cause an illegal instruction trap. - -**SVSTATE Fields** - -SVSTATE is a standard SPR that (if REMAP is not activated) contains sufficient -self-contaned information for a full context save/restore. -SVSTATE contains (and permits setting of): - -* MVL (the Maximum Vector Length) - declares (statically) how - much of a regfile is to be reserved for Vector elements -* VL - Vector Length -* dststep - the destination element offset of the current parallel - instruction being executed -* srcstep - for twin-predication, the source element offset as well. -* ssubstep - the source subvector element offset of the current - parallel instruction being executed -* dsubstep - the destination subvector element offset of the current - parallel instruction being executed -* vfirst - Vertical First mode. srcstep, dststep and substep - **do not advance** unless explicitly requested to do so with - pseudo-op svstep (a mode of setvl) -* RMpst - REMAP persistence. REMAP will apply only to the following - instruction unless this bit is set, in which case REMAP "persists". - Reset (cleared) on use of the `setvl` instruction if used to - alter VL or MVL. -* Pack - if set then srcstep/substep VL/SUBVL loop-ordering is inverted. -* UnPack - if set then dststep/substep VL/SUBVL loop-ordering is inverted. -* hphint - Horizontal Parallelism Hint. Indicates that - no Hazards exist between groups of elements in sequential multiples of this number - (before REMAP). By definition: elements for which `FLOOR(srcstep/hphint)` is - equal *before REMAP* are in the same parallelism "group". In Vertical First Mode - hardware **MUST ONLY** process elements in the same group, and must stop - Horizontal Issue at the last element of a given group. Set to zero to indicate "no hint". -* SVme - REMAP enable bits, indicating which register is to be - REMAPed: RA, RB, RC, RT and EA are the canonical (typical) register names - associated with each bit, with RA being the LSB and EA being the MSB. - See table below for ordering. When `SVme` is zero (0b00000) REMAP - is **fully disabled and inactive** regardless of the contents of - `SVSTATE`, `mi0-mi2/mo0-mo1`, or the four `SVSHAPEn` SPRs -* mi0-mi2/mo0-mo1 - when the corresponding SVme bit is enabled, these - indicate the SVSHAPE (0-3) that the corresponding register (RA etc) - should use, as long as the register's corresponding SVme bit is set - -Programmer's Note: the fact that REMAP is entirely dormant when `SVme` is zero -allows establishment of REMAP context well in advance, followed by utilising `svremap` -at a precise (or the very last) moment. Some implementations may exploit this -to cache (or take some time to prepare caches) in the background whilst other -(unrelated) instructions are being executed. This is particularly important to -bear in mind when using `svindex` which will require hardware to perform (and -cache) additional GPR reads. - -Programmer's Note: when REMAP is activated it becomes necessary on any -context-switch (Interrupt or Function call) to detect (or know in advance) -that REMAP is enabled and to additionally save/restore the four SVSHAPE -SPRs, SVHAPE0-3. Given that this is expected to be a rare occurrence it was -deemed unreasonable to burden every context-switch or function call with -mandatory save/restore of SVSHAPEs, and consequently it is a *callee* -(and Trap Handler) responsibility. Callees (and Trap Handlers) **MUST** -avoid using all and any SVP64 instructions during the period where state -could be adversely affected. SVP64 purely relies on Scalar instructions, -so Scalar instructions (except the SVP64 Management ones and mtspr and -mfspr) are 100% guaranteed to have zero impact on SVP64 state. - -**Max Vector Length (maxvl)** - -MAXVECTORLENGTH is the same concept as MVL in RISC-V RVV, except that it -is variable length and may be dynamically set (normally from an immediate -field only). MVL is limited to 7 bits -(in the first version of SVP64) and consequently the maximum number of -elements is limited to between 0 and 127. - -Programmer's Note: Except by directly using `mtspr` on SVSTATE, which may -result in performance penalties on some hardware implementations, SVSTATE's `maxvl` -field may only be set **statically** as an immediate, by the `setvl` instruction. -It may **NOT** be set dynamically from a register. Compiler writers and assembly -programmers are expected to perform static register file analysis, subdivision, -and allocation and only utilise `setvl`. Direct writing to SVSTATE in order to -"bypass" this Note could, in less-advanced implementations, potentially cause stalling, -particularly if SVP64 instructions are issued directly after the `mtspr` to SVSTATE. - -**Vector Length (vl)** - -The actual Vector length, the number of elements in a "Vector", `SVSTATE.vl` may be set -entirely dynamically at runtime from a number of sources. `setvl` is the primary -instruction for setting Vector Length. -`setvl` is conceptually similar but different from the Cray, SX Aurora, and RISC-V RVV -equivalent. Similar to RVV, VL is set to be within -the range 0 <= VL <= MVL. Unlike RVV, VL is set **exactly** according to the following: - - VL = (RT|0) = MIN(vlen, MVL) - -where 0 <= MVL <= 127 and vlen may come from an immediate, `RA`, or from the `CTR` SPR, -depending on options selected with the `setvl` instruction. - -Programmer's Note: conceptual understanding of Cray-style Vectors is far beyond the scope -of the Power ISA Technical Reference. Guidance on the 50-year-old Cray Vector paradigm is -best sought elsewhere: good studies include Academic Courses given on the 1970s -Cray Supercomputers over at least the past three decades. - -**SUBVL - Sub Vector Length** - -This is a "group by quantity" that effectively asks each iteration -of the hardware loop to load SUBVL elements of width elwidth at a -time. Effectively, SUBVL is like a SIMD multiplier: instead of just 1 -operation issued, SUBVL operations are issued. - -The main effect of SUBVL is that predication bits are applied per -**group**, rather than by individual element. Legal values are 0 to 3, -representing 1 operation (1 element) thru 4 operations (4 elements) respectively. -Elements are best though of in the context of 3D, Audio and Video: two Left and Right -Channel "elements" or four ARGB "elements", or three XYZ coordinate "elements". - -`subvl` is again primarily set by the `setvl` instruction. Not to be confused -with `hphint`. - -Directly related to `subvl` is the `pack` and `unpack` Mode bits of `SVSTATE`. -See `svstep` instruction for how to set Pack and Unpack Modes. - - -**Horizontal Parallelism** - -A problem exists for hardware where it may not be able to detect -that a programmer (or compiler) knows of opportunities for parallelism -and lack of overlap between loops. - -For hphint, the number chosen must be consistently -executed **every time**. Hardware is not permitted to execute five -computations for one instruction then three on the next. -hphint is a hint from the compiler to hardware that exactly this -many elements may be safely executed in parallel, without hazards -(including Memory accesses). -Interestingly, when hphint is set equal to VL, it is in effect -as if Vertical First mode were not set, because the hardware is -given the option to run through all elements in an instruction. -This is exactly what Horizontal-First is: a for-loop from 0 to VL-1 -except that the hardware may *choose* the number of elements. - -*Note to programmers: changing VL during the middle of such modes -should be done only with due care and respect for the fact that SVSTATE -has exactly the same peer-level status as a Program Counter.* - -------------- +---------------- \newpage{} diff --git a/openpower/sv/sprs.mdwn b/openpower/sv/sprs.mdwn index a7349d10d..c343e90ff 100644 --- a/openpower/sv/sprs.mdwn +++ b/openpower/sv/sprs.mdwn @@ -1,25 +1,43 @@ -[[!tag standards]] +# SPRs -TODO: this page must be kep up-to-date with ls008 +## SVSTATE SPR -# SPRs -Note Power ISA v3.1 p12: +The format of the SVSTATE SPR is as follows: - The designated SPR sandbox consists of non-privileged SPRs - 704-719 and privileged SPRs 720-735. +| Field | Name | Description | +| ----- | -------- | --------------------- | +| 0:6 | maxvl | Max Vector Length | +| 7:13 | vl | Vector Length | +| 14:20 | srcstep | for srcstep = 0..VL-1 | +| 21:27 | dststep | for dststep = 0..VL-1 | +| 28:29 | dsubstep | for substep = 0..SUBVL-1 | +| 30:31 | ssubstep | for substep = 0..SUBVL-1 | +| 32:33 | mi0 | REMAP RA/FRA/BFA SVSHAPE0-3 | +| 34:35 | mi1 | REMAP RB/FRB/BFB SVSHAPE0-3 | +| 36:37 | mi2 | REMAP RC/FRT SVSHAPE0-3 | +| 38:39 | mo0 | REMAP RT/FRT/BF SVSHAPE0-3 | +| 40:41 | mo1 | REMAP EA/RS/FRS SVSHAPE0-3 | +| 42:46 | SVme | REMAP enable (RA-RT) | +| 47:52 | rsvd | reserved | +| 53 | pack | PACK (srcstrp reorder) | +| 54 | unpack | UNPACK (dststep order) | +| 55:61 | hphint | Horizontal Hint | +| 62 | RMpst | REMAP persistence | +| 63 | vfirst | Vertical First mode | -There are eight SPRs, available in any privilege level: +Notes: -* SVSTATE (containing copies of MVL, VL and SUBVL as well as context information) -* SVLR, a mirror of LR, used by Vectorised Branch -* SVSHAPE0-3 for REMAP purposes, re-shaping Vector loops -* SVREMAP for applying specific shapes to specific registers +* The entries are truncated to be within range. Attempts to set VL to + greater than MAXVL will truncate VL. +* Setting srcstep, dststep to 64 or greater, or VL or MVL to greater + than 64 is reserved and will cause an illegal instruction trap. -# SVSTATE +**SVSTATE Fields** -This is a standard SPR that (REMAP aside) contains sufficient information for a -full context save/restore. It contains (and permits setting of): +SVSTATE is a standard SPR that (if REMAP is not activated) contains sufficient +self-contaned information for a full context save/restore. +SVSTATE contains (and permits setting of): * MVL (the Maximum Vector Length) - declares (statically) how much of a regfile is to be reserved for Vector elements @@ -41,30 +59,76 @@ full context save/restore. It contains (and permits setting of): * Pack - if set then srcstep/substep VL/SUBVL loop-ordering is inverted. * UnPack - if set then dststep/substep VL/SUBVL loop-ordering is inverted. * hphint - Horizontal Parallelism Hint. Indicates that - no Hazards exist between this number of sequentially-accessed - elements (including after REMAP). In Vertical First Mode - hardware **MUST** perform this many elements in parallel - per instruction. Set to zero to indicate "no hint". + no Hazards exist between groups of elements in sequential multiples of this number + (before REMAP). By definition: elements for which `FLOOR(srcstep/hphint)` is + equal *before REMAP* are in the same parallelism "group". In Vertical First Mode + hardware **MUST ONLY** process elements in the same group, and must stop + Horizontal Issue at the last element of a given group. Set to zero to indicate "no hint". * SVme - REMAP enable bits, indicating which register is to be - REMAPed. RA, RB, RC, RT or EA. -* mi0-mi4 - when the corresponding SVme bit is enabled, mi0-mi4 + REMAPed: RA, RB, RC, RT and EA are the canonical (typical) register names + associated with each bit, with RA being the LSB and EA being the MSB. + See table below for ordering. When `SVme` is zero (0b00000) REMAP + is **fully disabled and inactive** regardless of the contents of + `SVSTATE`, `mi0-mi2/mo0-mo1`, or the four `SVSHAPEn` SPRs +* mi0-mi2/mo0-mo1 - when the corresponding SVme bit is enabled, these indicate the SVSHAPE (0-3) that the corresponding register (RA etc) - should use. - -**MAXVECTORLENGTH (MVL)** - -MAXVECTORLENGTH is the same concept as MVL in RVV, except that it -is variable length and may be dynamically set. MVL is -however limited to the regfile bitwidth, 64. - -**Vector Length (VL)** - -VSETVL is slightly different from RVV. Similar to RVV, VL is set to be within -the range 0 <= VL <= MVL (where MVL in turn is limited to 1 <= MVL <= XLEN) - - VL = rd = MIN(vlen, MVL) - -where 1 <= MVL <= XLEN + should use, as long as the register's corresponding SVme bit is set + +Programmer's Note: the fact that REMAP is entirely dormant when `SVme` is zero +allows establishment of REMAP context well in advance, followed by utilising `svremap` +at a precise (or the very last) moment. Some implementations may exploit this +to cache (or take some time to prepare caches) in the background whilst other +(unrelated) instructions are being executed. This is particularly important to +bear in mind when using `svindex` which will require hardware to perform (and +cache) additional GPR reads. + +Programmer's Note: when REMAP is activated it becomes necessary on any +context-switch (Interrupt or Function call) to detect (or know in advance) +that REMAP is enabled and to additionally save/restore the four SVSHAPE +SPRs, SVHAPE0-3. Given that this is expected to be a rare occurrence it was +deemed unreasonable to burden every context-switch or function call with +mandatory save/restore of SVSHAPEs, and consequently it is a *callee* +(and Trap Handler) responsibility. Callees (and Trap Handlers) **MUST** +avoid using all and any SVP64 instructions during the period where state +could be adversely affected. SVP64 purely relies on Scalar instructions, +so Scalar instructions (except the SVP64 Management ones and mtspr and +mfspr) are 100% guaranteed to have zero impact on SVP64 state. + +**Max Vector Length (maxvl)** + +MAXVECTORLENGTH is the same concept as MVL in RISC-V RVV, except that it +is variable length and may be dynamically set (normally from an immediate +field only). MVL is limited to 7 bits +(in the first version of SVP64) and consequently the maximum number of +elements is limited to between 0 and 127. + +Programmer's Note: Except by directly using `mtspr` on SVSTATE, which may +result in performance penalties on some hardware implementations, SVSTATE's `maxvl` +field may only be set **statically** as an immediate, by the `setvl` instruction. +It may **NOT** be set dynamically from a register. Compiler writers and assembly +programmers are expected to perform static register file analysis, subdivision, +and allocation and only utilise `setvl`. Direct writing to SVSTATE in order to +"bypass" this Note could, in less-advanced implementations, potentially cause stalling, +particularly if SVP64 instructions are issued directly after the `mtspr` to SVSTATE. + +**Vector Length (vl)** + +The actual Vector length, the number of elements in a "Vector", `SVSTATE.vl` may be set +entirely dynamically at runtime from a number of sources. `setvl` is the primary +instruction for setting Vector Length. +`setvl` is conceptually similar but different from the Cray, SX Aurora, and RISC-V RVV +equivalent. Similar to RVV, VL is set to be within +the range 0 <= VL <= MVL. Unlike RVV, VL is set **exactly** according to the following: + + VL = (RT|0) = MIN(vlen, MVL) + +where 0 <= MVL <= 127 and vlen may come from an immediate, `RA`, or from the `CTR` SPR, +depending on options selected with the `setvl` instruction. + +Programmer's Note: conceptual understanding of Cray-style Vectors is far beyond the scope +of the Power ISA Technical Reference. Guidance on the 50-year-old Cray Vector paradigm is +best sought elsewhere: good studies include Academic Courses given on the 1970s +Cray Supercomputers over at least the past three decades. **SUBVL - Sub Vector Length** @@ -74,8 +138,17 @@ time. Effectively, SUBVL is like a SIMD multiplier: instead of just 1 operation issued, SUBVL operations are issued. The main effect of SUBVL is that predication bits are applied per -**group**, rather than by individual element. Legal values are 1 to 4. -Illegal values raise an exception. +**group**, rather than by individual element. Legal values are 0 to 3, +representing 1 operation (1 element) thru 4 operations (4 elements) respectively. +Elements are best though of in the context of 3D, Audio and Video: two Left and Right +Channel "elements" or four ARGB "elements", or three XYZ coordinate "elements". + +`subvl` is again primarily set by the `setvl` instruction. Not to be confused +with `hphint`. + +Directly related to `subvl` is the `pack` and `unpack` Mode bits of `SVSTATE`. +See `svstep` instruction for how to set Pack and Unpack Modes. + **Horizontal Parallelism** @@ -99,39 +172,7 @@ except that the hardware may *choose* the number of elements. should be done only with due care and respect for the fact that SVSTATE has exactly the same peer-level status as a Program Counter.* -**SVSTATE SPR** - -The format of the SVSTATE SPR is as follows: - -| Field | Name | Description | -| ----- | -------- | --------------------- | -| 0:6 | maxvl | Max Vector Length | -| 7:13 | vl | Vector Length | -| 14:20 | srcstep | for srcstep = 0..VL-1 | -| 21:27 | dststep | for dststep = 0..VL-1 | -| 28:29 | dsubstep | for substep = 0..SUBVL-1 | -| 30:31 | ssubstep | for substep = 0..SUBVL-1 | -| 32:33 | mi0 | REMAP RA SVSHAPE0-3 | -| 34:35 | mi1 | REMAP RB SVSHAPE0-3 | -| 36:37 | mi2 | REMAP RC SVSHAPE0-3 | -| 38:39 | mo0 | REMAP RT SVSHAPE0-3 | -| 40:41 | mo1 | REMAP EA SVSHAPE0-3 | -| 42:46 | SVme | REMAP enable (RA-RT) | -| 47:52 | rsvd | reserved | -| 53 | pack | PACK (srcstrp reorder) | -| 54 | unpack | UNPACK (dststep order) | -| 55:61 | hphint | Horizontal Hint | -| 62 | RMpst | REMAP persistence | -| 63 | vfirst | Vertical First mode | - -Notes: - -* The entries are truncated to be within range. Attempts to set VL to - greater than MAXVL will truncate VL. -* Setting srcstep, dststep to 64 or greater, or VL or MVL to greater - than 64 is reserved and will cause an illegal instruction trap. - -# SVLR +## SVLR SV Link Register, exactly analogous to LR (Link Register) may be used for temporary storage of SVSTATE, and, in particular, @@ -141,3 +182,6 @@ SVLR and SVSTATE whenever LR and NIA are. Note that there is no equivalent Link variant of SVREMAP or SVSHAPE0-3 (it would be too costly), so SVLR has limited applicability: REMAP SPRs must be saved and restored explicitly. + +[[!tag standards]] + -- 2.30.2