move SPRs out of ls008 back to sprs.mdwn
[libreriscv.git] / openpower / sv / sprs.mdwn
1 # SPRs <a name="sprs"></a>
2
3 ## SVSTATE SPR
4
5
6 The format of the SVSTATE SPR is as follows:
7
8 | Field | Name | Description |
9 | ----- | -------- | --------------------- |
10 | 0:6 | maxvl | Max Vector Length |
11 | 7:13 | vl | Vector Length |
12 | 14:20 | srcstep | for srcstep = 0..VL-1 |
13 | 21:27 | dststep | for dststep = 0..VL-1 |
14 | 28:29 | dsubstep | for substep = 0..SUBVL-1 |
15 | 30:31 | ssubstep | for substep = 0..SUBVL-1 |
16 | 32:33 | mi0 | REMAP RA/FRA/BFA SVSHAPE0-3 |
17 | 34:35 | mi1 | REMAP RB/FRB/BFB SVSHAPE0-3 |
18 | 36:37 | mi2 | REMAP RC/FRT SVSHAPE0-3 |
19 | 38:39 | mo0 | REMAP RT/FRT/BF SVSHAPE0-3 |
20 | 40:41 | mo1 | REMAP EA/RS/FRS SVSHAPE0-3 |
21 | 42:46 | SVme | REMAP enable (RA-RT) |
22 | 47:52 | rsvd | reserved |
23 | 53 | pack | PACK (srcstrp reorder) |
24 | 54 | unpack | UNPACK (dststep order) |
25 | 55:61 | hphint | Horizontal Hint |
26 | 62 | RMpst | REMAP persistence |
27 | 63 | vfirst | Vertical First mode |
28
29 Notes:
30
31 * The entries are truncated to be within range. Attempts to set VL to
32 greater than MAXVL will truncate VL.
33 * Setting srcstep, dststep to 64 or greater, or VL or MVL to greater
34 than 64 is reserved and will cause an illegal instruction trap.
35
36 **SVSTATE Fields**
37
38 SVSTATE is a standard SPR that (if REMAP is not activated) contains sufficient
39 self-contaned information for a full context save/restore.
40 SVSTATE contains (and permits setting of):
41
42 * MVL (the Maximum Vector Length) - declares (statically) how
43 much of a regfile is to be reserved for Vector elements
44 * VL - Vector Length
45 * dststep - the destination element offset of the current parallel
46 instruction being executed
47 * srcstep - for twin-predication, the source element offset as well.
48 * ssubstep - the source subvector element offset of the current
49 parallel instruction being executed
50 * dsubstep - the destination subvector element offset of the current
51 parallel instruction being executed
52 * vfirst - Vertical First mode. srcstep, dststep and substep
53 **do not advance** unless explicitly requested to do so with
54 pseudo-op svstep (a mode of setvl)
55 * RMpst - REMAP persistence. REMAP will apply only to the following
56 instruction unless this bit is set, in which case REMAP "persists".
57 Reset (cleared) on use of the `setvl` instruction if used to
58 alter VL or MVL.
59 * Pack - if set then srcstep/substep VL/SUBVL loop-ordering is inverted.
60 * UnPack - if set then dststep/substep VL/SUBVL loop-ordering is inverted.
61 * hphint - Horizontal Parallelism Hint. Indicates that
62 no Hazards exist between groups of elements in sequential multiples of this number
63 (before REMAP). By definition: elements for which `FLOOR(srcstep/hphint)` is
64 equal *before REMAP* are in the same parallelism "group". In Vertical First Mode
65 hardware **MUST ONLY** process elements in the same group, and must stop
66 Horizontal Issue at the last element of a given group. Set to zero to indicate "no hint".
67 * SVme - REMAP enable bits, indicating which register is to be
68 REMAPed: RA, RB, RC, RT and EA are the canonical (typical) register names
69 associated with each bit, with RA being the LSB and EA being the MSB.
70 See table below for ordering. When `SVme` is zero (0b00000) REMAP
71 is **fully disabled and inactive** regardless of the contents of
72 `SVSTATE`, `mi0-mi2/mo0-mo1`, or the four `SVSHAPEn` SPRs
73 * mi0-mi2/mo0-mo1 - when the corresponding SVme bit is enabled, these
74 indicate the SVSHAPE (0-3) that the corresponding register (RA etc)
75 should use, as long as the register's corresponding SVme bit is set
76
77 Programmer's Note: the fact that REMAP is entirely dormant when `SVme` is zero
78 allows establishment of REMAP context well in advance, followed by utilising `svremap`
79 at a precise (or the very last) moment. Some implementations may exploit this
80 to cache (or take some time to prepare caches) in the background whilst other
81 (unrelated) instructions are being executed. This is particularly important to
82 bear in mind when using `svindex` which will require hardware to perform (and
83 cache) additional GPR reads.
84
85 Programmer's Note: when REMAP is activated it becomes necessary on any
86 context-switch (Interrupt or Function call) to detect (or know in advance)
87 that REMAP is enabled and to additionally save/restore the four SVSHAPE
88 SPRs, SVHAPE0-3. Given that this is expected to be a rare occurrence it was
89 deemed unreasonable to burden every context-switch or function call with
90 mandatory save/restore of SVSHAPEs, and consequently it is a *callee*
91 (and Trap Handler) responsibility. Callees (and Trap Handlers) **MUST**
92 avoid using all and any SVP64 instructions during the period where state
93 could be adversely affected. SVP64 purely relies on Scalar instructions,
94 so Scalar instructions (except the SVP64 Management ones and mtspr and
95 mfspr) are 100% guaranteed to have zero impact on SVP64 state.
96
97 **Max Vector Length (maxvl)** <a name="mvl" />
98
99 MAXVECTORLENGTH is the same concept as MVL in RISC-V RVV, except that it
100 is variable length and may be dynamically set (normally from an immediate
101 field only). MVL is limited to 7 bits
102 (in the first version of SVP64) and consequently the maximum number of
103 elements is limited to between 0 and 127.
104
105 Programmer's Note: Except by directly using `mtspr` on SVSTATE, which may
106 result in performance penalties on some hardware implementations, SVSTATE's `maxvl`
107 field may only be set **statically** as an immediate, by the `setvl` instruction.
108 It may **NOT** be set dynamically from a register. Compiler writers and assembly
109 programmers are expected to perform static register file analysis, subdivision,
110 and allocation and only utilise `setvl`. Direct writing to SVSTATE in order to
111 "bypass" this Note could, in less-advanced implementations, potentially cause stalling,
112 particularly if SVP64 instructions are issued directly after the `mtspr` to SVSTATE.
113
114 **Vector Length (vl)** <a name="vl" />
115
116 The actual Vector length, the number of elements in a "Vector", `SVSTATE.vl` may be set
117 entirely dynamically at runtime from a number of sources. `setvl` is the primary
118 instruction for setting Vector Length.
119 `setvl` is conceptually similar but different from the Cray, SX Aurora, and RISC-V RVV
120 equivalent. Similar to RVV, VL is set to be within
121 the range 0 <= VL <= MVL. Unlike RVV, VL is set **exactly** according to the following:
122
123 VL = (RT|0) = MIN(vlen, MVL)
124
125 where 0 <= MVL <= 127 and vlen may come from an immediate, `RA`, or from the `CTR` SPR,
126 depending on options selected with the `setvl` instruction.
127
128 Programmer's Note: conceptual understanding of Cray-style Vectors is far beyond the scope
129 of the Power ISA Technical Reference. Guidance on the 50-year-old Cray Vector paradigm is
130 best sought elsewhere: good studies include Academic Courses given on the 1970s
131 Cray Supercomputers over at least the past three decades.
132
133 **SUBVL - Sub Vector Length**
134
135 This is a "group by quantity" that effectively asks each iteration
136 of the hardware loop to load SUBVL elements of width elwidth at a
137 time. Effectively, SUBVL is like a SIMD multiplier: instead of just 1
138 operation issued, SUBVL operations are issued.
139
140 The main effect of SUBVL is that predication bits are applied per
141 **group**, rather than by individual element. Legal values are 0 to 3,
142 representing 1 operation (1 element) thru 4 operations (4 elements) respectively.
143 Elements are best though of in the context of 3D, Audio and Video: two Left and Right
144 Channel "elements" or four ARGB "elements", or three XYZ coordinate "elements".
145
146 `subvl` is again primarily set by the `setvl` instruction. Not to be confused
147 with `hphint`.
148
149 Directly related to `subvl` is the `pack` and `unpack` Mode bits of `SVSTATE`.
150 See `svstep` instruction for how to set Pack and Unpack Modes.
151
152
153 **Horizontal Parallelism**
154
155 A problem exists for hardware where it may not be able to detect
156 that a programmer (or compiler) knows of opportunities for parallelism
157 and lack of overlap between loops.
158
159 For hphint, the number chosen must be consistently
160 executed **every time**. Hardware is not permitted to execute five
161 computations for one instruction then three on the next.
162 hphint is a hint from the compiler to hardware that exactly this
163 many elements may be safely executed in parallel, without hazards
164 (including Memory accesses).
165 Interestingly, when hphint is set equal to VL, it is in effect
166 as if Vertical First mode were not set, because the hardware is
167 given the option to run through all elements in an instruction.
168 This is exactly what Horizontal-First is: a for-loop from 0 to VL-1
169 except that the hardware may *choose* the number of elements.
170
171 *Note to programmers: changing VL during the middle of such modes
172 should be done only with due care and respect for the fact that SVSTATE
173 has exactly the same peer-level status as a Program Counter.*
174
175 ## SVLR
176
177 SV Link Register, exactly analogous to LR (Link Register) may
178 be used for temporary storage of SVSTATE, and, in particular,
179 Vectorised Branch-Conditional instructions may interchange
180 SVLR and SVSTATE whenever LR and NIA are.
181
182 Note that there is no equivalent Link variant of SVREMAP or
183 SVSHAPE0-3 (it would be too costly), so SVLR has limited applicability:
184 REMAP SPRs must be saved and restored explicitly.
185
186 [[!tag standards]]
187