(no commit message)
[libreriscv.git] / openpower / sv / svp_rewrite / svp64.mdwn
1 # Rewrite of SVP64 for OpenPower ISA v3.1
2
3 * [[svp64/discussion]]
4
5
6 The plan is to create an encoding for SVP64, then to create an encoding for
7 SVP48, then to reorganize them both to improve field overlap, reducing the
8 amount of decoder hardware necessary.
9
10 All bit numbers are in MSB0 form (the bits are numbered from 0 at the MSB and
11 counting up as you move to the LSB end). All bit ranges are inclusive (so
12 `4:6` means bits 4, 5, and 6).
13
14 64-bit instructions are split into two 32-bit words, the prefix and the suffix. The prefix always comes before the suffix in PC order.
15
16 ## Remapped Encoding (`RM[0:23]`)
17
18 To allow relatively easy remapping of which portions of the Prefix Opcode Map
19 are used for SVP64 without needing to rewrite a large portion of the SVP64
20 spec, a mapping is defined from the OpenPower v3.1 prefix bits to a new 24-bit
21 Remapped Encoding denoted `RM[0]` at the MSB to `RM[23]` at the LSB.
22
23 The mapping from the OpenPower v3.1 prefix bits to the Remapped Encoding is
24 defined in the Prefix Fields section.
25
26 ## Remapped Encoding Fields
27
28 | Remapped Encoding Field Name | Field bits | Description |
29 |------------------------------|------------|----------------|
30 | MASK | `0:3` | Execution Mask |
31 | TBD | `4:23` | TBD |
32
33 ## MASK Encoding
34
35 TODO: split out (remove) bit 3 as separate so that twin predication can use the same encoding, and split the table into 2 halves. The bit currently 3 becomes a separate (standalone) field (see [discussion]) that selects *both* src and dest predication as CR based or both as INT based. This saves one bit and makes things less complex to implement in hardware.
36
37 Integer based predication. Twin predication uses the same encoding thus allowing either the same register (r3 or r10) to be used for both src and dest, or different regs (one for src, one for dest)
38
39 | Value | Mnemonic | Description |
40 |-------|-------------------|--------------------------------------------------------|
41 | 0000 | - | Reserved (causes an illegal instruction trap) |
42 | 0001 | ALWAYS (implicit) | Operation is not masked see [[discussion]] |
43 | 0010 | R3 | Element `i` is enabled if `R3 & (1 << i)` is non-zero |
44 | 0011 | ~R3 | Element `i` is enabled if `R3 & (1 << i)` is zero |
45 | 0100 | R10 | Element `i` is enabled if `R10 & (1 << i)` is non-zero |
46 | 0101 | ~R10 | Element `i` is enabled if `R10 & (1 << i)` is zero |
47 | 0110 | R30 | Element `i` is enabled if `R30 & (1 << i)` is non-zero |
48 | 0111 | ~R30 | Element `i` is enabled if `R30 & (1 << i)` is zero |
49
50 CR based predication. TODO: select alternate CR for twin predication? see [[discussion]] Overlap of the two CR based predicates must be taken into account, so the starting point for one of them must be suitably high, or accept that for twin predication VL must not exceed the range where overlap will occur, *or* that they use the same starting point but select different *bits* of the same CRs
51
52 | Value | Mnemonic | Description |
53 |-------|-------------------|--------------------------------------------------------|
54 | 1000 | lt | Element `i` is enabled if `CR[6+i].LT` is set |
55 | 1001 | nl/ge | Element `i` is enabled if `CR[6+i].LT` is clear |
56 | 1010 | gt | Element `i` is enabled if `CR[6+i].GT` is set |
57 | 1011 | ng/le | Element `i` is enabled if `CR[6+i].GT` is clear |
58 | 1100 | eq | Element `i` is enabled if `CR[6+i].EQ` is set |
59 | 1101 | ne | Element `i` is enabled if `CR[6+i].EQ` is clear |
60 | 1110 | so/un | Element `i` is enabled if `CR[6+i].FU` is set |
61 | 1111 | ns/nu | Element `i` is enabled if `CR[6+i].FU` is clear |
62
63 ## Prefix Opcode Map (64-bit instruction encoding) (prefix bits 6:11)
64
65 (shows both PowerISA v3.1 instructions as well as new SVP instructions; empty spaces are yet-to-be-allocated Illegal Instructions)
66
67 | bits 6:11 | ---000 | ---001 | ---010 | ---011 | ---100 | ---101 | ---110 | ---111 |
68 |-----------|----------|------------|----------|----------|----------|----------|----------|----------|
69 | 000--- | 8LS-form | 8LS-form | 8LS-form | 8LS-form | 8LS-form | 8LS-form | 8LS-form | 8LS-form |
70 | 001--- | | | | | | | | |
71 | 010--- | 8RR-form | | | | SVP64 | SVP64 | SVP64 | SVP64 |
72 | 011--- | | | | | SVP64 | SVP64 | SVP64 | SVP64 |
73 | 100--- | MLS-form | MLS-form | MLS-form | MLS-form | MLS-form | MLS-form | MLS-form | MLS-form |
74 | 101--- | | | | | | | | |
75 | 110--- | MRR-form | | | | SVP64 | SVP64 | SVP64 | SVP64 |
76 | 111--- | | MMIRR-form | | | SVP64 | SVP64 | SVP64 | SVP64 |
77
78 ## Prefix Fields
79
80 | Prefix Field Name | Field bits | Constant Value | Description |
81 |---------------------|------------|----------------|--------------------------------------------|
82 | PO (Primary Opcode) | `0:5` | `1` | Indicates this is a 64-bit instruction |
83 | `RM[0]` | `6` | | Bit 0 of the Remapped Encoding |
84 | SVP64_7 | `7` | `1` | Indicates this is a SVP64 instruction |
85 | `RM[1]` | `8` | | Bit 1 of the Remapped Encoding |
86 | SVP64_9 | `9` | `1` | Indicates this is a SVP64 instruction |
87 | `RM[2:23]` | `10:31` | | Bits 2 through 23 of the Remapped Encoding |
88
89 # Register Naming
90
91 SV Registers are numbered using the notation `SV[F]R<N>_<M>` where `<N>` is a decimal integer and `<M>` is a binary integer. Two integers are used to enable future register expansions to add more registers by appending more LSB bits to `<M>`.
92
93 ## Integer Registers
94
95 Standard PowerISA Integer registers are aliased to some of the SV integer registers:
96
97 | Integer<br/>Register | SV Integer<br/>Register | Integer<br/>Register | SV Integer<br/>Register | Integer<br/>Register | SV Integer<br/>Register | Integer<br/>Register | SV Integer<br/>Register |
98 |----------------------|-------------------------|----------------------|-------------------------|----------------------|-------------------------|----------------------|-------------------------|
99 | R0 | SVR0_00 | R8 | SVR8_00 | R16 | SVR16_00 | R24 | SVR24_00 |
100 | | SVR0_01 | | SVR8_01 | | SVR16_01 | | SVR24_01 |
101 | | SVR0_10 | | SVR8_10 | | SVR16_10 | | SVR24_10 |
102 | | SVR0_11 | | SVR8_11 | | SVR16_11 | | SVR24_11 |
103 | R1 | SVR1_00 | R9 | SVR9_00 | R17 | SVR17_00 | R25 | SVR25_00 |
104 | | SVR1_01 | | SVR9_01 | | SVR17_01 | | SVR25_01 |
105 | | SVR1_10 | | SVR9_10 | | SVR17_10 | | SVR25_10 |
106 | | SVR1_11 | | SVR9_11 | | SVR17_11 | | SVR25_11 |
107 | R2 | SVR2_00 | R10 | SVR10_00 | R18 | SVR18_00 | R26 | SVR26_00 |
108 | | SVR2_01 | | SVR10_01 | | SVR18_01 | | SVR26_01 |
109 | | SVR2_10 | | SVR10_10 | | SVR18_10 | | SVR26_10 |
110 | | SVR2_11 | | SVR10_11 | | SVR18_11 | | SVR26_11 |
111 | R3 | SVR3_00 | R11 | SVR11_00 | R19 | SVR19_00 | R27 | SVR27_00 |
112 | | SVR3_01 | | SVR11_01 | | SVR19_01 | | SVR27_01 |
113 | | SVR3_10 | | SVR11_10 | | SVR19_10 | | SVR27_10 |
114 | | SVR3_11 | | SVR11_11 | | SVR19_11 | | SVR27_11 |
115 | R4 | SVR4_00 | R12 | SVR12_00 | R20 | SVR20_00 | R28 | SVR28_00 |
116 | | SVR4_01 | | SVR12_01 | | SVR20_01 | | SVR28_01 |
117 | | SVR4_10 | | SVR12_10 | | SVR20_10 | | SVR28_10 |
118 | | SVR4_11 | | SVR12_11 | | SVR20_11 | | SVR28_11 |
119 | R5 | SVR5_00 | R13 | SVR13_00 | R21 | SVR21_00 | R29 | SVR29_00 |
120 | | SVR5_01 | | SVR13_01 | | SVR21_01 | | SVR29_01 |
121 | | SVR5_10 | | SVR13_10 | | SVR21_10 | | SVR29_10 |
122 | | SVR5_11 | | SVR13_11 | | SVR21_11 | | SVR29_11 |
123 | R6 | SVR6_00 | R14 | SVR14_00 | R22 | SVR22_00 | R30 | SVR30_00 |
124 | | SVR6_01 | | SVR14_01 | | SVR22_01 | | SVR30_01 |
125 | | SVR6_10 | | SVR14_10 | | SVR22_10 | | SVR30_10 |
126 | | SVR6_11 | | SVR14_11 | | SVR22_11 | | SVR30_11 |
127 | R7 | SVR7_00 | R15 | SVR15_00 | R23 | SVR23_00 | R31 | SVR31_00 |
128 | | SVR7_01 | | SVR15_01 | | SVR23_01 | | SVR31_01 |
129 | | SVR7_10 | | SVR15_10 | | SVR23_10 | | SVR31_10 |
130 | | SVR7_11 | | SVR15_11 | | SVR23_11 | | SVR31_11 |
131
132 ## Floating-Point Registers
133
134 Standard PowerISA floating-point and VSX registers are aliased to some of the SV floating-point registers:
135
136 | FP<br/>Register | VSX Register | SV FP<br/>Register | FP<br/>Register | VSX Register | SV FP<br/>Register |
137 |-----------------|-----------------------|--------------------|-----------------|-----------------------|--------------------|
138 | FPR\[0\] | VSR\[0\]\.dword\[0\] | SVFR0\_00 | FPR\[16\] | VSR\[16\]\.dword\[0\] | SVFR16\_00 |
139 | | VSR\[0\]\.dword\[1\] | SVFR0\_01 | | VSR\[16\]\.dword\[1\] | SVFR16\_01 |
140 | | VSR\[32\]\.dword\[0\] | SVFR0\_10 | | VSR\[48\]\.dword\[0\] | SVFR16\_10 |
141 | | VSR\[32\]\.dword\[1\] | SVFR0\_11 | | VSR\[48\]\.dword\[1\] | SVFR16\_11 |
142 | FPR\[1\] | VSR\[1\]\.dword\[0\] | SVFR1\_00 | FPR\[17\] | VSR\[17\]\.dword\[0\] | SVFR17\_00 |
143 | | VSR\[1\]\.dword\[1\] | SVFR1\_01 | | VSR\[17\]\.dword\[1\] | SVFR17\_01 |
144 | | VSR\[33\]\.dword\[0\] | SVFR1\_10 | | VSR\[49\]\.dword\[0\] | SVFR17\_10 |
145 | | VSR\[33\]\.dword\[1\] | SVFR1\_11 | | VSR\[49\]\.dword\[1\] | SVFR17\_11 |
146 | FPR\[2\] | VSR\[2\]\.dword\[0\] | SVFR2\_00 | FPR\[18\] | VSR\[18\]\.dword\[0\] | SVFR18\_00 |
147 | | VSR\[2\]\.dword\[1\] | SVFR2\_01 | | VSR\[18\]\.dword\[1\] | SVFR18\_01 |
148 | | VSR\[34\]\.dword\[0\] | SVFR2\_10 | | VSR\[50\]\.dword\[0\] | SVFR18\_10 |
149 | | VSR\[34\]\.dword\[1\] | SVFR2\_11 | | VSR\[50\]\.dword\[1\] | SVFR18\_11 |
150 | FPR\[3\] | VSR\[3\]\.dword\[0\] | SVFR3\_00 | FPR\[19\] | VSR\[19\]\.dword\[0\] | SVFR19\_00 |
151 | | VSR\[3\]\.dword\[1\] | SVFR3\_01 | | VSR\[19\]\.dword\[1\] | SVFR19\_01 |
152 | | VSR\[35\]\.dword\[0\] | SVFR3\_10 | | VSR\[51\]\.dword\[0\] | SVFR19\_10 |
153 | | VSR\[35\]\.dword\[1\] | SVFR3\_11 | | VSR\[51\]\.dword\[1\] | SVFR19\_11 |
154 | FPR\[4\] | VSR\[4\]\.dword\[0\] | SVFR4\_00 | FPR\[20\] | VSR\[20\]\.dword\[0\] | SVFR20\_00 |
155 | | VSR\[4\]\.dword\[1\] | SVFR4\_01 | | VSR\[20\]\.dword\[1\] | SVFR20\_01 |
156 | | VSR\[36\]\.dword\[0\] | SVFR4\_10 | | VSR\[52\]\.dword\[0\] | SVFR20\_10 |
157 | | VSR\[36\]\.dword\[1\] | SVFR4\_11 | | VSR\[52\]\.dword\[1\] | SVFR20\_11 |
158 | FPR\[5\] | VSR\[5\]\.dword\[0\] | SVFR5\_00 | FPR\[21\] | VSR\[21\]\.dword\[0\] | SVFR21\_00 |
159 | | VSR\[5\]\.dword\[1\] | SVFR5\_01 | | VSR\[21\]\.dword\[1\] | SVFR21\_01 |
160 | | VSR\[37\]\.dword\[0\] | SVFR5\_10 | | VSR\[53\]\.dword\[0\] | SVFR21\_10 |
161 | | VSR\[37\]\.dword\[1\] | SVFR5\_11 | | VSR\[53\]\.dword\[1\] | SVFR21\_11 |
162 | FPR\[6\] | VSR\[6\]\.dword\[0\] | SVFR6\_00 | FPR\[22\] | VSR\[22\]\.dword\[0\] | SVFR22\_00 |
163 | | VSR\[6\]\.dword\[1\] | SVFR6\_01 | | VSR\[22\]\.dword\[1\] | SVFR22\_01 |
164 | | VSR\[38\]\.dword\[0\] | SVFR6\_10 | | VSR\[54\]\.dword\[0\] | SVFR22\_10 |
165 | | VSR\[38\]\.dword\[1\] | SVFR6\_11 | | VSR\[54\]\.dword\[1\] | SVFR22\_11 |
166 | FPR\[7\] | VSR\[7\]\.dword\[0\] | SVFR7\_00 | FPR\[23\] | VSR\[23\]\.dword\[0\] | SVFR23\_00 |
167 | | VSR\[7\]\.dword\[1\] | SVFR7\_01 | | VSR\[23\]\.dword\[1\] | SVFR23\_01 |
168 | | VSR\[39\]\.dword\[0\] | SVFR7\_10 | | VSR\[55\]\.dword\[0\] | SVFR23\_10 |
169 | | VSR\[39\]\.dword\[1\] | SVFR7\_11 | | VSR\[55\]\.dword\[1\] | SVFR23\_11 |
170 | FPR\[8\] | VSR\[8\]\.dword\[0\] | SVFR8\_00 | FPR\[24\] | VSR\[24\]\.dword\[0\] | SVFR24\_00 |
171 | | VSR\[8\]\.dword\[1\] | SVFR8\_01 | | VSR\[24\]\.dword\[1\] | SVFR24\_01 |
172 | | VSR\[40\]\.dword\[0\] | SVFR8\_10 | | VSR\[56\]\.dword\[0\] | SVFR24\_10 |
173 | | VSR\[40\]\.dword\[1\] | SVFR8\_11 | | VSR\[56\]\.dword\[1\] | SVFR24\_11 |
174 | FPR\[9\] | VSR\[9\]\.dword\[0\] | SVFR9\_00 | FPR\[25\] | VSR\[25\]\.dword\[0\] | SVFR25\_00 |
175 | | VSR\[9\]\.dword\[1\] | SVFR9\_01 | | VSR\[25\]\.dword\[1\] | SVFR25\_01 |
176 | | VSR\[41\]\.dword\[0\] | SVFR9\_10 | | VSR\[57\]\.dword\[0\] | SVFR25\_10 |
177 | | VSR\[41\]\.dword\[1\] | SVFR9\_11 | | VSR\[57\]\.dword\[1\] | SVFR25\_11 |
178 | FPR\[10\] | VSR\[10\]\.dword\[0\] | SVFR10\_00 | FPR\[26\] | VSR\[26\]\.dword\[0\] | SVFR26\_00 |
179 | | VSR\[10\]\.dword\[1\] | SVFR10\_01 | | VSR\[26\]\.dword\[1\] | SVFR26\_01 |
180 | | VSR\[42\]\.dword\[0\] | SVFR10\_10 | | VSR\[58\]\.dword\[0\] | SVFR26\_10 |
181 | | VSR\[42\]\.dword\[1\] | SVFR10\_11 | | VSR\[58\]\.dword\[1\] | SVFR26\_11 |
182 | FPR\[11\] | VSR\[11\]\.dword\[0\] | SVFR11\_00 | FPR\[27\] | VSR\[27\]\.dword\[0\] | SVFR27\_00 |
183 | | VSR\[11\]\.dword\[1\] | SVFR11\_01 | | VSR\[27\]\.dword\[1\] | SVFR27\_01 |
184 | | VSR\[43\]\.dword\[0\] | SVFR11\_10 | | VSR\[59\]\.dword\[0\] | SVFR27\_10 |
185 | | VSR\[43\]\.dword\[1\] | SVFR11\_11 | | VSR\[59\]\.dword\[1\] | SVFR27\_11 |
186 | FPR\[12\] | VSR\[12\]\.dword\[0\] | SVFR12\_00 | FPR\[28\] | VSR\[28\]\.dword\[0\] | SVFR28\_00 |
187 | | VSR\[12\]\.dword\[1\] | SVFR12\_01 | | VSR\[28\]\.dword\[1\] | SVFR28\_01 |
188 | | VSR\[44\]\.dword\[0\] | SVFR12\_10 | | VSR\[60\]\.dword\[0\] | SVFR28\_10 |
189 | | VSR\[44\]\.dword\[1\] | SVFR12\_11 | | VSR\[60\]\.dword\[1\] | SVFR28\_11 |
190 | FPR\[13\] | VSR\[13\]\.dword\[0\] | SVFR13\_00 | FPR\[29\] | VSR\[29\]\.dword\[0\] | SVFR29\_00 |
191 | | VSR\[13\]\.dword\[1\] | SVFR13\_01 | | VSR\[29\]\.dword\[1\] | SVFR29\_01 |
192 | | VSR\[45\]\.dword\[0\] | SVFR13\_10 | | VSR\[61\]\.dword\[0\] | SVFR29\_10 |
193 | | VSR\[45\]\.dword\[1\] | SVFR13\_11 | | VSR\[61\]\.dword\[1\] | SVFR29\_11 |
194 | FPR\[14\] | VSR\[14\]\.dword\[0\] | SVFR14\_00 | FPR\[30\] | VSR\[30\]\.dword\[0\] | SVFR30\_00 |
195 | | VSR\[14\]\.dword\[1\] | SVFR14\_01 | | VSR\[30\]\.dword\[1\] | SVFR30\_01 |
196 | | VSR\[46\]\.dword\[0\] | SVFR14\_10 | | VSR\[62\]\.dword\[0\] | SVFR30\_10 |
197 | | VSR\[46\]\.dword\[1\] | SVFR14\_11 | | VSR\[62\]\.dword\[1\] | SVFR30\_11 |
198 | FPR\[15\] | VSR\[15\]\.dword\[0\] | SVFR15\_00 | FPR\[31\] | VSR\[31\]\.dword\[0\] | SVFR31\_00 |
199 | | VSR\[15\]\.dword\[1\] | SVFR15\_01 | | VSR\[31\]\.dword\[1\] | SVFR31\_01 |
200 | | VSR\[47\]\.dword\[0\] | SVFR15\_10 | | VSR\[63\]\.dword\[0\] | SVFR31\_10 |
201 | | VSR\[47\]\.dword\[1\] | SVFR15\_11 | | VSR\[63\]\.dword\[1\] | SVFR31\_11 |
202
203
204 # Operation
205
206 ## CR fields as inputs/outputs of vector operations
207
208 When vectorized, the CR inputs/outputs are read/written to 4-bit CR fields
209 starting from CR6 and incrementing from there. If CR63 is reached, the next CR
210 field used wraps around to CR0, then incrementing from there.
211
212 CR6 was chosen to balance avoiding needing to save CR2-CR4 (which are
213 callee-saved) just to use SV vectors with VL <= 61 as well as having the first
214 few used CR fields readily accessible to standard CR instructions and branches.
215 Additionally, CR6 is used as the implicit result of a OpenPower ISA v3.1
216 standard vector instruction with Rc=1.
217
218 # Register Profiles
219
220 Instructions are broken down by Register Profiles as listed in the following auto-generated page:
221 [[opcode_regs_deduped]]. "Non-SV" indicates that the operations with this Register Profile cannot be Vectorised (mtspr, bc, dcbz, twi)
222
223 ## LDST-1R-1W-imm
224 TBD
225 ## LDST-1R-2W-imm
226 TBD
227 ## LDST-2R-imm
228 TBD
229 ## LDST-2R-1W
230 TBD
231 ## LDST-2R-1W-imm
232 TBD
233 ## LDST-2R-2W
234 TBD
235 ## LDST-3R
236 TBD
237 ## LDST-3R-CRo
238 TBD
239 ## LDST-3R-1W
240 TBD
241 ## CRi
242 non-SV
243 ## CRio
244 TBD
245 ## CR=2R1W
246 TBD
247 ## 1W
248 non-SV
249 ## 1W-CRi
250 TBD
251 ## 1R
252 non-SV
253 ## 1R-imm
254 non-SV
255 ## 1R-CRo
256 TBD
257 ## 1R-CRio
258 TBD
259 ## 1R-1W
260 TBD
261 ## 1R-1W-imm
262 TBD
263 ## 1R-1W-CRo
264 TBD
265 ## 1R-1W-CRo
266 TBD
267 ## 1R-1W-CRio
268 TBD
269 ## 2R
270 non-SV
271 ## 2R-CRo
272 TBD
273 ## 2R-CRio
274 TBD
275 ## 2R-1W
276 TBD
277 ## 2R-1W-CRo
278 TBD
279 ## 2R-1W-CRo
280 TBD
281 ## 2R-1W-CRi
282 TBD
283 ## 2R-1W-CRio
284 TBD
285 ## 3R-1W-CRio
286
287 Remapped Encoding Fields:
288
289 | |
290 |--|
291 | |