whoops remove VSR
[libreriscv.git] / openpower / sv / rfc / ls007.mdwn
1 # RFC ls007 Ternary/Binary GPR and CR Field bit-operations
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/bitmanip/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls007/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1017>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 2.5.1 Condition Register Logical Instructions
23 * Book I 3.3.13 Fixed-Point Logical Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Instructions added
32
33 * `ternlogi` -- Ternary Logic Immediate
34 * `crternlogi` -- Condition Register Ternary Logic Immediate
35 * `binlog` -- Dynamic Binary Logic
36 * `crbinlog` -- Condition Register Dynamic Binary Logic
37
38 **Submitter**: Luke Leighton (Libre-SOC)
39
40 **Requester**: Libre-SOC
41
42 **Impact on processor**:
43
44 * Addition of two new GPR-based instructions
45 * Addition of two new CR-field-based instructions
46
47 **Impact on software**:
48
49 * Requires support for new instructions in assembler, debuggers,
50 and related tools.
51
52 **Keywords**:
53
54 ```
55 GPR, CR-Field, bit-manipulation, ternary, binary, dynamic, look-up-table (LUT), FPGA
56 ```
57
58 **Motivation**
59
60 * `ternlogi` is similar to existing `and`/`or`/`xor`/etc. instructions, but
61 allows any arbitrary 3-input 1-output bitwise operation. This can be used to
62 combine several instructions into one. E.g. `A ^ (~B & (C | A))` can become
63 one instruction. This can also be used to have one instruction for
64 bitwise MUX `(A & B) | (~A & C)`.
65 * `binlog` is like `ternlogi` except it supports any arbitrary 2-input
66 1-output bitwise operation, where the operation can be selected dynamically
67 at runtime. This operates similarly to a Programmable LUT in a FPGA.
68 * `crternlogi` is like `ternlogi` except it works with CRs instead of GPRs.
69 * `crbinlog` is like `binlog` except it works with CRs instead of GPRs. Likewise it
70 is similar to a Programmable LUT in an FPGA.
71
72 **Notes and Observations**:
73
74 * `ternlogi` is like the existing `xxeval` instruction, except operates on GPRs instead
75 of VSRs and doesn't require VSX/VMX. SFS and SFFS are therefore less powerful.
76 * `crternlogi` is similar to the group of CR Operations (crand, cror etc) which have
77 been identified as a Binary Lookup Group, except an 8-bit
78 immediate is used instead of a 4-bit one, and up to 4 bits of a CR Field may
79 be computed at once, saving 3 CR operations.
80 * `crbinlut` is similar to the Binary Lookup Group of CR Operations except that the
81 4-bit lookup table comes from a CR Field instead of from an Immediate. Also
82 like `crternlogi` up to 4 bits may be computed at once.
83
84 **Changes**
85
86 Add the following entries to:
87
88 * Book I 2.5.1 Condition Register Logical Instructions
89 * Book I 3.3.13 Fixed-Point Logical Instructions
90 * Book I 1.6.1 and 1.6.2
91
92 ----------------
93
94 \newpage{}
95
96 # CRB-FORM
97
98 Add the following section to Book I 1.6.1
99
100 ```
101 |0 |6 |9 |12 |15 |18 |21 |29 |31 |
102 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk |
103 ```
104
105 # TLI-FORM
106
107 Add the following section to Book I 1.6.1
108
109 ```
110 |0 |6 |11 |16 |21 |29 |31 |
111 | PO | RT | RA | RB | TLI | XO | Rc |
112 ```
113
114 # VA-FORM
115
116 Add the following entry to VA-FORM in Book I 1.6.1.12
117
118 ```
119 |0 |6 |11 |16 |21|22 |26|27 |
120 | PO | RT | RA | RB | RC |nh| XO |
121 ```
122
123 # Word Instruction Fields
124
125 Add the following to Book I 1.6.2
126
127 ```
128 msk (9:10,14:15)
129 Field used by crternlogi to decide which CR bits to modify.
130 Formats: CRB
131
132 nh (26)
133 Nibble High. Field used by binlog to decide if the look-up-table should
134 be taken from bits 60:63 or 56:59 of RC.
135 Formats: VA
136
137 TLI (21:28)
138 Field used by the ternlogi instruction as the
139 look-up table.
140 Formats: TLI
141 TLI (21:25,19:20,31)
142 Field used by the crternlogi instruction as the
143 look-up table.
144 Formats: CRB
145
146 XO (29:30)
147 Extended opcode field.
148 Formats: TLI
149 XO (26:30)
150 Extended opcode field.
151 Formats: CRB
152 ```
153
154 * Add `TLI` to the `Formats:` list of all of `RA`, `RB`, `RT`, and `Rc`.
155 * Add `CRB` to the `Formats:` list of all of `BF`, `BFA`, `BFB`, and `BFC`.
156 * Add `TLI` to the `Formats:` list of `XO (29:30)`.
157 * Add `CRB` to the `Formats:` list of `XO (26:31)`.
158 * Add `VA` to the `Formats:` list of `XO (27:31)`.
159
160 ----------
161
162 \newpage{}
163
164 # Ternary Logic Immediate
165
166 TLI-form
167
168 Add this section to Book I 3.3.13
169
170 * `ternlogi RT, RA, RB, TLI` (`Rc=0`)
171 * `ternlogi. RT, RA, RB, TLI` (`Rc=1`)
172
173 | 0-5 | 6-10 | 11-15 | 16-20 | 21-28 | 29-30 | 31 | Form |
174 |-----|------|-------|-------|-------|-------|----|----------|
175 | PO | RT | RA | RB | TLI | XO | Rc | TLI-Form |
176
177 Pseudocode:
178
179 ```
180 result <- (~RT&~RA&~RB & TLI[0]*XLEN |
181 (~RT&~RA& RB & TLI[1]*XLEN |
182 (~RT& RA&~RB & TLI[2]*XLEN |
183 (~RT& RA& RB & TLI[3]*XLEN |
184 ( RT&~RA&~RB & TLI[4]*XLEN |
185 ( RT&~RA& RB & TLI[5]*XLEN |
186 ( RT& RA&~RB & TLI[6]*XLEN |
187 ( RT& RA& RB & TLI[7]*XLEN)
188 RT <- result
189 ```
190
191 For each integer value i, 0 to XLEN-1, do the following.
192
193 Let j be the value of the concatenation of the
194 contents of bit i of RT, bit i of RB, bit i of RT.
195 The value of bit j of TLI is placed into bit i of RT.
196
197 See Table 145, "xxeval(A, B, C, TLI) Equivalent
198 Functions," on page 968 for the equivalent function
199 evaluated by this instruction for any given value of TLI.
200
201 Special registers altered:
202
203 ```
204 CR0 (if Rc=1)
205 ```
206
207 ----------
208
209 \newpage{}
210
211 # Condition Register Ternary Logic Immediate
212
213 CRB-form
214
215 Add this section to Book I 2.5.1
216
217 * `crternlogi BF, BFA, BFB, BFC, TLI, msk`
218
219 | 0.5| 6-8 | 9-11 | 12-14 | 15-17 | 18-20 | 21-28 | 29-30 | 31 | Form |
220 |----|-----|------|-------|-------|-------|-------|-------|-----|----------|
221 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk | CRB-Form |
222
223 Pseudocode:
224
225 ```
226 a <- CR[4*BFA+32:4*BFA+35]
227 b <- CR[4*BFB+32:4*BFB+35]
228 c <- CR[4*BFC+32:4*BFC+35]
229 do i = 0 to 3
230 idx <- a[i] || b[i] || c[i] # compute index from current bits
231 result <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
232 if msk[i] = 1 then
233 CR[4*BF+32+i] <- result
234 ```
235
236 Special registers altered:
237
238 ```
239 CR field BF
240 ```
241
242 ----------
243
244 \newpage{}
245
246 # Dynamic Binary Logic
247
248 VA-form
249
250 Add this section to Book I 3.3.13
251
252 * `binlog RT, RA, RB, RC, nh`
253
254 | 0-5 | 6-10 | 11-15 | 16-20 | 21-25 | 26 | 27-31 | Form |
255 |-----|------|-------|-------|-------|----|-------|---------|
256 | PO | RT | RA | RB | RC | nh | XO | VA-Form |
257
258 Pseudocode:
259
260 ```
261 if nh = 1 then
262 lut <- (RC)[56:59]
263 else
264 lut <- (RC)[60:63]
265 do i = 0 to 63
266 idx <- (RB)[i] || (RA)[i] # compute index from current bits
267 result[i] <- lut[3 - idx] # subtract from 3 to index in LSB0 order
268 RT <- result
269 ```
270
271 Special registers altered:
272
273 ```
274 None
275 ```
276
277 **Programming Note**:
278
279 Dynamic Ternary Logic may be emulated by appropriate combination of `binlog` and `ternlogi`,
280 using the `nh` (next half) operand to select first and second nibble:
281
282 ```
283 # compute r3 = ternlog(r4, r5, r6, table=r7)
284 # compute the values for when r6[i] = 0:
285 binlog r3, r4, r5, r7, 0 # takes look-up-table from LSB 4 bits
286 # compute the values for when r6[i] = 1:
287 binlog r4, r4, r5, r7, 1 # takes look-up-table from second-to-LSB 4 bits
288 # mux the two results together: r3 = (r3 & ~r6) | (r4 & r6)
289 ternlogi r3, r4, r6, 0b11011000
290 ```
291
292 ----------
293
294 \newpage{}
295
296 ## crbinlog
297
298 With ternary (LUT3) dynamic instructions being very costly,
299 and CR Fields being only 4 bit, a binary (LUT2) variant is better
300
301 | 0.5|6.8 | 9.11|12.14|15.17|18.21|22...30 |31|
302 | -- | -- | --- | --- | --- |-----| -------- |--|
303 | NN | BT | BA | BB | BC |m0-m3|000101110 |0 |
304
305 mask = m0..m3
306 for i in range(4):
307 a,b = CRs[BA][i], CRs[BB][i])
308 if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b)
309
310 When SVP64 Vectorised any of the 4 operands may be Scalar or
311 Vector, including `BC` meaning that multiple different dynamic
312 lookups may be performed with a single instruction.
313
314 *Programmer's note: just as with binlut and ternlogi, a pair
315 of crbinlog instructions followed by a merging crternlogi may
316 be deployed to synthesise dynamic ternary (LUT3) CR Field
317 manipulation*
318
319 ----------
320
321 \newpage{}
322
323 ----------
324
325 # Appendices
326
327 Appendix E Power ISA sorted by opcode
328 Appendix F Power ISA sorted by version
329 Appendix G Power ISA sorted by Compliancy Subset
330 Appendix H Power ISA sorted by mnemonic
331
332 |Form| Book | Page | Version | mnemonic | Description |
333 |----|------|------|---------|----------|-------------|
334 |TLI | I | # | 3.2B | ternlogi | Ternary Logic Immediate |
335
336 ----------------
337
338 [[!tag opf_rfc]]