(no commit message)
[libreriscv.git] / openpower / sv / rfc / ls007.mdwn
1 # RFC ls007 Ternary/Binary GPR and CR Field bit-operations
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/bitmanip/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls007/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1017>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/117>
9
10 **Severity**: Major
11
12 **Status**: new
13
14 **Date**: 20 Oct 2022, 1st draft submitted 2023mar22
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 2.5.1 Condition Register Logical Instructions
23 * Book I 3.3.13 Fixed-Point Logical Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Instructions added
32
33 * `ternlogi` -- GPR Ternary Logic Immediate
34 * `crternlogi` -- Condition Register Field Ternary Logic Immediate
35 * `binlog` -- GPR Dynamic Binary Logic
36 * `crbinlog` -- Condition Register Field Dynamic Binary Logic
37
38 **Submitter**: Luke Leighton (Libre-SOC)
39
40 **Requester**: Libre-SOC
41
42 **Impact on processor**:
43
44 * Addition of two new GPR-based instructions
45 * Addition of two new CR-field-based instructions
46
47 **Impact on software**:
48
49 * Requires support for new instructions in assembler, debuggers,
50 and related tools.
51
52 **Keywords**:
53
54 ```
55 GPR, CR-Field, bit-manipulation, ternary, binary, dynamic, look-up-table
56 (LUT), FPGA, JIT
57 ```
58
59 **Motivation**
60
61 * `ternlogi` is similar to existing `and`/`or`/`xor`/etc. instructions, but
62 allows any arbitrary 3-input 1-output bitwise operation. This can be used to
63 combine several instructions into one. E.g. `A ^ (~B & (C | A))` can become
64 one instruction. This can also be used to have one instruction for
65 bitwise MUX `(A & B) | (~A & C)`.
66 * `binlog` is like `ternlogi` except it supports any arbitrary 2-input
67 1-output bitwise operation, where the operation can be selected dynamically
68 at runtime. This operates similarly to a Programmable LUT in a FPGA.
69 * `crternlogi` is like `ternlogi` except it works with CRs instead of GPRs.
70 * `crbinlog` is like `binlog` except it works with CRs instead of GPRs. Likewise it
71 is similar to a Programmable LUT in an FPGA.
72 * Combined these instructions save on insttuction count and also help accelerate
73 AI and JIT runtimes.
74
75 **Notes and Observations**:
76
77 * `ternlogi` is like the existing `xxeval` instruction, except operates on GPRs instead
78 of VSRs and doesn't require VSX/VMX. SFS and SFFS are comparatively compromised.
79 * `crternlogi` is similar to the group of CR Operations (crand, cror etc) which have
80 been identified as a Binary Lookup Group, except an 8-bit
81 immediate is used instead of a 4-bit one, and up to 4 bits of a CR Field may
82 be computed at once, saving at least 3 groups of CR operations.
83 * `crbinlut` is similar to the Binary Lookup Group of CR Operations except that the
84 4-bit lookup table comes from a CR Field instead of from an Immediate. Also
85 like `crternlogi` up to 4 bits may be computed at once.
86
87 **Changes**
88
89 Add the following entries to:
90
91 * Book I 2.5.1 Condition Register Logical Instructions
92 * Book I 3.3.13 Fixed-Point Logical Instructions
93 * Book I 1.6.1 and 1.6.2
94
95 ----------------
96
97 \newpage{}
98
99 # GPR Ternary Logic Immediate
100
101 Add this section to Book I 3.3.13
102
103 TLI-form
104
105 * `ternlogi RT, RA, RB, TLI` (`Rc=0`)
106 * `ternlogi. RT, RA, RB, TLI` (`Rc=1`)
107
108 | 0-5 | 6-10 | 11-15 | 16-20 | 21-28 | 29-30 | 31 | Form |
109 |-----|------|-------|-------|-------|-------|----|----------|
110 | PO | RT | RA | RB | TLI | XO | Rc | TLI-Form |
111
112 Pseudocode:
113
114 ```
115 result <- (~RT & ~RA & ~RB & TLI[0]*64) | # 64 copies of TLI[0]
116 (~RT & ~RA & RB & TLI[1]*64) | # ...
117 (~RT & RA & ~RB & TLI[2]*64) |
118 (~RT & RA & RB & TLI[3]*64) |
119 ( RT & ~RA & ~RB & TLI[4]*64) |
120 ( RT & ~RA & RB & TLI[5]*64) |
121 ( RT & RA & ~RB & TLI[6]*64) | # ...
122 ( RT & RA & RB & TLI[7]*64) # 64 copies of TLI[7]
123 RT <- result
124 ```
125
126 For each integer value i, 0 to 63, do the following.
127
128 Let j be the value of the concatenation of the
129 contents of bit i of RT, bit i of RB, bit i of RT.
130 The value of bit j of TLI is placed into bit i of RT.
131
132 See Table 145, "xxeval(A, B, C, TLI) Equivalent
133 Functions," on page 968 for the equivalent function
134 evaluated by this instruction for any given value of TLI.
135
136 *Programmer's Note: this is a Read-Modify-Write instruction on RT.*
137
138 Special registers altered:
139
140 ```
141 CR0 (if Rc=1)
142 ```
143
144 ----------
145
146 \newpage{}
147
148 # Condition Register Ternary Logic Immediate
149
150 Add this section to Book I 2.5.1
151
152 CRB-form
153
154 * `crternlogi BF, BFA, BFB, BFC, TLI, msk`
155
156 | 0.5|6.8 |9.10|11.13|14.15|16.18|19.25|26.30| 31| Form |
157 |----|----|----|-----|-----|-----|-----|-----|---|----------|
158 | PO | BF | msk|BFA | msk | BFB | TLI | XO |TLI| CRB-Form |
159
160 Pseudocode:
161
162 ```
163 a <- CR[4*BF+32:4*BF+35]
164 b <- CR[4*BFA+32:4*BFA+35]
165 c <- CR[4*BFB+32:4*BFB+35]
166 ternary <- (~a & ~b & ~c & TLI[0]*4) | # 4 copies of TLI[0]
167 (~a & ~b & c & TLI[1]*4) | # 4 copies of TLI[1]
168 (~a & b & ~c & TLI[2]*4) | # ...
169 (~a & b & c & TLI[3]*4) |
170 ( a & ~b & ~c & TLI[4]*4) |
171 ( a & ~b & c & TLI[5]*4) |
172 ( a & b & ~c & TLI[6]*4) | # ...
173 ( a & b & c & TLI[7]*4)) # 4 copies of TLI[7]
174 do i = 0 to 3
175 if msk[i] = 1 then
176 CR[4*BF+32+i] <- ternary[i]
177 ```
178
179 For each integer value i, 0 to 3, do the following.
180
181 Let j be the value of the concatenation of the
182 contents of bit i of CR Field BF, bit i of CR Field BFA,
183 bit i of CR Field BFB.
184
185 If bit i of msk is set to 1 then the value of bit j of TLI
186 is placed into bit i of CR Field BF.
187
188 Otherwise, if bit i of msk is a zero then bit i of
189 CR Field BF is unchanged.
190
191 See Table 145, "xxeval(A, B, C, TLI) Equivalent
192 Functions," on page 968 for the equivalent function
193 evaluated by this instruction for any given value of TLI.
194
195 If `msk` is zero an Illegal Instruction trap is raised.
196
197 *Programmer's Note: this instruction is a "masked" overwrite on CR Field BF. For each bit set in msk a Write is performed but for each bit clear in msk the corresponding bit of BF is preserved. Overall this makes crbinlog a conditionally Read-Modify-Write instruction on CR Field BF*
198
199 Special registers altered:
200
201 ```
202 CR field BF
203 ```
204
205 ----------
206
207 \newpage{}
208
209 # GPR Dynamic Binary Logic
210
211 Add this section to Book I 3.3.13
212
213 VA-form
214
215 * `binlog RT, RA, RB, RC, nh`
216
217 | 0-5 | 6-10 | 11-15 | 16-20 | 21-25 | 26 | 27-31 | Form |
218 |-----|------|-------|-------|-------|----|-------|---------|
219 | PO | RT | RA | RB | RC | nh | XO | VA-Form |
220
221 Pseudocode:
222
223 ```
224 if nh = 1 then lut <- (RC)[56:59]
225 else lut <- (RC)[60:63]
226 result <- (~RA & ~RB & lut[0]*64) |
227 (~RA & RB & lut[1]*64) |
228 ( RA & ~RB & lut[2]*64) |
229 ( RA & RB & lut[3]*64))
230 RT <- result
231 ```
232
233 For each integer value i, 0 to 63, do the following.
234
235 If nh contains a 0, let lut be the four LSBs of RC
236 (bits 60 to 63). Otherwise let lut be the next
237 four LSBs of RC (bits 56 to 59).
238
239 Let j be the value of the concatenation of the
240 contents of bit i of RT with bit i of RB.
241
242 The value of bit j of lut is placed into bit i of RT.
243
244 Special registers altered:
245
246 ```
247 None
248 ```
249
250 **Programmer's Note**:
251
252 Dynamic (non-immediate-based) Ternary Logic, suitable for FPGA-style LUT3
253 dynamic lookups and for JIT runtime acceleration, may be emulated by
254 appropriate combination of `binlog` and `ternlogi`, using the `nh`
255 (next half) operand to select first and second nibble:
256
257 ```
258 # compute r3 = ternlog(r4, r5, r6, table=r7)
259 # compute the values for when r6[i] = 0:
260 binlog r3, r4, r5, r7, 0 # takes look-up-table from LSB 4 bits
261 # compute the values for when r6[i] = 1:
262 binlog r4, r4, r5, r7, 1 # takes look-up-table from second-to-LSB 4 bits
263 # mux the two results together: r3 = (r3 & ~r6) | (r4 & r6)
264 ternlogi r3, r4, r6, 0b11011000
265 ```
266
267 ----------
268
269 \newpage{}
270
271 # Condition Register Field Dynamic Binary Logic
272
273 Add this section to Book I 2.5.1
274
275 CRB-form
276
277 | 0.5|6.8 |9.10|11.13|14.15|16.18|19.25|26.30| 31| Form |
278 |----|----|----|-----|-----|-----|-----|-----|---|----------|
279 | PO | BF | msk|BFA | msk | BFB | // | XO |// | CRB-Form |
280
281 ```
282 a <- CR[4*BF+32:4*BFA+35]
283 b <- CR[4*BFA+32:4*BFA+35]
284 lut <- CR[4*BFB+32:4*BFB+35]
285 binary <- (~a & ~b & lut[0]*4) |
286 (~a & b & lut[1]*4) |
287 ( a & ~b & lut[2]*4) |
288 ( a & b & lut[3]*4))
289 do i = 0 to 3
290 if msk[i] = 1 then
291 CR[4*BF+32+i] <- binary[i]
292 ```
293
294 For each integer value i, 0 to 3, do the following.
295
296 Let j be the value of the concatenation of the
297 contents of bit i of CR Field BF with bit i of CR Field BFA.
298
299 If bit i of msk is set to 1 then the value of bit j of
300 CR Field BFB is placed into bit i of CR Field BF.
301
302 Otherwise, if bit i of msk is a zero then bit i of
303 CR Field BF is unchanged.
304
305 If `msk` is zero an Illegal Instruction trap is raised.
306
307 Special registers altered:
308
309 ```
310 CR field BF
311 ```
312
313 *Programmer's Note: just as with binlut and ternlogi, a pair
314 of crbinlog instructions followed by a merging crternlogi may
315 be deployed to synthesise dynamic ternary (LUT3) CR Field
316 manipulation*
317
318 *Programmer's Note: this instruction is a "masked" overwrite on CR
319 Field BF. For each bit set in `msk` a Write is performed
320 but for each bit clear in `msk` the corresponding bit of BF is
321 preserved. Overall this makes `crbinlog` a conditionally
322 Read-Modify-Write instruction on CR Field BF*
323
324 ----------
325
326 \newpage{}
327
328 # CRB-FORM
329
330 Add the following section to Book I 1.6.1
331
332 ```
333
334 |0 |6 |9 |11 |14 |16 |19 |26 |31 |
335 | PO | BF | msk | BFA | msk | BFB | TLI | XO | TLI |
336 | PO | BF | msk | BFA | msk | BFB | // | XO | / |
337 ```
338
339 # TLI-FORM
340
341 Add the following section to Book I 1.6.1
342
343 ```
344 |0 |6 |11 |16 |21 |29 |31 |
345 | PO | RT | RA | RB | TLI | XO | Rc |
346 ```
347
348 # VA-FORM
349
350 Add the following entry to VA-FORM in Book I 1.6.1.12
351
352 ```
353 |0 |6 |11 |16 |21 |26|27 |
354 | PO | RT | RA | RB | RC |nh| XO |
355 ```
356
357 # Word Instruction Fields
358
359 Add the following to Book I 1.6.2
360
361 ```
362 msk (9:10,14:15)
363 Field used by crternlogi and crbinlut to decide which CR Field bits to
364 modify.
365 Formats: CRB
366 nh (26)
367 Nibble High. Field used by binlog to decide if the look-up-table should
368 be taken from bits 60:63 (nh=0) or 56:59 (nh=1) of RC.
369 Formats: VA
370 TLI (21:28)
371 Field used by the ternlogi instruction as the
372 look-up table.
373 Formats: TLI
374 TLI (21:25,19:20,31)
375 Field used by the crternlogi instruction as the
376 look-up table.
377 Formats: CRB
378 ```
379
380 * Add `TLI` to the `Formats:` list of all of `RA`, `RB`, `RT`, and `Rc`.
381 * Add `CRB` to the `Formats:` list of all of `BF`, `BFA`, `BFB`, and `BFC`.
382 * Add `TLI` to the `Formats:` list of `XO (29:30)`.
383 * Add `CRB` to the `Formats:` list of `XO (26:31)`.
384 * Add `VA` to the `Formats:` list of `XO (27:31)`.
385
386 ----------
387
388 # Appendices
389
390 Appendix E Power ISA sorted by opcode
391 Appendix F Power ISA sorted by version
392 Appendix G Power ISA sorted by Compliancy Subset
393 Appendix H Power ISA sorted by mnemonic
394
395 |Form| Book | Page | Version | mnemonic | Description |
396 |----|------|------|---------|------------|-------------|
397 |TLI | I | # | 3.2B | ternlogi | GPR Ternary Logic Immediate |
398 |VA | I | # | 3.2B | binlog | GPR Binary Logic |
399 |CRB | I | # | 3.2B | crternlogi | CR Field Ternary Logic Immediate |
400 |CRB | I | # | 3.2B | crbinlog | CR Field Binary Logic |
401
402 ----------------
403
404 [[!tag opf_rfc]]