(no commit message)
[libreriscv.git] / openpower / sv / rfc / ls007.mdwn
1 # RFC ls007 Ternary/Binary GPR and CR Field bit-operations
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/bitmanip/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls007/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1017>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 2.5.1 Condition Register Logical Instructions
23 * Book I 3.3.13 Fixed-Point Logical Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Instructions added
32
33 * `ternlogi` -- Ternary Logic Immediate
34 * `crternlogi` -- Condition Register Ternary Logic Immediate
35 * `binlog` -- Dynamic Binary Logic
36 * `crbinlog` -- Condition Register Dynamic Binary Logic
37
38 **Submitter**: Luke Leighton (Libre-SOC)
39
40 **Requester**: Libre-SOC
41
42 **Impact on processor**:
43
44 * Addition of two new GPR-based instructions
45 * Addition of two new CR-field-based instructions
46
47 **Impact on software**:
48
49 * Requires support for new instructions in assembler, debuggers,
50 and related tools.
51
52 **Keywords**:
53
54 ```
55 GPR, CR-Field, bit-manipulation, ternary, binary, dynamic, look-up-table (LUT), FPGA
56 ```
57
58 **Motivation**
59
60 * `ternlogi` is similar to existing `and`/`or`/`xor`/etc. instructions, but
61 allows any arbitrary 3-input 1-output bitwise operation. This can be used to
62 combine several instructions into one. E.g. `A ^ (~B & (C | A))` can become
63 one instruction. This can also be used to have one instruction for
64 bitwise MUX `(A & B) | (~A & C)`.
65 * `binlog` is like `ternlogi` except it supports any arbitrary 2-input
66 1-output bitwise operation, where the operation can be selected dynamically
67 at runtime. This operates similarly to a Programmable LUT in a FPGA.
68 * `crternlogi` is like `ternlogi` except it works with CRs instead of GPRs.
69 * `crbinlog` is like `binlog` except it works with CRs instead of GPRs. Likewise it
70 is similar to a Programmable LUT in an FPGA.
71
72 **Notes and Observations**:
73
74 * `ternlogi` is like the existing `xxeval` instruction, except operates on
75 GPRs instead of VSRs and doesn't require VSX/VMX.
76 * `crternlogi` is similar to the group of CR Operations (crand, cror etc) which have
77 been identified as a Binary Lookup Group, except an 8-bit
78 immediate is used instead of a 4-bit one, and up to 4 bits of a CR Field may
79 be computed at once, saving 3 CR operations.
80 * `crbinlut` is similar to the Binary Lookup Group of CR Operations except that the
81 4-bit lookup table comes from a CR Field instead of from an Immediate. Also
82 like `crternlogi` up to 4 bits may be computed at once.
83
84 **Changes**
85
86 Add the following entries to:
87
88 * Book I 2.5.1 Condition Register Logical Instructions
89 * Book I 3.3.13 Fixed-Point Logical Instructions
90 * Book I 1.6.1 and 1.6.2
91
92 ----------------
93
94 \newpage{}
95
96 # CRB-FORM
97
98 Add the following section to Book I 1.6.1
99
100 ```
101 |0 |6 |9 |12 |15 |18 |21 |29 |31 |
102 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk |
103 ```
104
105 # TLI-FORM
106
107 Add the following section to Book I 1.6.1
108
109 ```
110 |0 |6 |11 |16 |21 |29 |31 |
111 | PO | RT | RA | RB | TLI | XO | Rc |
112 ```
113
114 # VA-FORM
115
116 Add the following entry to VA-FORM in Book I 1.6.1.12
117
118 ```
119 |0 |6 |11 |16 |21|22 |26|27 |
120 | PO | RT | RA | RB | RC |nh| XO |
121 ```
122
123 # Word Instruction Fields
124
125 Add the following to Book I 1.6.2
126
127 ```
128 msk (18:20, 31)
129 Field used by crternlogi to decide which CR bits to modify.
130 Formats: CRB
131 ```
132
133 ```
134 nh (26)
135 Nibble High. Field used by binlog to decide if the look-up-table should
136 be taken from bits 60:63 or 56:59 of RC.
137 Formats: VA
138 ```
139
140 ```
141 TLI (21:28)
142 Field used by the ternlogi instruction as the
143 look-up table.
144 Formats: TLI, CRB
145 ```
146
147 ```
148 XO (29:30)
149 Extended opcode field.
150 Formats: TLI, CRB
151 ```
152
153 Add `TLI` to the `Formats:` list of all of `RA`, `RB`, `RT`, and `Rc`.
154 Add `CRB` to the `Formats:` list of all of `BF`, `BFA`, `BFB`, and `BFC`.
155 Add `VA` to the `Formats:` list of `XO (27:31)`.
156
157 ----------
158
159 \newpage{}
160
161 # Ternary Logic Immediate
162
163 TLI-form
164
165 Add this section to Book I 3.3.13
166
167 * `ternlogi RT, RA, RB, TLI` (`Rc=0`)
168 * `ternlogi. RT, RA, RB, TLI` (`Rc=1`)
169
170 | 0-5 | 6-10 | 11-15 | 16-20 | 21-28 | 29-30 | 31 | Form |
171 |-----|------|-------|-------|-------|-------|----|----------|
172 | PO | RT | RA | RB | TLI | XO | Rc | TLI-Form |
173
174 Pseudocode:
175
176 ```
177 result <- [0] * 64
178 do i = 0 to 63
179 idx <- (RT)[i] || (RA)[i] || (RB)[i] # compute index from current bits
180 result[i] <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
181 RT <- result
182 ```
183
184 Special registers altered:
185
186 ```
187 CR0 (if Rc=1)
188 ```
189
190 ----------
191
192 \newpage{}
193
194 # Condition Register Ternary Logic Immediate
195
196 CRB-form
197
198 Add this section to Book I 2.5.1
199
200 * `crternlogi BF, BFA, BFB, BFC, TLI, msk`
201
202 | 0.5| 6-8 | 9-11 | 12-14 | 15-17 | 18-20 | 21-28 | 29-30 | 31 | Form |
203 |----|-----|------|-------|-------|-------|-------|-------|-----|----------|
204 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk | CRB-Form |
205
206 Pseudocode:
207
208 ```
209 a <- CR[4*BFA+32:4*BFA+35]
210 b <- CR[4*BFB+32:4*BFB+35]
211 c <- CR[4*BFC+32:4*BFC+35]
212 do i = 0 to 3
213 idx <- a[i] || b[i] || c[i] # compute index from current bits
214 result <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
215 if msk[i] = 1 then
216 CR[4*BF+32+i] <- result
217 ```
218
219 Special registers altered:
220
221 ```
222 CR field BF
223 ```
224
225 ----------
226
227 \newpage{}
228
229 # Dynamic Binary Logic
230
231 VA-form
232
233 Add this section to Book I 3.3.13
234
235 * `binlog RT, RA, RB, RC, nh`
236
237 | 0-5 | 6-10 | 11-15 | 16-20 | 21-25 | 26 | 27-31 | Form |
238 |-----|------|-------|-------|-------|----|-------|---------|
239 | PO | RT | RA | RB | RC | nh | XO | VA-Form |
240
241 Pseudocode:
242
243 ```
244 if nh = 1 then
245 lut <- (RC)[56:59]
246 else
247 lut <- (RC)[60:63]
248 do i = 0 to 63
249 idx <- (RB)[i] || (RA)[i] # compute index from current bits
250 result[i] <- lut[3 - idx] # subtract from 3 to index in LSB0 order
251 RT <- result
252 ```
253
254 Special registers altered:
255
256 ```
257 None
258 ```
259
260 **Programming Note**:
261
262 Dynamic Ternary Logic may be emulated by appropriate combination of `binlog` and `ternlogi`,
263 using the `nh` (next half) operand to select first and second nibble:
264
265 ```
266 # compute r3 = ternlog(r4, r5, r6, table=r7)
267 # compute the values for when r6[i] = 0:
268 binlog r3, r4, r5, r7, 0 # takes look-up-table from LSB 4 bits
269 # compute the values for when r6[i] = 1:
270 binlog r4, r4, r5, r7, 1 # takes look-up-table from second-to-LSB 4 bits
271 # mux the two results together: r3 = (r3 & ~r6) | (r4 & r6)
272 ternlogi r3, r4, r6, 0b11011000
273 ```
274
275 ----------
276
277 \newpage{}
278
279 ## crbinlog
280
281 With ternary (LUT3) dynamic instructions being very costly,
282 and CR Fields being only 4 bit, a binary (LUT2) variant is better
283
284 | 0.5|6.8 | 9.11|12.14|15.17|18.21|22...30 |31|
285 | -- | -- | --- | --- | --- |-----| -------- |--|
286 | NN | BT | BA | BB | BC |m0-m3|000101110 |0 |
287
288 mask = m0..m3
289 for i in range(4):
290 a,b = CRs[BA][i], CRs[BB][i])
291 if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b)
292
293 When SVP64 Vectorised any of the 4 operands may be Scalar or
294 Vector, including `BC` meaning that multiple different dynamic
295 lookups may be performed with a single instruction.
296
297 *Programmer's note: just as with binlut and ternlogi, a pair
298 of crbinlog instructions followed by a merging crternlogi may
299 be deployed to synthesise dynamic ternary (LUT3) CR Field
300 manipulation*
301
302 ----------
303
304 \newpage{}
305
306 ----------
307
308 # Appendices
309
310 Appendix E Power ISA sorted by opcode
311 Appendix F Power ISA sorted by version
312 Appendix G Power ISA sorted by Compliancy Subset
313 Appendix H Power ISA sorted by mnemonic
314
315 |Form| Book | Page | Version | mnemonic | Description |
316 |----|------|------|---------|----------|-------------|
317 |TLI | I | # | 3.2B | ternlogi | Ternary Logic Immediate |
318
319 ----------------
320
321 [[!tag opf_rfc]]