add Forms to ls007, missing 1.6.2 fields
[libreriscv.git] / openpower / sv / rfc / ls007.mdwn
1 # RFC ls007 Ternary/Binary GPR and CR Field bit-operations
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/bitmanip/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls007/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1017>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 20 Oct 2022
15
16 **Target**: v3.2B
17
18 **Source**: v3.1B
19
20 **Books and Section affected**: **UPDATE**
21
22 * Book I 2.5.1 Condition Register Logical Instructions
23 * Book I 3.3.13 Fixed-Point Logical Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
28
29 **Summary**
30
31 Instructions added
32
33 * `ternlogi` -- Ternary Logic Immediate
34 * `crternlogi` -- Condition Register Ternary Logic Immediate
35 * `binlog` -- Dynamic Binary Logic
36 * `crbinlog` -- Condition Register Dynamic Binary Logic
37
38 **Submitter**: Luke Leighton (Libre-SOC)
39
40 **Requester**: Libre-SOC
41
42 **Impact on processor**:
43
44 * Addition of two new GPR-based instructions
45 * Addition of two new CR-field-based instructions
46
47 **Impact on software**:
48
49 * Requires support for new instructions in assembler, debuggers,
50 and related tools.
51
52 **Keywords**:
53
54 ```
55 GPR, CR-Field, bit-manipulation, ternary, binary, dynamic, look-up-table (LUT), FPGA
56 ```
57
58 **Motivation**
59
60 * `ternlogi` is similar to existing `and`/`or`/`xor`/etc. instructions, but
61 allows any arbitrary 3-input 1-output bitwise operation. This can be used to
62 combine several instructions into one. E.g. `A ^ (~B & (C | A))` can become
63 one instruction. This can also be used to have one instruction for
64 bitwise MUX `(A & B) | (~A & C)`.
65 * `binlog` is like `ternlogi` except it supports any arbitrary 2-input
66 1-output bitwise operation, where the operation can be selected dynamically
67 at runtime. This operates similarly to a Programmable LUT in a FPGA.
68 * `crternlogi` is like `ternlogi` except it works with CRs instead of GPRs.
69 * `crbinlog` is like `binlog` except it works with CRs instead of GPRs. Likewise it
70 is similar to a Programmable LUT in an FPGA.
71
72 **Notes and Observations**:
73
74 * `ternlogi` is like the existing `xxeval` instruction, except operates on GPRs instead
75 of VSRs and doesn't require VSX/VMX. SFS and SFFS are therefore less powerful.
76 * `crternlogi` is similar to the group of CR Operations (crand, cror etc) which have
77 been identified as a Binary Lookup Group, except an 8-bit
78 immediate is used instead of a 4-bit one, and up to 4 bits of a CR Field may
79 be computed at once, saving 3 CR operations.
80 * `crbinlut` is similar to the Binary Lookup Group of CR Operations except that the
81 4-bit lookup table comes from a CR Field instead of from an Immediate. Also
82 like `crternlogi` up to 4 bits may be computed at once.
83
84 **Changes**
85
86 Add the following entries to:
87
88 * Book I 2.5.1 Condition Register Logical Instructions
89 * Book I 3.3.13 Fixed-Point Logical Instructions
90 * Book I 1.6.1 and 1.6.2
91
92 ----------------
93
94 \newpage{}
95
96 # CRB-FORM
97
98 Add the following section to Book I 1.6.1
99
100 ```
101 |0 |6 |9 |12 |15 |18 |21 |29 |31 |
102 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk |
103 ```
104
105 # TLI-FORM
106
107 Add the following section to Book I 1.6.1
108
109 ```
110 |0 |6 |11 |16 |21 |29 |31 |
111 | PO | RT | RA | RB | TLI | XO | Rc |
112 ```
113
114 # VA-FORM
115
116 Add the following entry to VA-FORM in Book I 1.6.1.12
117
118 ```
119 |0 |6 |11 |16 |21|22 |26|27 |
120 | PO | RT | RA | RB | RC |nh| XO |
121 ```
122
123 # Word Instruction Fields
124
125 Add the following to Book I 1.6.2
126
127 ```
128 msk (9:10,14:15)
129 Field used by crternlogi to decide which CR bits to modify.
130 Formats: CRB
131
132 nh (26)
133 Nibble High. Field used by binlog to decide if the look-up-table should
134 be taken from bits 60:63 or 56:59 of RC.
135 Formats: VA
136
137 TLI (21:28)
138 Field used by the ternlogi instruction as the
139 look-up table.
140 Formats: TLI
141 TLI (21:25,19:20,31)
142 Field used by the crternlogi instruction as the
143 look-up table.
144 Formats: CRB
145
146 XO (29:30)
147 Extended opcode field.
148 Formats: TLI
149 XO (26:30)
150 Extended opcode field.
151 Formats: CRB
152 ```
153
154 Add `TLI` to the `Formats:` list of all of `RA`, `RB`, `RT`, and `Rc`.
155 Add `CRB` to the `Formats:` list of all of `BF`, `BFA`, `BFB`, and `BFC`.
156 Add `TLI` to the `Formats:` list of `XO (29:30)`.
157 Add `CRB` to the `Formats:` list of `XO (26:31)`.
158 Add `VA` to the `Formats:` list of `XO (27:31)`.
159
160 ----------
161
162 \newpage{}
163
164 # Ternary Logic Immediate
165
166 TLI-form
167
168 Add this section to Book I 3.3.13
169
170 * `ternlogi RT, RA, RB, TLI` (`Rc=0`)
171 * `ternlogi. RT, RA, RB, TLI` (`Rc=1`)
172
173 | 0-5 | 6-10 | 11-15 | 16-20 | 21-28 | 29-30 | 31 | Form |
174 |-----|------|-------|-------|-------|-------|----|----------|
175 | PO | RT | RA | RB | TLI | XO | Rc | TLI-Form |
176
177 Pseudocode:
178
179 ```
180 result <- [0] * 64
181 do i = 0 to 63
182 idx <- (RT)[i] || (RA)[i] || (RB)[i] # compute index from current bits
183 result[i] <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
184 RT <- result
185 ```
186
187 Special registers altered:
188
189 ```
190 CR0 (if Rc=1)
191 ```
192
193 ----------
194
195 \newpage{}
196
197 # Condition Register Ternary Logic Immediate
198
199 CRB-form
200
201 Add this section to Book I 2.5.1
202
203 * `crternlogi BF, BFA, BFB, BFC, TLI, msk`
204
205 | 0.5| 6-8 | 9-11 | 12-14 | 15-17 | 18-20 | 21-28 | 29-30 | 31 | Form |
206 |----|-----|------|-------|-------|-------|-------|-------|-----|----------|
207 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk | CRB-Form |
208
209 Pseudocode:
210
211 ```
212 a <- CR[4*BFA+32:4*BFA+35]
213 b <- CR[4*BFB+32:4*BFB+35]
214 c <- CR[4*BFC+32:4*BFC+35]
215 do i = 0 to 3
216 idx <- a[i] || b[i] || c[i] # compute index from current bits
217 result <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
218 if msk[i] = 1 then
219 CR[4*BF+32+i] <- result
220 ```
221
222 Special registers altered:
223
224 ```
225 CR field BF
226 ```
227
228 ----------
229
230 \newpage{}
231
232 # Dynamic Binary Logic
233
234 VA-form
235
236 Add this section to Book I 3.3.13
237
238 * `binlog RT, RA, RB, RC, nh`
239
240 | 0-5 | 6-10 | 11-15 | 16-20 | 21-25 | 26 | 27-31 | Form |
241 |-----|------|-------|-------|-------|----|-------|---------|
242 | PO | RT | RA | RB | RC | nh | XO | VA-Form |
243
244 Pseudocode:
245
246 ```
247 if nh = 1 then
248 lut <- (RC)[56:59]
249 else
250 lut <- (RC)[60:63]
251 do i = 0 to 63
252 idx <- (RB)[i] || (RA)[i] # compute index from current bits
253 result[i] <- lut[3 - idx] # subtract from 3 to index in LSB0 order
254 RT <- result
255 ```
256
257 Special registers altered:
258
259 ```
260 None
261 ```
262
263 **Programming Note**:
264
265 Dynamic Ternary Logic may be emulated by appropriate combination of `binlog` and `ternlogi`,
266 using the `nh` (next half) operand to select first and second nibble:
267
268 ```
269 # compute r3 = ternlog(r4, r5, r6, table=r7)
270 # compute the values for when r6[i] = 0:
271 binlog r3, r4, r5, r7, 0 # takes look-up-table from LSB 4 bits
272 # compute the values for when r6[i] = 1:
273 binlog r4, r4, r5, r7, 1 # takes look-up-table from second-to-LSB 4 bits
274 # mux the two results together: r3 = (r3 & ~r6) | (r4 & r6)
275 ternlogi r3, r4, r6, 0b11011000
276 ```
277
278 ----------
279
280 \newpage{}
281
282 ## crbinlog
283
284 With ternary (LUT3) dynamic instructions being very costly,
285 and CR Fields being only 4 bit, a binary (LUT2) variant is better
286
287 | 0.5|6.8 | 9.11|12.14|15.17|18.21|22...30 |31|
288 | -- | -- | --- | --- | --- |-----| -------- |--|
289 | NN | BT | BA | BB | BC |m0-m3|000101110 |0 |
290
291 mask = m0..m3
292 for i in range(4):
293 a,b = CRs[BA][i], CRs[BB][i])
294 if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b)
295
296 When SVP64 Vectorised any of the 4 operands may be Scalar or
297 Vector, including `BC` meaning that multiple different dynamic
298 lookups may be performed with a single instruction.
299
300 *Programmer's note: just as with binlut and ternlogi, a pair
301 of crbinlog instructions followed by a merging crternlogi may
302 be deployed to synthesise dynamic ternary (LUT3) CR Field
303 manipulation*
304
305 ----------
306
307 \newpage{}
308
309 ----------
310
311 # Appendices
312
313 Appendix E Power ISA sorted by opcode
314 Appendix F Power ISA sorted by version
315 Appendix G Power ISA sorted by Compliancy Subset
316 Appendix H Power ISA sorted by mnemonic
317
318 |Form| Book | Page | Version | mnemonic | Description |
319 |----|------|------|---------|----------|-------------|
320 |TLI | I | # | 3.2B | ternlogi | Ternary Logic Immediate |
321
322 ----------------
323
324 [[!tag opf_rfc]]