1 # RFC ls007 Ternary/Binary GPR and CR Field bit-operations
5 * <https://libre-soc.org/openpower/sv/bitmanip/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls007/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1017>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
20 **Books and Section affected**: **UPDATE**
22 * Book I 2.5.1 Condition Register Logical Instructions
23 * Book I 3.3.13 Fixed-Point Logical Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
33 * `ternlogi` -- Ternary Logic Immediate
34 * `crternlogi` -- Condition Register Ternary Logic Immediate
35 * `binlog` -- Dynamic Binary Logic
36 * `crbinlog` -- Condition Register Dynamic Binary Logic
38 **Submitter**: Luke Leighton (Libre-SOC)
40 **Requester**: Libre-SOC
42 **Impact on processor**:
44 * Addition of two new GPR-based instructions
45 * Addition of two new CR-field-based instructions
47 **Impact on software**:
49 * Requires support for new instructions in assembler, debuggers,
55 GPR, CR-Field, bit-manipulation, ternary, binary, dynamic, look-up-table (LUT), FPGA
60 * `ternlogi` is similar to existing `and`/`or`/`xor`/etc. instructions, but
61 allows any arbitrary 3-input 1-output bitwise operation. This can be used to
62 combine several instructions into one. E.g. `A ^ (~B & (C | A))` can become
63 one instruction. This can also be used to have one instruction for
64 bitwise MUX `(A & B) | (~A & C)`.
65 * `binlog` is like `ternlogi` except it supports any arbitrary 2-input
66 1-output bitwise operation, where the operation can be selected dynamically
67 at runtime. This operates similarly to a Programmable LUT in a FPGA.
68 * `crternlogi` is like `ternlogi` except it works with CRs instead of GPRs.
69 * `crbinlog` is like `binlog` except it works with CRs instead of GPRs. Likewise it
70 is similar to a Programmable LUT in an FPGA.
72 **Notes and Observations**:
74 * `ternlogi` is like the existing `xxeval` instruction, except operates on
75 GPRs instead of VSRs and doesn't require VSX/VMX.
76 * `crternlogi` is similar to the group of CR Operations (crand, cror etc) which have
77 been identified as a Binary Lookup Group, except an 8-bit
78 immediate is used instead of a 4-bit one, and up to 4 bits of a CR Field may
79 be computed at once, saving 3 CR operations.
80 * `crbinlut` is similar to the Binary Lookup Group of CR Operations except that the
81 4-bit lookup table comes from a CR Field instead of from an Immediate. Also
82 like `crternlogi` up to 4 bits may be computed at once.
86 Add the following entries to:
88 * Book I 2.5.1 Condition Register Logical Instructions
89 * Book I 3.3.13 Fixed-Point Logical Instructions
90 * Book I 1.6.1 and 1.6.2
98 Add the following section to Book I 1.6.1
101 |0 |6 |9 |12 |15 |18 |21 |29 |31 |
102 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk |
107 Add the following section to Book I 1.6.1
110 |0 |6 |11 |16 |21 |29 |31 |
111 | PO | RT | RA | RB | TLI | XO | Rc |
116 Add the following entry to VA-FORM in Book I 1.6.1.12
119 |0 |6 |11 |16 |21|22 |26|27 |
120 | PO | RT | RA | RB | RC |nh| XO |
123 # Word Instruction Fields
125 Add the following to Book I 1.6.2
129 Field used by crternlogi to decide which CR bits to modify.
135 Nibble High. Field used by binlog to decide if the look-up-table should
136 be taken from bits 60:63 or 56:59 of RC.
142 Field used by the ternlogi instruction as the
149 Extended opcode field.
153 Add `TLI` to the `Formats:` list of all of `RA`, `RB`, `RT`, and `Rc`.
154 Add `CRB` to the `Formats:` list of all of `BF`, `BFA`, `BFB`, and `BFC`.
155 Add `VA` to the `Formats:` list of `XO (27:31)`.
161 # Ternary Logic Immediate
165 Add this section to Book I 3.3.13
167 * `ternlogi RT, RA, RB, TLI` (`Rc=0`)
168 * `ternlogi. RT, RA, RB, TLI` (`Rc=1`)
170 | 0-5 | 6-10 | 11-15 | 16-20 | 21-28 | 29-30 | 31 | Form |
171 |-----|------|-------|-------|-------|-------|----|----------|
172 | PO | RT | RA | RB | TLI | XO | Rc | TLI-Form |
179 idx <- (RT)[i] || (RA)[i] || (RB)[i] # compute index from current bits
180 result[i] <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
184 Special registers altered:
194 # Condition Register Ternary Logic Immediate
198 Add this section to Book I 2.5.1
200 * `crternlogi BF, BFA, BFB, BFC, TLI, msk`
202 | 0.5| 6-8 | 9-11 | 12-14 | 15-17 | 18-20 | 21-28 | 29-30 | 31 | Form |
203 |----|-----|------|-------|-------|-------|-------|-------|-----|----------|
204 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk | CRB-Form |
209 a <- CR[4*BFA+32:4*BFA+35]
210 b <- CR[4*BFB+32:4*BFB+35]
211 c <- CR[4*BFC+32:4*BFC+35]
213 idx <- a[i] || b[i] || c[i] # compute index from current bits
214 result <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
216 CR[4*BF+32+i] <- result
219 Special registers altered:
229 # Dynamic Binary Logic
233 Add this section to Book I 3.3.13
235 * `binlog RT, RA, RB, RC, nh`
237 | 0-5 | 6-10 | 11-15 | 16-20 | 21-25 | 26 | 27-31 | Form |
238 |-----|------|-------|-------|-------|----|-------|---------|
239 | PO | RT | RA | RB | RC | nh | XO | VA-Form |
249 idx <- (RB)[i] || (RA)[i] # compute index from current bits
250 result[i] <- lut[3 - idx] # subtract from 3 to index in LSB0 order
254 Special registers altered:
260 **Programming Note**:
262 Dynamic Ternary Logic may be emulated by appropriate combination of `binlog` and `ternlogi`,
263 using the `nh` (next half) operand to select first and second nibble:
266 # compute r3 = ternlog(r4, r5, r6, table=r7)
267 # compute the values for when r6[i] = 0:
268 binlog r3, r4, r5, r7, 0 # takes look-up-table from LSB 4 bits
269 # compute the values for when r6[i] = 1:
270 binlog r4, r4, r5, r7, 1 # takes look-up-table from second-to-LSB 4 bits
271 # mux the two results together: r3 = (r3 & ~r6) | (r4 & r6)
272 ternlogi r3, r4, r6, 0b11011000
281 With ternary (LUT3) dynamic instructions being very costly,
282 and CR Fields being only 4 bit, a binary (LUT2) variant is better
284 | 0.5|6.8 | 9.11|12.14|15.17|18.21|22...30 |31|
285 | -- | -- | --- | --- | --- |-----| -------- |--|
286 | NN | BT | BA | BB | BC |m0-m3|000101110 |0 |
290 a,b = CRs[BA][i], CRs[BB][i])
291 if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b)
293 When SVP64 Vectorised any of the 4 operands may be Scalar or
294 Vector, including `BC` meaning that multiple different dynamic
295 lookups may be performed with a single instruction.
297 *Programmer's note: just as with binlut and ternlogi, a pair
298 of crbinlog instructions followed by a merging crternlogi may
299 be deployed to synthesise dynamic ternary (LUT3) CR Field
310 Appendix E Power ISA sorted by opcode
311 Appendix F Power ISA sorted by version
312 Appendix G Power ISA sorted by Compliancy Subset
313 Appendix H Power ISA sorted by mnemonic
315 |Form| Book | Page | Version | mnemonic | Description |
316 |----|------|------|---------|----------|-------------|
317 |TLI | I | # | 3.2B | ternlogi | Ternary Logic Immediate |