1 # RFC ls007 Ternary/Binary GPR and CR Field bit-operations
5 * <https://libre-soc.org/openpower/sv/bitmanip/>
6 * <https://libre-soc.org/openpower/sv/rfc/ls007/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=1017>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/todo>
20 **Books and Section affected**: **UPDATE**
22 * Book I 2.5.1 Condition Register Logical Instructions
23 * Book I 3.3.13 Fixed-Point Logical Instructions
24 * Appendix E Power ISA sorted by opcode
25 * Appendix F Power ISA sorted by version
26 * Appendix G Power ISA sorted by Compliancy Subset
27 * Appendix H Power ISA sorted by mnemonic
33 * `ternlogi` -- Ternary Logic Immediate
34 * `crternlogi` -- Condition Register Ternary Logic Immediate
35 * `binlog` -- Dynamic Binary Logic
36 * `crbinlog` -- Condition Register Dynamic Binary Logic
38 **Submitter**: Luke Leighton (Libre-SOC)
40 **Requester**: Libre-SOC
42 **Impact on processor**:
44 * Addition of two new GPR-based instructions
45 * Addition of two new CR-field-based instructions
47 **Impact on software**:
49 * Requires support for new instructions in assembler, debuggers,
55 GPR, CR-Field, bit-manipulation, ternary, binary, dynamic, look-up-table (LUT), FPGA
60 * `ternlogi` is similar to existing `and`/`or`/`xor`/etc. instructions, but
61 allows any arbitrary 3-input 1-output bitwise operation. This can be used to
62 combine several instructions into one. E.g. `A ^ (~B & (C | A))` can become
63 one instruction. This can also be used to have one instruction for
64 bitwise MUX `(A & B) | (~A & C)`.
65 * `binlog` is like `ternlogi` except it supports any arbitrary 2-input
66 1-output bitwise operation, where the operation can be selected dynamically
67 at runtime. This operates similarly to a Programmable LUT in a FPGA.
68 * `crternlogi` is like `ternlogi` except it works with CRs instead of GPRs.
69 * `crbinlog` is like `binlog` except it works with CRs instead of GPRs. Likewise it
70 is similar to a Programmable LUT in an FPGA.
72 **Notes and Observations**:
74 * `ternlogi` is like the existing `xxeval` instruction, except operates on GPRs instead
75 of VSRs and doesn't require VSX/VMX. SFS and SFFS are therefore less powerful.
76 * `crternlogi` is similar to the group of CR Operations (crand, cror etc) which have
77 been identified as a Binary Lookup Group, except an 8-bit
78 immediate is used instead of a 4-bit one, and up to 4 bits of a CR Field may
79 be computed at once, saving 3 CR operations.
80 * `crbinlut` is similar to the Binary Lookup Group of CR Operations except that the
81 4-bit lookup table comes from a CR Field instead of from an Immediate. Also
82 like `crternlogi` up to 4 bits may be computed at once.
86 Add the following entries to:
88 * Book I 2.5.1 Condition Register Logical Instructions
89 * Book I 3.3.13 Fixed-Point Logical Instructions
90 * Book I 1.6.1 and 1.6.2
98 Add the following section to Book I 1.6.1
101 |0 |6 |9 |12 |15 |18 |21 |29 |31 |
102 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk |
107 Add the following section to Book I 1.6.1
110 |0 |6 |11 |16 |21 |29 |31 |
111 | PO | RT | RA | RB | TLI | XO | Rc |
116 Add the following entry to VA-FORM in Book I 1.6.1.12
119 |0 |6 |11 |16 |21|22 |26|27 |
120 | PO | RT | RA | RB | RC |nh| XO |
123 # Word Instruction Fields
125 Add the following to Book I 1.6.2
129 Field used by crternlogi to decide which CR bits to modify.
133 Nibble High. Field used by binlog to decide if the look-up-table should
134 be taken from bits 60:63 or 56:59 of RC.
138 Field used by the ternlogi instruction as the
142 Field used by the crternlogi instruction as the
147 Extended opcode field.
150 Extended opcode field.
154 Add `TLI` to the `Formats:` list of all of `RA`, `RB`, `RT`, and `Rc`.
155 Add `CRB` to the `Formats:` list of all of `BF`, `BFA`, `BFB`, and `BFC`.
156 Add `TLI` to the `Formats:` list of `XO (29:30)`.
157 Add `CRB` to the `Formats:` list of `XO (26:31)`.
158 Add `VA` to the `Formats:` list of `XO (27:31)`.
164 # Ternary Logic Immediate
168 Add this section to Book I 3.3.13
170 * `ternlogi RT, RA, RB, TLI` (`Rc=0`)
171 * `ternlogi. RT, RA, RB, TLI` (`Rc=1`)
173 | 0-5 | 6-10 | 11-15 | 16-20 | 21-28 | 29-30 | 31 | Form |
174 |-----|------|-------|-------|-------|-------|----|----------|
175 | PO | RT | RA | RB | TLI | XO | Rc | TLI-Form |
182 idx <- (RT)[i] || (RA)[i] || (RB)[i] # compute index from current bits
183 result[i] <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
187 Special registers altered:
197 # Condition Register Ternary Logic Immediate
201 Add this section to Book I 2.5.1
203 * `crternlogi BF, BFA, BFB, BFC, TLI, msk`
205 | 0.5| 6-8 | 9-11 | 12-14 | 15-17 | 18-20 | 21-28 | 29-30 | 31 | Form |
206 |----|-----|------|-------|-------|-------|-------|-------|-----|----------|
207 | PO | BF | BFA | BFB | BFC | msk | TLI | XO | msk | CRB-Form |
212 a <- CR[4*BFA+32:4*BFA+35]
213 b <- CR[4*BFB+32:4*BFB+35]
214 c <- CR[4*BFC+32:4*BFC+35]
216 idx <- a[i] || b[i] || c[i] # compute index from current bits
217 result <- TLI[7 - idx] # subtract from 7 to index in LSB0 order
219 CR[4*BF+32+i] <- result
222 Special registers altered:
232 # Dynamic Binary Logic
236 Add this section to Book I 3.3.13
238 * `binlog RT, RA, RB, RC, nh`
240 | 0-5 | 6-10 | 11-15 | 16-20 | 21-25 | 26 | 27-31 | Form |
241 |-----|------|-------|-------|-------|----|-------|---------|
242 | PO | RT | RA | RB | RC | nh | XO | VA-Form |
252 idx <- (RB)[i] || (RA)[i] # compute index from current bits
253 result[i] <- lut[3 - idx] # subtract from 3 to index in LSB0 order
257 Special registers altered:
263 **Programming Note**:
265 Dynamic Ternary Logic may be emulated by appropriate combination of `binlog` and `ternlogi`,
266 using the `nh` (next half) operand to select first and second nibble:
269 # compute r3 = ternlog(r4, r5, r6, table=r7)
270 # compute the values for when r6[i] = 0:
271 binlog r3, r4, r5, r7, 0 # takes look-up-table from LSB 4 bits
272 # compute the values for when r6[i] = 1:
273 binlog r4, r4, r5, r7, 1 # takes look-up-table from second-to-LSB 4 bits
274 # mux the two results together: r3 = (r3 & ~r6) | (r4 & r6)
275 ternlogi r3, r4, r6, 0b11011000
284 With ternary (LUT3) dynamic instructions being very costly,
285 and CR Fields being only 4 bit, a binary (LUT2) variant is better
287 | 0.5|6.8 | 9.11|12.14|15.17|18.21|22...30 |31|
288 | -- | -- | --- | --- | --- |-----| -------- |--|
289 | NN | BT | BA | BB | BC |m0-m3|000101110 |0 |
293 a,b = CRs[BA][i], CRs[BB][i])
294 if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b)
296 When SVP64 Vectorised any of the 4 operands may be Scalar or
297 Vector, including `BC` meaning that multiple different dynamic
298 lookups may be performed with a single instruction.
300 *Programmer's note: just as with binlut and ternlogi, a pair
301 of crbinlog instructions followed by a merging crternlogi may
302 be deployed to synthesise dynamic ternary (LUT3) CR Field
313 Appendix E Power ISA sorted by opcode
314 Appendix F Power ISA sorted by version
315 Appendix G Power ISA sorted by Compliancy Subset
316 Appendix H Power ISA sorted by mnemonic
318 |Form| Book | Page | Version | mnemonic | Description |
319 |----|------|------|---------|----------|-------------|
320 |TLI | I | # | 3.2B | ternlogi | Ternary Logic Immediate |