| NN | RT | RA | RB | im0-4 | im5-7 00 |1 | grevlog |
| NN | | | | | ----- 01 |m3| crternlog |
| NN | RT | RA | RB | RC | mode 010 |Rc| bitmask\* |
-| NN | | | | | 00 011 | | rsvd |
+| NN | RT | RA | RB | RC | 00 011 |nh| binlut |
| NN | | | | | 01 011 |0 | svshape |
| NN | | | | | 01 011 |1 | svremap |
| NN | | | | | 10 011 |Rc| svstep |
| NN | RT | RA | RB | 1 | 11 | 1110 110 |Rc| clmulh | X-Form |
| NN | | | | | | --11 110 |Rc| rsvd | |
-# ternlog bitops
+# binary and ternary bitops
Similar to FPGA LUTs: for every bit perform a lookup into a table using an 8bit immediate, or in another register.
for i in range(64):
RT[i] = lut3(imm, RB[i], RA[i], RT[i])
-## ternlogv
-
-also, another possible variant involving swizzle-like selection
-and masking, this only requires 3 64 bit registers (RA, RS, RB) and
-only 16 LUT3s.
-
-Note however that unless XLEN matches sz, this instruction
-is a Read-Modify-Write: RS must be read as a second operand
-and all unmodified bits preserved. SVP64 may provide limited
-alternative destination for RS from RS-as-source, but again
-all unmodified bits must still be copied.
-
-| 0.5|6.10|11.15|16.20|21.28 | 29.30 |31|
-| -- | -- | --- | --- | ---- | ----- |--|
-| NN | RS | RA | RB |idx0-3| 01 |sz|
-
- SZ = (1+sz) * 8 # 8 or 16
- raoff = MIN(XLEN, idx0 * SZ)
- rboff = MIN(XLEN, idx1 * SZ)
- rcoff = MIN(XLEN, idx2 * SZ)
- rsoff = MIN(XLEN, idx3 * SZ)
- imm = RB[0:8]
- for i in range(MIN(XLEN, SZ)):
- ra = RA[raoff:+i]
- rb = RA[rboff+i]
- rc = RA[rcoff+i]
- res = lut3(imm, ra, rb, rc)
- RS[rsoff+i] = res
+## binlut
+
+Binary lookup is a dynamic LUT2 version of ternlogi. Firstly, the
+lookup table is 4 bits wide not 8 bits, and secondly the lookup
+table comes from a register not an immediate.
+
+| 0.5|6.10|11.15|16.20| 21..25|26..30|31|
+| -- | -- | --- | --- | ----- | ---- |--|
+| NN | RT | RA | RB | RC |00011 |nh|
+
+ lut2(imm, a, b):
+ idx = b << 1 | a
+ return imm[idx] # idx by LSB0 order
+
+ imm = (RC>>(nh*4))&0b1111
+ for i in range(64):
+ RT[i] = lut2(imm, RB[i], RA[i])
+
+*Programmer's note: a dynamic ternary lookup may be synthesised from
+a pair of `binlut` instructions followed by a `ternlogi` to select which
+to merge. Use `nh` to select which nibble to use as the lookup table
+from the RC source register (`nh=1` nibble high)*
## ternlogcr