## crternlogi
-another mode selection would be CRs not Ints.
+another mode selection would be CRs not Ints.
-| 0.5|6.8 | 9.11|12.14|15.17|18.20|21.28 | 29.30|31|
-| -- | -- | --- | --- | --- |-----|----- | -----|--|
-| NN | BT | BA | BB | BC |m0-2 | imm | 01 |m3|
+CRB-Form:
+
+| 0.5|6.8 |9.10|11.13|14.15|16.18|19.25|26.30| 31|
+|----|----|----|-----|-----|-----|-----|-----|---|
+| NN | BF | msk|BFA | msk | BFB | TLI | XO |TLI|
- mask = m0-3
for i in range(4):
- a,b,c = CRs[BA][i], CRs[BB][i], CRs[BC][i])
- if mask[i] CRs[BT][i] = lut3(imm, a, b, c)
+ a,b,c = CRs[BF][i], CRs[BFA][i], CRs[BFB][i])
+ if msk[i] CRs[BF][i] = lut3(imm, a, b, c)
This instruction is remarkably similar to the existing crops, `crand` etc.
which have been noted to be a 4-bit (binary) LUT. In effect `crternlogi`
-is the ternary LUT version of crops, having an 8-bit LUT.
+is the ternary LUT version of crops, having an 8-bit LUT. However it
+is an overwrite instruction in order to save on register file ports,
+due to the mask requiring the contents of the BF to be both read and
+written.
## crbinlog
With ternary (LUT3) dynamic instructions being very costly,
and CR Fields being only 4 bit, a binary (LUT2) variant is better
-| 0.5|6.8 | 9.11|12.14|15.17|18.21|22...30 |31|
-| -- | -- | --- | --- | --- |-----| -------- |--|
-| NN | BT | BA | BB | BC |m0-m3|000101110 |0 |
+CRB-Form:
+
+| 0.5|6.8 |9.10|11.13|14.15|16.18|19.25|26.30| 31|
+|----|----|----|-----|-----|-----|-----|-----|---|
+| NN | BF | msk|BFA | msk | BFB | // | XO | //|
- mask = m0..m3
for i in range(4):
- a,b = CRs[BA][i], CRs[BB][i])
- if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b)
+ a,b = CRs[BF][i], CRs[BF][i])
+ if msk[i] CRs[BF][i] = lut2(CRs[BFB], a, b)
When SVP64 Vectorised any of the 4 operands may be Scalar or
-Vector, including `BC` meaning that multiple different dynamic
-lookups may be performed with a single instruction.
+Vector, including `BFB` meaning that multiple different dynamic
+lookups may be performed with a single instruction. Note that
+this instruction is deliberately an overwrite in order to reduce
+the number of register file ports required: like `crternlogi`
+the contents of `BF` **must** be read due to the mask only
+writing back to non-masked-out bits of `BF`.
*Programmer's note: just as with binlut and ternlogi, a pair
of crbinlog instructions followed by a merging crternlogi may