From 8e5326caf15e238494a20cb0937b0c41c0ea9d20 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Wed, 15 Mar 2023 15:00:29 +0000 Subject: [PATCH] rewrite crternlogi and crbinlog to match new format, required to reduce both instructions to 3-read 1-write. https://bugs.libre-soc.org/show_bug.cgi?id=1023#c2 --- openpower/sv/bitmanip.mdwn | 41 +++++++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 16 deletions(-) diff --git a/openpower/sv/bitmanip.mdwn b/openpower/sv/bitmanip.mdwn index a4d68a0c6..15713958f 100644 --- a/openpower/sv/bitmanip.mdwn +++ b/openpower/sv/bitmanip.mdwn @@ -121,38 +121,47 @@ the second nh=1.* ## crternlogi -another mode selection would be CRs not Ints. +another mode selection would be CRs not Ints. -| 0.5|6.8 | 9.11|12.14|15.17|18.20|21.28 | 29.30|31| -| -- | -- | --- | --- | --- |-----|----- | -----|--| -| NN | BT | BA | BB | BC |m0-2 | imm | 01 |m3| +CRB-Form: + +| 0.5|6.8 |9.10|11.13|14.15|16.18|19.25|26.30| 31| +|----|----|----|-----|-----|-----|-----|-----|---| +| NN | BF | msk|BFA | msk | BFB | TLI | XO |TLI| - mask = m0-3 for i in range(4): - a,b,c = CRs[BA][i], CRs[BB][i], CRs[BC][i]) - if mask[i] CRs[BT][i] = lut3(imm, a, b, c) + a,b,c = CRs[BF][i], CRs[BFA][i], CRs[BFB][i]) + if msk[i] CRs[BF][i] = lut3(imm, a, b, c) This instruction is remarkably similar to the existing crops, `crand` etc. which have been noted to be a 4-bit (binary) LUT. In effect `crternlogi` -is the ternary LUT version of crops, having an 8-bit LUT. +is the ternary LUT version of crops, having an 8-bit LUT. However it +is an overwrite instruction in order to save on register file ports, +due to the mask requiring the contents of the BF to be both read and +written. ## crbinlog With ternary (LUT3) dynamic instructions being very costly, and CR Fields being only 4 bit, a binary (LUT2) variant is better -| 0.5|6.8 | 9.11|12.14|15.17|18.21|22...30 |31| -| -- | -- | --- | --- | --- |-----| -------- |--| -| NN | BT | BA | BB | BC |m0-m3|000101110 |0 | +CRB-Form: + +| 0.5|6.8 |9.10|11.13|14.15|16.18|19.25|26.30| 31| +|----|----|----|-----|-----|-----|-----|-----|---| +| NN | BF | msk|BFA | msk | BFB | // | XO | //| - mask = m0..m3 for i in range(4): - a,b = CRs[BA][i], CRs[BB][i]) - if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b) + a,b = CRs[BF][i], CRs[BF][i]) + if msk[i] CRs[BF][i] = lut2(CRs[BFB], a, b) When SVP64 Vectorised any of the 4 operands may be Scalar or -Vector, including `BC` meaning that multiple different dynamic -lookups may be performed with a single instruction. +Vector, including `BFB` meaning that multiple different dynamic +lookups may be performed with a single instruction. Note that +this instruction is deliberately an overwrite in order to reduce +the number of register file ports required: like `crternlogi` +the contents of `BF` **must** be read due to the mask only +writing back to non-masked-out bits of `BF`. *Programmer's note: just as with binlut and ternlogi, a pair of crbinlog instructions followed by a merging crternlogi may -- 2.30.2