From: lkcl Date: Sat, 21 May 2022 14:02:31 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~2148 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=566122ef0f962628bed5dc65ba7a6a45926dfab9;p=libreriscv.git --- diff --git a/openpower/sv/bitmanip.mdwn b/openpower/sv/bitmanip.mdwn index 7bf2182d5..c18e01aa4 100644 --- a/openpower/sv/bitmanip.mdwn +++ b/openpower/sv/bitmanip.mdwn @@ -161,7 +161,9 @@ the [[sv/av_opcodes]]) # binary and ternary bitops -Similar to FPGA LUTs: for every bit perform a lookup into a table using an 8-8-bit immediate (for the ternary instructions), or in another register (4-bit +Similar to FPGA LUTs: for two (binary) or three (ternary) inputs take +bits from each input, concatenate them and +perform a lookup into a table using an 8-8-bit immediate (for the ternary instructions), or in another register (4-bit for the binary instructions). The binary lookup instructions have CR Field lookup variants due to CR Fields being 4 bit. @@ -206,6 +208,9 @@ For bincrlut, `BFA` selects the 4-bit CR Field as the LUT2: for i in range(64): RT[i] = lut2(CRs{BFA}, RB[i], RA[i]) +When Vectorised with SVP64, as usual both source and destination may be +Vector or Scalar. + *Programmer's note: a dynamic ternary lookup may be synthesised from a pair of `binlut` instructions followed by a `ternlogi` to select which to merge. Use `nh` to select which nibble to use as the lookup table @@ -224,6 +229,10 @@ another mode selection would be CRs not Ints. a,b,c = CRs[BA][i], CRs[BB][i], CRs[BC][i]) if mask[i] CRs[BT][i] = lut3(imm, a, b, c) +This instruction is remarkably similar to the existing crops, `crand` etc. +which have been noted to be a 4-bit (binary) LUT. In effect `crternlogi` +is the ternary LUT version of crops, having an 8-bit LUT. + ## crbinlog With ternary (LUT3) dynamic instructions being very costly, @@ -238,6 +247,10 @@ and CR Fields being only 4 bit, a binary (LUT2) variant is better a,b = CRs[BA][i], CRs[BB][i]) if mask[i] CRs[BT][i] = lut2(CRs[BC], a, b) +When SVP64 Vectorised any of the 4 operands may be Scalar or +Vector, including `BC` meaning that multiple different dynamic +lookups may be performed with a single instruction. + *Programmer's note: just as with binlut and ternlogi, a pair of crbinlog instructions followed by a merging crternlogi may be deployed to synthesise dynamic ternary (LUT3) CR Field