From: Jacob Lifshay Date: Tue, 17 May 2022 02:26:13 +0000 (-0700) Subject: add grev/gorc design doc X-Git-Tag: opf_rfc_ls005_v1~2187 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=d3a2483e42b0f0615b06158959e8edaaf77039c2;p=libreriscv.git add grev/gorc design doc --- diff --git a/openpower/sv/bitmanip.mdwn b/openpower/sv/bitmanip.mdwn index fbedb496b..2f208d36d 100644 --- a/openpower/sv/bitmanip.mdwn +++ b/openpower/sv/bitmanip.mdwn @@ -421,6 +421,8 @@ uint_xlen_t bmextrev(RA, RB, sh) # grevlut +([3x lower latency alternative](bitmanip/grev_gorc_design.mdwn)) + generalised reverse combined with a pair of LUT2s and allowing a constant `0b0101...0101` when RA=0, and an option to invert (including when RA=0, giving a constant 0b1010...1010 as the diff --git a/openpower/sv/bitmanip/grev_gorc_design.mdwn b/openpower/sv/bitmanip/grev_gorc_design.mdwn new file mode 100644 index 000000000..9d2a2f60a --- /dev/null +++ b/openpower/sv/bitmanip/grev_gorc_design.mdwn @@ -0,0 +1,19 @@ +# GRev/GOrC combination instruction design + +The design is derived from a circuit for GRev made with muxes: + +![grev_made_with_muxes.svg](grev_made_with_muxes.svg) + +First, we convert that circuit to use And-Or-Invert gates, since that's an efficient way the muxes can be implemented: + +![grev_made_with_aoi_gates.svg](grev_made_with_aoi_gates.svg) + +Notice how each And-Or-Invert has both a bit of `SH` and `~SH` as inputs? Those can be converted to separate inputs, controlled by the bits of `SH` using the instruction's immediate as a pair of 2-bit look-up-tables. This requires 4-bits of immediate. + +This gives us our final design: + +![grev_gorc_combination.svg](grev_gorc_combination.svg) + +Notice how this still has an overall circuit latency that is essentially equivalent to grev's latency (or shift/rotate's latency). Also notice how this circuit allows specifying much more than just `grev` or `gorc` instructions. A final layer of XOR gates can be added at the input and output, allowing it to function as a `gandc` instruction too, requiring a total of 6-bits of immediate. + +We will also want versions of `grev` that have the shift amount be an immediate (needed for bitwise reverse and byte reversals and other similar instructions.) The immediate-shift-amount version can be specified to always do a `grev` (or maybe only `grev`/`gorc`) operation to save encoding space, since I'd guess it's much more common than any of the other immediate-shift variants.