add grev/gorc design doc

author Jacob Lifshay <programmerjake@gmail.com>

Tue, 17 May 2022 02:26:13 +0000 (19:26 -0700)

committer Jacob Lifshay <programmerjake@gmail.com>

Tue, 17 May 2022 02:26:13 +0000 (19:26 -0700)
author Jacob Lifshay <programmerjake@gmail.com>
Tue, 17 May 2022 02:26:13 +0000 (19:26 -0700)
committer Jacob Lifshay <programmerjake@gmail.com>
Tue, 17 May 2022 02:26:13 +0000 (19:26 -0700)
diff --git a/openpower/sv/bitmanip.mdwn b/openpower/sv/bitmanip.mdwn

index fbedb496b162aa8eee79f633f240d83312632582..2f208d36d1e1b630ca417bfb652f1cc40183560e 100644 (file)
--- a/openpower/sv/bitmanip.mdwn
+++ b/openpower/sv/bitmanip.mdwn
@@ -421,6 +421,8 @@ uint_xlen_t bmextrev(RA, RB, sh)
  
  # grevlut
  
+([3x lower latency alternative](bitmanip/grev_gorc_design.mdwn))
+
  generalised reverse combined with a pair of LUT2s and allowing
  a constant `0b0101...0101` when RA=0, and an option to invert
  (including when RA=0, giving a constant 0b1010...1010 as the
diff --git a/openpower/sv/bitmanip/grev_gorc_design.mdwn b/openpower/sv/bitmanip/grev_gorc_design.mdwn

new file mode 100644 (file)

index 0000000..9d2a2f6
--- /dev/null
+++ b/openpower/sv/bitmanip/grev_gorc_design.mdwn
@@ -0,0 +1,19 @@
+# GRev/GOrC combination instruction design
+
+The design is derived from a circuit for GRev made with muxes:
+
+![grev_made_with_muxes.svg](grev_made_with_muxes.svg)
+
+First, we convert that circuit to use And-Or-Invert gates, since that's an efficient way the muxes can be implemented:
+
+![grev_made_with_aoi_gates.svg](grev_made_with_aoi_gates.svg)
+
+Notice how each And-Or-Invert has both a bit of `SH` and `~SH` as inputs? Those can be converted to separate inputs, controlled by the bits of `SH` using the instruction's immediate as a pair of 2-bit look-up-tables. This requires 4-bits of immediate.
+
+This gives us our final design:
+
+![grev_gorc_combination.svg](grev_gorc_combination.svg)
+
+Notice how this still has an overall circuit latency that is essentially equivalent to grev's latency (or shift/rotate's latency). Also notice how this circuit allows specifying much more than just `grev` or `gorc` instructions. A final layer of XOR gates can be added at the input and output, allowing it to function as a `gandc` instruction too, requiring a total of 6-bits of immediate.
+
+We will also want versions of `grev` that have the shift amount be an immediate (needed for bitwise reverse and byte reversals and other similar instructions.) The immediate-shift-amount version can be specified to always do a `grev` (or maybe only `grev`/`gorc`) operation to save encoding space, since I'd guess it's much more common than any of the other immediate-shift variants.
author	Jacob Lifshay <programmerjake@gmail.com>
	Tue, 17 May 2022 02:26:13 +0000 (19:26 -0700)
committer	Jacob Lifshay <programmerjake@gmail.com>
	Tue, 17 May 2022 02:26:13 +0000 (19:26 -0700)
openpower/sv/bitmanip.mdwn		patch \| blob \| history
openpower/sv/bitmanip/grev_gorc_design.mdwn	[new file with mode: 0644]	patch \| blob