# grevlut
+([3x lower latency alternative](bitmanip/grev_gorc_design.mdwn))
+
generalised reverse combined with a pair of LUT2s and allowing
a constant `0b0101...0101` when RA=0, and an option to invert
(including when RA=0, giving a constant 0b1010...1010 as the
--- /dev/null
+# GRev/GOrC combination instruction design
+
+The design is derived from a circuit for GRev made with muxes:
+
+![grev_made_with_muxes.svg](grev_made_with_muxes.svg)
+
+First, we convert that circuit to use And-Or-Invert gates, since that's an efficient way the muxes can be implemented:
+
+![grev_made_with_aoi_gates.svg](grev_made_with_aoi_gates.svg)
+
+Notice how each And-Or-Invert has both a bit of `SH` and `~SH` as inputs? Those can be converted to separate inputs, controlled by the bits of `SH` using the instruction's immediate as a pair of 2-bit look-up-tables. This requires 4-bits of immediate.
+
+This gives us our final design:
+
+![grev_gorc_combination.svg](grev_gorc_combination.svg)
+
+Notice how this still has an overall circuit latency that is essentially equivalent to grev's latency (or shift/rotate's latency). Also notice how this circuit allows specifying much more than just `grev` or `gorc` instructions. A final layer of XOR gates can be added at the input and output, allowing it to function as a `gandc` instruction too, requiring a total of 6-bits of immediate.
+
+We will also want versions of `grev` that have the shift amount be an immediate (needed for bitwise reverse and byte reversals and other similar instructions.) The immediate-shift-amount version can be specified to always do a `grev` (or maybe only `grev`/`gorc`) operation to save encoding space, since I'd guess it's much more common than any of the other immediate-shift variants.