working version, negatives work also
[libreriscv.git] / openpower / sv / bitmanip / grev_gorc_design.mdwn
1 # GRev/GOrC combination instruction design
2
3 The design is derived from a circuit for GRev made with muxes:
4
5 <img src="../grev_made_with_muxes.svg" width="100%" height="100%"/>
6
7 First, we convert that circuit to use And-Or-Invert gates, since that's an efficient way the muxes can be implemented:
8
9 <img src="../grev_made_with_aoi_gates.svg" width="100%" height="100%" />
10
11 Notice how each And-Or-Invert has both a bit of `SH` and `~SH` as inputs? Those can be converted to separate inputs, controlled by the bits of `SH` using the instruction's immediate as a pair of 2-bit look-up-tables. This requires 4-bits of immediate.
12
13 This gives us our final design:
14
15 <img src="../grev_gorc_combination.svg" width="100%" height="100%" />
16
17 Notice how this still has an overall circuit latency that is essentially equivalent to grev's latency (or shift/rotate's latency). Also notice how this circuit allows specifying much more than just `grev` or `gorc` instructions. Layers of XOR gates can be added at the input and output, allowing it to function as a `gandc` instruction too, requiring a total of 6-bits of immediate (1 bit for inverting the input, 1 bit for inverting the output, 4 bits for the look-up-tables).
18
19 We will also want versions of `grev` that have the shift amount be an immediate (needed for bitwise reverse and byte reversals and other similar instructions.) The immediate-shift-amount version can be specified to always do a `grev` (or maybe only `grev`/`gorc`) operation to save encoding space, since I'd guess it's much more common than any of the other immediate-shift variants.
20
21 # Twin LUT4s
22
23 gate-saving of the AND/OR (AOI) can be applied to grevlut. TODO, version
24 of diagram in SVG/DIA
25
26 <img src="https://ftp.libre-soc.org/2022-05-17_11-05.png" width=800 />