# RFC ls014 Advanced Scalar Bitmanipulation **URLs**: * * * **Severity**: Major **Status**: New **Date**: 14 Apr 2023 **Target**: v3.2B **Source**: v3.1B **Books and Section affected**: ``` Book I Fixed-Point Instructions Appendix E Power ISA sorted by opcode Appendix F Power ISA sorted by version Appendix G Power ISA sorted by Compliancy Subset Appendix H Power ISA sorted by mnemonic ``` **Summary** ``` Instructions added: bmask, grevlut, grevluti ``` **Submitter**: Luke Leighton (Libre-SOC) **Requester**: Libre-SOC **Impact on processor**: ``` Addition of new GPR-based instructions ``` **Impact on software**: ``` Requires support for new instructions in assembler, debuggers, and related tools. ``` **Keywords**: ``` LUTs, Bitmanipulation, GPR ``` **Motivation** Scalar Bitmanipulation in other high-end ISAs have had BMI subsets for over a decade. Their use and benefit is well-understood and compiler integration well-established. `bmask` brings *twenty four* BMI instructions to the Power ISA. `grevlut` on the other hand is highly experimental and extremely powerful. Normally only `grev` (Generalised Reverse) and occasionally `gor` are added to a Bitmanip-strong ISA: grevlut utilises LUTs and inversion to add 512 Generalised Reverse instructions. Desirable savings in general binary size are achieved. **Notes and Observations**: 1. bmask is a synthesis and generalisation of every "TBM" instruction with additional options not found in any other ISA BMI group. 2. grevluti as a 32-bit Defined Word is capable of generating over a thousand useful regular-patterned 64-bit "magic constants" that otherwise require either a Load or require several instructions to synthesise 3. word halfword byte nibble 2-bit 1-bit reversal at multiple levels are all achieved with grevlut. Some of these instructions were explicitly added in Power ISA v3.1 but grevlut is akin to xxeval. 4. grevlut can be expensive in hardware (estimated 20,000 gates) but like xxeval provides 512 equivalent instructions. **Changes** Add the following entries to: * the Appendices of Book I * Book I 3.3.13 Fixed-Point Logical Instructions * Book I 1.6.1 and 1.6.2 ---------- \newpage{} # Rationale ## bmask Based on RVV masked set-before-first, set-after-first etc. and Intel and AMD Bitmanip instructions made generalised then advanced further to include masks, this is a single instruction covering 24 individual instructions in other ISAs. The patterns within the pseudocode for AMD TBM and x86 BMI1 are as follows: * first pattern A: two options `x` or `~x` * second pattern B: three options `|` `&` or `^` * third pattern C: four options `x+1`, `x-1`, `~(x+1)` or `(~x)+1` Thus it makes sense to create a single instruction that covers all of these. A crucial addition that is essential for Scalable Vector usage as Predicate Masks, is the second mask parameter (RB). The additional paramater, L, if set, will leave bits of RA masked by RB unaltered, otherwise those bits are set to zero. Note that when `RB=0` then instead of reading from the register file the mask is set to all ones. Executable pseudocode demo: ``` [[!inline pages="openpower/sv/bmask.py" quick="yes" raw="yes" ]] ``` ---------- \newpage{} [[!inline pages="openpower/sv/vector_ops" raw=yes ]] ---------- \newpage{} # Instruction Formats Add the following entries to Book I 1.6.1 Word Instruction Formats: ## MM-FORM ``` |0 |6 |11 |16 |21 |24 |25 |31 | | PO | FRT | FRA | FRB | FMM | XO | Rc | | PO | RT | RA | RB | MMM | / | XO | Rc | ``` Add the following new fields to Book I 1.6.2 Word Instruction Fields: ``` FMM (21:24) Field used to specify minimum/maximum mode for fminmax[s]. Formats: MM MMM (21:23) Field used to specify minimum/maximum mode for integer minmax. Formats: MM ``` Add `MM` to the `Formats:` list for all of `FRT`, `FRA`, `FRB`, `XO (25:30)`, `Rc`, `RT`, `RA` and `RB`. ---------- \newpage{} # Appendices Appendix E Power ISA sorted by opcode Appendix F Power ISA sorted by version Appendix G Power ISA sorted by Compliancy Subset Appendix H Power ISA sorted by mnemonic | Form | Book | Page | Version | Mnemonic | Description | |------|------|------|---------|----------|-------------| | MM | I | # | 3.2B | fminmax | Floating Minimum/Maximum | | MM | I | # | 3.2B | fminmaxs | Floating Minimum/Maximum Single | | MM | I | # | 3.2B | minmax | Minimum/Maximum | [[!tag opf_rfc]]