openpower/sv/rfc/ls002.mdwn

   1 # RFC ls002 Floating-Point Load-Immediate
   2
   3 **URLs**:
   4
   5 * <https://libre-soc.org/openpower/sv/int_fp_mv/#fmvis>
   6 * <https://libre-soc.org/openpower/sv/rfc/ls002/>
   7 * <https://bugs.libre-soc.org/show_bug.cgi?id=944>
   8 * <https://git.openpower.foundation/isa/PowerISA/issues/87>
   9
  10 **Severity**: Major
  11
  12 **Status**: New
  13
  14 **Date**: 03 Oct 2022
  15
  16 **Target**: v3.2
  17
  18 **Source**: v3.0B
  19
  20 **Books and Section affected**:
  21
  22 ```
  23     Book I Scalar Floating-Point 4.6.2.1
  24     Appendix D Power ISA sorted by opcode
  25     Appendix E Power ISA sorted by version
  26     Appendix F Power ISA sorted by mnemonic
  27 ```
  28
  29 **Summary**
  30
  31 ```
  32     Instructions added
  33     fmvis - Floating-Point Move Immediate, Single
  34     fishmv - Floating-Point Immediate, Second-half Move
  35     (Potentially 64-bit prefixed of the same)
  36 ```
  37
  38 **Submitter**: Luke Leighton (Libre-SOC)
  39
  40 **Requester**: Libre-SOC
  41
  42 **Impact on processor**:
  43
  44 ```
  45     Addition of two new FPR-based instructions
  46     (potentially 3 if EXT001 Prefixed variants added)
  47 ```
  48
  49 **Impact on software**:
  50
  51 ```
  52     Requires support for new instructions in assembler, debuggers,
  53     and related tools.
  54 ```
  55
  56 **Keywords**:
  57
  58 ```
  59     FPR, Floating-point, Load-immediate, BF16, FP32
  60 ```
  61
  62 **Motivation**
  63
  64 Similar to `lxvkq` but extended to a full BF16 with one
  65 32-bit instruction and a full FP32 in two 32-bit instructions
  66 these instructions always save a Data Load and associated L1
  67 and TLB lookup. Even clearing an FPR to zero presently requires Load.
  68
  69 **Notes and Observations**:
  70
  71 1. There is no need for an Rc=1 variant because this is an immediate
  72   loading instruction (an FPR equivalent to `li`)
  73 2. There is no need for Special Registers (FP Flags) because this
  74   is an immediate loading instruction.  No FPR Load Operations
  75   alter `FPSCR`, neither does `lxvkq`, and on that basis neither
  76   should these instructions.
  77 3. An EXT001 Variant which also save similar Data-Load and Data-TLB
  78   lookups are mentioned for completeness but not included as part
  79   of this RFC. Another Stakeholder with a vested interest in 64-bit
  80   Prefixed instructions may wish to consider submitting them.
  81 4. `fishmv` as a FRS-only Read-Modify-Write (instead of an unnecessary
  82   FRS,FRA pair) saves five potential bits, making
  83   the difference between a 5-bit XO (VA/DX-Form) and requiring an entire
  84   Primary Opcode.
  85
  86 **Changes**
  87
  88 Add the following entries to the Appendices and instructions of
  89 Book I as a new Section 4.6.2.1
  90
  91 ----------------
  92
  93 # Appendices
  94
  95     Appendix D Power ISA sorted by opcode
  96     Appendix E Power ISA sorted by version
  97     Appendix F Power ISA sorted by mnemonic
  98
  99 | Form | Book | Page | Version | mnemonic | Description |
 100 |------|------|------|---------|----------|-------------|
 101 | DX   | I    | #    | 3.0B    | fmvis    | Floating-point Move Immediate, Single |
 102 | DX   | I    | #    | 3.0B    | fishmv  | Floating-point Immediate, Second-half Move |
 103
 104 \newpage{}
 105
 106 # Floating-Point Move Immediate
 107
 108 `fmvis FRS, D`
 109
 110 |  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form    |
 111 |--------|------|-------|-------|-------|-----|---------|
 112 |  Major | FRS  | d1    | d0    | XO    | d2  | DX-Form |
 113
 114 Pseudocode:
 115
 116 ```
 117     bf16 <- d0 || d1 || d2 # create BF16 immediate
 118     fp32 <- bf16 || [0]*16 # convert BF16 to FP32
 119     FRS <- DOUBLE(fp32)    # convert FP32 to FP64
 120 ```
 121
 122 Special registers altered:
 123
 124     None
 125
 126 Reinterprets `D << 16` as a 32-bit float, which is then converted to a
 127 64-bit float and written to `FRS`.  This is equivalent to reinterpreting
 128 `D` as a `BF16` and converting to 64-bit float.
 129
 130 Examples:
 131
 132 ```
 133     fmvis f4, 0 # writes +0.0 to f4 (clears an FPR)
 134     fmvis f4, 0x8000 # writes -0.0 to f4
 135     fmvis f4, 0x3F80 # writes +1.0 to f4
 136     fmvis f4, 0xBFC0 # writes -1.5 to f4
 137     fmvis f4, 0x7FC0 # writes +qNaN to f4
 138     fmvis f4, 0x7F80 # writes +Infinity to f4
 139     fmvis f4, 0xFF80 # writes -Infinity to f4
 140     fmvis f4, 0x3FFF # writes +1.9921875 to f4
 141 ```
 142
 143 # Floating-Point Immediate Second-Half Move
 144
 145 `fishmv FRS, D`
 146
 147 DX-Form:
 148
 149 |  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form    |
 150 |--------|------|-------|-------|-------|-----|---------|
 151 |  Major | FRS  | d1    | d0    | XO    | d2  | DX-Form |
 152
 153 Pseudocode:
 154
 155 ```
 156     n <- (FRS)                     # read FRS
 157     fp32 <- SINGLE(n)              # convert to FP32
 158     fp32[16:31] <- d0 || d1 || d2  # replace LSB half
 159     FRS <- DOUBLE(fp32)            # convert back to FP64
 160 ```
 161
 162 Special registers altered:
 163
 164     None
 165
 166 An additional 16-bits of immediate is
 167 inserted into `FRS` to extend its accuracy to
 168 a full FP32, which is then stored in the usual FP64 Format within the FPR.
 169
 170 **This instruction performs a Read-Modify-Write.** *FRS is read, the
 171 additional
 172 16 bit immediate inserted, and the result also written to FRS.
 173 This is strategically similar to how `li` combined with `oris` is
 174 used to construct 32-bit Integers.
 175 `fishmv` may be macro-op-fused with `fmvis`*
 176
 177 Programmer's note:
 178 If a prior `fmvis` instruction had been used to
 179 set the upper 16-bits from an FP32 value, `fishmv` may be used
 180 to set the
 181 lower 16-bits.
 182 Example:
 183
 184 ```
 185     # these two combined instructions write 0x3f808000
 186     # into f4 as an FP32 to be converted to an FP64.
 187     # actual contents in f4 after conversion: 0x3ff0_1000_0000_0000
 188     # first the upper bits, happens to be +1.0
 189     fmvis f4, 0x3F80 # writes +1.0 to f4
 190     # now write the lower 16 bits of an FP32
 191     fishmv f4, 0x8000 # writes +1.00390625 to f4
 192 ```
 193 [[!tag opf_rfc]]
 194
 195 -------------
 196