openpower/sv/int_fp_mv.mdwn

   1 [[!tag standards]]
   2
   3 # FPR-to-GPR and GPR-to-FPR
   4
   5 **Draft Status** under development, for submission as an RFC
   6
   7 Links:
   8
   9 * <https://bugs.libre-soc.org/show_bug.cgi?id=650>
  10 * <https://bugs.libre-soc.org/show_bug.cgi?id=230#c71>
  11 * <https://bugs.libre-soc.org/show_bug.cgi?id=230#c74>
  12 * <https://bugs.libre-soc.org/show_bug.cgi?id=230#c76>
  13 * [[int_fp_mv/appendix]]
  14
  15 Trademarks:
  16
  17 * Rust is a Trademark of the Rust Foundation
  18 * Java and Javascript are Trademarks of Oracle
  19 * LLVM is a Trademark of the LLVM Foundation
  20 * SPIR-V is a Trademark of the Khronos Group
  21 * OpenCL is a Trademark of Apple, Inc.
  22
  23 Referring to these Trademarks within this document
  24 is by necessity, in order to put the semantics of each language
  25 into context, and is considered "fair use" under Trademark
  26 Law.
  27
  28 Introduction:
  29
  30 High-performance CPU/GPU software needs to often convert between integers
  31 and floating-point, therefore fast conversion/data-movement instructions
  32 are needed.  Also given that initialisation of floats tends to take up
  33 considerable space (even to just load 0.0) the inclusion of two compact
  34 format float immediate instructions is up for consideration using 16-bit
  35 immediates. BF16 is one of the formats: a second instruction allows a full
  36 accuracy FP32 to be constructed.
  37
  38 Libre-SOC will be compliant with the
  39 **Scalar Floating-Point Subset** (SFFS) i.e. is not implementing VMX/VSX,
  40 and with its focus on modern 3D GPU hybrid workloads represents an
  41 important new potential use-case for OpenPOWER.
  42
  43 Prior to the formation of the Compliancy Levels first introduced
  44 in v3.0C and v3.1
  45 the progressive historic development of the Scalar parts of the Power ISA assumed
  46 that VSX would always be there to complement it. However With VMX/VSX
  47 **not available** in the newly-introduced SFFS Compliancy Level, the
  48 existing non-VSX conversion/data-movement instructions require
  49 a Vector of load/store
  50 instructions (slow and expensive) to transfer data between the FPRs and
  51 the GPRs.  For a modern 3D GPU this kills any possibility of a
  52 competitive edge.
  53 Also, because SimpleV needs efficient scalar instructions in
  54 order to generate efficient vector instructions, adding new instructions
  55 for data-transfer/conversion between FPRs and GPRs multiplies the savings.
  56
  57 In addition, the vast majority of GPR <-> FPR data-transfers are as part
  58 of a FP <-> Integer conversion sequence, therefore reducing the number
  59 of instructions required is a priority.
  60
  61 Therefore, we are proposing adding:
  62
  63 * FPR load-immediate instructions, one equivalent to `BF16`, the
  64   other increasing accuracy to `FP32`
  65 * FPR <-> GPR data-transfer instructions that just copy bits without conversion
  66 * FPR <-> GPR combined data-transfer/conversion instructions that do
  67   Integer <-> FP conversions
  68
  69 If adding new Integer <-> FP conversion instructions,
  70 the opportunity may be taken to modernise the instructions and make them
  71 well-suited for common/important conversion sequences:
  72
  73 * **standard IEEE754** - used by most languages and CPUs
  74 * **standard OpenPOWER** - saturation with NaN
  75   converted to minimum valid integer
  76 * **Java** - saturation with NaN converted to 0
  77 * **JavaScript** - modulo wrapping with Inf/NaN converted to 0
  78
  79 The assembly listings in the [[int_fp_mv/appendix]] show how costly
  80 some of these language-specific conversions are: Javascript, the
  81 worst case, is 32 scalar instructions including seven branch instructions.
  82
  83 # Proposed New Scalar Instructions
  84
  85 All of the following instructions use the standard OpenPower conversion to/from 64-bit float format when reading/writing a 32-bit float from/to a FPR.  All integers however are sourced/stored in the *GPR*.
  86
  87 Integer operands and results being in the GPR is the key differentiator between the proposed instructions
  88 (the entire rationale) compared to existing Scalar Power ISA.
  89 In all existing Power ISA Scalar conversion instructions, all
  90 operands are FPRs, even if the format of the source or destination
  91 data is actually a scalar integer.
  92
  93 *(The existing Scalar instructions being FP-FP only is based on an assumption
  94 that VSX will be implemented, and VSX is not part of the SFFS Compliancy
  95 Level. An earlier version of the Power ISA used to have similar
  96 FPR<->GPR instructions to these:
  97 they were deprecated due to this incorrect assumption that VSX would
  98 always be present).*
  99
 100 Note that source and destination widths can be overridden by SimpleV
 101 SVP64, and that SVP64 also has Saturation Modes *in addition*
 102 to those independently described here. SVP64 Overrides and Saturation
 103 work on *both* Fixed *and* Floating Point operands and results.
 104  The interactions with SVP64
 105 are explained in the  [[int_fp_mv/appendix]]
 106
 107 # Float load immediate  <a name="fmvis"></a>
 108
 109 These are like a variant of `fmvfg` and `oris`, combined.
 110 Power ISA currently requires a large
 111 number of instructions to get Floating Point constants into registers.
 112 `fmvis` on its own is equivalent to BF16 to FP32/64 conversion,
 113 but if followed up by `frlsi` an additional 16 bits of accuracy in the
 114 mantissa may be achieved.
 115
 116 *IBM may consider it worthwhile to extend these two instructions to
 117 v3.1 Prefixed (`pfmvis` and `pfrlsi`). If so it is recommended that
 118 `pfmvis` load a full FP32 immediate and `pfrlsi` supplies the three high
 119 missing exponent bits (numbered 8 to 10) and the lower additional
 120 29 mantissa bits (23 to 51) needed to construct a full FP64 immediate.*
 121
 122 ## Load BF16 Immediate
 123
 124 `fmvis FRS, D`
 125
 126 Reinterprets `D << 16` as a 32-bit float, which is then converted to a
 127 64-bit float and written to `FRS`.  This is equivalent to reinterpreting
 128 `D` as a `BF16` and converting to 64-bit float.
 129 There is no need for an Rc=1 variant because this is an immediate loading
 130 instruction.
 131
 132 Example:
 133
 134 ```
 135 # clearing a FPR
 136 fmvis f4, 0 # writes +0.0 to f4
 137 # loading handy constants
 138 fmvis f4, 0x8000 # writes -0.0 to f4
 139 fmvis f4, 0x3F80 # writes +1.0 to f4
 140 fmvis f4, 0xBF80 # writes -1.0 to f4
 141 fmvis f4, 0xBFC0 # writes -1.5 to f4
 142 fmvis f4, 0x7FC0 # writes +qNaN to f4
 143 fmvis f4, 0x7F80 # writes +Infinity to f4
 144 fmvis f4, 0xFF80 # writes -Infinity to f4
 145 fmvis f4, 0x3FFF # writes +1.9921875 to f4
 146
 147 # clearing 128 FPRs with 2 SVP64 instructions
 148 # by issuing 32 vec4 (subvector length 4) ops
 149 setvli VL=MVL=32
 150 sv.fmvis/vec4 f0, 0 # writes +0.0 to f0-f127
 151 ```
 152 Important: If the float load immediate instruction(s) are left out,
 153 change all [GPR to FPR conversion instructions](#GPR-to-FPR-conversions)
 154 to instead write `+0.0` if `RA` is register `0`, at least
 155 allowing clearing FPRs.
 156
 157 `fmvis` fits with DX-Form:
 158
 159 |  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form |
 160 |--------|------|-------|-------|-------|-----|-----|
 161 |  Major | FRS  | d1    | d0    | XO    | d2  | DX-Form |
 162
 163 Pseudocode:
 164
 165     bf16 = d0 || d1 || d2
 166     fp32 = bf16 || [0]*16
 167     FRS = Single_to_Double(fp32)
 168
 169 ## Float Replace Lower-Half Single, Immediate <a name="frlsi"></a>
 170
 171 `frlsi FRS, D`
 172
 173 DX-Form:
 174
 175 |  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form |
 176 |--------|------|-------|-------|-------|-----|-----|
 177 |  Major | FRS  | d1    | d0    | XO    | d2  | DX-Form |
 178
 179 Strategically similar to how `oris` is used to construct
 180 32-bit Integers, an additional 16-bits of immediate is
 181 inserted into `FRS` to extend its accuracy to
 182 a full FP32 (stored as usual in FP64 Format within the FPR).
 183 If a prior `fmvis` instruction had been used to
 184 set the upper 16-bits of an FP32 value, `frlsi` contains the
 185 lower 16-bits.
 186
 187 The key difference between using `li` and `oris` to construct 32-bit
 188 GPR Immediates and `frlsi` is that the `fmvis` will have converted
 189 the `BF16` immediate to FP64 (Double) format.
 190 This is taken into consideration
 191 as can be seen in the pseudocode below.
 192
 193 Pseudocode:
 194
 195     fp32 = Double_to_Single(FRS)
 196     n = fp32[0:15] || d0 || d1 || d2
 197     FRS = Single_to_Double(n)
 198
 199 *This instruction performs a Read-Modify-Write. FRS is read, the additional
 200 16 bit immediate inserted, and the result also written to FRS*
 201
 202 Example:
 203
 204 ```
 205 # these two combined instructions write 0x3f808000
 206 # into f4 as an FP32 to be converted to an FP64.
 207 # actual contents in f4 after conversion: 0x3ff0_1000_0000_0000
 208 # first the upper bits, happens to be +1.0
 209 fmvis f4, 0x3F80 # writes +1.0 to f4
 210 # now write the lower 16 bits of an FP32
 211 frlsi f4, 0x8000 # writes +1.00390625 to f4
 212 ```
 213
 214 # Moves
 215
 216 These instructions perform a straight unaltered bit-level copy from one Register
 217 File to another.
 218
 219 # FPR to GPR moves
 220
 221 * `fmvtg RT, FRA`
 222 * `fmvtg. RT, FRA`
 223
 224 move a 64-bit float from a FPR to a GPR, just copying bits directly.
 225 As a direct bitcopy, no exceptions occur and no status flags are set.
 226
 227 Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
 228 operations.
 229
 230 * `fmvtgs RT, FRA`
 231 * `fmvtgs. RT, FRA`
 232
 233 move a 32-bit float from a FPR to a GPR, just copying bits. Converts the
 234 64-bit float in `FRA` to a 32-bit float, then writes the 32-bit float to
 235 `RT`. Effectively, `fmvtgs` is a macro-fusion of `frsp fmvtg`
 236 and therefore has the exact same exception and flags behaviour of `frsp`
 237
 238 Unlike `frsp` however, with RT being a GPR, Rc=1 follows
 239 standard *integer* behaviour, i.e. tests RT and sets CR0.
 240
 241 # GPR to FPR moves
 242
 243 `fmvfg FRT, RA`
 244
 245 move a 64-bit float from a GPR to a FPR, just copying bits. No exceptions
 246 are raised, no flags are altered of any kind.
 247
 248 Rc=1 tests FRT and sets CR1
 249
 250 `fmvfgs FRT, RA`
 251
 252 move a 32-bit float from a GPR to a FPR, just copying bits. Converts the
 253 32-bit float in `RA` to a 64-bit float, then writes the 64-bit float to
 254 `FRT`. Effectively, `fmvfgs` is a macro-fusion of `fmvfg frsp` and
 255 therefore has the exact same exception and flags behaviour of `frsp`
 256
 257 Rc=1 tests FRT and sets CR1
 258
 259 TODO: clear statement on evaluation as to whether exceptions or flags raised as part of the **FP** conversion (not the int bitcopy part, the conversion part.  the semantics should really be the same as frsp)
 260
 261 v3.0C section 4.6.7.1 states:
 262
 263 FPRF is set to the class and sign of the result, except for Invalid Operation Exceptions when VE=1.
 264
 265     Special Registers Altered:
 266       FPRF FR FI
 267       FX OX UX XX VXSNAN
 268       CR1 (if Rc=1)
 269
 270 # Conversions
 271
 272 Unlike the move instructions
 273 these instructions perform conversions between Integer and
 274 Floating Point. Truncation can therefore occur, as well
 275 as exceptions.
 276
 277 Mode values:
 278
 279 | Mode | `rounding_mode` | Semantics                        |
 280 |------|-----------------|----------------------------------|
 281 | 000  | from `FPSCR`    | [OpenPower semantics]            |
 282 | 001  | Truncate        | [OpenPower semantics]            |
 283 | 010  | from `FPSCR`    | [Java semantics]                 |
 284 | 011  | Truncate        | [Java semantics]                 |
 285 | 100  | from `FPSCR`    | [JavaScript semantics]           |
 286 | 101  | Truncate        | [JavaScript semantics]           |
 287 | rest | --              | illegal instruction trap for now |
 288
 289 [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
 290 [Java semantics]: #fp-to-int-java-conversion-semantics
 291 [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
 292
 293 ## GPR to FPR conversions
 294
 295 **Format**
 296
 297 |  0-5   | 6-10 | 11-15  | 16-25 | 26-30 | 31 | Form |
 298 |--------|------|--------|-------|-------|----|------|
 299 |  Major | FRT  | //Mode | RA    | XO    | Rc |X-Form|
 300
 301 All of the following GPR to FPR conversions use the rounding mode from `FPSCR`.
 302
 303 * `fcvtfgw FRT, RA`
 304   Convert from 32-bit signed integer in the GPR `RA` to 64-bit float in
 305   `FRT`.
 306 * `fcvtfgws FRT, RA`
 307   Convert from 32-bit signed integer in the GPR `RA` to 32-bit float in
 308   `FRT`.
 309 * `fcvtfguw FRT, RA`
 310   Convert from 32-bit unsigned integer in the GPR `RA` to 64-bit float in
 311   `FRT`.
 312 * `fcvtfguws FRT, RA`
 313   Convert from 32-bit unsigned integer in the GPR `RA` to 32-bit float in
 314   `FRT`.
 315 * `fcvtfgd FRT, RA`
 316   Convert from 64-bit signed integer in the GPR `RA` to 64-bit float in
 317   `FRT`.
 318 * `fcvtfgds FRT, RA`
 319   Convert from 64-bit signed integer in the GPR `RA` to 32-bit float in
 320   `FRT`.
 321 * `fcvtfgud FRT, RA`
 322   Convert from 64-bit unsigned integer in the GPR `RA` to 64-bit float in
 323   `FRT`.
 324 * `fcvtfguds FRT, RA`
 325   Convert from 64-bit unsigned integer in the GPR `RA` to 32-bit float in
 326   `FRT`.
 327
 328 ## FPR to GPR (Integer) conversions
 329
 330 <div id="fpr-to-gpr-conversion-mode"></div>
 331
 332 Different programming languages turn out to have completely different
 333 semantics for FP to Integer conversion.  Below is an overview
 334 of the different variants, listing the languages and hardware that
 335 implements each variant.
 336
 337 **Standard IEEE754 conversion**
 338
 339 This conversion is outlined in the IEEE754 specification.  It is used
 340 by nearly all programming languages and CPUs.  In the case of OpenPOWER,
 341 the rounding mode is read from FPSCR
 342
 343 **Standard OpenPower conversion**
 344
 345 This conversion, instead of exact IEEE754 Compliance, performs
 346 "saturation with NaN converted to minimum valid integer". This
 347 is also exactly the same as the x86 ISA conversion semantics.
 348 OpenPOWER however has instructions for both:
 349
 350 * rounding mode read from FPSCR
 351 * rounding mode always set to truncate
 352
 353 **Java conversion**
 354
 355 For the sake of simplicity, the FP -> Integer conversion semantics generalized from those used by Java's semantics (and Rust's `as` operator) will be referred to as
 356 [Java conversion semantics](#fp-to-int-java-conversion-semantics).
 357
 358 Those same semantics are used in some way by all of the following languages (not necessarily for the default conversion method):
 359
 360 * Java's
 361   [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 362 * Rust's FP -> Integer conversion using the
 363   [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 364 * LLVM's
 365   [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
 366   [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
 367 * SPIR-V's OpenCL dialect's
 368   [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
 369   [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
 370   instructions when decorated with
 371   [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
 372 * WebAssembly has also introduced
 373  [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
 374  [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
 375
 376 **JavaScript conversion**
 377
 378 For the sake of simplicity, the FP -> Integer conversion semantics generalized from those used by JavaScripts's `ToInt32` abstract operation will be referred to as [JavaScript conversion semantics](#fp-to-int-javascript-conversion-semantics).
 379
 380 This instruction is present in ARM assembler as FJCVTZS
 381 <https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
 382
 383 **Format**
 384
 385 |  0-5   | 6-10 | 11-15  | 16-25 | 26-30 | 31 | Form |
 386 |--------|------|--------|-------|-------|----|------|
 387 |  Major | RT   | //Mode | FRA   | XO    | Rc |X-Form|
 388
 389 **Rc=1 and OE=1**
 390
 391 All of these insructions have an Rc=1 mode which sets CR0
 392 in the normal way for any instructions producing a GPR result.
 393 Additionally, when OE=1, if the numerical value of the FP number
 394 is not 100% accurately preserved (due to truncation or saturation
 395 and including when the FP number was NaN) then this is considered
 396 to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
 397 are all set as normal for any GPR instructions that overflow.
 398
 399 **Instructions**
 400
 401 * `fcvttgw RT, FRA, Mode`
 402   Convert from 64-bit float to 32-bit signed integer, writing the result
 403   to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctiw` or `fctiwz`
 404 * `fcvttguw RT, FRA, Mode`
 405   Convert from 64-bit float to 32-bit unsigned integer, writing the result
 406   to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctiwu` or `fctiwuz`
 407 * `fcvttgd RT, FRA, Mode`
 408   Convert from 64-bit float to 64-bit signed integer, writing the result
 409   to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctid` or `fctidz`
 410 * `fcvttgud RT, FRA, Mode`
 411   Convert from 64-bit float to 64-bit unsigned integer, writing the result
 412   to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctidu` or `fctiduz`
 413 * `fcvtstgw RT, FRA, Mode`
 414   Convert from 32-bit float to 32-bit signed integer, writing the result
 415   to the GPR `RT`. Converts using [mode `Mode`]
 416 * `fcvtstguw RT, FRA, Mode`
 417   Convert from 32-bit float to 32-bit unsigned integer, writing the result
 418   to the GPR `RT`. Converts using [mode `Mode`]
 419 * `fcvtstgd RT, FRA, Mode`
 420   Convert from 32-bit float to 64-bit signed integer, writing the result
 421   to the GPR `RT`. Converts using [mode `Mode`]
 422 * `fcvtstgud RT, FRA, Mode`
 423   Convert from 32-bit float to 64-bit unsigned integer, writing the result
 424   to the GPR `RT`. Converts using [mode `Mode`]
 425
 426 [mode `Mode`]: #fpr-to-gpr-conversion-mode
 427
 428 ## FP to Integer Conversion Pseudo-code
 429
 430 Key for pseudo-code:
 431
 432 | term                      | result type | definition                                                                                         |
 433 |---------------------------|-------------|----------------------------------------------------------------------------------------------------|
 434 | `fp`                      | --          | `f32` or `f64` (or other types from SimpleV)                                                       |
 435 | `int`                     | --          | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV)                                              |
 436 | `uint`                    | --          | the unsigned integer of the same bit-width as `int`                                                |
 437 | `int::BITS`               | `int`       | the bit-width of `int`                                                                             |
 438 | `uint::MIN_VALUE`         | `uint`      | the minimum value `uint` can store: `0`                   |
 439 | `uint::MAX_VALUE`          | `uint`       | the maximum value `uint` can store: `2^int::BITS - 1`  |
 440 | `int::MIN_VALUE`          | `int`       | the minimum value `int` can store : `-2^(int::BITS-1)`              |
 441 | `int::MAX_VALUE`          | `int`       | the maximum value `int` can store :  `2^(int::BITS-1) - 1`  |
 442 | `int::VALUE_COUNT`        | Integer     | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`.           |
 443 | `rint(fp, rounding_mode)` | `fp`        | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode`      |
 444
 445 <div id="fp-to-int-openpower-conversion-semantics"></div>
 446 OpenPower conversion semantics (section A.2 page 999 (page 1023) of OpenPower ISA v3.1):
 447
 448 ```
 449 def fp_to_int_open_power<fp, int>(v: fp) -> int:
 450     if v is NaN:
 451         return int::MIN_VALUE
 452     if v >= int::MAX_VALUE:
 453         return int::MAX_VALUE
 454     if v <= int::MIN_VALUE:
 455         return int::MIN_VALUE
 456     return (int)rint(v, rounding_mode)
 457 ```
 458
 459 <div id="fp-to-int-java-conversion-semantics"></div>
 460 [Java conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
 461 /
 462 [Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
 463 (with adjustment to add non-truncate rounding modes):
 464
 465 ```
 466 def fp_to_int_java<fp, int>(v: fp) -> int:
 467     if v is NaN:
 468         return 0
 469     if v >= int::MAX_VALUE:
 470         return int::MAX_VALUE
 471     if v <= int::MIN_VALUE:
 472         return int::MIN_VALUE
 473     return (int)rint(v, rounding_mode)
 474 ```
 475
 476 <div id="fp-to-int-javascript-conversion-semantics"></div>
 477 Section 7.1 of the ECMAScript / JavaScript
 478 [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32) (with adjustment to add non-truncate rounding modes):
 479
 480 ```
 481 def fp_to_int_java_script<fp, int>(v: fp) -> int:
 482     if v is NaN or infinite:
 483         return 0
 484     v = rint(v, rounding_mode)  # assume no loss of precision in result
 485     v = v mod int::VALUE_COUNT  # 2^32 for i32, 2^64 for i64, result is non-negative
 486     bits = (uint)v
 487     return (int)bits
 488 ```
 489