(no commit message)

[libreriscv.git] / openpower / sv / int_fp_mv.mdwn
diff --git a/openpower/sv/int_fp_mv.mdwn b/openpower/sv/int_fp_mv.mdwn

index 796de2b479c2649b8e480bb12342856db5bcd577..27eecaf7ab204ee85327fc0eebd86df4ade139c0 100644 (file)
--- a/openpower/sv/int_fp_mv.mdwn
+++ b/openpower/sv/int_fp_mv.mdwn
@@ -1,3 +1,5 @@
+[[!tag standards]]
+
  # FPR-to-GPR and GPR-to-FPR
  
  **Draft Status** under development, for submission as an RFC
@@ -15,15 +17,17 @@ High-performance CPU/GPU software needs to often convert between integers
  and floating-point, therefore fast conversion/data-movement instructions
  are needed.  Also given that initialisation of floats tends to take up
  considerable space (even to just load 0.0) the inclusion of compact
-format float immediate is up for consideration using BF16
+format float immediate is up for consideration using BF16 as a base.
  
  Libre-SOC will be compliant with the
  **Scalar Floating-Point Subset** (SFFS) i.e. is not implementing VMX/VSX,
  and with its focus on modern 3D GPU hybrid workloads represents an
  important new potential use-case for OpenPOWER.
  
-The progressive development of the Scalar parts of the Power ISA assumed
-that VSX would be there to complement it. However With VMX/VSX 
+Prior to the formation of the Compliancy Levels first introduced
+in v3.0C and v3.1
+the progressive historic development of the Scalar parts of the Power ISA assumed
+that VSX would always be there to complement it. However With VMX/VSX 
  **not available** in the newly-introduced SFFS Compliancy Level, the
  existing non-VSX conversion/data-movement instructions require load/store
  instructions (slow and expensive) to transfer data between the FPRs and
@@ -50,7 +54,7 @@ well suited for common/important conversion sequences:
  * standard Integer -> FP IEEE754 conversion (used by most languages and CPUs)
  * standard OpenPower FP -> Integer conversion (saturation with NaN
    converted to minimum valid integer)
-* Rust FP -> Integer conversion (saturation with NaN converted to 0)
+* Java FP -> Integer conversion (saturation with NaN converted to 0)
  * JavaScript FP -> Integer conversion (modular with Inf/NaN converted to 0)
  
  The assembly listings in the [[int_fp_mv/appendix]] show how costly
@@ -80,16 +84,17 @@ OpenPOWER however has instructions for both:
  * rounding mode read from FPSCR
  * rounding mode always set to truncate
  
-### Rust FP -> Integer conversion
+### Java FP -> Integer conversion
  
-For the sake of simplicity, the FP -> Integer conversion semantics generalized from those used by Rust's `as` operator will be referred to as [Rust conversion semantics](#fp-to-int-rust-conversion-semantics).
+For the sake of simplicity, the FP -> Integer conversion semantics generalized from those used by Java's semantics (and Rust's `as` operator) will be referred to as
+[Java conversion semantics](#fp-to-int-java-conversion-semantics).
  
  Those same semantics are used in some way by all of the following languages (not necessarily for the default conversion method):
  
-* Rust's FP -> Integer conversion using the
-  [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
  * Java's
    [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
+* Rust's FP -> Integer conversion using the
+  [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
  * LLVM's
    [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
    [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
@@ -103,14 +108,29 @@ Those same semantics are used in some way by all of the following languages (not
  
  For the sake of simplicity, the FP -> Integer conversion semantics generalized from those used by JavaScripts's `ToInt32` abstract operation will be referred to as [JavaScript conversion semantics](#fp-to-int-javascript-conversion-semantics).
  
+This instruction is present in ARM assembler as FJCVTZS
+<https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
+
  ### Other languages
  
  TODO: review and investigate other language semantics
  
  # Proposed New Scalar Instructions
  
-All of the following instructions use the standard OpenPower conversion to/from 64-bit float format when reading/writing a 32-bit float from/to a FPR. This can be overridden by SimpleV, as explained in the 
- [[int_fp_mv/appendix]]
+All of the following instructions use the standard OpenPower conversion to/from 64-bit float format when reading/writing a 32-bit float from/to a FPR.  All integers however are sourced/stored in the *GPR*.
+
+Integer operands and results being in the GPR is the key differentiator between the proposed instructions
+(the entire rationale) compated to existing Scalar Power ISA.
+In all existing Power ISA Scalar conversion instructions, all
+operands are FPRs, even if the format of the source or destination
+data is actually a scalar integer.
+
+Note that source and destination widths can be overridden by SimpleV
+SVP64, and that SVP64 also has Saturation Modes *in addition*
+to those independently described here. SVP64 Overrides and Saturation
+work on *both* Fixed *and* Floating Point operands and results.
+ The interactions with SVP64
+are explained in the  [[int_fp_mv/appendix]]
  
  ## FPR to GPR moves
  
@@ -118,15 +138,21 @@ All of the following instructions use the standard OpenPower conversion to/from
  * `fmvtg. RT, FRA`
  
  move a 64-bit float from a FPR to a GPR, just copying bits directly.
-Rc=1 tests RT and sets CR0
+As a direct bitcopy, no exceptions occur and no status flags are set.
+
+Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
+operations.
  
  * `fmvtgs RT, FRA`
  * `fmvtgs. RT, FRA`
  
  move a 32-bit float from a FPR to a GPR, just copying bits. Converts the
  64-bit float in `FRA` to a 32-bit float, then writes the 32-bit float to
-`RT`.
-Rc=1 tests RT and sets CR0
+`RT`. Effectively, `fmvtgs` is a macro-fusion of `frsp fmvtg`
+and therefore has the exact same exception and flags behaviour of `frsp`
+
+Unlike `frsp` however, with RT being a GPR, Rc=1 follows
+standard *integer* behaviour, i.e. tests RT and sets CR0.
  
  ## GPR to FPR moves
  
@@ -135,15 +161,16 @@ Rc=1 tests RT and sets CR0
  move a 64-bit float from a GPR to a FPR, just copying bits. No exceptions
  are raised, no flags are altered of any kind.
  
-TODO: Rc=1 variants?
+Rc=1 tests FRT and sets CR1
  
  `fmvfgs FRT, RA`
  
  move a 32-bit float from a GPR to a FPR, just copying bits. Converts the
  32-bit float in `RA` to a 64-bit float, then writes the 64-bit float to
-`FRT`. Effectively, `fmvfgs` is a macro-fusion of `fmvfg frsp`.
+`FRT`. Effectively, `fmvfgs` is a macro-fusion of `fmvfg frsp` and
+therefore has the exact same exception and flags behaviour of `frsp`
  
-TODO: Rc=1 variants?
+Rc=1 tests FRT and sets CR1
  
  TODO: clear statement on evaluation as to whether exceptions or flags raised as part of the **FP** conversion (not the int bitcopy part, the conversion part.  the semantics should really be the same as frsp)
  
@@ -156,7 +183,9 @@ FPRF is set to the class and sign of the result, except for Invalid Operation Ex
        FX OX UX XX VXSNAN
        CR1 (if Rc=1)
  
-### Float load immediate (kinda a variant of `fmvfg`)
+### Float load immediate <a name="fmvis"></a>
+
+This is like a variant of `fmvfg`
  
  `fmvis FRT, FI`
  
@@ -164,6 +193,10 @@ Reinterprets `FI << 16` as a 32-bit float, which is then converted to a
  64-bit float and written to `FRT`.  This is equivalent to reinterpreting
  `FI` as a `BF16` and converting to 64-bit float.
  
+There is no need for an Rc=1 variant because this is an immediate loading
+instruction. This frees up one extra bit in the X-Form format for packing
+a full `BF16`.
+
  Example:
  
  ```
@@ -189,16 +222,13 @@ change all [GPR to FPR conversion instructions](#GPR-to-FPR-conversions)
  to instead write `+0.0` if `RA` is register `0`, at least
  allowing clearing FPRs.
  
-|  0-5   | 6-10 | 11-25 | 26-30 | 31  |
-|--------|------|-------|-------|-----|
-|  Major | FRT  | FI    | XO    | FI0 |
+`fmvis` fits well with DX-Form:
  
-The above fits reasonably well with Minor 19 and follows the
-pattern shown by `addpcis`, which uses an entire column of Minor 19
-XO.  15 bits of FI fit into bits 11 to 25,
-the top bit FI0 (MSB0 numbered 0) makes 16.
+|  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form |
+|--------|------|-------|-------|-------|-----|-----|
+|  Major | FRT  | d1    | d0    | XO    | d2  | DX-Form |
  
-    bf16 = FI0 || FI
+    bf16 = d0 || d1 || d2
      fp32 = bf16 || [0]*16
      FRT = Single_to_Double(fp32)
  
@@ -219,55 +249,40 @@ Mode values:
  |------|-----------------|----------------------------------|
  | 000  | from `FPSCR`    | [OpenPower semantics]            |
  | 001  | Truncate        | [OpenPower semantics]            |
-| 010  | from `FPSCR`    | [Rust semantics]                 |
-| 011  | Truncate        | [Rust semantics]                 |
+| 010  | from `FPSCR`    | [Java semantics]                 |
+| 011  | Truncate        | [Java semantics]                 |
  | 100  | from `FPSCR`    | [JavaScript semantics]           |
  | 101  | Truncate        | [JavaScript semantics]           |
  | rest | --              | illegal instruction trap for now |
  
  [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
-[Rust semantics]: #fp-to-int-rust-conversion-semantics
+[Java semantics]: #fp-to-int-java-conversion-semantics
  [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
  
-`fcvttgw RT, FRA, Mode`
-
-Convert from 64-bit float to 32-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvttguw RT, FRA, Mode`
-
-Convert from 64-bit float to 32-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvttgd RT, FRA, Mode`
-
-Convert from 64-bit float to 64-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvttgud RT, FRA, Mode`
-
-Convert from 64-bit float to 64-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstgw RT, FRA, Mode`
-
-Convert from 32-bit float to 32-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstguw RT, FRA, Mode`
-
-Convert from 32-bit float to 32-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstgd RT, FRA, Mode`
-
-Convert from 32-bit float to 64-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstgud RT, FRA, Mode`
-
-Convert from 32-bit float to 64-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvttgw RT, FRA, Mode`
+  Convert from 64-bit float to 32-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvttguw RT, FRA, Mode`
+  Convert from 64-bit float to 32-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvttgd RT, FRA, Mode`
+  Convert from 64-bit float to 64-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvttgud RT, FRA, Mode`
+  Convert from 64-bit float to 64-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvtstgw RT, FRA, Mode`
+  Convert from 32-bit float to 32-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvtstguw RT, FRA, Mode`
+  Convert from 32-bit float to 32-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvtstgd RT, FRA, Mode`
+  Convert from 32-bit float to 64-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvtstgud RT, FRA, Mode`
+  Convert from 32-bit float to 64-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
  
  [mode `Mode`]: #fpr-to-gpr-conversion-mode
  
@@ -275,37 +290,30 @@ to the GPR `RT`. Converts using [mode `Mode`]
  
  All of the following GPR to FPR conversions use the rounding mode from `FPSCR`.
  
-`fcvtfgw FRT, RA`
-
-Convert from 32-bit signed integer in the GPR `RA` to 64-bit float in `FRT`.
-
-`fcvtfgws FRT, RA`
-
-Convert from 32-bit signed integer in the GPR `RA` to 32-bit float in `FRT`.
-
-`fcvtfguw FRT, RA`
-
-Convert from 32-bit unsigned integer in the GPR `RA` to 64-bit float in `FRT`.
-
-`fcvtfguws FRT, RA`
-
-Convert from 32-bit unsigned integer in the GPR `RA` to 32-bit float in `FRT`.
-
-`fcvtfgd FRT, RA`
-
-Convert from 64-bit signed integer in the GPR `RA` to 64-bit float in `FRT`.
-
-`fcvtfgds FRT, RA`
-
-Convert from 64-bit signed integer in the GPR `RA` to 32-bit float in `FRT`.
-
-`fcvtfgud FRT, RA`
-
-Convert from 64-bit unsigned integer in the GPR `RA` to 64-bit float in `FRT`.
-
-`fcvtfguds FRT, RA`
-
-Convert from 64-bit unsigned integer in the GPR `RA` to 32-bit float in `FRT`.
+* `fcvtfgw FRT, RA`
+  Convert from 32-bit signed integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfgws FRT, RA`
+  Convert from 32-bit signed integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
+* `fcvtfguw FRT, RA`
+  Convert from 32-bit unsigned integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfguws FRT, RA`
+  Convert from 32-bit unsigned integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
+* `fcvtfgd FRT, RA`
+  Convert from 64-bit signed integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfgds FRT, RA`
+  Convert from 64-bit signed integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
+* `fcvtfgud FRT, RA`
+  Convert from 64-bit unsigned integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfguds FRT, RA`
+  Convert from 64-bit unsigned integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
  
  # FP to Integer Conversion Pseudo-code
  
@@ -336,11 +344,14 @@ def fp_to_int_open_power<fp, int>(v: fp) -> int:
      return (int)rint(v, rounding_mode)
  ```
  
-<div id="fp-to-int-rust-conversion-semantics"></div>
-Rust [conversion semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics) (with adjustment to add non-truncate rounding modes):
+<div id="fp-to-int-java-conversion-semantics"></div>
+[Java conversion semantics]((https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3))
+/
+[Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
+(with adjustment to add non-truncate rounding modes):
  
  ```
-def fp_to_int_rust<fp, int>(v: fp) -> int:
+def fp_to_int_java<fp, int>(v: fp) -> int:
      if v is NaN:
          return 0
      if v >= int::MAX_VALUE:
@@ -351,7 +362,8 @@ def fp_to_int_rust<fp, int>(v: fp) -> int:
  ```
  
  <div id="fp-to-int-javascript-conversion-semantics"></div>
-JavaScript [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32) (with adjustment to add non-truncate rounding modes):
+Section 7.1 of the ECMAScript / JavaScript
+[conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32) (with adjustment to add non-truncate rounding modes):
  
  ```
  def fp_to_int_java_script<fp, int>(v: fp) -> int:
@@ -363,6 +375,10 @@ def fp_to_int_java_script<fp, int>(v: fp) -> int:
      return (int)bits
  ```
  
-# Equivalent OpenPower ISA v3.0 Assembly Language for FP -> Integer Conversion Modes
+# Power ISA v3.0 Assembly equivalents
+
+Moved to [[int_fp_mv/appendix]], it is demonstrated how much assembler
+is required in order to perform each of the three language-specific
+FP -> Integer Conversion Modes. In the case of Javascript an astonishing
+35 instructions are required, with 5 branches.
  
-Moved to [[int_fp_mv/appendix]]