hack to put comparison table on one page

[libreriscv.git] / openpower / sv / int_fp_mv.mdwn
diff --git a/openpower/sv/int_fp_mv.mdwn b/openpower/sv/int_fp_mv.mdwn

index c497f6f74ed2302f9fcca97f441387a9d2c560fd..723ac5385c528d6bb74e8762a5036109e8417a54 100644 (file)
--- a/openpower/sv/int_fp_mv.mdwn
+++ b/openpower/sv/int_fp_mv.mdwn
@@ -1,231 +1,281 @@
+[[!tag standards]]
+
  # FPR-to-GPR and GPR-to-FPR
  
+TODO special constants instruction (e, tau/N, ln 2, sqrt 2, etc.) -- exclude any constants available through fmvis
+
+**Draft Status** under development, for submission as an RFC
+
+Links:
+
+* <https://bugs.libre-soc.org/show_bug.cgi?id=650>
+* <https://bugs.libre-soc.org/show_bug.cgi?id=230#c71>
+* <https://bugs.libre-soc.org/show_bug.cgi?id=230#c74>
+* <https://bugs.libre-soc.org/show_bug.cgi?id=230#c76>
+* <https://bugs.libre-soc.org/show_bug.cgi?id=887> fmvis
+* [[int_fp_mv/appendix]]
+
+Trademarks:
+
+* Rust is a Trademark of the Rust Foundation
+* Java and Javascript are Trademarks of Oracle
+* LLVM is a Trademark of the LLVM Foundation
+* SPIR-V is a Trademark of the Khronos Group
+* OpenCL is a Trademark of Apple, Inc.
+
+Referring to these Trademarks within this document
+is by necessity, in order to put the semantics of each language
+into context, and is considered "fair use" under Trademark
+Law.
+
  Introduction:
  
  High-performance CPU/GPU software needs to often convert between integers
  and floating-point, therefore fast conversion/data-movement instructions
  are needed.  Also given that initialisation of floats tends to take up
-considerable space (even to just load 0.0) the inclusion of float immediate
-is up for consideration (BF16 as immediates)
+considerable space (even to just load 0.0) the inclusion of two compact
+format float immediate instructions is up for consideration using 16-bit
+immediates. BF16 is one of the formats: a second instruction allows a full
+accuracy FP32 to be constructed.
  
  Libre-SOC will be compliant with the
  **Scalar Floating-Point Subset** (SFFS) i.e. is not implementing VMX/VSX,
  and with its focus on modern 3D GPU hybrid workloads represents an
  important new potential use-case for OpenPOWER.
-With VMX/VSX not available in the SFFS Compliancy Level, the 
-existing non-VSX conversion/data-movement instructions require load/store
+
+Prior to the formation of the Compliancy Levels first introduced
+in v3.0C and v3.1
+the progressive historic development of the Scalar parts of the Power ISA assumed
+that VSX would always be there to complement it. However With VMX/VSX 
+**not available** in the newly-introduced SFFS Compliancy Level, the
+existing non-VSX conversion/data-movement instructions require 
+a Vector of load/store
  instructions (slow and expensive) to transfer data between the FPRs and
-the GPRs.  Also, because SimpleV needs efficient scalar instructions in
+the GPRs.  For a modern 3D GPU this kills any possibility of a
+competitive edge.
+Also, because SimpleV needs efficient scalar instructions in
  order to generate efficient vector instructions, adding new instructions
-for data-transfer/conversion between FPRs and GPRs seems necessary.
+for data-transfer/conversion between FPRs and GPRs multiplies the savings.
  
  In addition, the vast majority of GPR <-> FPR data-transfers are as part
  of a FP <-> Integer conversion sequence, therefore reducing the number
-of instructions required to the minimum seems necessary.
+of instructions required is a priority.
  
  Therefore, we are proposing adding:
  
-* FPR load-immediate using `BF16` as the constant
+* FPR load-immediate instructions, one equivalent to `BF16`, the
+  other increasing accuracy to `FP32`
  * FPR <-> GPR data-transfer instructions that just copy bits without conversion
  * FPR <-> GPR combined data-transfer/conversion instructions that do
    Integer <-> FP conversions
  
-If we're adding new Integer <-> FP conversion instructions, we may
-as well take this opportunity to modernise the instructions and make them
-well suited for common/important conversion sequences:
-
-* standard Integer -> FP conversion (**TODO, which standard?** can it
-  be described in words? how does it differ from the other "standards"?)
-* standard OpenPower FP -> Integer conversion (saturation with NaN
-  converted to minimum valid integer)
-* Rust FP -> Integer conversion (saturation with NaN converted to 0)
-* JavaScript FP -> Integer conversion (modular with Inf/NaN converted to 0)
-
-# A bit more research into integer - fp conversion
-
-here is a paragraph which explains that there are different semantics
-for conversion, i don't know what the paragraph should say, but it needs
-to be here, to give some background.  it also acts as a lead-in to the
-sub-sections, introducing them and explaining why they are here, as
-justifications and background research as to why the ISA should support
-the feature being proposed.
-
-*nothing* can be left to chance or guesswork.
+If adding new Integer <-> FP conversion instructions, 
+the opportunity may be taken to modernise the instructions and make them
+well-suited for common/important conversion sequences:
  
-## standard Integer -> FP conversion
+* **standard IEEE754** - used by most languages and CPUs
+* **standard OpenPOWER** - saturation with NaN
+  converted to minimum valid integer
+* **Java** - saturation with NaN converted to 0
+* **JavaScript** - modulo wrapping with Inf/NaN converted to 0
  
-TODO, explain this further
+The assembly listings in the [[int_fp_mv/appendix]] show how costly
+some of these language-specific conversions are: Javascript, the
+worst case, is 32 scalar instructions including seven branch instructions.
  
-- rounding mode read from FPSCR
+# Proposed New Scalar Instructions
  
-# standard OpenPower FP -> Integer conversion
+All of the following instructions use the standard OpenPower conversion to/from 64-bit float format when reading/writing a 32-bit float from/to a FPR.  All integers however are sourced/stored in the *GPR*.
+
+Integer operands and results being in the GPR is the key differentiator between the proposed instructions
+(the entire rationale) compared to existing Scalar Power ISA.
+In all existing Power ISA Scalar conversion instructions, all
+operands are FPRs, even if the format of the source or destination
+data is actually a scalar integer.
+
+*(The existing Scalar instructions being FP-FP only is based on an assumption
+that VSX will be implemented, and VSX is not part of the SFFS Compliancy
+Level. An earlier version of the Power ISA used to have similar
+FPR<->GPR instructions to these:
+they were deprecated due to this incorrect assumption that VSX would
+always be present).*
+
+Note that source and destination widths can be overridden by SimpleV
+SVP64, and that SVP64 also has Saturation Modes *in addition*
+to those independently described here. SVP64 Overrides and Saturation
+work on *both* Fixed *and* Floating Point operands and results.
+ The interactions with SVP64
+are explained in the  [[int_fp_mv/appendix]]
+
+# Float load immediate  <a name="fmvis"></a>
+
+These are like a variant of `fmvfg` and `oris`, combined.
+Power ISA currently requires a large
+number of instructions to get Floating Point constants into registers.
+`fmvis` on its own is equivalent to BF16 to FP32/64 conversion,
+but if followed up by `frlsi` an additional 16 bits of accuracy in the
+mantissa may be achieved.
+
+*IBM may consider it worthwhile to extend these two instructions to
+v3.1 Prefixed (`pfmvis` and `pfrlsi`). If so it is recommended that
+`pfmvis` load a full FP32 immediate and `pfrlsi` supplies the three high
+missing exponent bits (numbered 8 to 10) and the lower additional
+29 mantissa bits (23 to 51) needed to construct a full FP64 immediate.*
+
+## Load BF16 Immediate
+
+`fmvis FRS, D`
+
+Reinterprets `D << 16` as a 32-bit float, which is then converted to a
+64-bit float and written to `FRS`.  This is equivalent to reinterpreting
+`D` as a `BF16` and converting to 64-bit float.
+There is no need for an Rc=1 variant because this is an immediate loading
+instruction.
  
-TODO, explain this further, make this a complete sentence:
-"saturation with NaN converted to minimum valid integer"
+Example:
  
-  - Matches x86's conversion semantics
-  - Has instructions for both:
-    * rounding mode read from FPSCR
-    * rounding mode is always truncate
+```
+# clearing a FPR
+fmvis f4, 0 # writes +0.0 to f4
+# loading handy constants
+fmvis f4, 0x8000 # writes -0.0 to f4
+fmvis f4, 0x3F80 # writes +1.0 to f4
+fmvis f4, 0xBF80 # writes -1.0 to f4
+fmvis f4, 0xBFC0 # writes -1.5 to f4
+fmvis f4, 0x7FC0 # writes +qNaN to f4
+fmvis f4, 0x7F80 # writes +Infinity to f4
+fmvis f4, 0xFF80 # writes -Infinity to f4
+fmvis f4, 0x3FFF # writes +1.9921875 to f4
  
-## Rust FP -> Integer conversion
+# clearing 128 FPRs with 2 SVP64 instructions
+# by issuing 32 vec4 (subvector length 4) ops
+setvli VL=MVL=32
+sv.fmvis/vec4 f0, 0 # writes +0.0 to f0-f127
+```
+Important: If the float load immediate instruction(s) are left out,
+change all [GPR to FPR conversion instructions](#GPR-to-FPR-conversions)
+to instead write `+0.0` if `RA` is register `0`, at least
+allowing clearing FPRs.
  
-TODO, explain this further, the following is not a complete sentence,
-"saturation with NaN converted to 0"
+`fmvis` fits with DX-Form:
  
-Semantics required by all of:
-(what does this mean, what is "required"?
-what semantics are being referred to? the sentence needs completing:
-"For Rust integer conversion, the semantics required are shown by the
-following, all of which are supported in XYZ" something like that)
+|  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form |
+|--------|------|-------|-------|-------|-----|-----|
+|  Major | FRS  | d1    | d0    | XO    | d2  | DX-Form |
  
-* Rust's FP -> Integer conversion using the
-  [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
-* Java's
-  [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
-* LLVM's
-  [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
-  [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
-* SPIR-V's OpenCL dialect's
-  [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
-  [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
-  instructions when decorated with
-  [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
+Pseudocode:
  
-## JavaScript FP -> Integer conversion
+    bf16 = d0 || d1 || d2
+    fp32 = bf16 || [0]*16
+    FRS = Single_to_Double(fp32)
  
-modular with Inf/NaN converted to 0
+## Float Replace Lower-Half Single, Immediate <a name="frlsi"></a>
  
-TODO, explain this further, it is not a sentence:
-"Semantics required by JavaScript"
+`frlsi FRS, D`
  
-## Other languages
+DX-Form:
  
-TODO: review and investigate other language semantics
+|  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form |
+|--------|------|-------|-------|-------|-----|-----|
+|  Major | FRS  | d1    | d0    | XO    | d2  | DX-Form |
  
-# Links
+Strategically similar to how `oris` is used to construct
+32-bit Integers, an additional 16-bits of immediate is
+inserted into `FRS` to extend its accuracy to
+a full FP32 (stored as usual in FP64 Format within the FPR).
+If a prior `fmvis` instruction had been used to
+set the upper 16-bits of an FP32 value, `frlsi` contains the
+lower 16-bits.
  
-* <https://bugs.libre-soc.org/show_bug.cgi?id=650>
-* <https://bugs.libre-soc.org/show_bug.cgi?id=230#c71>
-* <https://bugs.libre-soc.org/show_bug.cgi?id=230#c74>
-* <https://bugs.libre-soc.org/show_bug.cgi?id=230#c76>
+The key difference between using `li` and `oris` to construct 32-bit
+GPR Immediates and `frlsi` is that the `fmvis` will have converted
+the `BF16` immediate to FP64 (Double) format.
+This is taken into consideration
+as can be seen in the pseudocode below.
  
-# Proposed New Scalar Instructions
+Pseudocode:
  
-All of the following instructions use the standard OpenPower conversion to/from 64-bit float format when reading/writing a 32-bit float from/to a FPR.
+    fp32 = Double_to_Single(FRS)
+    n = fp32[0:15] || d0 || d1 || d2
+    FRS = Single_to_Double(n)
  
-This can be overridden by SimpleV, which sets the following
-operation "reinterpretation" rules:
+*This instruction performs a Read-Modify-Write. FRS is read, the additional
+16 bit immediate inserted, and the result also written to FRS*
  
-* any operation whose assembler mnemonic does not end in "s"
-  (being defined in v3.0B as a "double" operation) is
-  instead an operation at the overridden elwidth for the
-  relevant operand.
-* any operation nominally defined as a "single" FP operation
-  is redefined to be **half the elwidth** rather than
-  "half of 64 bit".
+Example:
  
-Examples:
+```
+# these two combined instructions write 0x3f808000
+# into f4 as an FP32 to be converted to an FP64.
+# actual contents in f4 after conversion: 0x3ff0_1000_0000_0000
+# first the upper bits, happens to be +1.0
+fmvis f4, 0x3F80 # writes +1.0 to f4
+# now write the lower 16 bits of an FP32
+frlsi f4, 0x8000 # writes +1.00390625 to f4
+```
  
-* `sv.fmvtg/sw=32 RT.v, FRA.v` is defined as treating FRA
-   as a vector of *FP32* source operands each *32* bits wide
-   which are to be placed into *64* bit integer destination elements.
-* `sv.fmvfgs/dw=32 FRT.v, RA.v` is defined as taking the bottom
-   32 bits of each RA integer source, then performing a **32 bit**
-   FP32 to **FP16** conversion and storing the result in the
-   **32 bits** of an FRT destination element.
+# Moves
  
-"Single" is therefore redefined in SVP64 to be "half elwidth"
-rather than Double width hardcoded to 64 and Single width
-hardcoded to 32.  This allows a full range of conversions
-between FP64, FP32, FP16 and BF16.
+These instructions perform a straight unaltered bit-level copy from one Register
+File to another.
  
-## FPR to GPR moves
+# FPR to GPR moves
  
  * `fmvtg RT, FRA`
  * `fmvtg. RT, FRA`
  
  move a 64-bit float from a FPR to a GPR, just copying bits directly.
-Rc=1 tests RT and sets CR0
+As a direct bitcopy, no exceptions occur and no status flags are set.
+
+Rc=1 tests RT and sets CR0, exactly like all other Scalar Fixed-Point
+operations.
  
  * `fmvtgs RT, FRA`
  * `fmvtgs. RT, FRA`
  
  move a 32-bit float from a FPR to a GPR, just copying bits. Converts the
  64-bit float in `FRA` to a 32-bit float, then writes the 32-bit float to
-`RT`.
-Rc=1 tests RT and sets CR0
+`RT`. Effectively, `fmvtgs` is a macro-fusion of `frsp fmvtg`
+and therefore has the exact same exception and flags behaviour of `frsp`
+
+Unlike `frsp` however, with RT being a GPR, Rc=1 follows
+standard *integer* behaviour, i.e. tests RT and sets CR0.
  
-## GPR to FPR moves
+# GPR to FPR moves
  
  `fmvfg FRT, RA`
  
-move a 64-bit float from a GPR to a FPR, just copying bits.
+move a 64-bit float from a GPR to a FPR, just copying bits. No exceptions
+are raised, no flags are altered of any kind.
+
+Rc=1 tests FRT and sets CR1
  
  `fmvfgs FRT, RA`
  
  move a 32-bit float from a GPR to a FPR, just copying bits. Converts the
  32-bit float in `RA` to a 64-bit float, then writes the 64-bit float to
-`FRT`.
-
-TODO: Rc=1 variants?
+`FRT`. Effectively, `fmvfgs` is a macro-fusion of `fmvfg frsp` and
+therefore has the exact same exception and flags behaviour of `frsp`
  
-### Float load immediate (kinda a variant of `fmvfg`)
+Rc=1 tests FRT and sets CR1
  
-`fmvis FRT, FI`
+TODO: clear statement on evaluation as to whether exceptions or flags raised as part of the **FP** conversion (not the int bitcopy part, the conversion part.  the semantics should really be the same as frsp)
  
-Reinterprets `FI << 16` as a 32-bit float, which is then converted to a
-64-bit float and written to `FRT`.  This is equivalent to reinterpreting
-`FI` as a `BF16` and converting to 64-bit float.
+v3.0C section 4.6.7.1 states:
  
-Example:
+FPRF is set to the class and sign of the result, except for Invalid Operation Exceptions when VE=1.
  
-```
-# clearing a FPR
-fmvis f4, 0 # writes +0.0 to f4
-# loading handy constants
-fmvis f4, 0x8000 # writes -0.0 to f4
-fmvis f4, 0x3F80 # writes +1.0 to f4
-fmvis f4, 0xBF80 # writes -1.0 to f4
-fmvis f4, 0xBFC0 # writes -1.5 to f4
-fmvis f4, 0x7FC0 # writes +qNaN to f4
-fmvis f4, 0x7F80 # writes +Infinity to f4
-fmvis f4, 0xFF80 # writes -Infinity to f4
-fmvis f4, 0x3FFF # writes +1.9921875 to f4
+    Special Registers Altered:
+      FPRF FR FI
+      FX OX UX XX VXSNAN
+      CR1 (if Rc=1)
  
-# clearing 128 FPRs with 2 SVP64 instructions
-# by issuing 32 vec4 (subvector length 4) ops
-setvli VL=MVL=32
-sv.fmvis/vec4 f0, 0 # writes +0.0 to f0-f127
-```
-Important: If the float load immediate instruction(s) are left out,
-change all [GPR to FPR conversion instructions](#GPR-to-FPR-conversions)
-to instead write `+0.0` if `RA` is register `0`, at least
-allowing clearing FPRs.
-
-|  0-5   | 6-10 | 11-25 | 26-30 | 31  |
-|--------|------|-------|-------|-----|
-|  Major | FRT  | FI    | XO    | FI0 |
-
-The above fits reasonably well with Minor 19 and follows the
-pattern shown by `addpcis`, which uses an entire column of Minor 19
-XO.  15 bits of FI fit into bits 11 to 25,
-the top bit FI0 (MSB0 numbered 0) makes 16.
-
-    bf16 = FI0 || FI
-    fp32 = bf16 || [0]*16
-    FRT = Single_to_Double(fp32)
-
-## FPR to GPR conversions
-
-<div id="fpr-to-gpr-conversion-mode"></div>
+# Conversions
  
-X-Form:
-
-|  0-5   | 6-10 | 11-15  | 16-25 | 26-30 | 31 |
-|--------|------|--------|-------|-------|----|
-|  Major | RT   | //Mode | FRA   | XO    | Rc |
-|  Major | FRT  | //Mode | RA    | XO    | Rc |
+Unlike the move instructions
+these instructions perform conversions between Integer and
+Floating Point. Truncation can therefore occur, as well
+as exceptions.
  
  Mode values:
  
@@ -233,95 +283,152 @@ Mode values:
  |------|-----------------|----------------------------------|
  | 000  | from `FPSCR`    | [OpenPower semantics]            |
  | 001  | Truncate        | [OpenPower semantics]            |
-| 010  | from `FPSCR`    | [Rust semantics]                 |
-| 011  | Truncate        | [Rust semantics]                 |
+| 010  | from `FPSCR`    | [Java semantics]                 |
+| 011  | Truncate        | [Java semantics]                 |
  | 100  | from `FPSCR`    | [JavaScript semantics]           |
  | 101  | Truncate        | [JavaScript semantics]           |
  | rest | --              | illegal instruction trap for now |
  
  [OpenPower semantics]: #fp-to-int-openpower-conversion-semantics
-[Rust semantics]: #fp-to-int-rust-conversion-semantics
+[Java semantics]: #fp-to-int-java-conversion-semantics
  [JavaScript semantics]: #fp-to-int-javascript-conversion-semantics
  
-`fcvttgw RT, FRA, Mode`
-
-Convert from 64-bit float to 32-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvttguw RT, FRA, Mode`
-
-Convert from 64-bit float to 32-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvttgd RT, FRA, Mode`
-
-Convert from 64-bit float to 64-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvttgud RT, FRA, Mode`
-
-Convert from 64-bit float to 64-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstgw RT, FRA, Mode`
-
-Convert from 32-bit float to 32-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstguw RT, FRA, Mode`
-
-Convert from 32-bit float to 32-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstgd RT, FRA, Mode`
-
-Convert from 32-bit float to 64-bit signed integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-`fcvtstgud RT, FRA, Mode`
-
-Convert from 32-bit float to 64-bit unsigned integer, writing the result
-to the GPR `RT`. Converts using [mode `Mode`]
-
-[mode `Mode`]: #fpr-to-gpr-conversion-mode
-
  ## GPR to FPR conversions
  
-All of the following GPR to FPR conversions use the rounding mode from `FPSCR`.
-
-`fcvtfgw FRT, RA`
+**Format**
  
-Convert from 32-bit signed integer in the GPR `RA` to 64-bit float in `FRT`.
+|  0-5   | 6-10 | 11-15  | 16-25 | 26-30 | 31 | Form |
+|--------|------|--------|-------|-------|----|------|
+|  Major | FRT  | //Mode | RA    | XO    | Rc |X-Form|
  
-`fcvtfgws FRT, RA`
+All of the following GPR to FPR conversions use the rounding mode from `FPSCR`.
  
-Convert from 32-bit signed integer in the GPR `RA` to 32-bit float in `FRT`.
+* `fcvtfgw FRT, RA`
+  Convert from 32-bit signed integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfgws FRT, RA`
+  Convert from 32-bit signed integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
+* `fcvtfguw FRT, RA`
+  Convert from 32-bit unsigned integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfguws FRT, RA`
+  Convert from 32-bit unsigned integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
+* `fcvtfgd FRT, RA`
+  Convert from 64-bit signed integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfgds FRT, RA`
+  Convert from 64-bit signed integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
+* `fcvtfgud FRT, RA`
+  Convert from 64-bit unsigned integer in the GPR `RA` to 64-bit float in 
+  `FRT`.
+* `fcvtfguds FRT, RA`
+  Convert from 64-bit unsigned integer in the GPR `RA` to 32-bit float in 
+  `FRT`.
+
+## FPR to GPR (Integer) conversions
  
-`fcvtfguw FRT, RA`
+<div id="fpr-to-gpr-conversion-mode"></div>
  
-Convert from 32-bit unsigned integer in the GPR `RA` to 64-bit float in `FRT`.
+Different programming languages turn out to have completely different
+semantics for FP to Integer conversion.  Below is an overview
+of the different variants, listing the languages and hardware that
+implements each variant.
  
-`fcvtfguws FRT, RA`
+**Standard IEEE754 conversion**
  
-Convert from 32-bit unsigned integer in the GPR `RA` to 32-bit float in `FRT`.
+This conversion is outlined in the IEEE754 specification.  It is used
+by nearly all programming languages and CPUs.  In the case of OpenPOWER,
+the rounding mode is read from FPSCR
  
-`fcvtfgd FRT, RA`
+**Standard OpenPower conversion**
  
-Convert from 64-bit signed integer in the GPR `RA` to 64-bit float in `FRT`.
+This conversion, instead of exact IEEE754 Compliance, performs
+"saturation with NaN converted to minimum valid integer". This
+is also exactly the same as the x86 ISA conversion semantics.
+OpenPOWER however has instructions for both:
  
-`fcvtfgds FRT, RA`
+* rounding mode read from FPSCR
+* rounding mode always set to truncate
  
-Convert from 64-bit signed integer in the GPR `RA` to 32-bit float in `FRT`.
+**Java conversion**
  
-`fcvtfgud FRT, RA`
+For the sake of simplicity, the FP -> Integer conversion semantics generalized from those used by Java's semantics (and Rust's `as` operator) will be referred to as
+[Java conversion semantics](#fp-to-int-java-conversion-semantics).
  
-Convert from 64-bit unsigned integer in the GPR `RA` to 64-bit float in `FRT`.
+Those same semantics are used in some way by all of the following languages (not necessarily for the default conversion method):
  
-`fcvtfguds FRT, RA`
+* Java's
+  [FP -> Integer conversion](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
+* Rust's FP -> Integer conversion using the
+  [`as` operator](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
+* LLVM's
+  [`llvm.fptosi.sat`](https://llvm.org/docs/LangRef.html#llvm-fptosi-sat-intrinsic) and
+  [`llvm.fptoui.sat`](https://llvm.org/docs/LangRef.html#llvm-fptoui-sat-intrinsic) intrinsics
+* SPIR-V's OpenCL dialect's
+  [`OpConvertFToU`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToU) and
+  [`OpConvertFToS`](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpConvertFToS)
+  instructions when decorated with
+  [the `SaturatedConversion` decorator](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_decoration_a_decoration).
+* WebAssembly has also introduced
+ [trunc_sat_u](ttps://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u) and
+ [trunc_sat_s](https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-s)
+
+**JavaScript conversion**
+
+For the sake of simplicity, the FP -> Integer conversion semantics generalized from those used by JavaScripts's `ToInt32` abstract operation will be referred to as [JavaScript conversion semantics](#fp-to-int-javascript-conversion-semantics).
+
+This instruction is present in ARM assembler as FJCVTZS
+<https://developer.arm.com/documentation/dui0801/g/hko1477562192868>
+
+**Format**
+
+|  0-5   | 6-10 | 11-15  | 16-25 | 26-30 | 31 | Form |
+|--------|------|--------|-------|-------|----|------|
+|  Major | RT   | //Mode | FRA   | XO    | Rc |X-Form|
+
+**Rc=1 and OE=1**
+
+All of these insructions have an Rc=1 mode which sets CR0
+in the normal way for any instructions producing a GPR result.
+Additionally, when OE=1, if the numerical value of the FP number
+is not 100% accurately preserved (due to truncation or saturation
+and including when the FP number was NaN) then this is considered
+to be an integer Overflow condition, and CR0.SO, XER.SO and XER.OV
+are all set as normal for any GPR instructions that overflow.
+
+**Instructions**
+
+* `fcvttgw RT, FRA, Mode`
+  Convert from 64-bit float to 32-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctiw` or `fctiwz`
+* `fcvttguw RT, FRA, Mode`
+  Convert from 64-bit float to 32-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctiwu` or `fctiwuz`
+* `fcvttgd RT, FRA, Mode`
+  Convert from 64-bit float to 64-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctid` or `fctidz`
+* `fcvttgud RT, FRA, Mode`
+  Convert from 64-bit float to 64-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]. Similar to `fctidu` or `fctiduz`
+* `fcvtstgw RT, FRA, Mode`
+  Convert from 32-bit float to 32-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvtstguw RT, FRA, Mode`
+  Convert from 32-bit float to 32-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvtstgd RT, FRA, Mode`
+  Convert from 32-bit float to 64-bit signed integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
+* `fcvtstgud RT, FRA, Mode`
+  Convert from 32-bit float to 64-bit unsigned integer, writing the result
+  to the GPR `RT`. Converts using [mode `Mode`]
  
-Convert from 64-bit unsigned integer in the GPR `RA` to 32-bit float in `FRT`.
+[mode `Mode`]: #fpr-to-gpr-conversion-mode
  
-# FP to Integer Conversion Pseudo-code
+## FP to Integer Conversion Pseudo-code
  
  Key for pseudo-code:
  
@@ -331,8 +438,10 @@ Key for pseudo-code:
  | `int`                     | --          | `u32`/`u64`/`i32`/`i64` (or other types from SimpleV)                                              |
  | `uint`                    | --          | the unsigned integer of the same bit-width as `int`                                                |
  | `int::BITS`               | `int`       | the bit-width of `int`                                                                             |
-| `int::MIN_VALUE`          | `int`       | the minimum value `int` can store (`0` if unsigned, `-2^(int::BITS-1)` if signed)                  |
-| `int::MAX_VALUE`          | `int`       | the maximum value `int` can store (`2^int::BITS - 1` if unsigned, `2^(int::BITS-1) - 1` if signed) |
+| `uint::MIN_VALUE`         | `uint`      | the minimum value `uint` can store: `0`                   |
+| `uint::MAX_VALUE`          | `uint`       | the maximum value `uint` can store: `2^int::BITS - 1`  |
+| `int::MIN_VALUE`          | `int`       | the minimum value `int` can store : `-2^(int::BITS-1)`              |
+| `int::MAX_VALUE`          | `int`       | the maximum value `int` can store :  `2^(int::BITS-1) - 1`  |
  | `int::VALUE_COUNT`        | Integer     | the number of different values `int` can store (`2^int::BITS`). too big to fit in `int`.           |
  | `rint(fp, rounding_mode)` | `fp`        | rounds the floating-point value `fp` to an integer according to rounding mode `rounding_mode`      |
  
@@ -350,11 +459,14 @@ def fp_to_int_open_power<fp, int>(v: fp) -> int:
      return (int)rint(v, rounding_mode)
  ```
  
-<div id="fp-to-int-rust-conversion-semantics"></div>
-Rust [conversion semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics) (with adjustment to add non-truncate rounding modes):
+<div id="fp-to-int-java-conversion-semantics"></div>
+[Java conversion semantics](https://docs.oracle.com/javase/specs/jls/se16/html/jls-5.html#jls-5.1.3)
+/
+[Rust semantics](https://doc.rust-lang.org/reference/expressions/operator-expr.html#semantics)
+(with adjustment to add non-truncate rounding modes):
  
  ```
-def fp_to_int_rust<fp, int>(v: fp) -> int:
+def fp_to_int_java<fp, int>(v: fp) -> int:
      if v is NaN:
          return 0
      if v >= int::MAX_VALUE:
@@ -365,18 +477,16 @@ def fp_to_int_rust<fp, int>(v: fp) -> int:
  ```
  
  <div id="fp-to-int-javascript-conversion-semantics"></div>
-JavaScript [conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32) (with adjustment to add non-truncate rounding modes):
+Section 7.1 of the ECMAScript / JavaScript
+[conversion semantics](https://262.ecma-international.org/11.0/#sec-toint32) (with adjustment to add non-truncate rounding modes):
  
  ```
  def fp_to_int_java_script<fp, int>(v: fp) -> int:
      if v is NaN or infinite:
          return 0
-    v = rint(v, rounding_mode)
+    v = rint(v, rounding_mode)  # assume no loss of precision in result
      v = v mod int::VALUE_COUNT  # 2^32 for i32, 2^64 for i64, result is non-negative
      bits = (uint)v
      return (int)bits
  ```
  
-# Equivalent OpenPower ISA v3.0 Assembly Language for FP -> Integer Conversion Modes
-
-Moved to [[int_fp_mv/appendix]]