From: lkcl Date: Wed, 25 May 2022 11:29:00 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~2101 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=95a4fab2a7fd1ee1c8b33744d7ab7f25d6f419fa;p=libreriscv.git --- diff --git a/openpower/sv/int_fp_mv.mdwn b/openpower/sv/int_fp_mv.mdwn index 3baf1fa83..cfa6dbba4 100644 --- a/openpower/sv/int_fp_mv.mdwn +++ b/openpower/sv/int_fp_mv.mdwn @@ -17,8 +17,9 @@ Introduction: High-performance CPU/GPU software needs to often convert between integers and floating-point, therefore fast conversion/data-movement instructions are needed. Also given that initialisation of floats tends to take up -considerable space (even to just load 0.0) the inclusion of compact -format float immediate is up for consideration using BF16 as a base. +considerable space (even to just load 0.0) the inclusion of two compact +format float immediate instructions is up for consideration using 16-bit +immediates. BF16 is one of the formats. Libre-SOC will be compliant with the **Scalar Floating-Point Subset** (SFFS) i.e. is not implementing VMX/VSX, @@ -43,7 +44,7 @@ of instructions required to the minimum seems necessary. Therefore, we are proposing adding: -* FPR load-immediate using `BF16` as the constant +* FPR load-immediate equivalent partially to `BF16` * FPR <-> GPR data-transfer instructions that just copy bits without conversion * FPR <-> GPR combined data-transfer/conversion instructions that do Integer <-> FP conversions @@ -79,13 +80,16 @@ work on *both* Fixed *and* Floating Point operands and results. The interactions with SVP64 are explained in the [[int_fp_mv/appendix]] -# Float load immediate +# Float load immediate -These arelike a variant of `fmvfg`. Power ISA currently requires a large +These are like a variant of `fmvfg` and `oris`, combined. +Power ISA currently requires a large number of instructions to get Floating Point constants into registers. -FP16 and BF16 Formats both fit into 16-bit immediates. +`fmvis` on its own is equivalent to BF16 to FP32/64 conversion, +but if followed up by `fishmv` an additional 16 bits of accuracy in the +mantissa may be achieved. -## Load BF16 Immediate +## Load BF16 Immediate `fmvis FRT, FI` @@ -134,28 +138,28 @@ Pseudocode: fp32 = bf16 || [0]*16 FRT = Single_to_Double(fp32) -## Load FP16 Immediate +## Floating Extend Immediate -`fishmv FRT, FI` +`fishmv FRS, FI` -Interprets `FI` as an IEEE754 16-bit float, which is then converted to a -64-bit float and written to `FRT`. This is equivalent to interpreting -`FI` as a `FP16` and converting to 64-bit float. - -There is no need for an Rc=1 variant because this is an immediate loading -instruction. This frees up one extra bit in the DX-Form format for packing -a full `FP16`. +Equivalent to `oris`, an additional 16-bits of immediate is +strategically inserted into `FRS` to extend its accuracy to +a full FP32, if a prior `fmvis` instruction had been used to +set the upper 16-bits. `fishmv` fits with DX-Form: | 0-5 | 6-10 | 11-15 | 16-25 | 26-30 | 31 | Form | |--------|------|-------|-------|-------|-----|-----| -| Major | FRT | d1 | d0 | XO | d2 | DX-Form | +| Major | FRS | d1 | d0 | XO | d2 | DX-Form | Pseudocode: - fp16 = d0 || d1 || d2 - FRT = Half_to_Double(fp16) + fp32 = FRS[48:63] || d0 || d1 || d2 + FRT = Single_to_Double(fp32) + +*This instruction performs a Read-Modify-Write. FRS is read, the additional +16 bit immediate inserted, and the result also written to FRS* # Moves