# RFC ls002 v2 Floating-Point Load-Immediate

**URLs**:

* <https://libre-soc.org/openpower/sv/int_fp_mv/#fmvis>
* <https://libre-soc.org/openpower/sv/rfc/ls002/>
* <https://bugs.libre-soc.org/show_bug.cgi?id=944>
* <https://git.openpower.foundation/isa/PowerISA/issues/87>

**Severity**: Major

**Status**: New

**Date**: 05 Oct 2022

**Target**: v3.2B

**Source**: v3.0B

**Books and Section affected**:

```
    Book I Scalar Floating-Point 4.6.2.1
    Appendix E Power ISA sorted by opcode
    Appendix F Power ISA sorted by version
    Appendix G Power ISA sorted by Compliancy Subset
    Appendix H Power ISA sorted by mnemonic
```

**Summary**

```
    Instructions added
    fmvis - Floating-Point Move Immediate, Shifted
    fishmv - Floating-Point Immediate, Second-half Move
```

**Submitter**: Luke Leighton (Libre-SOC)

**Requester**: Libre-SOC

**Impact on processor**:

```
    Addition of two new FPR-based instructions
```

**Impact on software**:

```
    Requires support for new instructions in assembler, debuggers,
    and related tools.
```

**Keywords**:

```
    FPR, Floating-point, Load-immediate, BF16, bfloat16, BFP32
```

**Motivation**

Similar to `lxvkq` but extended to a bfloat16 with one
32-bit instruction and a full FP32 in two 32-bit instructions
these instructions always save a Data Load and associated L1
and TLB lookup. Even quickly clearing an FPR to zero presently needs Load.

**Notes and Observations**:

1. There is no need for an Rc=1 variant because this is an immediate
  loading instruction (an FPR equivalent to `li`)
2. There is no need for Special Registers (FP Flags) because this
  is an immediate loading instruction.  No FPR Load Operations
  alter `FPSCR`, neither does `lxvkq`, and on that basis neither
  should these instructions.
3. `fishmv` as a FRT-only Read-Modify-Write (instead of an unnecessary
  FRT,FRA pair) saves five potential bits, making
  the difference between a 5-bit XO (VA/DX-Form) and requiring an entire
  Primary Opcode.

**Changes**

Add the following entries to:

* the Appendices of Book I
* Instructions of Book I as a new Section 4.6.2.1
* DX-Form of Book I Section 1.6.1.6 and 1.6.2
* Floating-Point Data a Format of Book I Section 4.3.1

----------------

\newpage{}

# Floating-Point Move Immediate

`fmvis FRT, D`

|  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form    |
|--------|------|-------|-------|-------|-----|---------|
|  Major | FRT  | d1    | d0    | XO    | d2  | DX-Form |

Pseudocode:

```
    bf16 <- d0 || d1 || d2  # create bfloat16 immediate
    bfp32 <- bf16 || [0]*16 # convert bfloat16 to BFP32
    FRT <- DOUBLE(bfp32)    # convert BFP32 to BFP64
```

Special registers altered:

    None

The value `D << 16` is interpreted as a 32-bit float, converted to a
64-bit float and written to `FRT`.  This is equivalent to reinterpreting
`D` as a `bfloat16` and converting to 64-bit float.

Examples:

```
    fmvis f4, 0 # writes +0.0 to f4 (clears an FPR)
    fmvis f4, 0x8000 # writes -0.0 to f4
    fmvis f4, 0x3F80 # writes +1.0 to f4
    fmvis f4, 0xBFC0 # writes -1.5 to f4
    fmvis f4, 0x7FC0 # writes +qNaN to f4
    fmvis f4, 0x7F80 # writes +Infinity to f4
    fmvis f4, 0xFF80 # writes -Infinity to f4
    fmvis f4, 0x3FFF # writes +1.9921875 to f4
```

# Floating-Point Immediate Second-Half Move

`fishmv FRT, D`

DX-Form:

|  0-5   | 6-10 | 11-15 | 16-25 | 26-30 | 31  | Form    |
|--------|------|-------|-------|-------|-----|---------|
|  Major | FRT  | d1    | d0    | XO    | d2  | DX-Form |

Pseudocode:

```
    n <- (FRT)                      # read FRT
    bfp32 <- SINGLE(n)              # convert to BFP32
    bfp32[16:31] <- d0 || d1 || d2  # replace LSB half
    FRT <- DOUBLE(bfp32)            # convert back to BFP64
```

Special registers altered:

    None

An additional 16-bits of immediate is
inserted into the low-order half of the single-format value
corresponding to the contents of FRT.

**This instruction performs a Read-Modify-Write on FRT.**
In hardware, `fishmv` may be macro-op-fused with `fmvis`.

Programmer's note:
The use of these two instructions is strategically similar to
how `li` combined with `oris` may be used to construct 32-bit Integers.
If a prior `fmvis` instruction had been used to
set the upper 16-bits from a BFP32 value, `fishmv` may be used
to set the
lower 16-bits.
Example:

```
    # these two combined instructions write 0x3f808000
    # into f4 as a BFP32 to be converted to a BFP64.
    # actual contents in f4 after conversion: 0x3ff0_1000_0000_0000
    # first the upper bits, happens to be +1.0
    fmvis f4, 0x3F80 # writes +1.0 to f4
    # now write the lower 16 bits of a BFP32
    fishmv f4, 0x8000 # writes +1.00390625 to f4
```
[[!tag opf_rfc]]

-------------

\newpage{}

# DX-Form

Add the following to Book I, 1.6.1.6, DX-Form

```
  |0    |6   |11   |16   |26   |31
  | PO  | FRT|   d1|   d0|   XO|d2
```

Add `DX` to `FRT` Field in Book I, 1.6.2

```
 FRT (6:10)
     Field used to specify an FPR to be used as a
     source.
     Formats: D, X, DX
```

# bfloat16 definition

Add the following to Book I, 4.3.1:

The format may be a 16-bit bfloat16, 32-bit single format for a
single-precision value...

The bfloat16 format is used as an immediate.

The structure of the bfloat16, single and double formats is shown below.

```
  |S |EXP| FRACTION|
  |0 |1 8|9      15|
```

Figure #. Binary floating-point half-precision format (bfloat16)

# Appendices

    Appendix E Power ISA sorted by opcode
    Appendix F Power ISA sorted by version
    Appendix G Power ISA sorted by Compliancy Subset
    Appendix H Power ISA sorted by mnemonic

| Form | Book | Page | Version | mnemonic | Description |
|------|------|------|---------|----------|-------------|
| DX   | I    | #    | 3.0B    | fmvis    | Floating-point Move Immediate, Shifted |
| DX   | I    | #    | 3.0B    | fishmv  | Floating-point Immediate, Second-half Move |