# SimpleV SVP64 polymorphic element width overrides

SimpleV, the Draft Cray-style Vectorisation for OpenPOWER, may
independently override both or either of the source or destination
register bitwidth in the base operation used to create the Vector
operation.  In the case of IEEE754 FP operands this gives an
opportunity to add `FP16` as well.as `BF16` to the Power ISA
with no actual new Scalar opcodes.

However there is the potential for confusion as to the definition
of what Single and Double mean when the operand width has been
over-ridden.  Simple-V therefore sets the following
 "reinterpretation" rules:

* any operation whose assembler mnemonic does not end in "s"
  (being defined in v3.0B as a "double" operation) is
  instead an operation at the overridden elwidth for the
  relevant operand, instead of a 64 bit "Double"
* any operation nominally defined as a "single" FP operation
  is redefined to be **half the elwidth** rather than
  "half of 64 bit" (32 bit, aka "Single")

Examples:

* `sv.fmvtg/sw=32 RT.v, FRA.v` is defined as treating FRA
   as a vector of *FP32* source operands each *32* bits wide
   which are to be placed into *64* bit integer destination elements.
* `sv.fmvfgs/dw=32 FRT.v, RA.v` is defined as taking the bottom
   32 bits of each RA integer source, then performing a **32 bit**
   FP32 to **FP16** conversion and storing the result in the
   **32 bits** of an FRT destination element.

"Single" is therefore redefined in SVP64 to be "half elwidth"
rather than Double width hardcoded to 64 and Single width
hardcoded to 32.  This allows a full range of conversions
between FP64, FP32, FP16 and BF16.

Note however that attempts to perform "Single" operations on
FP16 elwidths will raise an illegal instruction trap: Half
of FP16 is FP8, which is not defined as a legal IEEE754 format.

# Simple-V SVP64 Saturation

SVP64 also allows for Saturation, such that the result is truncated
to the maximum or minimum range of the result operand rather than
overflowing.

There will be some interaction here with Conversion routines which
will need careful application of the SVP64 Saturation rules: some
work will be duplicated by the operation itself, but in some cases
it will change the result.

The critical thing to note is that SVP64 Saturation is to be considered
as the "priority override" where the operation should take place at
"Infinite bitwidth followed by a result post-analysis phase".

Thus if by chance an unsigned conversion to INT was carried out,
with a destination override to 16 bit results, in combination
with a **signed** SVP64 Saturation override, the result would
be truncated to within the range 0 to 0x7FFF.  The actual
operation itself, being an *Unsigned* conversion, would set the
minimum value to zero, whilst the SVP64 *Signed* Saturation
would set the maximum to a Signed 16 bit integer.

As always with SVP64, some thought and care has to be put into
how the override behaviour will interact with the base scalar
operation.

# Equivalent OpenPower ISA v3.0 Assembly Language for FP -> Integer Conversion Modes

## c (IEEE754 standard compliant)

```
int32_t toInt32(double number)
{
    uint32_t result = (int32_t)number;
    return result;
}
```

### 64-bit float -> 32-bit signed integer

```
toInt32(double):
        fctiwz 1,1
        addi 9,1,-16
        stfiwx 1,0,9
        lwz 3,-16(1)
        extsw 3,3
        blr
        .long 0
        .byte 0,9,0,0,0,0,0,0
```

## Rust

```pub fn fcvttgd_rust(v: f64) -> i64 {
    v as i64
}

pub fn fcvttgud_rust(v: f64) -> u64 {
    v as u64
}

pub fn fcvttgw_rust(v: f64) -> i32 {
    v as i32
}

pub fn fcvttguw_rust(v: f64) -> u32 {
    v as u32
}
```

### 64-bit float -> 64-bit signed integer

```
.LCPI0_0:
        .long   0xdf000000
.LCPI0_1:
        .quad   0x43dfffffffffffff
example::fcvttgd_rust:
.Lfunc_gep0:
        addis 2, 12, .TOC.-.Lfunc_gep0@ha
        addi 2, 2, .TOC.-.Lfunc_gep0@l
        addis 3, 2, .LCPI0_0@toc@ha
        fctidz 2, 1
        fcmpu 5, 1, 1
        li 4, 1
        li 5, -1
        lfs 0, .LCPI0_0@toc@l(3)
        addis 3, 2, .LCPI0_1@toc@ha
        rldic 4, 4, 63, 0
        fcmpu 0, 1, 0
        lfd 0, .LCPI0_1@toc@l(3)
        stfd 2, -8(1)
        ld 3, -8(1)
        fcmpu 1, 1, 0
        cror 24, 0, 3
        isel 3, 4, 3, 24
        rldic 4, 5, 0, 1
        isel 3, 4, 3, 5
        isel 3, 0, 3, 23
        blr
        .long   0
        .quad   0
```

### 64-bit float -> 64-bit unsigned integer

```
.LCPI1_0:
        .long   0x00000000
.LCPI1_1:
        .quad   0x43efffffffffffff
example::fcvttgud_rust:
.Lfunc_gep1:
        addis 2, 12, .TOC.-.Lfunc_gep1@ha
        addi 2, 2, .TOC.-.Lfunc_gep1@l
        addis 3, 2, .LCPI1_0@toc@ha
        fctiduz 2, 1
        li 4, -1
        lfs 0, .LCPI1_0@toc@l(3)
        addis 3, 2, .LCPI1_1@toc@ha
        fcmpu 0, 1, 0
        lfd 0, .LCPI1_1@toc@l(3)
        stfd 2, -8(1)
        ld 3, -8(1)
        fcmpu 1, 1, 0
        cror 20, 0, 3
        isel 3, 0, 3, 20
        isel 3, 4, 3, 5
        blr
        .long   0
        .quad   0
```

### 64-bit float -> 32-bit signed integer

```
.LCPI2_0:
        .long   0xcf000000
.LCPI2_1:
        .quad   0x41dfffffffc00000
example::fcvttgw_rust:
.Lfunc_gep2:
        addis 2, 12, .TOC.-.Lfunc_gep2@ha
        addi 2, 2, .TOC.-.Lfunc_gep2@l
        addis 3, 2, .LCPI2_0@toc@ha
        fctiwz 2, 1
        lis 4, -32768
        lis 5, 32767
        lfs 0, .LCPI2_0@toc@l(3)
        addis 3, 2, .LCPI2_1@toc@ha
        fcmpu 0, 1, 0
        lfd 0, .LCPI2_1@toc@l(3)
        addi 3, 1, -4
        stfiwx 2, 0, 3
        fcmpu 5, 1, 1
        lwz 3, -4(1)
        fcmpu 1, 1, 0
        cror 24, 0, 3
        isel 3, 4, 3, 24
        ori 4, 5, 65535
        isel 3, 4, 3, 5
        isel 3, 0, 3, 23
        blr
        .long   0
        .quad   0
```

### 64-bit float -> 32-bit unsigned integer

```
.LCPI3_0:
        .long   0x00000000
.LCPI3_1:
        .quad   0x41efffffffe00000
example::fcvttguw_rust:
.Lfunc_gep3:
        addis 2, 12, .TOC.-.Lfunc_gep3@ha
        addi 2, 2, .TOC.-.Lfunc_gep3@l
        addis 3, 2, .LCPI3_0@toc@ha
        fctiwuz 2, 1
        li 4, -1
        lfs 0, .LCPI3_0@toc@l(3)
        addis 3, 2, .LCPI3_1@toc@ha
        fcmpu 0, 1, 0
        lfd 0, .LCPI3_1@toc@l(3)
        addi 3, 1, -4
        stfiwx 2, 0, 3
        lwz 3, -4(1)
        fcmpu 1, 1, 0
        cror 20, 0, 3
        isel 3, 0, 3, 20
        isel 3, 4, 3, 5
        blr
        .long   0
        .quad   0
```

## JavaScript

```
#include <stdint.h>

namespace WTF {
template<typename Target, typename Src>
inline Target bitwise_cast(Src v) {
    union {
        Src s;
        Target t;
    } u;
    u.s = v;
…    if (exp < 32) {
        int32_t missingOne = 1 << exp;
        result &= missingOne - 1;
        result += missingOne;
    }

    // If the input value was negative (we could test either 'number' or 'bits',
    // but testing 'bits' is likely faster) invert the result appropriately.
    return bits < 0 ? -result : result;
}
```

### 64-bit float -> 32-bit signed integer

```
toInt32(double):
        stfd 1,-16(1)
        li 3,0
        ori 2,2,0
        ld 9,-16(1)
        rldicl 8,9,12,53
        addi 10,8,-1023
        cmplwi 7,10,83
        bgtlr 7
        cmpwi 7,10,52
        bgt 7,.L7
        cmpwi 7,10,31
        subfic 3,10,52
        srad 3,9,3
        extsw 3,3
        bgt 7,.L4
        li 8,1
        slw 10,8,10
        addi 8,10,-1
        and 3,8,3
        add 10,10,3
        extsw 3,10
.L4:
        cmpdi 7,9,0
        bgelr 7
.L8:
        neg 3,3
        extsw 3,3
        blr
.L7:
        cmpdi 7,9,0
        addi 3,8,-1075
        sld 3,9,3
        extsw 3,3
        bgelr 7
        b .L8
        .long 0
        .byte 0,9,0,0,0,0,0,0
```