# RFC ls013 Min/Max GPR/FPR
-**URLs**:
-
+* Funded by NLnet under the Privacy and Enhanced Trust Programme, EU
+ Horizon2020 Grant 825310, and NGI0 Entrust No 101069594
* <https://libre-soc.org/openpower/sv/rfc/ls013/>
* <https://git.openpower.foundation/isa/PowerISA/issues/TODO>
* <https://bugs.libre-soc.org/show_bug.cgi?id=1057>
**Notes and Observations**:
1. SVP64 REMAP Parallel Reduction needs a single Scalar instruction to
- work with, for best effectiveness. With no SFFS minimum/maximum instructions
- Simple-V min/max Parallel Reduction is severely compromised.
-2. Once one FP min/max mode is implemented the rest are not much more
- hardware.
-3. There exists similar instructions in VSX (not IEEE754-2019 though).
- This is frequently used to justify not
- adding them. However SVP64/VSX may have different meaning from SVP64/SFFS,
- so it is *really* crucial to have SFFS ops even if "equivalent" to VSX
- in order for SVP64 to not be compromised (non-orthogonal).
+ work with, for best effectiveness. With no SFFS minimum/maximum
+ instructions Simple-V min/max Parallel Reduction is severely compromised.
+2. Once one FP min/max mode is implemented the rest are not much more hardware.
+3. There exists similar instructions in VSX (not IEEE754-2019 though).
+ This is frequently used to justify not adding them. However SVP64/VSX may
+ have different meaning from SVP64/SFFS, so it is *really* crucial to have
+ SFFS ops even if "equivalent" to VSX in order for SVP64 to not be
+ compromised (non-orthogonal).
4. FP min/max are rather complex to implement in software, the most commonly
- used FP max function `fmax` from glibc compiled for SFFS is an
- astounding 32 instructions.
+ used FP max function `fmax` from glibc compiled for SFFS is an astounding
+ 32 instructions.
**Changes**
<a id="fmm-floating-min-max-mode"></a>
-| `FMM` | Assembly Alias | Origin | Semantics |
-|-------|-------------------------------|--------------------------------|-------------------------------------------------|
-| 0000 | fminnum08[s] FRT, FRA, FRB | IEEE 754-2008 | FRT = minNum(FRA, FRB) (1) |
-| 0001 | fmin19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minimum(FRA, FRB) |
-| 0010 | fminnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minimumNumber(FRA, FRB) |
-| 0011 | fminc[s] FRT, FRA, FRB | x86 minss or Win32's min macro | FRT = FRA \< FRB ? FRA : FRB |
-| 0100 | fminmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3)) | FRT = minmaxmag(FRA, FRB, False, fminnum08) (2) |
-| 0101 | fminmag19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, False, fmin19) (2) |
-| 0110 | fminmagnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, False, fminnum19) (2) |
-| 0111 | fminmagc[s] FRT, FRA, FRB | - | FRT = minmaxmag(FRA, FRB, False, fminc) (2) |
-| 1000 | fmaxnum08[s] FRT, FRA, FRB | IEEE 754-2008 | FRT = maxNum(FRA, FRB) (1) |
-| 1001 | fmax19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = maximum(FRA, FRB) |
-| 1010 | fmaxnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = maximumNumber(FRA, FRB) |
-| 1011 | fmaxc[s] FRT, FRA, FRB | x86 maxss or Win32's max macro | FRT = FRA > FRB ? FRA : FRB |
-| 1100 | fmaxmagnum08[s] FRT, FRA, FRB | IEEE 754-2008 (TODO: (3)) | FRT = minmaxmag(FRA, FRB, True, fmaxnum08) (2) |
-| 1101 | fmaxmag19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, True, fmax19) (2) |
-| 1110 | fmaxmagnum19[s] FRT, FRA, FRB | IEEE 754-2019 | FRT = minmaxmag(FRA, FRB, True, fmaxnum19) (2) |
-| 1111 | fmaxmagc[s] FRT, FRA, FRB | - | FRT = minmaxmag(FRA, FRB, True, fmaxc) (2) |
+<!-- hyphens in table determine width of columns for pandoc -- -->
+| `FMM`| Extended Mnemonic | Origin | Semantics |
+|------|-------------------------------|--------------------|--------------------------------------------|
+| 0000 | fminnum08[s] FRT,FRA,FRB | IEEE 754-2008 | minNum(FRA,FRB) (1) |
+| 0001 | fmin19[s] FRT,FRA,FRB | IEEE 754-2019 | minimum(FRA,FRB) |
+| 0010 | fminnum19[s] FRT,FRA,FRB | IEEE 754-2019 | minimumNumber(FRA,FRB) |
+| 0011 | fminc[s] FRT,FRA,FRB | x86 minss (4) | FRA\<FRB ? FRA:FRB |
+| 0100 | fminmagnum08[s] FRT,FRA,FRB | IEEE 754-2008 (3) | mmmag(FRA,FRB,False,fminnum08) (2) |
+| 0101 | fminmag19[s] FRT,FRA,FRB | IEEE 754-2019 | mmmag(FRA,FRB,False,fmin19) (2) |
+| 0110 | fminmagnum19[s] FRT,FRA,FRB | IEEE 754-2019 | mmmag(FRA,FRB,False,fminnum19) (2) |
+| 0111 | fminmagc[s] FRT,FRA,FRB | - | mmmag(FRA,FRB,False,fminc) (2) |
+| 1000 | fmaxnum08[s] FRT,FRA,FRB | IEEE 754-2008 | maxNum(FRA,FRB) (1) |
+| 1001 | fmax19[s] FRT,FRA,FRB | IEEE 754-2019 | maximum(FRA,FRB) |
+| 1010 | fmaxnum19[s] FRT,FRA,FRB | IEEE 754-2019 | maximumNumber(FRA,FRB) |
+| 1011 | fmaxc[s] FRT,FRA,FRB | x86 maxss (4) | FRA\>FRB ? FRA:FRB |
+| 1100 | fmaxmagnum08[s] FRT,FRA,FRB | IEEE 754-2008 (3) | mmmag(FRA,FRB,True,fmaxnum08) (2) |
+| 1101 | fmaxmag19[s] FRT,FRA,FRB | IEEE 754-2019 | mmmag(FRA,FRB,True,fmax19) (2) |
+| 1110 | fmaxmagnum19[s] FRT,FRA,FRB | IEEE 754-2019 | mmmag(FRA,FRB,True,fmaxnum19) (2) |
+| 1111 | fmaxmagc[s] FRT,FRA,FRB | - | mmmag(FRA,FRB,True,fmaxc) (2) |
Note (1): for the purposes of minNum/maxNum, -0.0 is defined to be less than
+0.0. This is left unspecified in IEEE 754-2008.
-Note (2): minmaxmag(x, y, cmp, fallback) is defined as:
+Note (2): mmmag(x, y, cmp, fallback) is defined as:
```python
-def minmaxmag(x, y, is_max, fallback):
+def mmmag(x, y, is_max, fallback):
a = abs(x) < abs(y)
b = abs(x) > abs(y)
if is_max:
Note (3): TODO: icr if IEEE 754-2008 has min/maxMagNum like IEEE 754-2019's
minimum/maximumMagnitudeNumber
+Note (4) or Win32's min macro
+
----------------
\newpage{}
-## Floating Minimum/Maximum
-
-A-Form
-
+## Floating Minimum/Maximum MM-form
* fminmax FRT, FRA, FRB, FMM
* fminmax. FRT, FRA, FRB, FMM
```
- |0 |6 |11 |16 |21 |26 |31 |
- | PO | FRT | FRA | FRB | FMM[0:3] / | XO | Rc |
+ |0 |6 |11 |16 |21 |25 |31 |
+ | PO | FRT | FRA | FRB | FMM | XO | Rc |
```
+```
+ result <- [0] * 64
+ a <- (FRA)
+ b <- (FRB)
+ abs_a <- 0b0 || a[1:63]
+ abs_b <- 0b0 || b[1:63]
+ a_is_nan <- abs_a >u 0x7FF0_0000_0000_0000
+ a_is_snan <- a_is_nan and a[12] = 0
+ b_is_nan <- abs_b >u 0x7FF0_0000_0000_0000
+ b_is_snan <- b_is_nan and b[12] = 0
+ any_snan <- a_is_snan or b_is_snan
+ a_quieted <- a
+ a_quieted[12] = 1
+ b_quieted <- b
+ b_quieted[12] = 1
+ if a_is_nan or b_is_nan then
+ if FMM[2:3] = 0b00 then # min/maxnum08
+ if a_is_snan then result <- a_quieted
+ else if b_is_snan then result <- b_quieted
+ else if a_is_nan and b_is_nan then result <- a_quieted
+ else if a_is_nan then result <- b
+ else result <- a
+ if FMM[2:3] = 0b01 then # min/max19
+ if a_is_nan then result <- a_quieted
+ else result <- b_quieted
+ if FMM[2:3] = 0b10 then # min/maxnum19
+ if a_is_nan and b_is_nan then result <- a_quieted
+ else if a_is_nan then result <- b
+ else result <- a
+ if FMM[2:3] = 0b11 then # min/maxc
+ result <- b
+ else
+ cmp_l <- a
+ cmp_r <- b
+ if FMM[1] then # min/maxmag
+ if abs_a != abs_b then
+ cmp_l <- abs_a
+ cmp_r <- abs_b
+ if FMM[2:3] = 0b11 then # min/maxc
+ if abs_a = 0 then cmp_l <- 0
+ if abs_b = 0 then cmp_r <- 0
+ if FMM[0] then # max
+ # swap cmp_* so comparison goes the other way
+ cmp_l, cmp_r <- cmp_r, cmp_l
+ if cmp_l[0] = 1 then
+ if cmp_r[0] = 0 then result <- a
+ else if cmp_l >u cmp_r then
+ # IEEE 754 is sign-magnitude,
+ # so bigger magnitude negative is smaller
+ result <- a
+ else result <- b
+ else if cmp_r[0] = 1 then result <- b
+ else if cmp_l <u cmp_r then result <- a
+ else result <- b
+ if any_snan then SetFX(FPSCR.VXSNAN)
+ if FPSCR.VE = 0 and ¬any_snan then (FRT) <- result
+```
+
+Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
+result in FRT.
+
Special Registers altered:
```
FX VXSNAN
CR1 (if Rc=1)
```
-Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
-result in FRT.
-Assembly Aliases: see
-[`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
+Extended Mnemonics:
-----------
+see [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
-## Floating Minimum/Maximum Single
+----------
-A-Form
+## Floating Minimum/Maximum Single MM-form
* fminmaxs FRT, FRA, FRB, FMM
* fminmaxs. FRT, FRA, FRB, FMM
```
- |0 |6 |11 |16 |21 |26 |31 |
- | PO | FRT | FRA | FRB | FMM[0:3] / | XO | Rc |
+ |0 |6 |11 |16 |21 |25 |31 |
+ | PO | FRT | FRA | FRB | FMM | XO | Rc |
```
+Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
+result in FRT.
+
Special Registers altered:
```
CR1 (if Rc=1)
```
+Extended Mnemonics:
-Compute the minimum/maximum of FRA and FRB, according to FMM, and store the
-result in FRT.
-
-Assembly Aliases: see
-[`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
+see [`FMM` -- Floating Min/Max Mode](#fmm-floating-min-max-mode)
----------
These are signed and unsigned, min or max. SVP64 Prefixing defines Saturation
semantics therefore Saturated variants of these instructions need not be proposed.
-## Integer Min/Max Mode
+## `MMM` -- Integer Min/Max Mode
+
+<a id="mmm-integer-min-max-mode"></a>
* bit 0: set if word variant else dword
* bit 1: set if signed else unsigned
* bit 2: set if max else min
-| `IMM` | Assembly Alias | Semantics |
-|-------|------------------|----------------------------------------------|
-| 000 | `minu RT,RA,RB` | `RT = (uint64_t)RA < (uint64_t)RB ? RA : RB` |
-| 001 | `maxu RT,RA,RB` | `RT = (uint64_t)RA > (uint64_t)RB ? RA : RB` |
-| 010 | `mins RT,RA,RB` | `RT = (int64_t)RA < (int64_t)RB ? RA : RB` |
-| 011 | `maxs RT,RA,RB` | `RT = (int64_t)RA > (int64_t)RB ? RA : RB` |
-| 100 | `minuw RT,RA,RB` | `RT = (uint32_t)RA < (uint32_t)RB ? RA : RB` |
-| 101 | `maxuw RT,RA,RB` | `RT = (uint32_t)RA > (uint32_t)RB ? RA : RB` |
-| 110 | `minsw RT,RA,RB` | `RT = (int32_t)RA < (int32_t)RB ? RA : RB` |
-| 111 | `maxsw RT,RA,RB` | `RT = (int32_t)RA > (int32_t)RB ? RA : RB` |
+| `MMM` | Extended Mnemonic | Semantics |
+|-------|-------------------|----------------------------------------------|
+| 000 | `minu RT,RA,RB` | `(uint64_t)RA < (uint64_t)RB ? RA : RB` |
+| 001 | `maxu RT,RA,RB` | `(uint64_t)RA > (uint64_t)RB ? RA : RB` |
+| 010 | `mins RT,RA,RB` | ` (int64_t)RA < (int64_t)RB ? RA : RB` |
+| 011 | `maxs RT,RA,RB` | ` (int64_t)RA > (int64_t)RB ? RA : RB` |
+| 100 | `minuw RT,RA,RB` | `(uint32_t)RA < (uint32_t)RB ? RA : RB` |
+| 101 | `maxuw RT,RA,RB` | `(uint32_t)RA > (uint32_t)RB ? RA : RB` |
+| 110 | `minsw RT,RA,RB` | ` (int32_t)RA < (int32_t)RB ? RA : RB` |
+| 111 | `maxsw RT,RA,RB` | ` (int32_t)RA > (int32_t)RB ? RA : RB` |
-## Integer Min/Max MM-Form
+## Minimum/Maximum MM-Form
* minmax RT, RA, RB, MMM
* minmax. RT, RA, RB, MMM
```
```
- a <- (RA)
+ a <- (RA|0)
b <- (RB)
if MMM[0] then # word mode
# shift left by XLEN/2 to make the dword comparison
if MMM[1] then # signed mode
# invert sign bits to make the unsigned comparison
# do signed comparison of the original inputs
- a[0] <- !a[0] # convert
- b[0] <- !b[0]
+ a[0] <- ¬a[0]
+ b[0] <- ¬b[0]
+ # if Rc = 1 then store the result of comparing a and b to CR0
+ if Rc = 1 then
+ if a <u b then
+ CR0 <- 0b100 || XER.SO
+ if a = b then
+ CR0 <- 0b001 || XER.SO
+ if a >u b then
+ CR0 <- 0b010 || XER.SO
if MMM[2] then # max mode
# swap a and b to make the less than comparison do
# greater than comparison of the original inputs
a <- b
b <- t
# store the entire selected source (even in word mode)
- if a <u b then RT <- (RA)
- else RT <- (RB)
+ # if Rc = 1 then store the result of comparing a and b to CR0
+ if a <u b then RT <- (RA|0)
+ else RT <- (RB)
```
-Compute the integer minimum/maximum according to `MMM` of `RA` and `RB` and
-store the result in `RT`.
+Compute the integer minimum/maximum according to `MMM` of `(RA|0)` and `(RB)`
+and store the result in `RT`.
Special Registers altered:
CR0 (if Rc=1)
```
+Extended Mnemonics:
+
+see [`MMM` -- Integer Min/Max Mode](#mmm-integer-min-max-mode)
+
----------
\newpage{}
# Instruction Formats
-Add the following entries to Book I 1.6.1.15 X-FORM:
+Add the following entries to Book I 1.6.1 Word Instruction Formats:
+
+## MM-FORM
```
- |0 |6 |11 |16 |21 |26 |31 |
- | PO | FRT | FRA | FRB | FMM[0:3] / | XO | Rc |
+ |0 |6 |11 |16 |21 |24 |25 |31 |
+ | PO | FRT | FRA | FRB | FMM | XO | Rc |
+ | PO | RT | RA | RB | MMM | / | XO | Rc |
```
-Add a new field to Book I 1.6.2 Word Instruction Fields:
+Add the following new fields to Book I 1.6.2 Word Instruction Fields:
```
FMM (21:24)
Field used to specify minimum/maximum mode for fminmax[s].
- Formats: A
+ Formats: MM
+
+ MMM (21:23)
+ Field used to specify minimum/maximum mode for integer minmax.
+
+ Formats: MM
```
+Add `MM` to the `Formats:` list for all of `FRT`, `FRA`, `FRB`, `XO (25:30)`,
+`Rc`, `RT`, `RA` and `RB`.
+
----------
\newpage{}
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
-| Form | Book | Page | Version | mnemonic | Description |
+| Form | Book | Page | Version | Mnemonic | Description |
|------|------|------|---------|----------|-------------|
-| A | I | # | 3.2B | fminmax | Floating Minimum/Maximum |
-| A | I | # | 3.2B | fminmaxs | Floating Minimum/Maximum Single |
-| ??? | I | # | 3.2B | minmax | Minimum/max Signed/Unsigned |
+| MM | I | # | 3.2B | fminmax | Floating Minimum/Maximum |
+| MM | I | # | 3.2B | fminmaxs | Floating Minimum/Maximum Single |
+| MM | I | # | 3.2B | minmax | Minimum/Maximum |
## fmax instruction count
32 instructions are required in SFFS to emulate fmax.
-<https://gcc.godbolt.org/z/6xba61To6>
+```
+ #include <stdint.h>
+ #include <string.h>
+
+ inline uint64_t asuint64(double f) {
+ union {
+ double f;
+ uint64_t i;
+ } u = {f};
+ return u.i;
+ }
+
+ inline int issignaling(double v) {
+ // copied from glibc:
+ // https://github.com/bminor/glibc/blob/e2756903/sysdeps/ieee754/dbl-64/math_config.h#L101
+ uint64_t ix = asuint64(v);
+ return 2 * (ix ^ 0x0008000000000000) > 2 * 0x7ff8000000000000ULL;
+ }
+
+ double fmax(double x, double y) {
+ // copied from glibc:
+ // https://github.com/bminor/glibc/blob/e2756903/math/s_fmax_template.c
+ if(__builtin_isgreaterequal(x, y))
+ return x;
+ else if(__builtin_isless(x, y))
+ return y;
+ else if(issignaling(x) || issignaling(y))
+ return x + y;
+ else
+ return __builtin_isnan(y) ? x : y;
+ }
+```
+
+Assembly listing:
```
fmax(double, double):