# RFC ls013 Min/Max GPR/FPR
**URLs**:
*
*
*
**Severity**: Major
**Status**: New
**Date**: 14 Apr 2023
**Target**: v3.2B
**Source**: v3.1B
**Books and Section affected**:
```
Book I Fixed-Point and Floating-Point Instructions
Appendix E Power ISA sorted by opcode
Appendix F Power ISA sorted by version
Appendix G Power ISA sorted by Compliancy Subset
Appendix H Power ISA sorted by mnemonic
```
**Summary**
```
Instructions added
```
**Submitter**: Luke Leighton (Libre-SOC)
**Requester**: Libre-SOC
**Impact on processor**:
```
Addition of new GPR-based and FPR-based instructions
```
**Impact on software**:
```
Requires support for new instructions in assembler, debuggers,
and related tools.
```
**Keywords**:
```
GPR, FPR, min, max, fmin, fmax
```
**Motivation**
Minimum/Maximum are common operations that can take an astounding number of
operations to implement in software. Additionally, Vector Reduce-Min/Max are
common vector operations, and SVP64 Parallel Reduction needs a single Scalar
instruction in order to effectively implement Reduce-Min/Max.
**Notes and Observations**:
1. SVP64 REMAP Parallel Reduction needs a single Scalar instruction to
work with, for best effectiveness. With no SFFS minimum/maximum
instructions Simple-V min/max Parallel Reduction is severely compromised.
2. Once one FP min/max mode is implemented the rest are not much more hardware.
3. There exists similar instructions in VSX (not IEEE754-2019 though).
This is frequently used to justify not adding them. However SVP64/VSX may
have different meaning from SVP64/SFFS, so it is *really* crucial to have
SFFS ops even if "equivalent" to VSX in order for SVP64 to not be
compromised (non-orthogonal).
4. FP min/max are rather complex to implement in software, the most commonly
used FP max function `fmax` from glibc compiled for SFFS is an astounding
32 instructions.
**Changes**
Add the following entries to:
* the Appendices of Book I
* Book I 3.3.9 Fixed-Point Arithmetic Instructions
* Book I 4.6.6.1 Floating-Point Elementary Arithmetic Instructions
* Book I 1.6.1 and 1.6.2
----------------
\newpage{}
# Floating-Point Instructions
This group is to provide Floating-Point min/max however with IEEE754 having advanced
to 2019 there are now subtle differences. These are selectable with a Mode Field, `FMM`.
## `FMM` -- Floating Min/Max Mode
| `FMM`| Extended Mnemonic | Origin | Semantics |
|------|-------------------------------|--------------------|--------------------------------------------|
| 0000 | fminnum08[s] FRT,FRA,FRB | IEEE 754-2008 | minNum(FRA,FRB) (1) |
| 0001 | fmin19[s] FRT,FRA,FRB | IEEE 754-2019 | minimum(FRA,FRB) |
| 0010 | fminnum19[s] FRT,FRA,FRB | IEEE 754-2019 | minimumNumber(FRA,FRB) |
| 0011 | fminc[s] FRT,FRA,FRB | x86 minss (4) | FRA\FRB ? FRA:FRB |
| 1100 | fmaxmagnum08[s] FRT,FRA,FRB | IEEE 754-2008 (3) | mmmag(FRA,FRB,True,fmaxnum08) (2) |
| 1101 | fmaxmag19[s] FRT,FRA,FRB | IEEE 754-2019 | mmmag(FRA,FRB,True,fmax19) (2) |
| 1110 | fmaxmagnum19[s] FRT,FRA,FRB | IEEE 754-2019 | mmmag(FRA,FRB,True,fmaxnum19) (2) |
| 1111 | fmaxmagc[s] FRT,FRA,FRB | - | mmmag(FRA,FRB,True,fmaxc) (2) |
Note (1): for the purposes of minNum/maxNum, -0.0 is defined to be less than
+0.0. This is left unspecified in IEEE 754-2008.
Note (2): mmmag(x, y, cmp, fallback) is defined as:
```python
def mmmag(x, y, is_max, fallback):
a = abs(x) < abs(y)
b = abs(x) > abs(y)
if is_max:
a, b = b, a # swap
if a:
return x
if b:
return y
# equal magnitudes, or NaN input(s)
return fallback(x, y)
```
Note (3): TODO: icr if IEEE 754-2008 has min/maxMagNum like IEEE 754-2019's
minimum/maximumMagnitudeNumber
Note (4) or Win32's min macro
----------------
\newpage{}
## Floating Minimum/Maximum MM-form
* fminmax FRT, FRA, FRB, FMM
* fminmax. FRT, FRA, FRB, FMM
```
|0 |6 |11 |16 |21 |25 |31 |
| PO | FRT | FRA | FRB | FMM | XO | Rc |
```
```
result <- [0] * 64
a <- (FRA)
b <- (FRB)
abs_a <- 0b0 || a[1:63]
abs_b <- 0b0 || b[1:63]
a_is_nan <- abs_a >u 0x7FF0_0000_0000_0000
a_is_snan <- a_is_nan and a[12] = 0
b_is_nan <- abs_b >u 0x7FF0_0000_0000_0000
b_is_snan <- b_is_nan and b[12] = 0
any_snan <- a_is_snan or b_is_snan
a_quieted <- a
a_quieted[12] = 1
b_quieted <- b
b_quieted[12] = 1
if a_is_nan or b_is_nan then
if FMM[2:3] = 0b00 then # min/maxnum08
if a_is_snan then result <- a_quieted
else if b_is_snan then result <- b_quieted
else if a_is_nan and b_is_nan then result <- a_quieted
else if a_is_nan then result <- b
else result <- a
if FMM[2:3] = 0b01 then # min/max19
if a_is_nan then result <- a_quieted
else result <- b_quieted
if FMM[2:3] = 0b10 then # min/maxnum19
if a_is_nan and b_is_nan then result <- a_quieted
else if a_is_nan then result <- b
else result <- a
if FMM[2:3] = 0b11 then # min/maxc
result <- b
else
cmp_l <- a
cmp_r <- b
if FMM[1] then # min/maxmag
if abs_a != abs_b then
cmp_l <- abs_a
cmp_r <- abs_b
if FMM[2:3] = 0b11 then # min/maxc
if abs_a = 0 then cmp_l <- 0
if abs_b = 0 then cmp_r <- 0
if FMM[0] then # max
# swap cmp_* so comparison goes the other way
cmp_l, cmp_r <- cmp_r, cmp_l
if cmp_l[0] = 1 then
if cmp_r[0] = 0 then result <- a
else if cmp_l >u cmp_r then
# IEEE 754 is sign-magnitude,
# so bigger magnitude negative is smaller
result <- a
else result <- b
else if cmp_r[0] = 1 then result <- b
else if cmp_l
* bit 0: set if word variant else dword
* bit 1: set if signed else unsigned
* bit 2: set if max else min
| `MMM` | Extended Mnemonic | Semantics |
|-------|-------------------|----------------------------------------------|
| 000 | `minu RT,RA,RB` | `(uint64_t)RA < (uint64_t)RB ? RA : RB` |
| 001 | `maxu RT,RA,RB` | `(uint64_t)RA > (uint64_t)RB ? RA : RB` |
| 010 | `mins RT,RA,RB` | ` (int64_t)RA < (int64_t)RB ? RA : RB` |
| 011 | `maxs RT,RA,RB` | ` (int64_t)RA > (int64_t)RB ? RA : RB` |
| 100 | `minuw RT,RA,RB` | `(uint32_t)RA < (uint32_t)RB ? RA : RB` |
| 101 | `maxuw RT,RA,RB` | `(uint32_t)RA > (uint32_t)RB ? RA : RB` |
| 110 | `minsw RT,RA,RB` | ` (int32_t)RA < (int32_t)RB ? RA : RB` |
| 111 | `maxsw RT,RA,RB` | ` (int32_t)RA > (int32_t)RB ? RA : RB` |
## Minimum/Maximum MM-Form
* minmax RT, RA, RB, MMM
* minmax. RT, RA, RB, MMM
```
|0 |6 |11 |16 |21 |24 |25 |31 |
| PO | RT | RA | RB | MMM | / | XO | Rc |
```
```
a <- (RA|0)
b <- (RB)
if MMM[0] then # word mode
# shift left by XLEN/2 to make the dword comparison
# do word comparison of the original inputs
a <- a[XLEN/2:XLEN-1] || [0] * XLEN/2
b <- b[XLEN/2:XLEN-1] || [0] * XLEN/2
if MMM[1] then # signed mode
# invert sign bits to make the unsigned comparison
# do signed comparison of the original inputs
a[0] <- ¬a[0]
b[0] <- ¬b[0]
# if Rc = 1 then store the result of comparing a and b to CR0
if Rc = 1 then
if a u b then
CR0 <- 0b010 || XER.SO
if MMM[2] then # max mode
# swap a and b to make the less than comparison do
# greater than comparison of the original inputs
t <- a
a <- b
b <- t
# store the entire selected source (even in word mode)
# if Rc = 1 then store the result of comparing a and b to CR0
if a
#include
inline uint64_t asuint64(double f) {
union {
double f;
uint64_t i;
} u = {f};
return u.i;
}
inline int issignaling(double v) {
// copied from glibc:
// https://github.com/bminor/glibc/blob/e2756903/sysdeps/ieee754/dbl-64/math_config.h#L101
uint64_t ix = asuint64(v);
return 2 * (ix ^ 0x0008000000000000) > 2 * 0x7ff8000000000000ULL;
}
double fmax(double x, double y) {
// copied from glibc:
// https://github.com/bminor/glibc/blob/e2756903/math/s_fmax_template.c
if(__builtin_isgreaterequal(x, y))
return x;
else if(__builtin_isless(x, y))
return y;
else if(issignaling(x) || issignaling(y))
return x + y;
else
return __builtin_isnan(y) ? x : y;
}
```
Assembly listing:
```
fmax(double, double):
fcmpu 0,1,2
fmr 0,1
cror 30,1,2
beq 7,.L12
blt 0,.L13
stfd 1,-16(1)
lis 9,0x8
li 8,-1
sldi 9,9,32
rldicr 8,8,0,11
ori 2,2,0
ld 10,-16(1)
xor 10,10,9
sldi 10,10,1
cmpld 0,10,8
bgt 0,.L5
stfd 2,-16(1)
ori 2,2,0
ld 10,-16(1)
xor 9,10,9
sldi 9,9,1
cmpld 0,9,8
ble 0,.L6
.L5:
fadd 1,0,2
blr
.L13:
fmr 1,2
blr
.L6:
fcmpu 0,2,2
fmr 1,2
bnulr 0
.L12:
fmr 1,0
blr
.long 0
.byte 0,9,0,0,0,0,0,0
```
[[!tag opf_rfc]]