# Transcendental operations
-To be updated to OpenPOWER.
-
Summary:
-*This proposal extends OpenPOWER scalar floating point operations to
+*This proposal extends Power ISA scalar floating point operations to
add IEEE754 transcendental functions (pow, log etc) and trigonometric
functions (sin, cos etc). These functions are also 98% shared with the
Khronos Group OpenCL Extended Instruction Set.*
See:
-* <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
+* <http://bugs.libre-soc.org/show_bug.cgi?id=127>
* <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
* Discussion: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002342.html>
-* [[rv_major_opcode_1010011]] for opcode listing.
-* [[zfpacc_proposal]] for accuracy settings proposal
+* [[power_trans_ops]] for opcode listing.
Extension subsets:
Minimum recommended requirements for 3D: Zftrans, Ztrignpi,
Zarctrignpi, with Ztrigpi and Zarctrigpi as augmentations.
-Minimum recommended requirements for Mobile-Embedded 3D: Ztrignpi, Zftrans, with Ztrigpi as an augmentation.
+Minimum recommended requirements for Mobile-Embedded 3D:
+Ztrignpi, Zftrans, with Ztrigpi as an augmentation.
# TODO:
# Requirements <a name="requirements"></a>
-This proposal is designed to meet a wide range of extremely diverse needs,
-allowing implementors from all of them to benefit from the tools and hardware
-cost reductions associated with common standards adoption in RISC-V (primarily IEEE754 and Vulkan).
+This proposal is designed to meet a wide range of extremely diverse
+needs, allowing implementors from all of them to benefit from the tools
+and hardware cost reductions associated with common standards adoption
+in Power ISA (primarily IEEE754 and Vulkan).
**There are *four* different, disparate platform's needs (two new)**:
to use the (equivalent) proposed opcode covering the same function.
* "Fast" opcodes are *not* being proposed, because the Khronos Specification
fast\_length, fast\_normalise and fast\_distance OpenCL opcodes require
- vectors (or can be done as scalar operations using other RISC-V instructions).
+ vectors (or can be done as scalar operations using other Power ISA
+ instructions).
The OpenCL FP32 opcodes are **direct** equivalents to the proposed opcodes.
Deviation from conformance with the Khronos Specification - including the
[[!table data="""
opcode | OpenCL FP32 | OpenCL FP16 | OpenCL native | OpenCL fast | IEEE754 | Power ISA |
-FSIN | sin | half\_sin | native\_sin | NONE | sin | |
-FCOS | cos | half\_cos | native\_cos | NONE | cos | |
-FTAN | tan | half\_tan | native\_tan | NONE | tan | |
-NONE (1) | sincos | NONE | NONE | NONE | NONE | |
-FASIN | asin | NONE | NONE | NONE | asin | |
-FACOS | acos | NONE | NONE | NONE | acos | |
-FATAN | atan | NONE | NONE | NONE | atan | |
-FSINPI | sinpi | NONE | NONE | NONE | sinPi | |
-FCOSPI | cospi | NONE | NONE | NONE | cosPi | |
-FTANPI | tanpi | NONE | NONE | NONE | tanPi | |
-FASINPI | asinpi | NONE | NONE | NONE | asinPi | |
-FACOSPI | acospi | NONE | NONE | NONE | acosPi | |
-FATANPI | atanpi | NONE | NONE | NONE | atanPi | |
-FSINH | sinh | NONE | NONE | NONE | sinh | |
-FCOSH | cosh | NONE | NONE | NONE | cosh | |
-FTANH | tanh | NONE | NONE | NONE | tanh | |
-FASINH | asinh | NONE | NONE | NONE | asinh | |
-FACOSH | acosh | NONE | NONE | NONE | acosh | |
-FATANH | atanh | NONE | NONE | NONE | atanh | |
-FATAN2 | atan2 | NONE | NONE | NONE | atan2 | |
-FATAN2PI | atan2pi | NONE | NONE | NONE | atan2pi | |
+FSIN | sin | half\_sin | native\_sin | NONE | sin | NONE |
+FCOS | cos | half\_cos | native\_cos | NONE | cos | NONE |
+FTAN | tan | half\_tan | native\_tan | NONE | tan | NONE |
+NONE (1) | sincos | NONE | NONE | NONE | NONE | NONE |
+FASIN | asin | NONE | NONE | NONE | asin | NONE |
+FACOS | acos | NONE | NONE | NONE | acos | NONE |
+FATAN | atan | NONE | NONE | NONE | atan | NONE |
+FSINPI | sinpi | NONE | NONE | NONE | sinPi | NONE |
+FCOSPI | cospi | NONE | NONE | NONE | cosPi | NONE |
+FTANPI | tanpi | NONE | NONE | NONE | tanPi | NONE |
+FASINPI | asinpi | NONE | NONE | NONE | asinPi | NONE |
+FACOSPI | acospi | NONE | NONE | NONE | acosPi | NONE |
+FATANPI | atanpi | NONE | NONE | NONE | atanPi | NONE |
+FSINH | sinh | NONE | NONE | NONE | sinh | NONE |
+FCOSH | cosh | NONE | NONE | NONE | cosh | NONE |
+FTANH | tanh | NONE | NONE | NONE | tanh | NONE |
+FASINH | asinh | NONE | NONE | NONE | asinh | NONE |
+FACOSH | acosh | NONE | NONE | NONE | acosh | NONE |
+FATANH | atanh | NONE | NONE | NONE | atanh | NONE |
+FATAN2 | atan2 | NONE | NONE | NONE | atan2 | NONE |
+FATAN2PI | atan2pi | NONE | NONE | NONE | atan2pi | NONE |
FRSQRT | rsqrt | half\_rsqrt | native\_rsqrt | NONE | rSqrt | fsqrte, fsqrtes (4) |
-FCBRT | cbrt | NONE | NONE | NONE | NONE (2) | |
-FEXP2 | exp2 | half\_exp2 | native\_exp2 | NONE | exp2 | |
-FLOG2 | log2 | half\_log2 | native\_log2 | NONE | log2 | |
-FEXPM1 | expm1 | NONE | NONE | NONE | expm1 | |
-FLOG1P | log1p | NONE | NONE | NONE | logp1 | |
-FEXP | exp | half\_exp | native\_exp | NONE | exp | |
-FLOG | log | half\_log | native\_log | NONE | log | |
-FEXP10 | exp10 | half\_exp10 | native\_exp10 | NONE | exp10 | |
-FLOG10 | log10 | half\_log10 | native\_log10 | NONE | log10 | |
-FPOW | pow | NONE | NONE | NONE | pow | |
-FPOWN | pown | NONE | NONE | NONE | pown | |
-FPOWR | powr | half\_powr | native\_powr | NONE | powr | |
-FROOTN | rootn | NONE | NONE | NONE | rootn | |
-FHYPOT | hypot | NONE | NONE | NONE | hypot | |
+FCBRT | cbrt | NONE | NONE | NONE | NONE (2) | NONE |
+FEXP2 | exp2 | half\_exp2 | native\_exp2 | NONE | exp2 | NONE |
+FLOG2 | log2 | half\_log2 | native\_log2 | NONE | log2 | NONE |
+FEXPM1 | expm1 | NONE | NONE | NONE | expm1 | NONE |
+FLOG1P | log1p | NONE | NONE | NONE | logp1 | NONE |
+FEXP | exp | half\_exp | native\_exp | NONE | exp | NONE |
+FLOG | log | half\_log | native\_log | NONE | log | NONE |
+FEXP10 | exp10 | half\_exp10 | native\_exp10 | NONE | exp10 | NONE |
+FLOG10 | log10 | half\_log10 | native\_log10 | NONE | log10 | NONE |
+FPOW | pow | NONE | NONE | NONE | pow | NONE |
+FPOWN | pown | NONE | NONE | NONE | pown | NONE |
+FPOWR | powr | half\_powr | native\_powr | NONE | powr | NONE |
+FROOTN | rootn | NONE | NONE | NONE | rootn | NONE |
+FHYPOT | hypot | NONE | NONE | NONE | hypot | NONE |
FRECIP | NONE | half\_recip | native\_recip | NONE | NONE (3) | fre, fres (4) |
-NONE | NONE | NONE | NONE | NONE | compound | |
-NONE | NONE | NONE | NONE | NONE | exp2m1 | |
-NONE | NONE | NONE | NONE | NONE | exp10m1 | |
-NONE | NONE | NONE | NONE | NONE | log2p1 | |
-NONE | NONE | NONE | NONE | NONE | log10p1 | |
+NONE | NONE | NONE | NONE | NONE | compound | NONE |
+NONE | NONE | NONE | NONE | NONE | exp2m1 | NONE |
+NONE | NONE | NONE | NONE | NONE | exp10m1 | NONE |
+NONE | NONE | NONE | NONE | NONE | log2p1 | NONE |
+NONE | NONE | NONE | NONE | NONE | log10p1 | NONE |
"""]]
Note (1) FSINCOS is macro-op fused (see below).
## List of 2-arg opcodes
-[[!table data="""
-opcode | Description | pseudocode | Extension |
-FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
-FATAN2PI | atan2 arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
-FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv |
-FPOWN | x power of n (n int) | rd = pow(rs1, rs2) | ZftransAdv |
-FPOWR | x power of y (x +ve) | rd = exp(rs1 log(rs2)) | ZftransAdv |
-FROOTN | x power 1/n (n integer)| rd = pow(rs1, 1/rs2) | ZftransAdv |
-FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | ZftransAdv |
-"""]]
+| opcode | Description | pseudocode | Extension |
+| ------ | ---------------- | ---------------- | ----------- |
+| FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
+| FATAN2PI | atan2 arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
+| FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv |
+| FPOWN | x power of n (n int) | rd = pow(rs1, rs2) | ZftransAdv |
+| FPOWR | x power of y (x +ve) | rd = exp(rs1 log(rs2)) | ZftransAdv |
+| FROOTN | x power 1/n (n integer)| rd = pow(rs1, 1/rs2) | ZftransAdv |
+| FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | ZftransAdv |
## List of 1-arg transcendental opcodes
-[[!table data="""
-opcode | Description | pseudocode | Extension |
-FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt |
-FCBRT | Cube Root | rd = pow(rs1, 1.0 / 3) | ZftransAdv |
-FRECIP | Reciprocal | rd = 1.0 / rs1 | Zftrans |
-FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans |
-FLOG2 | log2 | rd = log(2. rs1) | Zftrans |
-FEXPM1 | exponential minus 1 | rd = pow(e, rs1) - 1.0 | ZftransExt |
-FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | ZftransExt |
-FEXP | exponential | rd = pow(e, rs1) | ZftransExt |
-FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt |
-FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt |
-FLOG10 | log base 10 | rd = log(10, rs1) | ZftransExt |
-"""]]
+| opcode | Description | pseudocode | Extension |
+| ------ | ---------------- | ---------------- | ----------- |
+| FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt |
+| FCBRT | Cube Root | rd = pow(rs1, 1.0 / 3) | ZftransAdv |
+| FRECIP | Reciprocal | rd = 1.0 / rs1 | Zftrans |
+| FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans |
+| FLOG2 | log2 | rd = log(2. rs1) | Zftrans |
+| FEXPM1 | exponential minus 1 | rd = pow(e, rs1) - 1.0 | ZftransExt |
+| FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | ZftransExt |
+| FEXP | exponential | rd = pow(e, rs1) | ZftransExt |
+| FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt |
+| FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt |
+| FLOG10 | log base 10 | rd = log(10, rs1) | ZftransExt |
## List of 1-arg trigonometric opcodes
-[[!table data="""
-opcode | Description | pseudo-code | Extension |
-FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
-FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
-FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
-FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
-FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
-FATAN | arctan (radians) | rd = atan(rs1) | Zarctrignpi |
-FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
-FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
-FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
-FASINPI | arcsin / pi | rd = asin(rs1) / pi | Zarctrigpi |
-FACOSPI | arccos / pi | rd = acos(rs1) / pi | Zarctrigpi |
-FATANPI | arctan / pi | rd = atan(rs1) / pi | Zarctrigpi |
-FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp |
-FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp |
-FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp |
-FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp |
-FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp |
-FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp |
-"""]]
+| opcode | Description | pseudocode | Extension |
+| ------ | ---------------- | ---------------- | ----------- |
+| FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
+| FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
+| FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
+| FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
+| FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
+| FATAN | arctan (radians) | rd = atan(rs1) | Zarctrignpi |
+| FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
+| FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
+| FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
+| FASINPI | arcsin / pi | rd = asin(rs1) / pi | Zarctrigpi |
+| FACOSPI | arccos / pi | rd = acos(rs1) / pi | Zarctrigpi |
+| FATANPI | arctan / pi | rd = atan(rs1) / pi | Zarctrigpi |
+| FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp |
+| FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp |
+| FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp |
+| FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp |
+| FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp |
+| FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp |
# Subsets