From 400eefe43c64441798ab2f3ab898508bcaf43c97 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Thu, 7 Jul 2022 16:13:46 +0100 Subject: [PATCH] update transcendental tables --- openpower/transcendentals.mdwn | 189 ++++++++++++++++----------------- 1 file changed, 93 insertions(+), 96 deletions(-) diff --git a/openpower/transcendentals.mdwn b/openpower/transcendentals.mdwn index 725167099..a789982d9 100644 --- a/openpower/transcendentals.mdwn +++ b/openpower/transcendentals.mdwn @@ -1,10 +1,8 @@ # Transcendental operations -To be updated to OpenPOWER. - Summary: -*This proposal extends OpenPOWER scalar floating point operations to +*This proposal extends Power ISA scalar floating point operations to add IEEE754 transcendental functions (pow, log etc) and trigonometric functions (sin, cos etc). These functions are also 98% shared with the Khronos Group OpenCL Extended Instruction Set.* @@ -22,11 +20,10 @@ With thanks to: See: -* +* * * Discussion: -* [[rv_major_opcode_1010011]] for opcode listing. -* [[zfpacc_proposal]] for accuracy settings proposal +* [[power_trans_ops]] for opcode listing. Extension subsets: @@ -45,7 +42,8 @@ Extension subsets: Minimum recommended requirements for 3D: Zftrans, Ztrignpi, Zarctrignpi, with Ztrigpi and Zarctrigpi as augmentations. -Minimum recommended requirements for Mobile-Embedded 3D: Ztrignpi, Zftrans, with Ztrigpi as an augmentation. +Minimum recommended requirements for Mobile-Embedded 3D: +Ztrignpi, Zftrans, with Ztrigpi as an augmentation. # TODO: @@ -61,9 +59,10 @@ Minimum recommended requirements for Mobile-Embedded 3D: Ztrignpi, Zftrans, with # Requirements -This proposal is designed to meet a wide range of extremely diverse needs, -allowing implementors from all of them to benefit from the tools and hardware -cost reductions associated with common standards adoption in RISC-V (primarily IEEE754 and Vulkan). +This proposal is designed to meet a wide range of extremely diverse +needs, allowing implementors from all of them to benefit from the tools +and hardware cost reductions associated with common standards adoption +in Power ISA (primarily IEEE754 and Vulkan). **There are *four* different, disparate platform's needs (two new)**: @@ -134,7 +133,8 @@ and to use the (equivalent) proposed opcode covering the same function. * "Fast" opcodes are *not* being proposed, because the Khronos Specification fast\_length, fast\_normalise and fast\_distance OpenCL opcodes require - vectors (or can be done as scalar operations using other RISC-V instructions). + vectors (or can be done as scalar operations using other Power ISA + instructions). The OpenCL FP32 opcodes are **direct** equivalents to the proposed opcodes. Deviation from conformance with the Khronos Specification - including the @@ -148,48 +148,48 @@ compound, exp2m1, exp10m1, log2p1, log10p1, pown (integer power) and powr. [[!table data=""" opcode | OpenCL FP32 | OpenCL FP16 | OpenCL native | OpenCL fast | IEEE754 | Power ISA | -FSIN | sin | half\_sin | native\_sin | NONE | sin | | -FCOS | cos | half\_cos | native\_cos | NONE | cos | | -FTAN | tan | half\_tan | native\_tan | NONE | tan | | -NONE (1) | sincos | NONE | NONE | NONE | NONE | | -FASIN | asin | NONE | NONE | NONE | asin | | -FACOS | acos | NONE | NONE | NONE | acos | | -FATAN | atan | NONE | NONE | NONE | atan | | -FSINPI | sinpi | NONE | NONE | NONE | sinPi | | -FCOSPI | cospi | NONE | NONE | NONE | cosPi | | -FTANPI | tanpi | NONE | NONE | NONE | tanPi | | -FASINPI | asinpi | NONE | NONE | NONE | asinPi | | -FACOSPI | acospi | NONE | NONE | NONE | acosPi | | -FATANPI | atanpi | NONE | NONE | NONE | atanPi | | -FSINH | sinh | NONE | NONE | NONE | sinh | | -FCOSH | cosh | NONE | NONE | NONE | cosh | | -FTANH | tanh | NONE | NONE | NONE | tanh | | -FASINH | asinh | NONE | NONE | NONE | asinh | | -FACOSH | acosh | NONE | NONE | NONE | acosh | | -FATANH | atanh | NONE | NONE | NONE | atanh | | -FATAN2 | atan2 | NONE | NONE | NONE | atan2 | | -FATAN2PI | atan2pi | NONE | NONE | NONE | atan2pi | | +FSIN | sin | half\_sin | native\_sin | NONE | sin | NONE | +FCOS | cos | half\_cos | native\_cos | NONE | cos | NONE | +FTAN | tan | half\_tan | native\_tan | NONE | tan | NONE | +NONE (1) | sincos | NONE | NONE | NONE | NONE | NONE | +FASIN | asin | NONE | NONE | NONE | asin | NONE | +FACOS | acos | NONE | NONE | NONE | acos | NONE | +FATAN | atan | NONE | NONE | NONE | atan | NONE | +FSINPI | sinpi | NONE | NONE | NONE | sinPi | NONE | +FCOSPI | cospi | NONE | NONE | NONE | cosPi | NONE | +FTANPI | tanpi | NONE | NONE | NONE | tanPi | NONE | +FASINPI | asinpi | NONE | NONE | NONE | asinPi | NONE | +FACOSPI | acospi | NONE | NONE | NONE | acosPi | NONE | +FATANPI | atanpi | NONE | NONE | NONE | atanPi | NONE | +FSINH | sinh | NONE | NONE | NONE | sinh | NONE | +FCOSH | cosh | NONE | NONE | NONE | cosh | NONE | +FTANH | tanh | NONE | NONE | NONE | tanh | NONE | +FASINH | asinh | NONE | NONE | NONE | asinh | NONE | +FACOSH | acosh | NONE | NONE | NONE | acosh | NONE | +FATANH | atanh | NONE | NONE | NONE | atanh | NONE | +FATAN2 | atan2 | NONE | NONE | NONE | atan2 | NONE | +FATAN2PI | atan2pi | NONE | NONE | NONE | atan2pi | NONE | FRSQRT | rsqrt | half\_rsqrt | native\_rsqrt | NONE | rSqrt | fsqrte, fsqrtes (4) | -FCBRT | cbrt | NONE | NONE | NONE | NONE (2) | | -FEXP2 | exp2 | half\_exp2 | native\_exp2 | NONE | exp2 | | -FLOG2 | log2 | half\_log2 | native\_log2 | NONE | log2 | | -FEXPM1 | expm1 | NONE | NONE | NONE | expm1 | | -FLOG1P | log1p | NONE | NONE | NONE | logp1 | | -FEXP | exp | half\_exp | native\_exp | NONE | exp | | -FLOG | log | half\_log | native\_log | NONE | log | | -FEXP10 | exp10 | half\_exp10 | native\_exp10 | NONE | exp10 | | -FLOG10 | log10 | half\_log10 | native\_log10 | NONE | log10 | | -FPOW | pow | NONE | NONE | NONE | pow | | -FPOWN | pown | NONE | NONE | NONE | pown | | -FPOWR | powr | half\_powr | native\_powr | NONE | powr | | -FROOTN | rootn | NONE | NONE | NONE | rootn | | -FHYPOT | hypot | NONE | NONE | NONE | hypot | | +FCBRT | cbrt | NONE | NONE | NONE | NONE (2) | NONE | +FEXP2 | exp2 | half\_exp2 | native\_exp2 | NONE | exp2 | NONE | +FLOG2 | log2 | half\_log2 | native\_log2 | NONE | log2 | NONE | +FEXPM1 | expm1 | NONE | NONE | NONE | expm1 | NONE | +FLOG1P | log1p | NONE | NONE | NONE | logp1 | NONE | +FEXP | exp | half\_exp | native\_exp | NONE | exp | NONE | +FLOG | log | half\_log | native\_log | NONE | log | NONE | +FEXP10 | exp10 | half\_exp10 | native\_exp10 | NONE | exp10 | NONE | +FLOG10 | log10 | half\_log10 | native\_log10 | NONE | log10 | NONE | +FPOW | pow | NONE | NONE | NONE | pow | NONE | +FPOWN | pown | NONE | NONE | NONE | pown | NONE | +FPOWR | powr | half\_powr | native\_powr | NONE | powr | NONE | +FROOTN | rootn | NONE | NONE | NONE | rootn | NONE | +FHYPOT | hypot | NONE | NONE | NONE | hypot | NONE | FRECIP | NONE | half\_recip | native\_recip | NONE | NONE (3) | fre, fres (4) | -NONE | NONE | NONE | NONE | NONE | compound | | -NONE | NONE | NONE | NONE | NONE | exp2m1 | | -NONE | NONE | NONE | NONE | NONE | exp10m1 | | -NONE | NONE | NONE | NONE | NONE | log2p1 | | -NONE | NONE | NONE | NONE | NONE | log10p1 | | +NONE | NONE | NONE | NONE | NONE | compound | NONE | +NONE | NONE | NONE | NONE | NONE | exp2m1 | NONE | +NONE | NONE | NONE | NONE | NONE | exp10m1 | NONE | +NONE | NONE | NONE | NONE | NONE | log2p1 | NONE | +NONE | NONE | NONE | NONE | NONE | log10p1 | NONE | """]] Note (1) FSINCOS is macro-op fused (see below). @@ -203,57 +203,54 @@ software emulation ## List of 2-arg opcodes -[[!table data=""" -opcode | Description | pseudocode | Extension | -FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi | -FATAN2PI | atan2 arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi | -FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv | -FPOWN | x power of n (n int) | rd = pow(rs1, rs2) | ZftransAdv | -FPOWR | x power of y (x +ve) | rd = exp(rs1 log(rs2)) | ZftransAdv | -FROOTN | x power 1/n (n integer)| rd = pow(rs1, 1/rs2) | ZftransAdv | -FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | ZftransAdv | -"""]] +| opcode | Description | pseudocode | Extension | +| ------ | ---------------- | ---------------- | ----------- | +| FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi | +| FATAN2PI | atan2 arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi | +| FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv | +| FPOWN | x power of n (n int) | rd = pow(rs1, rs2) | ZftransAdv | +| FPOWR | x power of y (x +ve) | rd = exp(rs1 log(rs2)) | ZftransAdv | +| FROOTN | x power 1/n (n integer)| rd = pow(rs1, 1/rs2) | ZftransAdv | +| FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | ZftransAdv | ## List of 1-arg transcendental opcodes -[[!table data=""" -opcode | Description | pseudocode | Extension | -FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt | -FCBRT | Cube Root | rd = pow(rs1, 1.0 / 3) | ZftransAdv | -FRECIP | Reciprocal | rd = 1.0 / rs1 | Zftrans | -FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans | -FLOG2 | log2 | rd = log(2. rs1) | Zftrans | -FEXPM1 | exponential minus 1 | rd = pow(e, rs1) - 1.0 | ZftransExt | -FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | ZftransExt | -FEXP | exponential | rd = pow(e, rs1) | ZftransExt | -FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt | -FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt | -FLOG10 | log base 10 | rd = log(10, rs1) | ZftransExt | -"""]] +| opcode | Description | pseudocode | Extension | +| ------ | ---------------- | ---------------- | ----------- | +| FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt | +| FCBRT | Cube Root | rd = pow(rs1, 1.0 / 3) | ZftransAdv | +| FRECIP | Reciprocal | rd = 1.0 / rs1 | Zftrans | +| FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans | +| FLOG2 | log2 | rd = log(2. rs1) | Zftrans | +| FEXPM1 | exponential minus 1 | rd = pow(e, rs1) - 1.0 | ZftransExt | +| FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | ZftransExt | +| FEXP | exponential | rd = pow(e, rs1) | ZftransExt | +| FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt | +| FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt | +| FLOG10 | log base 10 | rd = log(10, rs1) | ZftransExt | ## List of 1-arg trigonometric opcodes -[[!table data=""" -opcode | Description | pseudo-code | Extension | -FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi | -FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi | -FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi | -FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi | -FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi | -FATAN | arctan (radians) | rd = atan(rs1) | Zarctrignpi | -FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi | -FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi | -FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi | -FASINPI | arcsin / pi | rd = asin(rs1) / pi | Zarctrigpi | -FACOSPI | arccos / pi | rd = acos(rs1) / pi | Zarctrigpi | -FATANPI | arctan / pi | rd = atan(rs1) / pi | Zarctrigpi | -FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp | -FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp | -FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp | -FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp | -FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp | -FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp | -"""]] +| opcode | Description | pseudocode | Extension | +| ------ | ---------------- | ---------------- | ----------- | +| FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi | +| FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi | +| FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi | +| FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi | +| FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi | +| FATAN | arctan (radians) | rd = atan(rs1) | Zarctrignpi | +| FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi | +| FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi | +| FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi | +| FASINPI | arcsin / pi | rd = asin(rs1) / pi | Zarctrigpi | +| FACOSPI | arccos / pi | rd = acos(rs1) / pi | Zarctrigpi | +| FATANPI | arctan / pi | rd = atan(rs1) / pi | Zarctrigpi | +| FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp | +| FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp | +| FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp | +| FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp | +| FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp | +| FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp | # Subsets -- 2.30.2