-# Ztrans - transcendental operations
+# Zftrans - transcendental operations
See:
* <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
* <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
+* Discussion: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002342.html>
+* [[rv_major_opcode_1010011]] for opcode listing.
+* [[zfpacc_proposal]] for accuracy settings proposal
+
+Extension subsets:
+
+* **Zftrans**: standard transcendentals (best suited to 3D)
+* **ZftransExt**: extra functions (useful, not generally needed for 3D,
+ can be synthesised using Ztrans)
+* **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
+* **Ztrignpi**: trig non-xxx-pi sin cos tan
+* **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
+* **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
+* **Zfhyp**: hyperbolic/inverse-hyperbolic. sinh, cosh, tanh, asinh,
+ acosh, atanh (can be synthesised - see below)
+* **ZftransAdv**: much more complex to implement in hardware
+* **Zfrsqrt**: Reciprocal square-root.
+
+Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi,
+Zarctrignpi
[[!toc levels=2]]
+# TODO:
+
+* Decision on accuracy, moved to [[zfpacc_proposal]]
+<http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002355.html>
+* Errors **MUST** be repeatable.
+* How about four Platform Specifications? 3DUNIX, UNIX, 3DEmbedded and Embedded?
+<http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002361.html>
+ Accuracy requirements for dual (triple) purpose implementations must
+ meet the higher standard.
+* Reciprocal Square-root is in its own separate extension (Zfrsqrt) as
+ it is desirable on its own by other implementors. This to be evaluated.
+
+
# List of 2-arg opcodes
[[!table data="""
opcode | Description | pseudo-code | Extension |
-FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Ztrans |
-FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | |
-FPOW | power of | rd = pow(rs1, rs2) | Ztrans |
+FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
+FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
+FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv |
+FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZftransAdv |
+FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | Zftrans |
+"""]]
+
+# List of 1-arg transcendental opcodes
+
+[[!table data="""
+opcode | Description | pseudo-code | Extension |
+FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt |
+FCBRT | Cube Root | rd = pow(rs1, 3) | Zftrans |
+FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans |
+FLOG2 | log2 | rd = log2(rs1) | Zftrans |
+FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Zftrans |
+FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Zftrans |
+FEXP | exponent | rd = pow(e, rs1) | ZftransExt |
+FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt |
+FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt |
+FLOG10 | log base 10 | rd = log10(rs1) | ZftransExt |
"""]]
-# List of 1-arg opcodes
+# List of 1-arg trigonometric opcodes
[[!table data="""
opcode | Description | pseudo-code | Extension |
-FCBRT | Cube Root | rd = pow(rs1, 3) | |
-FEXP2 | power-of-2 | rd = pow(2, rs1) | |
-FLOG2 | log2 | rd = log2(rs1) | |
-FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | |
-FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | |
-FEXP | exponent | rd = pow(e, rs1) | |
-FLOG | natural log (base e) | rd = log(e, rs1) | |
-FEXP10 | power-of-10 | rd = pow(10, rs1) | |
-FLOG10 | log base 10 | rd = log10(rs1) | |
-FSIN | sin (radians) | | Ztrans |
-FCOS | cos (radians) | | Ztrans |
-FTAN | tan (radians) | | Ztrans |
-FSINPI | sin times pi | rd = sin(pi * rs1) | |
-FCOSPI | cos times pi | rd = cos(pi * rs1) | |
-FTANPI | tan times pi | rd = tan(pi * rs1) | |
-FSINH | hyperbolic sin (radians) | | |
-FCOSH | hyperbolic cos (radians) | | |
-FTANH | hyperbolic tan (radians) | | |
-FASINH | inverse hyperbolic sin | | |
-FACOSH | inverse hyperbolic cos | | |
-FATANH | inverse hyperbolic tan | | |
+FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
+FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
+FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
+FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
+FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
+FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
+FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
+FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
+FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi |
+FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi |
+FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi |
+FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp |
+FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp |
+FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp |
+FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp |
+FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp |
+FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp |
"""]]
-# Pseudo-code ops
+# Synthesis, Pseudo-code ops and macro-ops
+
+The pseudo-ops are best left up to the compiler rather than being actual
+pseudo-ops, by allocating one scalar FP register for use as a constant
+(loop invariant) set to "1.0" at the beginning of a function or other
+suitable code block.
* FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
-* SINCOS - fused macro-op between FSIN and FCOS (issued in that order).
-* SINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
+* FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
+* FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
+* FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
+* FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
+
+FATANPI example pseudo-code:
+
+ lui t0, 0x3F800 // upper bits of f32 1.0
+ fmv.x.s ft0, t0
+ fatan2pi.s rd, rs1, ft0
+
+Hypotenuse example (obviates need for Zfhyp except for high-performance):
+
+ ASINH( x ) = ln( x + SQRT(x**2+1)
+
+LOG / LOGP1 example:
+
+ LOG(x) = LOGP1(x) + 1.0
+ EXP(x) = EXPM1(x-1.0)
+
+# To evaluate: should LOG be replaced with LOG1P (and EXP with EXPM1)?
+
+RISC principle says "exclude LOG because it's covered by LOGP1 plus an ADD".
+Research needed to ensure that implementors are not compromised by such
+a decision
+<http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002358.html>