X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=ztrans_proposal.mdwn;h=7e05e85106c4e65b6eb34a27390dfcf71883d718;hb=c1aa578aee508dcb57b7c8453e1019bb999dd3fa;hp=15c6ea766d2ef1c6c9d699c9c8ed0675d61e24d6;hpb=539b741e656ac7b1816142fef502663377fa9141;p=libreriscv.git

diff --git a/ztrans_proposal.mdwn b/ztrans_proposal.mdwn
index 15c6ea766..7e05e8510 100644
--- a/ztrans_proposal.mdwn
+++ b/ztrans_proposal.mdwn
@@ -1,51 +1,127 @@
-# Ztrans - transcendental operations
+# Zftrans - transcendental operations
 
 See:
 
 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
+* Discussion: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002342.html>
+* [[rv_major_opcode_1010011]] for opcode listing.
+* [[zfpacc_proposal]] for accuracy settings proposal
+
+Extension subsets:
+
+* **Zftrans**: standard transcendentals (best suited to 3D)
+* **ZftransExt**: extra functions (useful, not generally needed for 3D,
+  can be synthesised using Ztrans)
+* **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
+* **Ztrignpi**: trig non-xxx-pi sin cos tan
+* **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
+* **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
+* **Zfhyp**: hyperbolic/inverse-hyperbolic.  sinh, cosh, tanh, asinh,
+  acosh, atanh (can be synthesised - see below)
+* **ZftransAdv**: much more complex to implement in hardware
+* **Zfrsqrt**: Reciprocal square-root.
+
+Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi,
+Zarctrignpi
 
 [[!toc levels=2]]
 
+# TODO:
+
+* Decision on accuracy, moved to [[zfpacc_proposal]]
+<http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002355.html>
+* Errors **MUST** be repeatable.
+* How about four Platform Specifications? 3DUNIX, UNIX, 3DEmbedded and Embedded?
+<http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002361.html>
+  Accuracy requirements for dual (triple) purpose implementations must
+  meet the higher standard.
+* Reciprocal Square-root is in its own separate extension (Zfrsqrt) as
+  it is desirable on its own by other implementors.  This to be evaluated.
+
+
 # List of 2-arg opcodes
 
 [[!table  data="""
 opcode    | Description           | pseudo-code                | Extension |
-FATAN2    | atan2 arc tangent     | rd = atan2(rs2, rs1)       | Ztrans    |
-FATAN2PI  | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi  |           |
-FPOW      | power of              | rd = pow(rs1, rs2)         | Ztrans    |
+FATAN2    | atan2 arc tangent     | rd = atan2(rs2, rs1)       | Zarctrignpi |
+FATAN2PI  | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi  | Zarctrigpi |
+FPOW      | x power of y          | rd = pow(rs1, rs2)         | ZftransAdv |
+FROOT     | x power 1/y           | rd = pow(rs1, 1/rs2)       | ZftransAdv |
+FHYPOT    | hypotenuse            | rd = sqrt(rs1^2 + rs2^2)       | Zftrans    |
+"""]]
+
+# List of 1-arg transcendental opcodes
+
+[[!table  data="""
+opcode   | Description              | pseudo-code             | Extension |
+FRSQRT   | Reciprocal Square-root   | rd = sqrt(rs1)          | Zfrsqrt    |
+FCBRT    | Cube Root                | rd = pow(rs1, 3)        | Zftrans    |
+FEXP2    | power-of-2               | rd = pow(2, rs1)        | Zftrans    |
+FLOG2    | log2                     | rd = log2(rs1)          | Zftrans    |
+FEXPM1   | exponent minus 1         | rd = pow(e, rs1) - 1.0  | Zftrans    |
+FLOG1P   | log plus 1               | rd = log(e, 1 + rs1)    | Zftrans    |
+FEXP     | exponent                 | rd = pow(e, rs1)        | ZftransExt |
+FLOG     | natural log (base e)     | rd = log(e, rs1)        | ZftransExt |
+FEXP10   | power-of-10              | rd = pow(10, rs1)       | ZftransExt |
+FLOG10   | log base 10              | rd = log10(rs1)         | ZftransExt |
 """]]
 
-# List of 1-arg opcodes
+# List of 1-arg trigonometric opcodes
 
 [[!table  data="""
 opcode   | Description              | pseudo-code             | Extension |
-FCBRT    | Cube Root                | rd = pow(rs1, 3)        |           |
-FEXP2    | power-of-2               | rd = pow(2, rs1)        |           |
-FLOG2    | log2                     | rd = log2(rs1)          |           |
-FEXPM1   | exponent minus 1         | rd = pow(e, rs1) - 1.0  |           |
-FLOG1P   | log plus 1               | rd = log(e, 1 + rs1)    |           |
-FEXP     | exponent                 | rd = pow(e, rs1)        |           |
-FLOG     | natural log (base e)     | rd = log(e, rs1)        |           |
-FEXP10   | power-of-10              | rd = pow(10, rs1)       |           |
-FLOG10   | log base 10              | rd = log10(rs1)         |           |
-FSIN     | sin (radians)            |                         | Ztrans    |
-FCOS     | cos (radians)            |                         | Ztrans    |
-FTAN     | tan (radians)            |                         | Ztrans    |
-FSINPI   | sin times pi             | rd = sin(pi * rs1)      |           |
-FCOSPI   | cos times pi             | rd = cos(pi * rs1)      |           |
-FTANPI   | tan times pi             | rd = tan(pi * rs1)      |           |
-FSINH    | hyperbolic sin (radians) |                         |           |
-FCOSH    | hyperbolic cos (radians) |                         |           |
-FTANH    | hyperbolic tan (radians) |                         |           |
-FASINH   | inverse hyperbolic sin   |                         |           |
-FACOSH   | inverse hyperbolic cos   |                         |           |
-FATANH   | inverse hyperbolic tan   |                         |           |
+FSIN     | sin (radians)            | rd = sin(rs1)           | Ztrignpi    |
+FCOS     | cos (radians)            | rd = cos(rs1)           | Ztrignpi    |
+FTAN     | tan (radians)            | rd = tan(rs1)           | Ztrignpi    |
+FASIN    | arcsin (radians)         | rd = asin(rs1)          | Zarctrignpi |
+FACOS    | arccos (radians)         | rd = acos(rs1)          | Zarctrignpi |
+FSINPI   | sin times pi             | rd = sin(pi * rs1)      | Ztrigpi |
+FCOSPI   | cos times pi             | rd = cos(pi * rs1)      | Ztrigpi |
+FTANPI   | tan times pi             | rd = tan(pi * rs1)      | Ztrigpi |
+FASINPI  | arcsin times pi          | rd = asin(pi * rs1)     | Zarctrigpi |
+FACOSPI  | arccos times pi          | rd = acos(pi * rs1)     | Zarctrigpi |
+FATANPI  | arctan times pi          | rd = atan(pi * rs1)     | Zarctrigpi |
+FSINH    | hyperbolic sin (radians) | rd = sinh(rs1)          | Zfhyp |
+FCOSH    | hyperbolic cos (radians) | rd = cosh(rs1)          | Zfhyp |
+FTANH    | hyperbolic tan (radians) | rd = tanh(rs1)          | Zfhyp |
+FASINH   | inverse hyperbolic sin   | rd = asinh(rs1)         | Zfhyp |
+FACOSH   | inverse hyperbolic cos   | rd = acosh(rs1)         | Zfhyp |
+FATANH   | inverse hyperbolic tan   | rd = atanh(rs1)         | Zfhyp |
 """]]
 
-# Pseudo-code ops
+# Synthesis, Pseudo-code ops and macro-ops
+
+The pseudo-ops are best left up to the compiler rather than being actual
+pseudo-ops, by allocating one scalar FP register for use as a constant
+(loop invariant) set to "1.0" at the beginning of a function or other
+suitable code block.
 
 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
-* SINCOS - fused macro-op between FSIN and FCOS (issued in that order).
-* SINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
+* FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
+* FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
+* FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
+* FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
+
+FATANPI example pseudo-code:
+
+    lui t0, 0x3F800 // upper bits of f32 1.0
+    fmv.x.s ft0, t0
+    fatan2pi.s rd, rs1, ft0
+
+Hypotenuse example (obviates need for Zfhyp except for high-performance):
+
+    ASINH( x ) = ln( x + SQRT(x**2+1)
+
+LOG / LOGP1 example:
+
+    LOG(x) = LOGP1(x) + 1.0
+    EXP(x) = EXPM1(x-1.0)
+
+# To evaluate: should LOG be replaced with LOG1P (and EXP with EXPM1)?
+
+RISC principle says "exclude LOG because it's covered by LOGP1 plus an ADD".
+Research needed to ensure that implementors are not compromised by such
+a decision
+<http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002358.html>