add links to discussion with mitch
[libreriscv.git] / ztrans_proposal.mdwn
1 # Zftrans - transcendental operations
2
3 See:
4
5 * <http://bugs.libre-riscv.org/show_bug.cgi?id=127>
6 * <https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html>
7 * Discussion: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002342.html>
8 * [[rv_major_opcode_1010011]] for opcode listing.
9
10 Extension subsets:
11
12 * **Zftrans**: standard transcendentals (best suited to 3D)
13 * **ZftransExt**: extra functions (useful, not generally needed for 3D, can be synthesised using Ztrans)
14 * **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi
15 * **Ztrignpi**: trig non-xxx-pi sin cos tan
16 * **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi
17 * **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos
18 * **Zfhyp**: hyperbolic/inverse-hyperbolic. sinh, cosh, tanh, asinh, acosh, atanh
19 * **ZftransAdv**: much more complex to implement in hardware
20
21 Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi,
22 Zarctrignpi
23
24 [[!toc levels=2]]
25
26 # TODO:
27
28 * Decision on accuracy <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002355.html>
29
30 # List of 2-arg opcodes
31
32 [[!table data="""
33 opcode | Description | pseudo-code | Extension |
34 FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi |
35 FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi |
36 FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv |
37 FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZftransAdv |
38 FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | Zftrans |
39 """]]
40
41 # List of 1-arg transcendental opcodes
42
43 [[!table data="""
44 opcode | Description | pseudo-code | Extension |
45 FCBRT | Cube Root | rd = pow(rs1, 3) | Zftrans |
46 FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans |
47 FLOG2 | log2 | rd = log2(rs1) | Zftrans |
48 FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Zftrans |
49 FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Zftrans |
50 FEXP | exponent | rd = pow(e, rs1) | ZftransExt |
51 FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt |
52 FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt |
53 FLOG10 | log base 10 | rd = log10(rs1) | ZftransExt |
54 """]]
55
56 # List of 1-arg trigonometric opcodes
57
58 [[!table data="""
59 opcode | Description | pseudo-code | Extension |
60 FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi |
61 FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi |
62 FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi |
63 FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi |
64 FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi |
65 FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi |
66 FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi |
67 FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi |
68 FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi |
69 FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi |
70 FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi |
71 FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp |
72 FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp |
73 FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp |
74 FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp |
75 FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp |
76 FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp |
77 """]]
78
79 # Pseudo-code ops and macro-ops
80
81 The pseudo-ops are best left up to the compiler rather than being actual
82 pseudo-ops, by allocating one scalar FP register for use as a constant
83 (loop invariant) set to "1.0" at the beginning of a function or other
84 suitable code block.
85
86 * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1
87 * FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2
88 * FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI
89 * FSINCOS - fused macro-op between FSIN and FCOS (issued in that order).
90 * FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order).
91
92 FATANPI example pseudo-code:
93
94 lui t0, 0x3F800 // upper bits of f32 1.0
95 fmv.x.s ft0, t0
96 fatan2pi.s rd, rs1, ft0
97
98 Hypotenuse example (obviates need for Zfhyp except for high-performance):
99
100 ASINH( x ) = ln( x + SQRT(x**2+1)
101
102 # To evaluate: should LOG be replaced with LOG1P (and EXP with EXPM1)?
103
104 LOG(x) = LOGP1(x) + 1.0
105 EXP(x) = EXPM1(x-1.0)
106
107 RISC principle says "exclude LOG because it's covered by LOGP1 plus an ADD".
108 Research needed to ensure that implementors are not compromised by such
109 a decision
110 <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002358.html>