X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=ztrans_proposal.mdwn;h=7e05e85106c4e65b6eb34a27390dfcf71883d718;hb=c1aa578aee508dcb57b7c8453e1019bb999dd3fa;hp=15c6ea766d2ef1c6c9d699c9c8ed0675d61e24d6;hpb=539b741e656ac7b1816142fef502663377fa9141;p=libreriscv.git diff --git a/ztrans_proposal.mdwn b/ztrans_proposal.mdwn index 15c6ea766..7e05e8510 100644 --- a/ztrans_proposal.mdwn +++ b/ztrans_proposal.mdwn @@ -1,51 +1,127 @@ -# Ztrans - transcendental operations +# Zftrans - transcendental operations See: * * +* Discussion: +* [[rv_major_opcode_1010011]] for opcode listing. +* [[zfpacc_proposal]] for accuracy settings proposal + +Extension subsets: + +* **Zftrans**: standard transcendentals (best suited to 3D) +* **ZftransExt**: extra functions (useful, not generally needed for 3D, + can be synthesised using Ztrans) +* **Ztrigpi**: trig. xxx-pi sinpi cospi tanpi +* **Ztrignpi**: trig non-xxx-pi sin cos tan +* **Zarctrigpi**: arc-trig. a-xxx-pi: atan2pi asinpi acospi +* **Zarctrignpi**: arc-trig. non-a-xxx-pi: atan2, asin, acos +* **Zfhyp**: hyperbolic/inverse-hyperbolic. sinh, cosh, tanh, asinh, + acosh, atanh (can be synthesised - see below) +* **ZftransAdv**: much more complex to implement in hardware +* **Zfrsqrt**: Reciprocal square-root. + +Minimum recommended requirements for 3D: Zftrans, Ztrigpi, Zarctrigpi, +Zarctrignpi [[!toc levels=2]] +# TODO: + +* Decision on accuracy, moved to [[zfpacc_proposal]] + +* Errors **MUST** be repeatable. +* How about four Platform Specifications? 3DUNIX, UNIX, 3DEmbedded and Embedded? + + Accuracy requirements for dual (triple) purpose implementations must + meet the higher standard. +* Reciprocal Square-root is in its own separate extension (Zfrsqrt) as + it is desirable on its own by other implementors. This to be evaluated. + + # List of 2-arg opcodes [[!table data=""" opcode | Description | pseudo-code | Extension | -FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Ztrans | -FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | | -FPOW | power of | rd = pow(rs1, rs2) | Ztrans | +FATAN2 | atan2 arc tangent | rd = atan2(rs2, rs1) | Zarctrignpi | +FATAN2PI | atan arc tangent / pi | rd = atan2(rs2, rs1) / pi | Zarctrigpi | +FPOW | x power of y | rd = pow(rs1, rs2) | ZftransAdv | +FROOT | x power 1/y | rd = pow(rs1, 1/rs2) | ZftransAdv | +FHYPOT | hypotenuse | rd = sqrt(rs1^2 + rs2^2) | Zftrans | +"""]] + +# List of 1-arg transcendental opcodes + +[[!table data=""" +opcode | Description | pseudo-code | Extension | +FRSQRT | Reciprocal Square-root | rd = sqrt(rs1) | Zfrsqrt | +FCBRT | Cube Root | rd = pow(rs1, 3) | Zftrans | +FEXP2 | power-of-2 | rd = pow(2, rs1) | Zftrans | +FLOG2 | log2 | rd = log2(rs1) | Zftrans | +FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | Zftrans | +FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | Zftrans | +FEXP | exponent | rd = pow(e, rs1) | ZftransExt | +FLOG | natural log (base e) | rd = log(e, rs1) | ZftransExt | +FEXP10 | power-of-10 | rd = pow(10, rs1) | ZftransExt | +FLOG10 | log base 10 | rd = log10(rs1) | ZftransExt | """]] -# List of 1-arg opcodes +# List of 1-arg trigonometric opcodes [[!table data=""" opcode | Description | pseudo-code | Extension | -FCBRT | Cube Root | rd = pow(rs1, 3) | | -FEXP2 | power-of-2 | rd = pow(2, rs1) | | -FLOG2 | log2 | rd = log2(rs1) | | -FEXPM1 | exponent minus 1 | rd = pow(e, rs1) - 1.0 | | -FLOG1P | log plus 1 | rd = log(e, 1 + rs1) | | -FEXP | exponent | rd = pow(e, rs1) | | -FLOG | natural log (base e) | rd = log(e, rs1) | | -FEXP10 | power-of-10 | rd = pow(10, rs1) | | -FLOG10 | log base 10 | rd = log10(rs1) | | -FSIN | sin (radians) | | Ztrans | -FCOS | cos (radians) | | Ztrans | -FTAN | tan (radians) | | Ztrans | -FSINPI | sin times pi | rd = sin(pi * rs1) | | -FCOSPI | cos times pi | rd = cos(pi * rs1) | | -FTANPI | tan times pi | rd = tan(pi * rs1) | | -FSINH | hyperbolic sin (radians) | | | -FCOSH | hyperbolic cos (radians) | | | -FTANH | hyperbolic tan (radians) | | | -FASINH | inverse hyperbolic sin | | | -FACOSH | inverse hyperbolic cos | | | -FATANH | inverse hyperbolic tan | | | +FSIN | sin (radians) | rd = sin(rs1) | Ztrignpi | +FCOS | cos (radians) | rd = cos(rs1) | Ztrignpi | +FTAN | tan (radians) | rd = tan(rs1) | Ztrignpi | +FASIN | arcsin (radians) | rd = asin(rs1) | Zarctrignpi | +FACOS | arccos (radians) | rd = acos(rs1) | Zarctrignpi | +FSINPI | sin times pi | rd = sin(pi * rs1) | Ztrigpi | +FCOSPI | cos times pi | rd = cos(pi * rs1) | Ztrigpi | +FTANPI | tan times pi | rd = tan(pi * rs1) | Ztrigpi | +FASINPI | arcsin times pi | rd = asin(pi * rs1) | Zarctrigpi | +FACOSPI | arccos times pi | rd = acos(pi * rs1) | Zarctrigpi | +FATANPI | arctan times pi | rd = atan(pi * rs1) | Zarctrigpi | +FSINH | hyperbolic sin (radians) | rd = sinh(rs1) | Zfhyp | +FCOSH | hyperbolic cos (radians) | rd = cosh(rs1) | Zfhyp | +FTANH | hyperbolic tan (radians) | rd = tanh(rs1) | Zfhyp | +FASINH | inverse hyperbolic sin | rd = asinh(rs1) | Zfhyp | +FACOSH | inverse hyperbolic cos | rd = acosh(rs1) | Zfhyp | +FATANH | inverse hyperbolic tan | rd = atanh(rs1) | Zfhyp | """]] -# Pseudo-code ops +# Synthesis, Pseudo-code ops and macro-ops + +The pseudo-ops are best left up to the compiler rather than being actual +pseudo-ops, by allocating one scalar FP register for use as a constant +(loop invariant) set to "1.0" at the beginning of a function or other +suitable code block. * FRCP rd, rs1 - pseudo-code alias for rd = 1.0 / rs1 -* SINCOS - fused macro-op between FSIN and FCOS (issued in that order). -* SINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order). +* FATAN - pseudo-code alias for rd = atan2(rs1, 1.0) - FATAN2 +* FATANPI - pseudo alias for rd = atan2pi(rs1, 1.0) - FATAN2PI +* FSINCOS - fused macro-op between FSIN and FCOS (issued in that order). +* FSINCOSPI - fused macro-op between FSINPI and FCOSPI (issued in that order). + +FATANPI example pseudo-code: + + lui t0, 0x3F800 // upper bits of f32 1.0 + fmv.x.s ft0, t0 + fatan2pi.s rd, rs1, ft0 + +Hypotenuse example (obviates need for Zfhyp except for high-performance): + + ASINH( x ) = ln( x + SQRT(x**2+1) + +LOG / LOGP1 example: + + LOG(x) = LOGP1(x) + 1.0 + EXP(x) = EXPM1(x-1.0) + +# To evaluate: should LOG be replaced with LOG1P (and EXP with EXPM1)? + +RISC principle says "exclude LOG because it's covered by LOGP1 plus an ADD". +Research needed to ensure that implementors are not compromised by such +a decision +