+
# FP Accuracy proposal
+Credits:
+
+* Bruce Hoult
+* Allen Baum
+* Dan Petroski
+* Jacob Lifshay
+
TODO: complete writeup
* <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002400.html>
* <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002412.html>
-Zfpacc: a proposal to allow implementations to dynamically set the bit-accuracy
-of results, trading speed (reduced latency) for accuracy (higher latency).
-IEE754 format is preserved: only ULP (Unit in Last Place) is permitted to be non-zero.
+Zfpacc: a proposal to allow implementations to dynamically set the
+bit-accuracy of floating-point results, trading speed (reduced latency)
+*at runtime* for accuracy (higher latency). IEEE754 format is preserved:
+instruction operand and result format requirements are unmodified by
+this proposal. Only ULP (Unit in Last Place) of the instruction *result*
+is permitted to meet alternative accuracy requirements, whilst still
+retaining the instruction's requested format.
+
+This proposal is *only* suitable for adding pre-existing accuracy standards
+where it is clearly established, well in advance of applications being
+written that conform to that standard, that dealing with variations in
+accuracy across hardware implementations is the responsibility of the
+application writer. This is the case for both Vulkan and OpenCL.
+
+This proposal is *not* suitable for inclusion of "de-facto" (proprietary)
+accuracy standards (historic IBM Mainframe vs Ahmdahl incompatibility)
+where there was no prior agreement or notification to applications
+writers that variations in accuracy across hardware implementations
+would occur. In the unlikely event that they *are* ever to be included
+(n the future, rather than as a Custom Extension, then, unlike Vulkan
+and OpenCL, they must **only** be added as "bit-for-bit compatible".
# Extension of FCSR
| 0b000 | IEEE754 | correctly rounded |
| 0b010 | ULP<1 | Unit Last Place < 1 |
| 0b100 | Vulkan | Vulkan compliant |
-| 0b110 | Appx | Machine Learning |
+| 0b110 | Appx | Machine Learning
+
+(TODO: review alternative idea: ULP0.5, ULP1, ULP2, ULP4, ULP16)
-Note that the format of the operands and result remain the same for all opcodes. The only change is in the *accuracy* of the result, not its format.
+Notes:
+
+* facc=0 to match current RISC-V behaviour, where these bits were formerly reserved and set to zero.
+* The format of the operands and result remain the same for
+all opcodes. The only change is in the *accuracy* of the result, not
+its format.
+* facc sets the *minimum* accuracy. It is acceptable to provide *more* accurate results than is requested by a given facc mode (although, clearly, the opportunity for reduced power and latency would be missed).
+
+## Discussion
maybe a solution would be to add an extra field to the fp control csr
to allow selecting one of several accurate or fast modes:
- fully-accurate-mode: correctly rounded in all cases
- maybe more modes?
+extra mode suggestions:
+
+ it might be reasonable to add a mode saying you're prepared to accept
+ worse then 0.5 ULP accuracy, perhaps with a few options: 1, 2, 4,
+ 16 or something like that.
+
Question: should better accuracy than is requested be permitted? Example:
Ahmdahl-370 issues.
There are also 8 and 9-bit floating point formats that could be useful
<https://en.wikipedia.org/wiki/Minifloat>
+
+### function accuracy in standards (opencl, vulkan)
+
+[[resources]] for OpenCL and Vulkan
+
+Vulkan requires full ieee754 precision for all F/D instructions except for fdiv and fsqrt.
+
+<https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/chap40.html#spirvenv-precision-operation>
+
+Source is here:
+<https://github.com/KhronosGroup/Vulkan-Docs/blob/master/appendices/spirvenv.txt#L1172>
+
+OpenCL slightly different, suggest adding as an extra entry.
+
+<https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_Env.html#relative-error-as-ulps>
+
+Link, finds version 2.1 of opencl environment specification, table 8.4.1 however needs checking if it is the same as the above, which has "SPIRV" in it and is 2.2 not 2.1
+
+https://www.google.com/search?q=opencl+environment+specification
+
+2.1 superceded by 2.2
+<https://github.com/KhronosGroup/OpenCL-Docs/blob/master/env/numerical_compliance.asciidoc>
+
+### Compliance
+
+Dan Petroski:
+
+ It’s a bit more complicated than that. Different FP
+ representations/algorithms have different quantization ranges, so you
+ can get more or less precise depending on how large the arguments are.
+
+ For instance, machine A can compute within ULP3 from 0 to 10000, but
+ ULP2 from 10000 upwards. Machine B can compute within ULP2 from 0 to
+ 6000, then ULP3 for 6000+. How do you design a compliance suite which
+ guarantees behavior across all fpaccs?
+
+and from Allen Baum:
+
+ In the example above, you'd need a ratified spec with the defined
+ ranges (possbily per range and per op) - and then implementations
+ would need to at least meet that spec (but could be more accurate)
+
+ so - not impossible, but a lot more work to write different kinds
+ of tests than standard IEEE compatible test would have.
+
+ And, by the way, if you want it to be a ratified spec, it needs a
+ compliance suite, and whoever has defined the spec is responsible
+ for writing it.,
+