X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=zfpacc_proposal.mdwn;h=8b22582a5d984dacb72bb766a487a30045959e86;hb=57398591b40ccf7cf3d9995b2c199f541f263df7;hp=f7e4fa559ed40fe57de6bf8149098d94b9f546ff;hpb=7613c09b6f9c6debe7353e3db749688f223fdb59;p=libreriscv.git diff --git a/zfpacc_proposal.mdwn b/zfpacc_proposal.mdwn index f7e4fa559..8b22582a5 100644 --- a/zfpacc_proposal.mdwn +++ b/zfpacc_proposal.mdwn @@ -1,5 +1,13 @@ + # FP Accuracy proposal +Credits: + +* Bruce Hoult +* Allen Baum +* Dan Petroski +* Jacob Lifshay + TODO: complete writeup * @@ -13,6 +21,20 @@ this proposal. Only ULP (Unit in Last Place) of the instruction *result* is permitted to meet alternative accuracy requirements, whilst still retaining the instruction's requested format. +This proposal is *only* suitable for adding pre-existing accuracy standards +where it is clearly established, well in advance of applications being +written that conform to that standard, that dealing with variations in +accuracy across hardware implementations is the responsibility of the +application writer. This is the case for both Vulkan and OpenCL. + +This proposal is *not* suitable for inclusion of "de-facto" (proprietary) +accuracy standards (historic IBM Mainframe vs Ahmdahl incompatibility) +where there was no prior agreement or notification to applications +writers that variations in accuracy across hardware implementations +would occur. In the unlikely event that they *are* ever to be included +(n the future, rather than as a Custom Extension, then, unlike Vulkan +and OpenCL, they must **only** be added as "bit-for-bit compatible". + # Extension of FCSR Zfpacc would use some of the the reserved bits of FCSR. It would be treated @@ -86,32 +108,20 @@ The values for the field facc to include the following: | facc | mode | description | | ----- | ------- | ------------------- | -| 0b00H | IEEE754 | correctly rounded | -| 0b01H | ULP<1 | Unit Last Place < 1 | -| 0b10H | Vulkan | Vulkan compliant | -| 0b11H | Appx | Machine Learning | - -When bit 0 (H) of facc is set to zero, half-precision mode is -disabled. When set, an automatic down conversion (FCVT) to half the -instruction bitwidth (FP32 opcode would convert to FP16) on operands -is performed, followed by the operation occuring at half precision, -followed by automatic up conversion back to the instruction's bitwidth. - -Note that the format of the operands and result remain the same for -all opcodes. The only change is in the *accuracy* of the result, not -its format. +| 0b000 | IEEE754 | correctly rounded | +| 0b010 | ULP<1 | Unit Last Place < 1 | +| 0b100 | Vulkan | Vulkan compliant | +| 0b110 | Appx | Machine Learning + +(TODO: review alternative idea: ULP0.5, ULP1, ULP2, ULP4, ULP16) -Pseudocode for half accuracy mode: +Notes: - def fpadd32(op1, op2): - if FCSR.facc.halfmode: - op1 = fcvt32to16(op1) - op2 = fcvt32to16(op2) - result = fpadd32(op1, op2) - return fcvt16to32(result) - else: - # TODO, reduced accuracy if requested - return op1 + op2 +* facc=0 to match current RISC-V behaviour, where these bits were formerly reserved and set to zero. +* The format of the operands and result remain the same for +all opcodes. The only change is in the *accuracy* of the result, not +its format. +* facc sets the *minimum* accuracy. It is acceptable to provide *more* accurate results than is requested by a given facc mode (although, clearly, the opportunity for reduced power and latency would be missed). ## Discussion @@ -127,6 +137,12 @@ to allow selecting one of several accurate or fast modes: - fully-accurate-mode: correctly rounded in all cases - maybe more modes? +extra mode suggestions: + + it might be reasonable to add a mode saying you're prepared to accept + worse then 0.5 ULP accuracy, perhaps with a few options: 1, 2, 4, + 16 or something like that. + Question: should better accuracy than is requested be permitted? Example: Ahmdahl-370 issues. @@ -149,3 +165,52 @@ Comments: There are also 8 and 9-bit floating point formats that could be useful + +### function accuracy in standards (opencl, vulkan) + +[[resources]] for OpenCL and Vulkan + +Vulkan requires full ieee754 precision for all F/D instructions except for fdiv and fsqrt. + + + +Source is here: + + +OpenCL slightly different, suggest adding as an extra entry. + + + +Link, finds version 2.1 of opencl environment specification, table 8.4.1 however needs checking if it is the same as the above, which has "SPIRV" in it and is 2.2 not 2.1 + +https://www.google.com/search?q=opencl+environment+specification + +2.1 superceded by 2.2 + + +### Compliance + +Dan Petroski: + + It’s a bit more complicated than that. Different FP + representations/algorithms have different quantization ranges, so you + can get more or less precise depending on how large the arguments are. + + For instance, machine A can compute within ULP3 from 0 to 10000, but + ULP2 from 10000 upwards. Machine B can compute within ULP2 from 0 to + 6000, then ULP3 for 6000+. How do you design a compliance suite which + guarantees behavior across all fpaccs? + +and from Allen Baum: + + In the example above, you'd need a ratified spec with the defined + ranges (possbily per range and per op) - and then implementations + would need to at least meet that spec (but could be more accurate) + + so - not impossible, but a lot more work to write different kinds + of tests than standard IEEE compatible test would have. + + And, by the way, if you want it to be a ratified spec, it needs a + compliance suite, and whoever has defined the spec is responsible + for writing it., +