From: Luke Kenneth Casson Leighton Date: Fri, 9 Aug 2019 04:38:28 +0000 (+0100) Subject: add cf to zfaccuracy proposal X-Git-Tag: convert-csv-opcode-to-binary~4250 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=c605bd3e2ea172a47de2bec63160771f2361ff25;p=libreriscv.git add cf to zfaccuracy proposal --- diff --git a/zfpacc_proposal.mdwn b/zfpacc_proposal.mdwn new file mode 100644 index 000000000..23feec9c1 --- /dev/null +++ b/zfpacc_proposal.mdwn @@ -0,0 +1,61 @@ +# FP Accuracy proposal + +TODO: writeup + + + A natural place for a standard reduced accuracy extension "Zfpacc" + would be in the reserved bits of FCSR. It could be treated very + similarly to how dynamic frm is treated now. Currently, there are 5 + bits of fflags, 3 bits of frm and 24 Reserved bits. The L (decimal + floating-point) extension will presumably use some, but not all of + them. I'm unable to find any public proposals for L bit encodings + in FCSR. + + For reference, frm is treated as follows: Floating-point operations + use either a static rounding mode encoded in the instruction, or + a dynamic rounding mode held in frm. Rounding modes are encoded + as shown in Table 11.1. A value of 111 in the instruction’s rm + field selects the dynamic rounding mode held in frm. If frm is set + to an invalid value (101–111), any subsequent attempt to execute + a floating-point operation with a dynamic rounding mode will raise + an illegal instruction exception. + + Let's say that we wish to support up to 4 accuracy modes -- 2 'fam' + bits. Default would be IEEE-compliant, encoded as 00. This means + that all current hardware would be compliant with the default mode. + + the unsupported modes would cause a trap to allow emulation where + traps are supported. emulation of unsupported modes would be required + for unix platforms. + + As with frm, an implementation can choose to support any permutation + of dynamic fam-instruction pairs. It will illegal-instruction + trap upon executing an unsupported fam-instruction pair. + The implementation can then emulate the accuracy mode required. + + there would be a mechanism for user mode code to detect which modes + are emulated (csr? syscall?) (if the supervisor decides to make the + emulation visible) that would allow user code to switch to faster + software implementations if it chooses to. + + If the bits are in FCSR, then the switch itself would be exposed + to user mode. User-mode would not be able to detect emulation vs + hardware supported instructions, however (by design). That would + require some platform-specific code. + + Now, which accuracy modes should be included is a question outside + of my expertise and would require a literature review of instruction + frequency in key workloads, PPA analysis of simple and advanced + implementations, etc. (Thanks for the insights, Mitch!) + + emulation of unsupported modes would be required for unix platforms. + + I don't see why Unix should be required to emulate some arbitrary + reduced accuracy ML mode. My guess would be that Unix Platform Spec + requires support for IEEE, whereas arbitrary ML platform requires + support for Mode XYZ. Of course, implementations of either platform + would be free to support any/all modes that they find valuable. + Compiling for a specific platform means that support for required + accuracy modes is guaranteed (and therefore does not need discovery + sequences), while allowing portable code to execute discovery + sequences to detect support for alternative accuracy modes.