From: lkcl Date: Mon, 10 Apr 2023 14:56:38 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls012_v1~25 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=51bd873b545cd30b4ed5dcf0686b3e3cfd6db0cc;p=libreriscv.git --- diff --git a/openpower/sv/rfc/ls012.mdwn b/openpower/sv/rfc/ls012.mdwn index c24eaff7e..1017b5733 100644 --- a/openpower/sv/rfc/ls012.mdwn +++ b/openpower/sv/rfc/ls012.mdwn @@ -487,10 +487,32 @@ terms is just asking for trouble. **How long will it take to complete?** -In the case of divide or Transcendentals they are so complex that simple +In the case of divide or Transcendentals the algorithms needed are so complex that simple implementations can often take an astounding 128 clock cycles to complete. -Other instructions waiting for the results back up and eventually stall, -where in-order systems just stall straight away. +Other instructions waiting for the results will back up and eventually stall, +where in-order systems pretty much just stall straight away. + +Less extreme examples include instructions that take only a few cycles to complete, +but if used in tight loops with Conditional Branches, an Out-of-Order system with +Speculative capability may need significantly more Reservation Stations to hold +in-flight dats for instructions which take longer than those which do not. + +**Can one instruction do the job of many?** + +Large numbers of disparate instructions adversely affects resource utilisation in +In-Order systems. However it is not always that simple: every one of the Power +ISA "add" and "subtract" instructions, as shown by the Microwatt source code, may +be micro-coded as one single instruction where RA may optionally be inverted, +output likewise, and Carry-In set to 1, 0 or XER.CA. From these options the +*entire* suite of add/subtract may be synthesised (subtract by inverting RA and +adding an extra 1 it produces a 2s-complement of RA). + +`bmask` for example is to be proposed as a single instruction with a 5-bit "Mode" +operand, greatly simplifying some micro-architectural implementations. Likewise +the FP-INT conversion instructions are grouped as a set of four, instead of +over 30 separate instructions. Aside from anything this strategy makes +the ISA Working Group's evaluation task easier, as well as reducing the work +of writing a Compliance Test Suite. **Summary**