**How long will it take to complete?**
-In the case of divide or Transcendentals they are so complex that simple
+In the case of divide or Transcendentals the algorithms needed are so complex that simple
implementations can often take an astounding 128 clock cycles to complete.
-Other instructions waiting for the results back up and eventually stall,
-where in-order systems just stall straight away.
+Other instructions waiting for the results will back up and eventually stall,
+where in-order systems pretty much just stall straight away.
+
+Less extreme examples include instructions that take only a few cycles to complete,
+but if used in tight loops with Conditional Branches, an Out-of-Order system with
+Speculative capability may need significantly more Reservation Stations to hold
+in-flight dats for instructions which take longer than those which do not.
+
+**Can one instruction do the job of many?**
+
+Large numbers of disparate instructions adversely affects resource utilisation in
+In-Order systems. However it is not always that simple: every one of the Power
+ISA "add" and "subtract" instructions, as shown by the Microwatt source code, may
+be micro-coded as one single instruction where RA may optionally be inverted,
+output likewise, and Carry-In set to 1, 0 or XER.CA. From these options the
+*entire* suite of add/subtract may be synthesised (subtract by inverting RA and
+adding an extra 1 it produces a 2s-complement of RA).
+
+`bmask` for example is to be proposed as a single instruction with a 5-bit "Mode"
+operand, greatly simplifying some micro-architectural implementations. Likewise
+the FP-INT conversion instructions are grouped as a set of four, instead of
+over 30 separate instructions. Aside from anything this strategy makes
+the ISA Working Group's evaluation task easier, as well as reducing the work
+of writing a Compliance Test Suite.
**Summary**