|
R2,CA = A2+B2+CA adde r2,a2,b2
+This pattern - sequential execution of individual instructions
+with incrementing register numbers - is precisely the very definition
+of how SVP64 works!
Thus, due to sequential execution of `adde` both consuming and producing
a CA Flag, `sv.adde` is in effect an alias for Vectorised add. As such,
implementors are entirely at liberty to recognise Horizontal-First Vector
adds and send the vector of registers to a much larger and wider back-end
ALU, and/or short-cut the intermediate storage of XER.CA on an element
level and implement a Vector-aware carry propagation algorithm
-in back-end hardware that need only take the first incoming XER.CA and
+in back-end hardware that need only read the first incoming XER.CA and
only store the last XER.CA. The size of the underlying back-end SIMD ALU
is entirely at the discretion of the implementer.