# Twin (implicit) result operations
Some operations in the Power ISA already target two 64-bit scalar
-registers: `lq` for example. Some mathematical algorithms are more
+registers: `lq` for example, and LD with update.
+Some mathematical algorithms are more
efficient when there are two outputs rather than one, providing
-feedback loops between elements. 64-bit multiply
+feedback loops between elements (the most well-known being add with
+carry). 64-bit multiply
for example actually internally produces a 128 bit result, which clearly
cannot be stored in a single 64 bit register. Some ISAs recommend
"macro op fusion": the practice of setting a convention whereby if
The practice and convention of macro-op fusion however is not compatible
with SVP64 Horizontal-First, because Horizontal Mode may only
-be applied to a single instruction at a time. Thus it becomes
+be applied to a single instruction at a time, and SVP64 is based on
+the principle of strict Program Order even at the element
+level. Thus it becomes
necessary to add explicit more complex single instructions with
more operands than would normally be seen in another ISA. If it
was not for Power ISA already having LD/ST with update as well as
into consideration, the starting point for the implicit destination
is best illustrated in pseudocode:
- # demo of madded
- for (i = 0; i < VL; i++)
+ # demo of madded
+ for (i = 0; i < VL; i++)
if (predval & 1<<i) # predication
src1 = get_polymorphed_reg(RA, srcwid, irs1)
src2 = get_polymorphed_reg(RB, srcwid, irs2)