From 65451445bc21677b0c1a6ce1e6e494056ad21443 Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 30 Apr 2022 19:48:18 +0100 Subject: [PATCH] --- openpower/sv/svp64/appendix.mdwn | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/openpower/sv/svp64/appendix.mdwn b/openpower/sv/svp64/appendix.mdwn index 8fb2a629c..add177fc2 100644 --- a/openpower/sv/svp64/appendix.mdwn +++ b/openpower/sv/svp64/appendix.mdwn @@ -1005,9 +1005,11 @@ An example ADD operation with predication and element width overrides: # Twin (implicit) result operations Some operations in the Power ISA already target two 64-bit scalar -registers: `lq` for example. Some mathematical algorithms are more +registers: `lq` for example, and LD with update. +Some mathematical algorithms are more efficient when there are two outputs rather than one, providing -feedback loops between elements. 64-bit multiply +feedback loops between elements (the most well-known being add with +carry). 64-bit multiply for example actually internally produces a 128 bit result, which clearly cannot be stored in a single 64 bit register. Some ISAs recommend "macro op fusion": the practice of setting a convention whereby if @@ -1019,7 +1021,9 @@ internally. The practice and convention of macro-op fusion however is not compatible with SVP64 Horizontal-First, because Horizontal Mode may only -be applied to a single instruction at a time. Thus it becomes +be applied to a single instruction at a time, and SVP64 is based on +the principle of strict Program Order even at the element +level. Thus it becomes necessary to add explicit more complex single instructions with more operands than would normally be seen in another ISA. If it was not for Power ISA already having LD/ST with update as well as @@ -1045,8 +1049,8 @@ and bear in mind that element-width overrides still have to be taken into consideration, the starting point for the implicit destination is best illustrated in pseudocode: - # demo of madded -  for (i = 0; i < VL; i++) + # demo of madded +  for (i = 0; i < VL; i++) if (predval & 1<