(no commit message)

author lkcl <lkcl@web>

Sat, 30 Apr 2022 18:48:18 +0000 (19:48 +0100)

committer IkiWiki <ikiwiki.info>

Sat, 30 Apr 2022 18:48:18 +0000 (19:48 +0100)
author lkcl <lkcl@web>
Sat, 30 Apr 2022 18:48:18 +0000 (19:48 +0100)
committer IkiWiki <ikiwiki.info>
Sat, 30 Apr 2022 18:48:18 +0000 (19:48 +0100)
diff --git a/openpower/sv/svp64/appendix.mdwn b/openpower/sv/svp64/appendix.mdwn

index 8fb2a629c000d41138afb8fa547a5f73ea0b39a2..add177fc261d83cab55a9df7f5fc5abd949609ba 100644 (file)
--- a/openpower/sv/svp64/appendix.mdwn
+++ b/openpower/sv/svp64/appendix.mdwn
@@ -1005,9 +1005,11 @@ An example ADD operation with predication and element width overrides:
  # Twin (implicit) result operations
  
  Some operations in the Power ISA already target two 64-bit scalar
-registers: `lq` for example. Some mathematical algorithms are more
+registers: `lq` for example, and LD with update.
+Some mathematical algorithms are more
  efficient when there are two outputs rather than one, providing
-feedback loops between elements.  64-bit multiply
+feedback loops between elements (the most well-known being add with
+carry).  64-bit multiply
  for example actually internally produces a 128 bit result, which clearly
  cannot be stored in a single 64 bit register.  Some ISAs recommend
  "macro op fusion": the practice of setting a convention whereby if
@@ -1019,7 +1021,9 @@ internally.
  
  The practice and convention of macro-op fusion however is not compatible
  with SVP64 Horizontal-First, because Horizontal Mode may only
-be applied to a single instruction at a time.  Thus it becomes
+be applied to a single instruction at a time, and SVP64 is based on
+the principle of strict Program Order even at the element
+level.  Thus it becomes
  necessary to add explicit more complex single instructions with
  more operands than would normally be seen in another ISA. If it
  was not for Power ISA already having LD/ST with update as well as
@@ -1045,8 +1049,8 @@ and bear in mind that element-width overrides still have to be taken
  into consideration, the starting point for the implicit destination
  is best illustrated in pseudocode:
  
-      # demo of madded
-      for (i = 0; i < VL; i++)
+     # demo of madded
+     for (i = 0; i < VL; i++)
          if (predval & 1<<i) # predication
             src1 = get_polymorphed_reg(RA, srcwid, irs1)
             src2 = get_polymorphed_reg(RB, srcwid, irs2)
author	lkcl <lkcl@web>
	Sat, 30 Apr 2022 18:48:18 +0000 (19:48 +0100)
committer	IkiWiki <ikiwiki.info>
	Sat, 30 Apr 2022 18:48:18 +0000 (19:48 +0100)