(no commit message)

author lkcl <lkcl@web>

Sun, 26 Jun 2022 14:26:24 +0000 (15:26 +0100)

committer IkiWiki <ikiwiki.info>

Sun, 26 Jun 2022 14:26:24 +0000 (15:26 +0100)
author lkcl <lkcl@web>
Sun, 26 Jun 2022 14:26:24 +0000 (15:26 +0100)
committer IkiWiki <ikiwiki.info>
Sun, 26 Jun 2022 14:26:24 +0000 (15:26 +0100)
diff --git a/openpower/sv/svp64/appendix.mdwn b/openpower/sv/svp64/appendix.mdwn

index fc15904a7765efb05cd3a828aa08d79a734d6fa8..365d8a8f5f8e085fc591cb7053cbb69abbfe81c1 100644 (file)
--- a/openpower/sv/svp64/appendix.mdwn
+++ b/openpower/sv/svp64/appendix.mdwn
@@ -604,7 +604,7 @@ will **not** be overwritten and will **not** be zero'd.
  
  Note that when SVM is clear and SUBVL!=1 the sub-elements are
  *independent*, i.e. they are mapreduced per *sub-element* as a result.
-illustration with a vec2, assuming RA==RT, e.g `sv.add/mr/vec2 r4, r4, r16`
+illustration with a vec2, assuming RA==RT, e.g `sv.add/mr/vec2 r4, r4, r16.v`
  
      for i in range(0, VL):
          # RA==RT in the instruction. does not have to be
@@ -622,28 +622,33 @@ like a traditional Vector Processor Reduction instruction.
  Example for a vec2:
  
      for i in range(VL):
-        iregs[RT+i] = op(iregs[RA+i].x, iregs[RA+i].y)
+        iregs[RT+i] = op(iregs[RA+i].x, iregs[RB+i].y)
  
  Example for a vec3:
  
      for i in range(VL):
-        iregs[RT+i] = op(iregs[RA+i].x, iregs[RA+i].y)
-        iregs[RT+i] = op(iregs[RT+i]  , iregs[RA+i].z)
+        iregs[RT+i] = op(iregs[RA+i].x, iregs[RB+i].y)
+        iregs[RT+i] = op(iregs[RT+i]  , iregs[RB+i].z)
  
  Example for a vec4:
  
      for i in range(VL):
-        iregs[RT+i] = op(iregs[RA+i].x, iregs[RA+i].y)
-        iregs[RT+i] = op(iregs[RT+i]  , iregs[RA+i].z)
-        iregs[RT+i] = op(iregs[RT+i]  , iregs[RA+i].w)
+        iregs[RT+i] = op(iregs[RA+i].x, iregs[RB+i].y)
+        iregs[RT+i] = op(iregs[RT+i]  , iregs[RB+i].z)
+        iregs[RT+i] = op(iregs[RT+i]  , iregs[RB+i].w)
  
  In this mode, when Rc=1 the Vector of CRs is as normal: each result
  element creates a corresponding CR element (for the final, reduced, result).
  
-Note that the destination (RT) is automatically used as an "Accumulator"
-register, and consequently the Sub-Vector Loop is interruptible.
-If RT is a Scalar then as usual the main VL Loop terminates at the
-first predicated element (or the first element if unpredicated).
+Note:
+
+1. that the destination (RT) is inherently used as an "Accumulator"
+   register, and consequently the Sub-Vector Loop is interruptible.
+   If RT is a Scalar then as usual the main VL Loop terminates at the
+   first predicated element (or the first element if unpredicated).
+2. that the Sub-Vector designation applies to RA and RB *but not RT*.
+3. that the number of operations executed is one less than the Sub-vector
+   length
  
  # Fail-on-first
author	lkcl <lkcl@web>
	Sun, 26 Jun 2022 14:26:24 +0000 (15:26 +0100)
committer	IkiWiki <ikiwiki.info>
	Sun, 26 Jun 2022 14:26:24 +0000 (15:26 +0100)