(no commit message)

author lkcl <lkcl@web>

Wed, 23 Dec 2020 01:06:59 +0000 (01:06 +0000)

committer IkiWiki <ikiwiki.info>

Wed, 23 Dec 2020 01:06:59 +0000 (01:06 +0000)
author lkcl <lkcl@web>
Wed, 23 Dec 2020 01:06:59 +0000 (01:06 +0000)
committer IkiWiki <ikiwiki.info>
Wed, 23 Dec 2020 01:06:59 +0000 (01:06 +0000)
diff --git a/openpower/sv/svp_rewrite/svp64.mdwn b/openpower/sv/svp_rewrite/svp64.mdwn

index d90c33e304200b4e9706f5db32bfdd25700b71a3..9548f67d8d2365cfa94fa48e158effe271b2304e 100644 (file)
--- a/openpower/sv/svp_rewrite/svp64.mdwn
+++ b/openpower/sv/svp_rewrite/svp64.mdwn
@@ -681,14 +681,19 @@ are mapreduced per *sub-element* as a result.  illustration with a vec2:
          result.x = op(result.x, iregs[RA+i].x)
          result.y = op(result.y, iregs[RA+i].y)
  
-When SVM is set and SUBVL!=1, another variant is enabled.
+Note here that Rc=1 does not make sense when SVM is clear and SUBVL!=1.
+
+
+When SVM is set and SUBVL!=1, another variant is enabled: horizontal subvector mode.  Example for a vec3:
  
      for i in range(VL):
          result = op(iregs[RA+i].x, iregs[RA+i].x)
-        result = op(result, iregs[RA+i].z)
+        result = op(result, iregs[RA+i].y)
          result = op(result, iregs[RA+i].z)
          iregs[RT+i] = result
  
+In this mode, when Rc=1 the Vector of CRs is as normal: each result element creates a corresponding CR element.
+
  ## Fail-on-first
  
  Data-dependent fail-on-first has two distinct variants: one for LD/ST,
@@ -730,6 +735,10 @@ One extremely important aspect of ffirst is:
    vectorised operations are effectively `nops` which is
    *precisely the desired and intended behaviour*.
  
+Another aspect is that for ffirst LD/STs, VL may be truncated arbitrarily to a nonzero value for any implementation-specific reason.  For example: it is perfectly reasonable for implementations to alter VL when ffirst LD or ST operations are initiated on a nonaligned boundary, such that within a loop the subsequent iteration of that loop begins subsequent ffirst LD/ST operations on an aligned boundary.  Likewise, to reduce workloads or balance resources.
+
+CR-based data-dependent first on the other hand MUST not truncate VL arbitrarily.  This because it is a precise test on which algorithms will rely.
+
  ## pred-result mode
  
  This mode merges common CR testing with predication, saving on instruction count. Below is the pseudocode excluding predicate zeroing and elwidth overrides.
author	lkcl <lkcl@web>
	Wed, 23 Dec 2020 01:06:59 +0000 (01:06 +0000)
committer	IkiWiki <ikiwiki.info>
	Wed, 23 Dec 2020 01:06:59 +0000 (01:06 +0000)