(no commit message)

author lkcl <lkcl@web>

Mon, 11 Apr 2022 08:14:01 +0000 (09:14 +0100)

committer IkiWiki <ikiwiki.info>

Mon, 11 Apr 2022 08:14:01 +0000 (09:14 +0100)
author lkcl <lkcl@web>
Mon, 11 Apr 2022 08:14:01 +0000 (09:14 +0100)
committer IkiWiki <ikiwiki.info>
Mon, 11 Apr 2022 08:14:01 +0000 (09:14 +0100)
diff --git a/openpower/sv/svp64/appendix.mdwn b/openpower/sv/svp64/appendix.mdwn

index 18be50fba8abe91a922d1fb720d2ac8f0dec9a76..6f23727ba054565a4b2724b4b74003637bff07f0 100644 (file)
--- a/openpower/sv/svp64/appendix.mdwn
+++ b/openpower/sv/svp64/appendix.mdwn
@@ -836,7 +836,9 @@ achieved in DCT and FFT REMAP**
                  // reduction operation -- we still use this algorithm even
                  // if the reduction operation isn't associative or
                  // commutative.
-/// `temp_pred` is a user-visible Vector Condition register 
+    XXX VIOLATION OF SVP64 DESIGN PRINCIPLES               XXXX
+/// XXX `pred` is a user-visible Vector Condition register XXXX
+    XXX VIOLATION OF SVP64 DESIGN PRINCIPLES               XXXX
  ///
  /// all input arrays have length `vl`
  def reduce(vl, vec, pred):
@@ -855,7 +857,7 @@ def reduce(vl, vec, pred):
              pred[i] |= other_pred;
  ```
  
-The principle in SVP64 being violated is that SVP64 is a fully-independent
+The first principle in SVP64 being violated is that SVP64 is a fully-independent
  Abstraction of hardware-looping in between issue and execute phases 
  that has no relation to the operation it issues.  The above pseudocode
  conditionally changes not only the type of element operation issued
@@ -863,7 +865,17 @@ conditionally changes not only the type of element operation issued
  At the very least, for Vertical-First Mode this will result in unanticipated and unexpected behaviour (maximise "surprises" for programmers) in
  the middle of loops, that will be far too hard to explain.
  
-An alternative algorithm is therefore required that does not perform MVs.
+The second principle being violated by the above algorithm is the expectation
+that temporary storage is available for a modified predicate: there is no
+such space.  SVP64 is founded on the principle that all operations are
+"re-entrant" with respect to interrupts and exceptions: SVSTATE must
+be saved and restored alongside PC and MSR, but nothing more. It is perfectly
+fine to have context-switching back to the operation be somewhat slower,
+through "reconstruction" of temporary internal state based on what SVSTATE
+contains, but nothing more.
+
+An alternative algorithm is therefore required that does not perform MVs,
+and does not require additional state to be saved on context-switching.
  
  ```
  def reduce(  vl,  vec, pred, pred,):
@@ -878,15 +890,21 @@ def reduce(  vl,  vec, pred, pred,):
          halfstep = step // 2
          for i in (0..vl).step_by(step)
              other = vi[i + halfstep]
-            i = vi[i]
+            ir = vi[i]
              other_pred = other < vl && pred[other]
              if pred[i] && other_pred
-                vec[i] += vec[other]
-            pred[i] |= other_pred
+                vec[ir] += vec[other]
+            else if other_pred:
+               vi[ir] = vi[other] # index redirection, no MV
+            pred[ir] |= other_pred # reconstructed on context-switch
           step *= 2
-
  ```
  
+In this version the need for an explicit MV is made unnecessary by instead
+leaving elements *in situ*.  The internal modifications to the predicate may,
+due to the reduction being entirely deterministic, be "reconstructed"
+on a context-switch. This may make some implementations slower.
+
  *Implementor's Note: many SIMD-based Parallel Reduction Algorithms are
  implemented in hardware with MVs that ensure lane-crossing is minimised.
  In SIMD ISAs the internal SIMD Architectural design is exposed and imposed on the programmer. Cray-style Vector ISAs on the other hand provide convenient,
author	lkcl <lkcl@web>
	Mon, 11 Apr 2022 08:14:01 +0000 (09:14 +0100)
committer	IkiWiki <ikiwiki.info>
	Mon, 11 Apr 2022 08:14:01 +0000 (09:14 +0100)