(no commit message)

author lkcl <lkcl@web>

Sun, 10 Apr 2022 15:33:29 +0000 (16:33 +0100)

committer IkiWiki <ikiwiki.info>

Sun, 10 Apr 2022 15:33:29 +0000 (16:33 +0100)
author lkcl <lkcl@web>
Sun, 10 Apr 2022 15:33:29 +0000 (16:33 +0100)
committer IkiWiki <ikiwiki.info>
Sun, 10 Apr 2022 15:33:29 +0000 (16:33 +0100)
diff --git a/openpower/sv/svp64/appendix.mdwn b/openpower/sv/svp64/appendix.mdwn

index fac669f0c0a706aff9b4ee339f70cbf7470f3008..3067efe4f59152d29bd071bd0c8382ddfa5d9956 100644 (file)
--- a/openpower/sv/svp64/appendix.mdwn
+++ b/openpower/sv/svp64/appendix.mdwn
@@ -16,29 +16,47 @@ Table of contents:
  
  Vector systems are expected to be high performance.  This is achieved
  through parallelism, which requires that elements in the vector be
-independent.  XER SO and other global "accumulation" flags (CR.OV) cause
+independent.  XER SO/OV and other global "accumulation" flags (CR.SO) cause
  Read-Write Hazards on single-bit global resources, having a significant
  detrimental effect.
  
-Consequently in SV, XER.SO and CR.OV behaviour is disregarded (including
+Consequently in SV, XER.SO and OV behaviour is disregarded (including
  in `cmp` instructions).  XER is simply neither read nor written.
  This includes when `scalar identity behaviour` occurs.  If precise
  OpenPOWER v3.0/1 scalar behaviour is desired then OpenPOWER v3.0/1
  instructions should be used without an SV Prefix.
  
+Of note here is that XER.SO and OV may already be disregarded in the
+Power ISA v3.0/1 SFFS (Scalar Fixed and Floating) Compliancy Subset.
+SVP64 simply makes it mandatory to disregard even for other Subsets,
+but only for SVP64 Prefixed Operations.
+
  An interesting side-effect of this decision is that the OE flag is now
-free for other uses when SV Prefixing is used.
+free for other uses when SV Prefixing is used, and CR.SO may likewise
+used for other purposes (saturation for example).
  
  XER.CA/CA32 on the other hand is expected and required to be implemented
  according to standard Power ISA Scalar behaviour.  Interestingly, due
  to SVP64 being in effect a hardware for-loop around Scalar instructions
  executing in precise Program Order, a little thought shows that a Vectorised
-Carry-In-Out add is in effect a Big Integer Add, taking a single bit CarryIn
+Carry-In-Out add is in effect a Big Integer Add, taking a single bit Carry In
  and producing, at the end, a single bit Carry out.  High performance
  implementations may exploit this observation to deploy efficient
  Parallel Carry Lookahead.
  
-    sv.
+    # assume VL=4, this results in 4 sequential ops (below)
+    sv.adde r0.v, r4.v, r8.v
+
+    # instructions that get executed in backend hardware:
+    adde r0, r4, r8 # takes carry-in, produces carry-out
+    adde r1, r5, r9 # takes carry from previous
+    ...
+    adde r3, r7, r11 # likewise
+
+It can clearly be seen that the carry chains from one
+64 bit add to the next, the end result being that a
+256-bit "Big Integer Add" has been performed, and that
+CA contains the 257th bit.
  
  # v3.0B/v3.1 relevant instructions
author	lkcl <lkcl@web>
	Sun, 10 Apr 2022 15:33:29 +0000 (16:33 +0100)
committer	IkiWiki <ikiwiki.info>
	Sun, 10 Apr 2022 15:33:29 +0000 (16:33 +0100)