(no commit message)

author lkcl <lkcl@web>

Thu, 2 Jun 2022 11:18:21 +0000 (12:18 +0100)

committer IkiWiki <ikiwiki.info>

Thu, 2 Jun 2022 11:18:21 +0000 (12:18 +0100)
author lkcl <lkcl@web>
Thu, 2 Jun 2022 11:18:21 +0000 (12:18 +0100)
committer IkiWiki <ikiwiki.info>
Thu, 2 Jun 2022 11:18:21 +0000 (12:18 +0100)
diff --git a/openpower/sv/svp64_quirks.mdwn b/openpower/sv/svp64_quirks.mdwn

index b68c3168fb7ad5d4e046dcab858da38a429cb876..ad9beddc4bc595a2d1189acb3f3d3789b49860c2 100644 (file)
--- a/openpower/sv/svp64_quirks.mdwn
+++ b/openpower/sv/svp64_quirks.mdwn
@@ -52,6 +52,8 @@ makes no sense at all, such as `sc` or `mtmsr`). The categories are:
  * Condition Register Field operations
  * branch
  
+**Arithmetic**
+
  Arithmetic (known as "normal" mode) is where Scalar and Parallel
  Reduction can be done: Saturation as well, and two new innovative
  modes for Vector ISAs: data-dependent fail-first and predicate result.
@@ -62,6 +64,8 @@ getting used to, as it may result in invalid results, but ultimately
  it is critical to think in terms of the "rules", that everything is
  Scalar instructions in strict Program Order.
  
+**Branches**
+
  Branch is the one and only place where the Scalar
  (non-prefixed) operations differ from the Vector (element)
  instructions, as explained in a separate section.
@@ -74,6 +78,8 @@ order to support a wide range of parallel boolean condition options
  which are expected of a Vector / GPU ISA. These save a considerable
  number of instructions in tight inner loop situations.
  
+**CR Field Ops**
+
  Condition Register Fields are 4-bit wide and consequently element-width
  overrides make absolutely no sense whatsoever. Therefore the elwidth
  override field bits can be used for other purposes when Vectorising
@@ -85,9 +91,27 @@ All of these differences, which require quite a lot of logical
  reasoning and deduction, help explain why there is an entirely different
  CR ops Vectorisation Category.
  
+**Load/Store**
+
  LOAD/STORE is another area that has different needs: this time it is
  down to limitations in Scalar LD/ST. Vector ISAs have Load/Store modes
-which simply make no sense in a RISC Scalar ISA: 
+which simply make no sense in a RISC Scalar ISA: element-stride and
+unit-stride and the entire concept of a stride itself (a spacing
+between elements) has no place at all in a Scalar ISA. The problems
+come when trying to *retrofit* the concept of "Vector Elements" onto
+a Scalar ISA, and it required a couple of bits (Modes) in the SVP64
+RM Prefix to convey the stride mode, changing the Effective Address
+computation as a result. Interestingly, worth noting for Hardware
+designers: it did turn out to be possible to perform pre-multiplication
+of the D/DS Immediate by the stride amount, making it possible to avoid
+actually modifying the LD/ST Pipelibe itself.
+
+Other areas where LD/ST went quirky: element-width overrides especially
+when combined with Saturation, given that LD/ST operations have byte,
+halfword, word, dword and quad variants. The interaction between these
+widths as part of the actual operation, and the source and destination
+elwidth overrides, was particularly obtuse and hard to derive: some care
+and attention is advised, here, when reading the specification.
  
  # CR weird instructions
author	lkcl <lkcl@web>
	Thu, 2 Jun 2022 11:18:21 +0000 (12:18 +0100)
committer	IkiWiki <ikiwiki.info>
	Thu, 2 Jun 2022 11:18:21 +0000 (12:18 +0100)