(no commit message)

author lkcl <lkcl@web>

Thu, 11 May 2023 16:16:36 +0000 (17:16 +0100)

committer IkiWiki <ikiwiki.info>

Thu, 11 May 2023 16:16:36 +0000 (17:16 +0100)
author lkcl <lkcl@web>
Thu, 11 May 2023 16:16:36 +0000 (17:16 +0100)
committer IkiWiki <ikiwiki.info>
Thu, 11 May 2023 16:16:36 +0000 (17:16 +0100)
diff --git a/openpower/sv/svp64.mdwn b/openpower/sv/svp64.mdwn

index 0f198f13e0ffa435d589b24bd1984de8771a864b..e0c0a9ee03b94ead03ec7ec0ea7b4bd60e73d4cc 100644 (file)
--- a/openpower/sv/svp64.mdwn
+++ b/openpower/sv/svp64.mdwn
@@ -372,21 +372,6 @@ the example having VL=5.  Thus on "wrapping" - sequential progression
  from GPR(1) into GPR(2) - the 5th result modifies **only** the bottom
  16 LSBs of GPR(1).
  
-*Engineering note: to avoid a Read-Modify-Write at the register
-file it is strongly recommended to implement byte-level write-enable lines
-exactly as has been implemented in DRAM ICs for many decades. Additionally
-the predicate mask bit is advised to be associated with the element
-operation and alongside the result ultimately passed to the register file.
-When element-width is set to 64-bit the relevant predicate mask bit
-may be repeated eight times and pull all eight write-port byte-level
-lines HIGH. Clearly when element-width is set to 8-bit the relevant
-predicate mask bit corresponds directly with one single byte-level
-write-enable line.  It is up to the Hardware Architect to then amortise
-(merge) elements together into both PredicatedSIMD Pipelines as well
-as simultaneous non-overlapping Register File writes, to achieve High
-Performance designs.  Overall it helps to think of the GPR and FPR
-register files as being much more akin to a 64-bit-wide byte-level-addressable SRAM.*
-
  If the 16-bit operation were to be followed up with a 32-bit Vectorised
  Operation, the exact same contents would be viewed as follows:
  
@@ -410,6 +395,21 @@ form because `MSR.LE` is directly in control of the Memory-to-Register
  byte-ordering. This section is exclusively about how to correctly perceive
  Simple-V-Augmented **Register** Files.
  
+*Engineering note: to avoid a Read-Modify-Write at the register
+file it is strongly recommended to implement byte-level write-enable lines
+exactly as has been implemented in DRAM ICs for many decades. Additionally
+the predicate mask bit is advised to be associated with the element
+operation and alongside the result ultimately passed to the register file.
+When element-width is set to 64-bit the relevant predicate mask bit
+may be repeated eight times and pull all eight write-port byte-level
+lines HIGH. Clearly when element-width is set to 8-bit the relevant
+predicate mask bit corresponds directly with one single byte-level
+write-enable line.  It is up to the Hardware Architect to then amortise
+(merge) elements together into both PredicatedSIMD Pipelines as well
+as simultaneous non-overlapping Register File writes, to achieve High
+Performance designs.  Overall it helps to think of the GPR and FPR
+register files as being much more akin to a 64-bit-wide byte-level-addressable SRAM.*
+
  **Comparative equivalent using VSR registers**
  
  For a comparative data point the VSR Registers may be expressed in the
@@ -425,7 +425,6 @@ element (numbered zero) being at the bitwise-numbered **LSB** end of the
  register, where VSX does the reverse: places the numerically-*highest*
  (last-numbered) element at the LSB end of the register.
  
-
  ```
      #pragma pack
      typedef union {
author	lkcl <lkcl@web>
	Thu, 11 May 2023 16:16:36 +0000 (17:16 +0100)
committer	IkiWiki <ikiwiki.info>
	Thu, 11 May 2023 16:16:36 +0000 (17:16 +0100)