(no commit message)

author lkcl <lkcl@web>

Sat, 20 May 2023 10:24:53 +0000 (11:24 +0100)

committer IkiWiki <ikiwiki.info>

Sat, 20 May 2023 10:24:53 +0000 (11:24 +0100)
author lkcl <lkcl@web>
Sat, 20 May 2023 10:24:53 +0000 (11:24 +0100)
committer IkiWiki <ikiwiki.info>
Sat, 20 May 2023 10:24:53 +0000 (11:24 +0100)
diff --git a/openpower/sv/ldst.mdwn b/openpower/sv/ldst.mdwn

index 836c9857335b409c9d144d26755be46bf6478ea6..5b497a31b4531140cfc04f0fcdd33d13deee2f0e 100644 (file)
--- a/openpower/sv/ldst.mdwn
+++ b/openpower/sv/ldst.mdwn
@@ -636,26 +636,28 @@ have all been removed, for clarity and simplicity:
              # unit / element stride mode, compute 64 bit address
              srcbase = get_polymorphed_reg(RA, 64, 0)
              # adjust for unit/el-stride
-            srcbase += ....
+            srcbase += .... uses op_width here
  
          # read the underlying memory
          memread <= MEM(srcbase + imm_offs, op_width)
  
          # truncate/extend to over-ridden dest width.
-        memread = adjust_wid(memread, op_width, svctx.dest_elwidth)
+        memread = adjust_wid(memread, op_width, svctx.elwidth)
  
          # takes care of inserting memory-read (now correctly byteswapped)
          # into regfile underlying LE-defined order, into the right place
-        # within the NEON-like register, respecting destination element
-        # bitwidth, and the element index (j)
-        set_polymorphed_reg(RT, svctx.dest_elwidth, j, memread)
+        # using Element-Packing starting at register RT, respecting destination
+        # element bitwidth, and the element index (j)
+        set_polymorphed_reg(RT, svctx.elwidth, j, memread)
  
          # increments both src and dest element indices (no predication here)
          i++;
          j++;
  ```
  
-Note above that the source elwidth is *not used at all* in LD-immediate.
+Note above that the source elwidth is *not used at all* in LD-immediate: RA
+never has elwidth overrides, leaving the elwidth free for truncation/extension
+of the result.
  
  For LD/Indexed, the key is that in the calculation of the Effective Address,
  RA has no elwidth override but RB does.  Pseudocode below is simplified
@@ -687,19 +689,31 @@ for clarity: predication and all modes are removed:
              memread = byteswap(memread, op_width)
  
          # truncate/extend to over-ridden dest width.
-        memread = adjust_wid(memread, op_width, svctx.dest_elwidth)
+        dest_width = op_width if RT.isvec else 64
+        memread = adjust_wid(memread, op_width, dest_width)
  
          # takes care of inserting memory-read (now correctly byteswapped)
          # into regfile underlying LE-defined order, into the right place
          # within the NEON-like register, respecting destination element
          # bitwidth, and the element index (j)
-        set_polymorphed_reg(RT, svctx.dest_elwidth, j, memread)
+        set_polymorphed_reg(RT, destwidth, j, memread)
  
          # increments both src and dest element indices (no predication here)
          i++;
          j++;
  ```
  
+*Programmer's note: with no destination elwidth override the destination
+width must be implicitly ascertained.  The assumption is that if the destination
+is a Scalar that the entire 64-bit register must be written, thus the width is
+extended to 64-bit.  If however the destination is a Vector then it is deemed
+appropriate to use the LD/ST width and to perform contiguous register element
+packing at that width.  The justification for doing so is that if further
+sign-extension or saturation is required after a LD, these may be performed by a
+follow-up instruction that uses a source elwidth override matching the exact width
+of the LD operation.  Correspondingly for a ST a destination elwidth override
+on a prior instruction may match the exact width of the ST instruction.*
+
  ## Remapped LD/ST
  
  In the [[sv/remap]] page the concept of "Remapping" is described.  Whilst
author	lkcl <lkcl@web>
	Sat, 20 May 2023 10:24:53 +0000 (11:24 +0100)
committer	IkiWiki <ikiwiki.info>
	Sat, 20 May 2023 10:24:53 +0000 (11:24 +0100)