add ls001.po9 RFC

[libreriscv.git] / openpower / sv / mv.vec.mdwn
diff --git a/openpower/sv/mv.vec.mdwn b/openpower/sv/mv.vec.mdwn

index fd24fb13cee4194a7c1393db4a5a5fdb0a7dc6ab..9190fcdc7dc81f06a23b9c44da96422fd105d825 100644 (file)
--- a/openpower/sv/mv.vec.mdwn
+++ b/openpower/sv/mv.vec.mdwn
@@ -2,127 +2,26 @@
  
  # Vector Pack/Unpack operations
  
-In the SIMD VSX set, section 6.8.1 and 6.8.2 p254 of v3.0B has a series of pack and unpack operations. This page covers those and more.  [[svp64]] provides the Vector Context to also add saturation as well as predication.
+In the SIMD VSX set, section 6.8.1 and 6.8.2 p254 of v3.0B has a series of pack and unpack operations. Additional pixel pack/unpack instructions
+also exist.
+
+In SVP64, Pack and Unpack are achieved *in the abstract* for application on *all*
+Vectoriseable instructions.
  
  * See <https://bugs.libre-soc.org/show_bug.cgi?id=230#c30>
  * <https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-June/004911.html>
  
-Pack and unpack may be covered by [[sv/remap]] by using Matrix 2D layouts on either source or destination but is quite expensive to do so.  Additionally,
+The effect of Pack and unpack could be covered by [[sv/remap]] by using Matrix 2D layouts on either source or destination but is quite expensive to do so.  Additionally,
  with pressure on the Scalar 32-bit opcode space it is more appropriate to
-compromise by adding required capability in SVP64 on top of a
-base pre-existing Scalar mv instruction.  [[sv/mv.swizzle]] is sufficiently
+compromise by adding required capability in SVP64 as a high priority
+(part of the Management Instructions).  [[sv/mv.swizzle]] is sufficiently
  unusual to justify a base Scalar 32-bit instruction but pack/unpack is not.
-Both may benefit from a use of the `RM.EXTRA` field to provide an
-additional mode, that may be applied to vec2/3/4.
-
-# REMAP concept for pack/unpack
-
-It may be possible to use one standard mv instruction to perform packing
-and unpacking: Matrix allows for both reordering and offsets. At the very least a predicate mask potentially can
-be used.
-
-* If a single src-dest mv is used, then it potentially requires
-  two separate REMAP and two separate sv.mvs: remap-even, sv.mv,
-  remap-odd, sv.mv
-* If adding twin-src and twin-dest that is a lot of instructions,
-  particularly if triple is added as well. FPR mv, GPR mv
-* Unless twin or triple is added, how is it possible to determine
-  the extra register(s) to be merged (or split)?
-
-How about instead relying on the implicit RS=MAXVL+RT trick and
-extending that to RS=MAXVL+RA as a source?  One spare bit in the
-EXTRA RM area says whether the sv.mv is a pack (RS-as-src=RA+MAXVL)
-or unpack (RS-as-dest=RT+MAXVL)
-
-Alternatively, given that Matrix is up to 3 Dimensions, not even
-be concerned about RS, just simply use one of those dimensions to
-span the packing:
-
-Example 1:
-
-* RA set to linear
-* RT set to YX, ydim=2, xdim=4
-* VL=MAXVL=8
-
-The indices match up as follows:
-
-    | RA | (0 1) (2 3) (4 5) (6 7) |
-    | RT |   0 2 4 8     1 3 5 7   |
-
-This results in a 2-element "unpack"
-
-Example 2:
-
-* RT set to linear
-* RT set to YX, ydim=3, xdim=3
-* VL=MAXVL=9
-
-The indices match up as follows:
-
-    | RA |  0 1 2   3 4 5   6 7 8  |
-    | RT | (0 3 6) (1 4 7) (2 5 8) |
-
-This results in a 3-element "pack"
-
-Both examples become particularly fun when Twin Predication is thrown
-into the mix.
-
-There exists room within the `svshape` instruction of  [[sv/remap]]
-to request some alternative Matrix mappings, and there is also
-room within the reserved bits of `svremap` as well.
-
-# RM Pack/unpack
-
-Also used on [[sv/mv.swizzle]] 
-
-MVRM-2P-1S1D:
-
-| Field Name | Field bits | Description                     |
-|------------|------------|----------------------------|
-| Rdest_EXTRA2 | `10:11`  | extends Rdest (R\*\_EXTRA2 Encoding)   |
-| Rsrc_EXTRA2  | `12:13`  | extends Rsrc  (R\*\_EXTRA2 Encoding)   |
-| PACK_en      | `14`     | Enable pack              |
-| UNPACK_en    | `15`     | Enable unpack             |
-| MASK_SRC     | `16:18`  | Execution Mask for Source     |
-
-The usual RM-2P-1S1D is reduced from EXTRA3 to EXTRA2, making
-room for 2 extra bits that enable either "packing" or "unpacking"
-on the subvectors vec2/3/4.
-
-Illustrating a
-"normal" SVP64 operation with `SUBVL!=1:` (assuming no elwidth overrides):
-
-    def index():
-        for i in range(VL):
-            for j in range(SUBVL):
-                yield i*SUBVL+j
-
-    for idx in index():
-        operation_on(RA+idx)
-
-For pack/unpack (again, no elwidth overrides):
-
-    # yield an outer-SUBVL or inner VL loop with SUBVL
-    def index_p(outer):
-        if outer:
-            for j in range(SUBVL):
-                for i in range(VL):
-                    yield i+VL*j
-        else:
-            for i in range(VL):
-                for j in range(SUBVL):
-                    yield i*SUBVL+j
-
-     # walk through both source and dest indices simultaneously
-     for src_idx, dst_idx in zip(index_p(PACK), index_p(UNPACK)):
-         move_operation(RT+dst_idx, RA+src_idx)
+What, ultimately, was decided, was to make Pack/Unpack part of the
+`SVSTATE` [[sv/spr]].
  
-"yield" from python is used here for simplicity and clarity.
-The two Finite State Machines for the generation of the source
-and destination element offsets progress incrementally in
-lock-step.
+# SVSTATE Pack/unpack Mode bits
  
-Setting of both `PACK_en` and `UNPACK_en` is neither prohibited nor
-`UNDEFINED` because the reordering is fully deterministic, and
-additional REMAP reordering may be applied. For Matrix this would
-give potentially up to 4 Dimensions of reordering.
+Described in [[svp64/appendix]] the Pack/Unpack Modes allow selective
+Transposition of Sub-vector elements, on both source and destination.
+[[sv/mv.swizzle]] is unique in that the Subvector length may be different
+for source and destination.