to request some alternative Matrix mappings, and there is also
room within the reserved bits of `svremap` as well.
-# RM Mode Concept:
+# RM Pack/unpack
+
+Similar to [[sv/mv.swizzle]]
MVRM-2P-2S1D:
| src_SUBVL | `14:15` | SUBVL for Source |
| MASK_SRC | `16:18` | Execution Mask for Source |
-The inclusion of a separate src SUBVL would allow either
-`sv.mv RT.vecN RA.vecN` to mean contiguous sequential copy
-or it could mean zip/unzip (pack/unpack).
+The inclusion of a separate src SUBVL allows
+`sv.mv.swiz RT.vecN RA.vecN` to mean zip/unzip (pack/unpack).
+This is conceptually achieved by having both source and
+destination SUBVL be "outer" loops instead of inner loops.
+
+Illustrating a
+"normal" SVP64 operation with `SUBVL!=1:` (assuming no elwidth overrides):
+
+ def index():
+ for i in range(VL):
+ for j in range(SUBVL):
+ yield i*SUBVL+j
+
+ for idx in index():
+ operation_on(RA+idx)
+
+For a separate source/dest SUBVL (again, no elwidth overrides):
+
+ # yield an outer-SUBVL, inner VL loop with SRC SUBVL
+ def index_src():
+ for j in range(SRC_SUBVL):
+ for i in range(VL):
+ yield i+VL*j
+
+ # yield an outer-SUBVL, inner VL loop with DEST SUBVL
+ def index_dest():
+ for j in range(SUBVL):
+ for i in range(VL):
+ yield i+VL*j
+
+ # inner looping when SUBVLs are equal
+ if SRC_SUBVL == SUBVL:
+ for idx in index():
+ move_operation(RT+idx, RA+idx)
+ else:
+ # walk through both source and dest indices simultaneously
+ for src_idx, dst_idx in zip(index_src(), index_dst()):
+ move_operation(RT+dst_idx, RA+src_idx)
+
+"yield" from python is used here for simplicity and clarity.
+The two Finite State Machines for the generation of the source
+and destination element offsets progress incrementally in
+lock-step.
+
+Normal uaage, `SRC_SUBVL=1, SUBVL=2/3/4` gives
+a "pack" effect, and `SUBVL=1, SRC_SUBVL=2/3/4` gives an
+"unpack". Setting both SUBVL and SRC_SUBVL to greater than
+1 will, unlike [[sv/mv.swizzle]], produce defined deterministic results,
+even if a little hard to understand. Loops run through
+`MIN(SUBVL, SRC_SUBVL) * VL` elements.