The inclusion of a separate src SUBVL allows
`sv.mv.swiz RT.vecN RA.vecN` to mean zip/unzip (pack/unpack).
This is conceptually achieved by having both source and
-destination SUBVL be "outer" loops instead of inner loops.
+destination SUBVL be "outer" loops instead of inner loops,
+exactly as in [[sv/remap]] Matrix mode.
Illustrating a
"normal" SVP64 operation with `SUBVL!=1:` (assuming no elwidth overrides):
# yield an outer-SUBVL, inner VL loop with SRC SUBVL
def index_src():
- for j in range(SRC_SUBVL):
+ for j in range(SUBVL):
for i in range(VL):
yield i+VL*j
# yield an outer-SUBVL, inner VL loop with DEST SUBVL
def index_dest():
- for j in range(SUBVL):
+ for j in range(dst_subvl):
for i in range(VL):
yield i+VL*j
- # inner looping when SUBVLs are equal
- if SRC_SUBVL == SUBVL:
- for idx in index():
- move_operation(RT+idx, RA+idx)
- else:
- # walk through both source and dest indices simultaneously
- for src_idx, dst_idx in zip(index_src(), index_dst()):
- move_operation(RT+dst_idx, RA+src_idx)
-
"yield" from python is used here for simplicity and clarity.
The two Finite State Machines for the generation of the source
and destination element offsets progress incrementally in
lock-step.
-Ether `SRC_SUBVL=1, SUBVL=2/3/4` gives
-a "pack" effect, and `SUBVL=1, SRC_SUBVL=2/3/4` gives an
-"unpack". Setting both SUBVL and SRC_SUBVL to greater than
-1 is `UNDEFINED`.
+Just as in [[sv/mv.vec]], when `PACK_en` is set it is the source
+that swaps to Outer-subvector loops, and when `UNPACK_en` is set
+it is the destination that swaps its loop-order. Setting both
+`PACK_en` and `UNPACK_en` is neither prohibited nor `UNDEFINED`
+because the behaviour is fully deterministic.