From 11cd47c08f7807d7a22af00568bce60d4909dadf Mon Sep 17 00:00:00 2001 From: lkcl Date: Tue, 14 Jun 2022 16:50:41 +0100 Subject: [PATCH] --- openpower/sv/mv.swizzle.mdwn | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/openpower/sv/mv.swizzle.mdwn b/openpower/sv/mv.swizzle.mdwn index bee79fcd3..48954eefb 100644 --- a/openpower/sv/mv.swizzle.mdwn +++ b/openpower/sv/mv.swizzle.mdwn @@ -228,7 +228,8 @@ MVRM-2P-1S1D: The inclusion of a separate src SUBVL allows `sv.mv.swiz RT.vecN RA.vecN` to mean zip/unzip (pack/unpack). This is conceptually achieved by having both source and -destination SUBVL be "outer" loops instead of inner loops. +destination SUBVL be "outer" loops instead of inner loops, +exactly as in [[sv/remap]] Matrix mode. Illustrating a "normal" SVP64 operation with `SUBVL!=1:` (assuming no elwidth overrides): @@ -245,31 +246,23 @@ For a separate source/dest SUBVL (again, no elwidth overrides): # yield an outer-SUBVL, inner VL loop with SRC SUBVL def index_src(): - for j in range(SRC_SUBVL): + for j in range(SUBVL): for i in range(VL): yield i+VL*j # yield an outer-SUBVL, inner VL loop with DEST SUBVL def index_dest(): - for j in range(SUBVL): + for j in range(dst_subvl): for i in range(VL): yield i+VL*j - # inner looping when SUBVLs are equal - if SRC_SUBVL == SUBVL: - for idx in index(): - move_operation(RT+idx, RA+idx) - else: - # walk through both source and dest indices simultaneously - for src_idx, dst_idx in zip(index_src(), index_dst()): - move_operation(RT+dst_idx, RA+src_idx) - "yield" from python is used here for simplicity and clarity. The two Finite State Machines for the generation of the source and destination element offsets progress incrementally in lock-step. -Ether `SRC_SUBVL=1, SUBVL=2/3/4` gives -a "pack" effect, and `SUBVL=1, SRC_SUBVL=2/3/4` gives an -"unpack". Setting both SUBVL and SRC_SUBVL to greater than -1 is `UNDEFINED`. +Just as in [[sv/mv.vec]], when `PACK_en` is set it is the source +that swaps to Outer-subvector loops, and when `UNPACK_en` is set +it is the destination that swaps its loop-order. Setting both +`PACK_en` and `UNPACK_en` is neither prohibited nor `UNDEFINED` +because the behaviour is fully deterministic. -- 2.30.2