From 90252acd4f6b2a5d54347a855783a3fe62344571 Mon Sep 17 00:00:00 2001 From: lkcl Date: Tue, 14 Jun 2022 20:17:48 +0100 Subject: [PATCH] --- openpower/sv/mv.swizzle.mdwn | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/openpower/sv/mv.swizzle.mdwn b/openpower/sv/mv.swizzle.mdwn index 8c6e2fc5f..d5b3e4287 100644 --- a/openpower/sv/mv.swizzle.mdwn +++ b/openpower/sv/mv.swizzle.mdwn @@ -271,3 +271,28 @@ that swaps to Outer-subvector loops, and when `UNPACK_en` is set it is the destination that swaps its loop-order. Setting both `PACK_en` and `UNPACK_en` is neither prohibited nor `UNDEFINED` because the behaviour is fully deterministic. + +*However*, in +Vertical-First Mode, when both are enabled, +with both source and destination being outer loops a **single** +step of srstep and dststep is performed. Contrast this when +one of `PACK_en` is set, it is the *destination* that is an inner +subvector loop, and therefore Vertical-First runs through the +entire `dst_subvl` group. Likewise when `UNPACK_en` is set it +is the source subvector that is run through as a group. + +``` +if VERTICAL_FIRST: + # must run through SUBVL or dst_subvl elements, to keep + # the subvector "together". weirdness occurs due to + # PACK_en/UNPACK_en + num_runs = SUBVL # 1-4 + if PACK_en: + num_runs = dst_subvl # destination still an inner loop + if PACK_en and UNPACK_en: + num_runs = 1 # both are outer loops + for substep in num_runs: + (src_idx, offs) = yield from index_src() + dst_idx = yield from index_dst() + move_operation(RT+dst_idx, RA+src_idx+offs) +``` -- 2.30.2