From 62ea697e72da387a591c00dcb54a83253d1abebb Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 7 May 2022 21:17:17 +0100 Subject: [PATCH] --- openpower/sv/SimpleV_rationale.mdwn | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn index 36dc950c9..aecdc58ef 100644 --- a/openpower/sv/SimpleV_rationale.mdwn +++ b/openpower/sv/SimpleV_rationale.mdwn @@ -435,9 +435,14 @@ normally otherwise encountered this results in contention between the L1 D and I Caches at the L2 Bus, slowing down execution even further. Power ISA 3.1 MMA (Matrix-Multiply-Assist) requires loop-unrolling to contend with non-power-of-two Matrix -sizes: SVP64 does not, as hinted at below. - -Additional savings come in the form of `SVREMAP`. This is a hardware +sizes: SVP64 does not (as hinted at below). +[Figures 8 and 9](https://arxiv.org/abs/2104.03142) +illustrate the process of concatenating copies of data in order +to match RADIX2 limitations of MMA. + +Additional savings come in the form of `SVREMAP`. Like the +hardware-assist of Google's TPU mentioned on p9 of the above MMA paper, +`SVREMAP` is a hardware index transformation system where the normally sequentially-linear Vector element access may be "Re-Mapped" to limited but algorithmic-tailored commonly-used deterministic schedules, for example Matrix Multiply, -- 2.30.2