From: lkcl Date: Thu, 7 Jul 2022 09:11:04 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~1305 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=56eb674b93719bc5aa435f3311eefe7489b01ae1;p=libreriscv.git --- diff --git a/openpower/sv/remap.mdwn b/openpower/sv/remap.mdwn index bc67a847c..57f13f556 100644 --- a/openpower/sv/remap.mdwn +++ b/openpower/sv/remap.mdwn @@ -112,7 +112,23 @@ and briefly goes over their characteristics and limitations. ## Matrix (1D/2D/3D shaping) -TODO +Matrix Multiplication is a huge part of High-Performance Compute, +and 3D. +In many PackedSIMD as well as Scalable Vector ISAs, non-power-of-two +Matrix sizes are a serious challenge. PackedSIMD ISAs, in order to +cope with for example 3x4 Matrices, recommend rolling data-repetition and loop-unrolling. +Aside from the cost of the load on the L1 I-Cache, the trick only +works if one of the dimensions X or Y are power-two. Prime Numbers +(5x7, 3x5) become deeply problematic to unroll. + +Even traditional Scalable Vector ISAs have issues with Matrices, often +having to perform data Transpose by pushing out through Memory and back, +or computing Transposition Indices (costly) then copying to another +Vector (costly). + +Matrix REMAP was thus designed to solve these issues by providing +"Schedules" that can view what would otherwise be limited to a strictly +linear Vector as instead being 2D (even 3D) in-place reordered. ## FFT/DCT Triple Loop