From: lkcl Date: Thu, 7 Jul 2022 10:00:55 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~1304 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=7aebe6389a3a7e38c488b972b838f136bc4ae9a8;p=libreriscv.git --- diff --git a/openpower/sv/remap.mdwn b/openpower/sv/remap.mdwn index 57f13f556..0ea1cc15c 100644 --- a/openpower/sv/remap.mdwn +++ b/openpower/sv/remap.mdwn @@ -126,9 +126,26 @@ having to perform data Transpose by pushing out through Memory and back, or computing Transposition Indices (costly) then copying to another Vector (costly). -Matrix REMAP was thus designed to solve these issues by providing +Matrix REMAP was thus designed to solve these issues by providing Hardware +Assisted "Schedules" that can view what would otherwise be limited to a strictly -linear Vector as instead being 2D (even 3D) in-place reordered. +linear Vector as instead being 2D (even 3D) *in-place* reordered. +With both Transposition and non-power-two being supported the issues +faced by other ISAs are mitigated. + +Limitations of Matrix REMAP are that the Vector Length (VL) is currently +restricted to 127: up to 127 FMAs may be performed in total (potentially +127 vec2/3/4 FMAs may be used but this requires additional research). +Also given that it is in-registers only at present some care has to be +taken on regfile resource utilisation. However it is perfectly possible +to utilise Matrix REMAP to perform the three inner-most "kernel" loops of +the usual 6-level large Matrix Multiply, without the usual difficulties +associated with SIMD. + +Also the `svshape` instruction only provides access to part of the +Matrix REMAP capability. Rotation and mirroring need to be done by +programming the SVSHAPE SPRs directly, which can take a lot more +instructions. ## FFT/DCT Triple Loop