From: lkcl Date: Sat, 15 Apr 2023 22:30:26 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls009_v1~60 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=7fc74252d07be595d141ed62c0432e5fcaf8246d;p=libreriscv.git --- diff --git a/openpower/sv/remap.mdwn b/openpower/sv/remap.mdwn index 1fb8b3edf..2bc8a598f 100644 --- a/openpower/sv/remap.mdwn +++ b/openpower/sv/remap.mdwn @@ -38,19 +38,21 @@ that can be structure-packed (LD/ST and Move operations being the most common), REMAP may be applied to literally any instruction: CRs, Arithmetic, Logical, LD/ST, anything. -When SUBVL is greater than 1 the group of Subvector -elements are kept together, effectively the group becomes the +When SUBVL is greater than 1 a given group of Subvector +elements are kept together: effectively the group becomes the element, and the group is REMAPed together. Swizzle *can* however be applied to the same instruction as REMAP, providing re-sequencing of -Subvector elements that REMAP cannot. Also as explained in [[sv/mv.swizzle]], [[sv/mv.vec]] and the [[svp64/appendix]], Pack and Unpack EXTRA Mode bits -can extend down into Sub-vector elements to perform vec2/vec3/vec4 +Subvector elements which REMAP cannot. Also as explained in [[sv/mv.swizzle]], [[sv/mv.vec]] and the [[svp64/appendix]], Pack and Unpack Mode bits +can extend down into Sub-vector elements to influence vec2/vec3/vec4 sequential reordering, but even here, REMAP is not extended down to the actual sub-vector elements themselves. In its general form, REMAP is quite expensive to set up, and on some implementations may introduce latency, so should realistically be used only where it is worthwhile. +Given that most other ISAs require full loop-unrolling for Matrix, +DCT and FFT, savings are still anticipated. Commonly-used patterns such as Matrix Multiply, DCT and FFT have helper instruction options which make REMAP easier to use.