From a6220790a87d0154b87c3830f8a491bd9046c19e Mon Sep 17 00:00:00 2001
From: lkcl <lkcl@web>
Date: Tue, 28 Mar 2023 00:50:40 +0100
Subject: [PATCH]

---
 openpower/sv/rfc/ls009.mdwn | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/openpower/sv/rfc/ls009.mdwn b/openpower/sv/rfc/ls009.mdwn
index 755a80801..022193bf0 100644
--- a/openpower/sv/rfc/ls009.mdwn
+++ b/openpower/sv/rfc/ls009.mdwn
@@ -119,16 +119,21 @@ Vector ISAs which would typically only have a limited set of instructions
 that can be structure-packed (LD/ST typically), REMAP may be applied to
 literally any instruction: CRs, Arithmetic, Logical, LD/ST, anything.
 
-Note that REMAP does not *directly* apply to sub-vector elements: that 
+Note that REMAP does not *directly* apply to sub-vector elements but
+only to the group: that 
 is what swizzle is for.  Swizzle *can* however be applied to the same
-instruction as REMAP.  As explained in [[sv/mv.swizzle]], [[sv/mv.vec]] and the [[svp64/appendix]], Pack and Unpack EXTRA Mode bits
+instruction as REMAP.  As explained in [[sv/mv.swizzle]]
+and the [[svp64/appendix]], Pack and Unpack EXTRA Mode bits
 can extend down into Sub-vector elements to perform vec2/vec3/vec4
-sequential reordering, but even here, REMAP is not extended down to
-the actual sub-vector elements themselves.
+sequential reordering, but even here, REMAP is not *individually*
+extended down to the actual sub-vector elements themselves.
 
 In its general form, REMAP is quite expensive to set up, and on some
 implementations may introduce
 latency, so should realistically be used only where it is worthwhile.
+Given that even with latency the fact that up to 127 operations
+can be requested to be issued (from a single instruction) it should
+be clear that REMAP should not be dismissed for *potential* latency alone.
 Commonly-used patterns such as Matrix Multiply, DCT and FFT have
 helper instruction options which make REMAP easier to use.
 
-- 
2.30.2