From c3458afae8d76c29d4858081cf822bd54c38d64c Mon Sep 17 00:00:00 2001
From: lkcl <lkcl@web>
Date: Sat, 30 Apr 2022 15:32:45 +0100
Subject: [PATCH]

---
 openpower/sv/svp64/appendix.mdwn | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/openpower/sv/svp64/appendix.mdwn b/openpower/sv/svp64/appendix.mdwn
index 06e2a4aa8..9ec5bed07 100644
--- a/openpower/sv/svp64/appendix.mdwn
+++ b/openpower/sv/svp64/appendix.mdwn
@@ -66,8 +66,7 @@ may be performed by setting VL=8, and a one-instruction
 SV is primarily designed for use as an efficient hybrid 3D GPU / VPU /
 CPU ISA.
 
-As mentioned above, OE=1 is not applicable in SV, freeing this bit for
-alternative uses.  Additionally, Vectorisation of the VSX SIMD system
+Vectorisation of the VSX Packed SIMD system
 likewise makes no sense whatsoever. SV *replaces* VSX and provides,
 at the very minimum, predication (which VSX was designed without).
 Thus all VSX Major Opcodes - all of them - are "unused" and must raise
@@ -964,9 +963,15 @@ being only 32 bit, 5 operands is quite an ask.  `lq` however sets
 a precedent: `RTp` stands for "RT pair".  In other words the result
 is stored in RT and RT+1.  For Scalar operations, following this
 precedent is perfectly reasonable.  In Scalar mode,
-`umadded` therefore stores the two halves of the 128-bit multiply
+`madded` therefore stores the two halves of the 128-bit multiply
 into RT and RT+1.
 
+What, then, of `sv.madded`? If the destination is hard-coded to
+RT and RT+1 the instruction is not useful when Vectorised because
+the output will be overwritten on the next element.  To solve this
+is easy: define the destination registers as RT and RT+MAXVL
+respectively.
+
 
 * [[isa/svfixedarith]]
 * [[isa/svfparith]]
-- 
2.30.2