From: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date: Sat, 18 Jun 2022 14:19:26 +0000 (+0100)
Subject: summary clarification
X-Git-Tag: opf_rfc_ls005_v1~1705
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=9b2519b08f3b3cd3aa7a330c5a5969de1533ca08;p=libreriscv.git

summary clarification
---

diff --git a/svp64-primer/summary.tex b/svp64-primer/summary.tex
index 48dacea03..a2fc94f0c 100644
--- a/svp64-primer/summary.tex
+++ b/svp64-primer/summary.tex
@@ -4,16 +4,18 @@ ONLY uses scalar instructions.
 
 \begin{itemize}
 \item The Power ISA v3.1 Specification is not altered in any way.
+  v3.1 Code-compatibility is guaranteed.
+\item Does not require sacrificing 32-bit Major Opcodes.
 \item Specifically designed to be easily implemented
   on top of an existing Micro-architecture (especially
   Superscalar Out-of-Order Multi-issue) without
   disruptive full architectural redesigns.
 \item Divided into Compliancy Levels to suit differing needs.
-\item At the highest Compliancy Level only requires four instructions
+\item At the highest Compliancy Level only requires five instructions
   (SVE2 requires appx 9,000. AVX-512 around 10,000. RVV around
   300).
-\item Predication, an often-requested feature, is added cleanly to the
-  Power ISA (without modifying the v3.1 Power ISA)
+\item Predication, an often-requested feature, is added cleanly
+  (without modifying the v3.1 Power ISA)
 \item In-registers arbitrary-sized Matrix Multiply is achieved in three
   instructions (without adding any v3.1 Power ISA instructions)
 \item Full DCT and FFT RADIX2 Triple-loops are achieved with dramatically
@@ -21,7 +23,7 @@ ONLY uses scalar instructions.
   reduce. Normally found only in high-end VLIW DSPs (TI MSP, Qualcomm
   Hexagon)
 \item Fail-First Load/Store allows strncpy to be implemented in around 14
-  instructions (Optimised VSX assembler is 240).
+  instructions (hand-optimised VSX assembler is 240).
 \item Inner loop of MP3 implemented in under 100 instructions
   (gcc produces 450 for the same function)
 \end{itemize}
@@ -179,6 +181,6 @@ SIMD implementations by:
 -for loop, increment registers RT, RA, RB
 -few instructions, easier to implement and maintain
 -example assembly code
--ARM has already started to add to libC SVE2 support 
+-ARM has already started to add to libC SVE2 support
 
 1970 x86 comparison