From fc5320595066024a8c7188a525aeae823f8a173d Mon Sep 17 00:00:00 2001
From: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date: Mon, 16 Apr 2018 01:47:40 +0100
Subject: [PATCH] add comparison section

---
 simple_v_extension.mdwn | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/simple_v_extension.mdwn b/simple_v_extension.mdwn
index f851675fb..a2e6c3eed 100644
--- a/simple_v_extension.mdwn
+++ b/simple_v_extension.mdwn
@@ -1054,17 +1054,20 @@ SIMD is yet to be explicitly incorporated into this section.
 [[alt_rvp]]
 
 * plus: the simplicity of the lanes (combined with the regularity of
-  allocating identical opcodes multiple independent registers)
+  allocating identical opcodes multiple independent registers) meaning
+  that SRAM or 2R1W can be used for entire regfile (potentially).
 * minus: a more complex instruction set where the parallelism is much
   more explicitly directly specified in the instruction and
 * minus: if you *don't* have an explicit instruction (opcode) and you
-  need one, the only place it can be added is... in the vector unit
+  need one, the only place it can be added is... in the vector unit and
+* minus: opcode functions (and associated ALUs) duplicated in Alt-RVP are
+  not useable or accessible in other Extensions.
 * plus-and-minus: Lanes may be utilised for high-speed context-switching
   but with the down-side that they're an all-or-nothing part of the Extension.
   No Alt-RVP: no fast register-bank switching.
 * plus: Lane-switching would mean that complex operations not suited to
   parallelisation can be carried out, followed by further parallel Lane-based
-  work
+  work, without moving register contents down to memory (and back)
 * minus: Access to registers across multiple lanes is challenging. "Solution"
   is to drop data into memory and immediately back in again (like MMX).
 
@@ -1081,7 +1084,7 @@ Simple-V
   operations not suited to parallelisation may be carried out interleaved
   between parallelised instructions *without* requiring data to be dropped
   down to memory and back (into a separate vectorised register engine).
-* plus-and-minus: re-use of integer and floating-point 32-wide register
+* plus-and-maybe-minus: re-use of integer and floating-point 32-wide register
   files means that huge parallel workloads would use up considerable
   chunks of the register file.  However in the case of RV64 and 32-bit
   operations, that effectively means 64 slots are available for parallel
@@ -1092,7 +1095,7 @@ RVV (as it stands, Draft 0.4 Section 17, RISC-V ISA V2.3-Draft)
 * plus: regular predictable workload means effects on L1/L2 Cache can
   be streamlined.
 * plus: regular and clear parallel workload also means that lanes
-  (similar to Alt-RVP) may be used as an implementation details,
+  (similar to Alt-RVP) may be used as an implementation detail,
   using either SRAM or 2R1W registers.
 * plus: separate engine with no impact on the rest of an implementation
 * minus: separate *complex* engine with no RTL (ALUs, Pipeline stages) reuse
-- 
2.30.2