From 4e7475e283fedd1013d29f1a8848c9f152f1f890 Mon Sep 17 00:00:00 2001
From: lkcl <lkcl@web>
Date: Sun, 8 May 2022 13:36:24 +0100
Subject: [PATCH]

---
 openpower/sv/SimpleV_rationale.mdwn | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn
index 1422cbbbd..36e20150b 100644
--- a/openpower/sv/SimpleV_rationale.mdwn
+++ b/openpower/sv/SimpleV_rationale.mdwn
@@ -804,6 +804,16 @@ Intel, ARM, MIPS, Power ISA and RISC-V have all already said "yes" on that,
 for several decades, and advanced programmers are comfortable with the
 practice.
 
+Additional questions remain as to whether OpenCAPI or its use for this
+particular scenario requires that the PEs, even quite basic ones,
+implement a full RADIX MMU, and associated TLB lookup? In order to ensure
+that programs may be cleanly and seamlessly transferred between PEs
+and CPU the answer is quite likely to be "yes", which is interesting
+in and of itself.  Fortunately, the associated L1 Cache with TLB
+Translation does not have to be large, and the actual RADIX Tree Walk
+need not explicitly be done by the PEs, it can be handled by the main
+CPU as a software-extension.
+
 **Use-case: Matrix and Convolutions**
 
 Imagine a large Matrix scenario, with several values close to zero that
@@ -838,6 +848,22 @@ main CPU.  In this way a large Sparse Matrix Multiply or Convolution
 may be achieved without having to pass unnecessary data through
 L1/L2/L3 Caches only to find, at the CPU, that it is zero.
 
+**Use-case: More powerful PEs in-memory**
+
+An obvious variant of the above is that, if there is inherently
+more parallelism in the data set, then the PEs get their own
+Multiply-and-Accumulate instruction, and rather than send the
+data to the CPU over OpenCAPI, perform the Matrix-Multiply
+directly themselves.
+
+However the source code and binary would be near-identical if
+not identical in every respect, and the PEs implementing the full
+ZOLC capability in order to compact binary size to the bare minimum.
+
+One key strategic question does remain: do the PEs need to have
+a RADIX MMU and associated TLB-aware minimal L1 Cache, in order
+to support OpenCAPI properly?
+
 **Roadmap summary of Advanced SVP64**
 
 The future direction for SVP64, then, is:
-- 
2.30.2