From 4e7475e283fedd1013d29f1a8848c9f152f1f890 Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 8 May 2022 13:36:24 +0100 Subject: [PATCH] --- openpower/sv/SimpleV_rationale.mdwn | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn index 1422cbbbd..36e20150b 100644 --- a/openpower/sv/SimpleV_rationale.mdwn +++ b/openpower/sv/SimpleV_rationale.mdwn @@ -804,6 +804,16 @@ Intel, ARM, MIPS, Power ISA and RISC-V have all already said "yes" on that, for several decades, and advanced programmers are comfortable with the practice. +Additional questions remain as to whether OpenCAPI or its use for this +particular scenario requires that the PEs, even quite basic ones, +implement a full RADIX MMU, and associated TLB lookup? In order to ensure +that programs may be cleanly and seamlessly transferred between PEs +and CPU the answer is quite likely to be "yes", which is interesting +in and of itself. Fortunately, the associated L1 Cache with TLB +Translation does not have to be large, and the actual RADIX Tree Walk +need not explicitly be done by the PEs, it can be handled by the main +CPU as a software-extension. + **Use-case: Matrix and Convolutions** Imagine a large Matrix scenario, with several values close to zero that @@ -838,6 +848,22 @@ main CPU. In this way a large Sparse Matrix Multiply or Convolution may be achieved without having to pass unnecessary data through L1/L2/L3 Caches only to find, at the CPU, that it is zero. +**Use-case: More powerful PEs in-memory** + +An obvious variant of the above is that, if there is inherently +more parallelism in the data set, then the PEs get their own +Multiply-and-Accumulate instruction, and rather than send the +data to the CPU over OpenCAPI, perform the Matrix-Multiply +directly themselves. + +However the source code and binary would be near-identical if +not identical in every respect, and the PEs implementing the full +ZOLC capability in order to compact binary size to the bare minimum. + +One key strategic question does remain: do the PEs need to have +a RADIX MMU and associated TLB-aware minimal L1 Cache, in order +to support OpenCAPI properly? + **Roadmap summary of Advanced SVP64** The future direction for SVP64, then, is: -- 2.30.2