From 18f362618c75f7b0ec093569e992dd7a75a1607a Mon Sep 17 00:00:00 2001 From: lkcl Date: Mon, 9 May 2022 19:08:44 +0100 Subject: [PATCH] --- openpower/sv/SimpleV_rationale.mdwn | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn index 2beed6557..fa24e2dc8 100644 --- a/openpower/sv/SimpleV_rationale.mdwn +++ b/openpower/sv/SimpleV_rationale.mdwn @@ -916,10 +916,23 @@ and how that helps Register Hazards and SIMD amortisation on a GB-OoO Micro-architecture) * -Draft Image: +Draft Image (placeholder): +The program being executed is a simple loop with a conditional +test that ignores the multiply if the input is zero. + +* In the CPU-only case (top) the data goes through L1/L2 + Cache before reaching the CPU. +* However the PE version does not send zero-data to the CPU, + and even when it does it goes into a Coherent FIFO: no real + compelling need to enter L1/L2 Cache or even the CPU Register + File (one of the key reasons why Snitch saves so much power). +* The PE-only version (see next use-case) the CPU is mostly + idle, serving RADIX MMU TLB requests for PEs, and OpenCAPI + requests. + **Use-case variant: More powerful in-memory PEs** An obvious variant of the above is that, if there is inherently -- 2.30.2