From a671893b57905a63afeea6b56b808112e2068c33 Mon Sep 17 00:00:00 2001 From: lkcl Date: Fri, 6 May 2022 12:00:50 +0100 Subject: [PATCH] --- openpower/sv/SimpleV_rationale.mdwn | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn index f3de0ae62..275cb0f1c 100644 --- a/openpower/sv/SimpleV_rationale.mdwn +++ b/openpower/sv/SimpleV_rationale.mdwn @@ -455,7 +455,7 @@ to move (distribute) processing closer to the DRAM Memory, firmly on the *opposite* side of the main CPU's L1/2/3/4 Caches. However the alarm bells ring here at the keyword "distributed", because by moving the processing down next to the Memory, the speed of any -of the parallel Processing Elements has dropped +of the parallel Processing Elements (PEs) has dropped by almost two orders of magnitude, the simplicity has for pure pragmatic reasons to drop by several orders of magnitude. Things that the average "sequential algorithm" @@ -465,4 +465,14 @@ spinlocks (atomic locking), all of these are either outright gone or expected that the programmer shall explicitly contend with (even if that programmer is the Compiler Developer). - +To give an extreme example: Aspex's Array-String Processor, which +was 4096 2-bit SIMD PEs each with 256 bytes of Content Addressable +Memory was capable of literally a hundred-fold improvement in +performance over Scalar CPUs such as the Pentium III of its era, +all on a 3.5 watt budget at only 250 mhz in 130 nm. Yet to take +proper advantage of its capability required an astounding 5-10 +*days* per line of assembly code. 20 lines of optimised +Assembler taking six months to write can in no way be termed +"productive". + +**In short, we are in "Programmer's nightmare" territory** -- 2.30.2