From: lkcl Date: Fri, 6 May 2022 17:09:47 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~2364 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=94c4c0f2a3269e6c29cd220113e13be52bccb987;p=libreriscv.git --- diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn index a387fdf55..5bcde068c 100644 --- a/openpower/sv/SimpleV_rationale.mdwn +++ b/openpower/sv/SimpleV_rationale.mdwn @@ -658,15 +658,25 @@ will be looking out for is a Basic Block of instructions that: * contains some instructions that a given PE is capable of executing * ends with a STORE (again: OpenCAPI) -For best results that would be wrapped with a Zero-Overhead Loop, where +For best results that would be wrapped with a Zero-Overhead Loop +(which is offloaded - in full - down to the PE), where the Compiler (or hardware at runtime) could easily identify, in advance, the full range of Memory Addresses that the Loop is to encounter. Copies of loop-invariant data would need to be passed down to the remote PE: again, for simple-enough Basic Blocks, with assistance from the Compiler, -loop-invariant inputs are easily identified. +loop-invariant inputs are easily identified. Parallel Processing +opportunities should also be easy enough to create, simply by farming out +different parts of a given Deterministic Zero-Overhead Loop to +different PEs based on their proximity, bandwidth or ease of access to +given Memory. The importance of OpenCAPI in this mix cannot be underestimated, because it will be the means by which the main CPU coordinates its activities with the remote PEs, ensuring that LOAD/STORE Memory Hazards are not -violated. +violated. It should also be straightforward to ensure that the offloading +is entirely transparent to the developer, in fact this is a hard requirement +because at any given moment there is the possibility that the PEs may be +busy and it is the main CPU that has to complete the Processing Task itself. + +It