From: lkcl <lkcl@web>
Date: Thu, 12 May 2022 11:49:16 +0000 (+0100)
Subject: (no commit message)
X-Git-Tag: opf_rfc_ls005_v1~2262
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=1c361a3eb28daff34647be0c662493957044c307;p=libreriscv.git

---

diff --git a/openpower/sv/SimpleV_rationale.mdwn b/openpower/sv/SimpleV_rationale.mdwn
index 7b937d750..2e9f78e8e 100644
--- a/openpower/sv/SimpleV_rationale.mdwn
+++ b/openpower/sv/SimpleV_rationale.mdwn
@@ -987,7 +987,7 @@ to the average high-end CPU.
 
 The informed reader will have noted the remarkable similarity between how
 a CPU communicates with a GPU to schedule tasks, and the proposed
-architecture.  CPUs schedule tasks as follows:
+architecture.  CPUs schedule tasks with GPUs as follows:
 
 * User-space program encounters an OpenGL function, in the
   CPU's ISA.
@@ -995,7 +995,23 @@ architecture.  CPUs schedule tasks as follows:
   Shader Binary written in the GPU's ISA.
 * GPU Driver wishes to transfer both the data and the Shader Binary
   to the GPU. Both may only do so via Shared Memory, usually
-  DMA over PCIe.
+  DMA over PCIe (assuming a PCIe Graphics Card)
+* GPU Driver which has been running CPU userspace notifies CPU
+  kernelspace of the desire to transfer data and GPU Shader Binary
+  to the GPU. A context-switch occurs...
+
+It is almost unfair to burden the reader with further details.
+The extraordinarily convoluted procedure is as bad as it sounds. Hundreds
+of thousands of tasks per second are scheduled this way, with hundreds
+or megabytes of data per second being exchanged as well.
+
+Yet, the process is not that different from how things would work
+with the proposed microarchitecture: the differences however are key.
+
+* Both PEs and CPU run the exact same ISA.  A major complexity of 3D GPU
+  and CUDA workloads  (JIT compilation etc) is eliminated, and, crucially,
+  the CPU may execute the PE's tasks, if needed.
+
 
 **Roadmap summary of Advanced SVP64**