+**Comparison of PE-CPU to GPU-CPU interaction**
+
+The informed reader will have noted the remarkable similarity between how
+a CPU communicates with a GPU to schedule tasks, and the proposed
+architecture. CPUs schedule tasks with GPUs as follows:
+
+* User-space program encounters an OpenGL function, in the
+ CPU's ISA.
+* Proprietary GPU Driver, still in the CPU's ISA, prepares a
+ Shader Binary written in the GPU's ISA.
+* GPU Driver wishes to transfer both the data and the Shader Binary
+ to the GPU. Both may only do so via Shared Memory, usually
+ DMA over PCIe (assuming a PCIe Graphics Card)
+* GPU Driver which has been running CPU userspace notifies CPU
+ kernelspace of the desire to transfer data and GPU Shader Binary
+ to the GPU. A context-switch occurs...
+
+It is almost unfair to burden the reader with further details.
+The extraordinarily convoluted procedure is as bad as it sounds. Hundreds
+of thousands of tasks per second are scheduled this way, with hundreds
+or megabytes of data per second being exchanged as well.
+
+Yet, the process is not that different from how things would work
+with the proposed microarchitecture: the differences however are key.
+
+* Both PEs and CPU run the exact same ISA. A major complexity of 3D GPU
+ and CUDA workloads (JIT compilation etc) is eliminated, and, crucially,
+ the CPU may directly execute the PE's tasks, if needed. This simply
+ is not even remotely possible on GPU Architectures.
+* Where GPU Drivers use PCIe Shared Memory, the proposed architecture
+ deploys OpenCAPI.
+* Where GPUs are a foreign architecture and a foreign ISA, the proposed
+ architecture only narrowly misses being defined as big/LITTLE Symmetric
+ Multi-Processing (SMP) by virtue of the massively-parallel PEs
+ being a bit light on L1 Cache, in favour of large ALUs and proximity
+ to Memory, and require a modest amount of "helper" assistance with
+ their Virtual Memory Management.
+* The proposed architecture has the markup points emdedded into the
+ binary programs
+ where PEs may take over from the CPU, and there is accompanying
+ (planned) hardware-level assistance at the ISA level. GPUs, which have to
+ work with a wide range of commodity CPUs, cannot in any way expect
+ ARM or Intel to add support for GPU Task Scheduling directly into
+ the ARM or x86 ISAs!
+
+On this last point it is crucial to note that SVP64 began its inspiration
+from a Hybrid CPU-GPU-VPU paradigm (like ICubeCorp's IC3128) and
+consequently has versatility that the separate specialisation of both
+GPU and CPU architectures lack.
+