* Both PEs and CPU run the exact same ISA. A major complexity of 3D GPU
and CUDA workloads (JIT compilation etc) is eliminated, and, crucially,
- the CPU may execute the PE's tasks, if needed.
+ the CPU may directly execute the PE's tasks, if needed. This simply
+ is not even remotely possible on GPU Architectures.
+* Where GPU Drivers use PCIe Shared Memory, the proposed architecture
+ deploys OpenCAPI.
+* Where GPUs are a foreign architecture and a foreign ISA, the proposed
+ architecture only narrowly misses being defined as big/LITTLE Symmetric
+ Multi-Processing (SMP) by virtue of the massively-parallel PEs
+ being a bit light on L1 Cache, in favour of large ALUs and proximity
+ to Memory, and require a modest amount of "helper" assistance with
+ their Virtual Memory Management.
**Roadmap summary of Advanced SVP64**