AMD, MIPS, Sun Microsystems, SGI, Cray, and many more. (*Hand-crafted
assembler and direct use of intrinsics is the Industry-standard norm
to achieve high-performance optimisation where it matters*).
-Rather: GPUs
-have ultra-specialist compilers (CUDA) that are designed from the ground up
+GPUs full this void both in hardware and software terms by having
+ultra-specialist compilers (CUDA) that are designed from the ground up
to support Vector/SIMD parallelism, and associated standards
(SPIR-V, Vulkan, OpenCL) managed by
the Khronos Group, with multi-man-century development committment from
this task, and what, in Computer Science, actually needs solving?
First hints are that whilst memory bitcells have not increased in speed
-since the 90s (around 150 mhz), increasing the bank width and
+since the 90s (around 150 mhz), increasing the bank width, striping, and
datapath widths and speeds to the same has allowed
significant apparent speed increases: 3200 mhz DDR4 and even faster DDR5,
and other advanced Memory interfaces such as HBM, Gen-Z, and OpenCAPI,