bare-bones microkernel
would be viable, or a Management Core closer to the PEs (on the same
die or Multi-Chip-Module as the PEs) would allow better bandwidth and
-reduce Management Overhead on the main CPUs.
+reduce Management Overhead on the main CPUs. However once established,
+and running the same level of power saving as Snitch (1/6th) and
+the same sort of reduction in algorithm runtime (20 to 80%) is not
+unreasonable, and compelling enough to warrant in-depth investigation.
**Use-case: Matrix and Convolutions**