moving to next **element**. Currently managed by `svstep`,
ZOLC may be deployed to manage the stepping, in a Deterministic manner.
+Second:
+SVP64 Draft Matrix Multiply is currently set up to arrange a Schedule
+of Multiply-and-Accumulates, suitable for pipelining, that will,
+ultimately, result in a Matrix Multiply. Normal processors are forced
+to perform "loop-unrolling" in order to achieve this same Schedule.
+SIMD processors are further forced into a situation of pre-arranging rotated
+copies of data if the Matrices are not exactly on a power-of-two boundary.
+
+The current limitation of SVP64 however is (when Horizontal-First
+is deployed, at least, which is the least number of instructions)
+that both source and destination Matrices have to be in-registers,
+in full. Vertical-First may be used to perform a LD/ST within
+the loop, covered by `svstep`, but it is still not ideal. This
+is where the Snitch and EXTRA-V concepts kick in.
+
+<img src="https://ftp.libre-soc.org/matrix_svremap.jpg" width=600 />
+
Imagine a large Matrix scenario, with several values close to zero that
could be skipped: no need to include zero-multiplications, but a
traditional CPU in no way can help: only by loading the data through