## svshape
+The `svshape` instruction is a convenient way to access the SHAPE Special
+Purpose Registers (SPRs), which were added alongside the SVP64 looping
+system for complex element indexing. Without having SHAPE SPRs, only the most
+basic, consecuting indexing of register elements (0,1,2,3...) would
+be possible.
+### SHAPE Remapping SPRs
+
+* See [[openpower/sv/remap]] for the full break down of SPRs SHAPE0-3.
+
+For Matrix Multiply, SHAPE0 SPR is used:
+
+|0:5 |6:11 | 12:17 | 18:20 | 21:23 |24:27 |28:29 |30:31|
+|----- |----- | ------- | ------- | ------ |------|------ |---- |
+|xdimsz|ydimsz| zdimsz | permute | invxyz |offset|skip |mode |
+
+
+skip:
+
+- 0b00 indicates no dimensions to be skipped
+- 0b01 - skip '1st dim'
+- 0b10 - skip '2nd dim'
+- 0b11 - skip '3rd dim'
+
+invxyz (3-bit; 1 for x, 1 for y, 1 for z):
+
+- If corresponding dim bit is zero, start index from zero and increment
+- If bit set, start from xdimsz-1 (x dimension size, or whichever dimension
+bit is being looked at) and decrement down to zero.
+
+offset is used to offset the result by `offset` elements (important for when
+using element width overrides are used).
+
+xdimsz, ydimsz, zdimsz are offset by 1, such that 0-0b111111 correspond to
+1-64. A value of xdimsz=2 would indicate that in the first dimension there are
+3 elements in the array.
+
+With the example Matrix X (2 rows, 3 columns, or 2x3 matrix), xdimsz=1,
+ydimsz=2, zdimsz=0.
+
+permute setting:
+| permute | order | array format |
+| ------- | ----- | ------------------------ |
+| 000 | 0,1,2 | (xdim+1)(ydim+1)(zdim+1) |
+| 001 | 0,2,1 | (xdim+1)(zdim+1)(ydim+1) |
+| 010 | 1,0,2 | (ydim+1)(xdim+1)(zdim+1) |
+| 011 | 1,2,0 | (ydim+1)(zdim+1)(xdim+1) |
+| 100 | 2,0,1 | (zdim+1)(xdim+1)(ydim+1) |
+| 101 | 2,1,0 | (zdim+1)(ydim+1)(xdim+1) |
+| 110 | 0,1 | Indexed (xdim+1)(ydim+1) |
+| 111 | 1,0 | Indexed (ydim+1)(xdim+1) |
+
+Permute re-arranges the order of the nested for-loops used to iterate over the
+three dimensions. This allows for in-place transpose, in-place rotate, matrix
+multiply, convolutions, without the limitation of Power-of-Two matrices.
+
+Limitations of Matrix REMAP are currently:
+
+- Vector Length (VL) limited to 127, and up to 127 Multiply-Add Accumulates
+(MAC), or other operations may be performed in total.
+For matrix multiply, it means both operand matrices and result matrix can
+have no more than 127 elements in total.
+(Larger matrices can be split into tiles to circumvent this issue, out
+of scope of this document).
+- `svshape` instruction only provides part of the Matrix REMAP capability.
+For rotation and mirroring, SVSHAPE SPRs must be programmed directly (thus
+requiring more assembler instructions). Future revisions of SVP64 will
+provide more comprehensive capacity, mitigating the need to write to SVSHAPE
+SPRs directly.
+
+Going back to the assembler instruction used to setup the shape for matrix
+multiply:
+
+ svshape 2, 2, 3, 0, 0
+
+breakdown:
+
+- SVxd=2, SVyd=2, SVzd=3
+- SVRM=0 (Matrix mode, uses SHAPE0 SPR)
+-
## SVREMAP