From: Andrey Miroshnikov Date: Wed, 25 Oct 2023 13:15:28 +0000 (+0000) Subject: Adding section SHAPE, still drafting X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=a7204d79a1116352cff832365030b21eba55e711;p=libreriscv.git Adding section SHAPE, still drafting --- diff --git a/openpower/sv/cookbook/remap_matrix.mdwn b/openpower/sv/cookbook/remap_matrix.mdwn index 0a337bb5f..e321914c9 100644 --- a/openpower/sv/cookbook/remap_matrix.mdwn +++ b/openpower/sv/cookbook/remap_matrix.mdwn @@ -162,7 +162,86 @@ ISACaller)* ## svshape +The `svshape` instruction is a convenient way to access the SHAPE Special +Purpose Registers (SPRs), which were added alongside the SVP64 looping +system for complex element indexing. Without having SHAPE SPRs, only the most +basic, consecuting indexing of register elements (0,1,2,3...) would +be possible. +### SHAPE Remapping SPRs + +* See [[openpower/sv/remap]] for the full break down of SPRs SHAPE0-3. + +For Matrix Multiply, SHAPE0 SPR is used: + +|0:5 |6:11 | 12:17 | 18:20 | 21:23 |24:27 |28:29 |30:31| +|----- |----- | ------- | ------- | ------ |------|------ |---- | +|xdimsz|ydimsz| zdimsz | permute | invxyz |offset|skip |mode | + + +skip: + +- 0b00 indicates no dimensions to be skipped +- 0b01 - skip '1st dim' +- 0b10 - skip '2nd dim' +- 0b11 - skip '3rd dim' + +invxyz (3-bit; 1 for x, 1 for y, 1 for z): + +- If corresponding dim bit is zero, start index from zero and increment +- If bit set, start from xdimsz-1 (x dimension size, or whichever dimension +bit is being looked at) and decrement down to zero. + +offset is used to offset the result by `offset` elements (important for when +using element width overrides are used). + +xdimsz, ydimsz, zdimsz are offset by 1, such that 0-0b111111 correspond to +1-64. A value of xdimsz=2 would indicate that in the first dimension there are +3 elements in the array. + +With the example Matrix X (2 rows, 3 columns, or 2x3 matrix), xdimsz=1, +ydimsz=2, zdimsz=0. + +permute setting: +| permute | order | array format | +| ------- | ----- | ------------------------ | +| 000 | 0,1,2 | (xdim+1)(ydim+1)(zdim+1) | +| 001 | 0,2,1 | (xdim+1)(zdim+1)(ydim+1) | +| 010 | 1,0,2 | (ydim+1)(xdim+1)(zdim+1) | +| 011 | 1,2,0 | (ydim+1)(zdim+1)(xdim+1) | +| 100 | 2,0,1 | (zdim+1)(xdim+1)(ydim+1) | +| 101 | 2,1,0 | (zdim+1)(ydim+1)(xdim+1) | +| 110 | 0,1 | Indexed (xdim+1)(ydim+1) | +| 111 | 1,0 | Indexed (ydim+1)(xdim+1) | + +Permute re-arranges the order of the nested for-loops used to iterate over the +three dimensions. This allows for in-place transpose, in-place rotate, matrix +multiply, convolutions, without the limitation of Power-of-Two matrices. + +Limitations of Matrix REMAP are currently: + +- Vector Length (VL) limited to 127, and up to 127 Multiply-Add Accumulates +(MAC), or other operations may be performed in total. +For matrix multiply, it means both operand matrices and result matrix can +have no more than 127 elements in total. +(Larger matrices can be split into tiles to circumvent this issue, out +of scope of this document). +- `svshape` instruction only provides part of the Matrix REMAP capability. +For rotation and mirroring, SVSHAPE SPRs must be programmed directly (thus +requiring more assembler instructions). Future revisions of SVP64 will +provide more comprehensive capacity, mitigating the need to write to SVSHAPE +SPRs directly. + +Going back to the assembler instruction used to setup the shape for matrix +multiply: + + svshape 2, 2, 3, 0, 0 + +breakdown: + +- SVxd=2, SVyd=2, SVzd=3 +- SVRM=0 (Matrix mode, uses SHAPE0 SPR) +- ## SVREMAP