cleared indicated that the REMAP operation shall only apply to the immediately-following
instruction. If set then REMAP remains permanently enabled until such time as it is
explicitly disabled, either by `setvl` setting a new MAXVL, or with another
-`svremap` instruction.
+`svremap` instruction. `svindex` and `svshape2` are also capable of setting or
+clearing persistence, as well as partially covering a subset of the capability of
+`svremap` to set register-to-SVSHAPE relationships.
# SHAPE Remapping SPRs
| -- | -- | --- | ----- | ------ | -- | ------| -------- |
|OPCD| SVxd | SVyd | SVzd | SVRM | vf | XO | svshape |
+```
+ # for convenience, VL to be calculated and stored in SVSTATE
+ vlen <- [0] * 7
+ mscale[0:5] <- 0b000001 # for scaling MAXVL
+ itercount[0:6] <- [0] * 7
+ SVSTATE[0:31] <- [0] * 32
+ # only overwrite REMAP if "persistence" is zero
+ if (SVSTATE[62] = 0b0) then
+ SVSTATE[32:33] <- 0b00
+ SVSTATE[34:35] <- 0b00
+ SVSTATE[36:37] <- 0b00
+ SVSTATE[38:39] <- 0b00
+ SVSTATE[40:41] <- 0b00
+ SVSTATE[42:46] <- 0b00000
+ SVSTATE[62] <- 0b0
+ SVSTATE[63] <- 0b0
+ # clear out all SVSHAPEs
+ SVSHAPE0[0:31] <- [0] * 32
+ SVSHAPE1[0:31] <- [0] * 32
+ SVSHAPE2[0:31] <- [0] * 32
+ SVSHAPE3[0:31] <- [0] * 32
+
+ # set schedule up for multiply
+ if (SVrm = 0b0000) then
+ # VL in Matrix Multiply is xd*yd*zd
+ xd <- (0b00 || SVxd) + 1
+ yd <- (0b00 || SVyd) + 1
+ zd <- (0b00 || SVzd) + 1
+ n <- xd * yd * zd
+ vlen[0:6] <- n[14:20]
+ # set up template in SVSHAPE0, then copy to 1-3
+ SVSHAPE0[0:5] <- (0b0 || SVxd) # xdim
+ SVSHAPE0[6:11] <- (0b0 || SVyd) # ydim
+ SVSHAPE0[12:17] <- (0b0 || SVzd) # zdim
+ SVSHAPE0[28:29] <- 0b11 # skip z
+ # copy
+ SVSHAPE1[0:31] <- SVSHAPE0[0:31]
+ SVSHAPE2[0:31] <- SVSHAPE0[0:31]
+ SVSHAPE3[0:31] <- SVSHAPE0[0:31]
+ # set up FRA
+ SVSHAPE1[18:20] <- 0b001 # permute x,z,y
+ SVSHAPE1[28:29] <- 0b01 # skip z
+ # FRC
+ SVSHAPE2[18:20] <- 0b001 # permute x,z,y
+ SVSHAPE2[28:29] <- 0b11 # skip y
+
+ # set schedule up for FFT butterfly
+ if (SVrm = 0b0001) then
+ # calculate O(N log2 N)
+ n <- [0] * 3
+ do while n < 5
+ if SVxd[4-n] = 0 then
+ leave
+ n <- n + 1
+ n <- ((0b0 || SVxd) + 1) * n
+ vlen[0:6] <- n[1:7]
+ # set up template in SVSHAPE0, then copy to 1-3
+ # for FRA and FRT
+ SVSHAPE0[0:5] <- (0b0 || SVxd) # xdim
+ SVSHAPE0[12:17] <- (0b0 || SVzd) # zdim - "striding" (2D FFT)
+ mscale <- (0b0 || SVzd) + 1
+ SVSHAPE0[30:31] <- 0b01 # Butterfly mode
+ # copy
+ SVSHAPE1[0:31] <- SVSHAPE0[0:31]
+ SVSHAPE2[0:31] <- SVSHAPE0[0:31]
+ # set up FRB and FRS
+ SVSHAPE1[28:29] <- 0b01 # j+halfstep schedule
+ # FRC (coefficients)
+ SVSHAPE2[28:29] <- 0b10 # k schedule
+
+ # set schedule up for (i)DCT Inner butterfly
+ # SVrm Mode 2 (Mode 6 for iDCT) is for pre-calculated coefficients,
+ # SVrm Mode 4 (Mode 12 for iDCT) is for on-the-fly (Vertical-First Mode)
+ if ((SVrm = 0b0010) | (SVrm = 0b0100) |
+ (SVrm = 0b1010) | (SVrm = 0b1100)) then
+ # calculate O(N log2 N)
+ n <- [0] * 3
+ do while n < 5
+ if SVxd[4-n] = 0 then
+ leave
+ n <- n + 1
+ n <- ((0b0 || SVxd) + 1) * n
+ vlen[0:6] <- n[1:7]
+ # set up template in SVSHAPE0, then copy to 1-3
# set up FRB and FRS
SVSHAPE0[0:5] <- (0b0 || SVxd) # xdim
SVSHAPE0[12:17] <- (0b0 || SVzd) # zdim - "striding" (2D DCT)
SVSHAPE1[28:29] <- 0b01 # j+halfstep schedule
# reset costable "striding" to 1
SVSHAPE2[12:17] <- 0b000000
+
# set schedule up for DCT COS table generation
if (SVrm = 0b0101) | (SVrm = 0b1101) then
# calculate O(N log2 N)
# for cos coefficient
SVSHAPE1[28:29] <- 0b10 # ci schedule
SVSHAPE2[28:29] <- 0b11 # size schedule
+
# set schedule up for iDCT / DCT inverse of half-swapped ordering
if (SVrm = 0b0110) | (SVrm = 0b1110) | (SVrm = 0b1111) then
vlen[0:6] <- (0b00 || SVxd) + 0b0000001
else
SVSHAPE0[30:31] <- 0b11 # DCT mode
SVSHAPE0[6:11] <- 0b000101 # DCT "half-swap" mode
+
# set schedule up for parallel reduction
if (SVrm = 0b0111) then
# calculate the total number of operations (brute-force)
SVSHAPE1[0:31] <- SVSHAPE0[0:31]
# set up right operand (left operand 28:29 is zero)
SVSHAPE1[28:29] <- 0b01 # right operand
+
# set VL, MVL and Vertical-First
m[0:12] <- vlen * mscale
maxvl[0:6] <- m[6:12]
SVSTATE[0:6] <- maxvl # MAVXL
SVSTATE[7:13] <- vlen # VL
SVSTATE[63] <- vf
+```
Special Registers Altered:
# svindex instruction <a name="svindex"> </a>
+
+| 0.5|6.10 |11.15 |16.20 | 21..25 | 26..31| name | Form |
+| -- | -- | --- | ---- | ----------- | ------| -------- | ---- |
+|OPCD| SVG | rmm | SVd | ew/yx/mm/sk | XO | svindex | SVI-Form |
+
+SVI-Form
+
+* svindex SVG,rmm,SVd,ew,SVyx,mm,sk
+
+Pseudo-code:
+
+ # based on nearest MAXVL compute other dimension
+ MVL <- SVSTATE[0:6]
+ d <- [0] * 6
+ dim <- SVd+1
+ do while d*dim <u ([0]*4 || MVL)
+ d <- d + 1
+
+ # set up template, then copy once location identified
+ shape <- [0]*32
+ shape[30:31] <- 0b00 # mode
+ if SVyx = 0 then
+ shape[18:20] <- 0b110 # indexed xd/yd
+ shape[0:5] <- (0b0 || SVd) # xdim
+ if sk = 0 then shape[6:11] <- 0 # ydim
+ else shape[6:11] <- 0b111111 # ydim max
+ else
+ shape[18:20] <- 0b111 # indexed yd/xd
+ if sk = 1 then shape[6:11] <- 0 # ydim
+ else shape[6:11] <- d-1 # ydim max
+ shape[0:5] <- (0b0 || SVd) # ydim
+ shape[12:17] <- (0b0 || SVG) # SVGPR
+ shape[28:29] <- ew # element-width override
+ shape[21] <- sk # skip 1st dimension
+
+ # select the mode for updating SVSHAPEs
+ SVSTATE[62] <- mm # set or clear persistence
+ if mm = 0 then
+ # clear out all SVSHAPEs first
+ SVSHAPE0[0:31] <- [0] * 32
+ SVSHAPE1[0:31] <- [0] * 32
+ SVSHAPE2[0:31] <- [0] * 32
+ SVSHAPE3[0:31] <- [0] * 32
+ SVSTATE[32:41] <- [0] * 10 # clear REMAP.mi/o
+ SVSTATE[42:46] <- rmm # rmm exactly REMAP.SVme
+ idx <- 0
+ for bit = 0 to 4
+ if rmm[4-bit] then
+ # activate requested shape
+ if idx = 0 then SVSHAPE0 <- shape
+ if idx = 1 then SVSHAPE1 <- shape
+ if idx = 2 then SVSHAPE2 <- shape
+ if idx = 3 then SVSHAPE3 <- shape
+ SVSTATE[bit*2+32:bit*2+33] <- idx
+ # increment shape index, modulo 4
+ if idx = 3 then idx <- 0
+ else idx <- idx + 1
+ else
+ # refined SVSHAPE/REMAP update mode
+ bit <- rmm[0:2]
+ idx <- rmm[3:4]
+ if idx = 0 then SVSHAPE0 <- shape
+ if idx = 1 then SVSHAPE1 <- shape
+ if idx = 2 then SVSHAPE2 <- shape
+ if idx = 3 then SVSHAPE3 <- shape
+ SVSTATE[bit*2+32:bit*2+33] <- idx
+ SVSTATE[46-bit] <- 1
+
+Special Registers Altered:
+
+ None
+
`svindex` is a convenience instruction that reduces instruction
count for Indexed REMAP Mode. It sets up
(overwrites) all required SVSHAPE SPRs and can modify the REMAP
-SPR as well. The relevant SPRs *may* be directly programmed with
+area of the SVSTATE SPR as well. The relevant SPRs *may* be directly programmed with
`mtspr` however it is laborious to do so: svindex saves instructions
covering much of Indexed REMAP capability.
-Form: SVI-Form SV "Indexed" Form (see [[isatables/fields.text]])
-
- svindex SVG,rmm,SVd,ew,yx,mr,sk
-
-| 0.5|6.10 |11.15 |16.20 | 21..25 | 26..31| name | Form |
-| -- | -- | --- | ---- | ----------- | ------| -------- | ---- |
-|OPCD| SVG | rmm | SVd | ew/yx/mm/sk | XO | svindex | SVI-Form |
-
Fields:
* **SVd** - SV REMAP x/y dim
* **yx** - 2D reordering to be used if yx=1
* **mm** - mask mode. determines how `rmm` is interpreted.
* **sk** - Dimension skipping enabled
-* **XO** - standard 6-bit XO field
*Note: SVd, like SVxd, SVyz and SVzd of `svshape`, are all stored
"off-by-one". In the assembler