1 # setvl: Set Vector Length
6 * <http://lists.libre-soc.org/pipermail/libre-soc-dev/2020-November/001366.html>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=535>
8 * <https://bugs.libre-soc.org/show_bug.cgi?id=587>
9 * <https://bugs.libre-soc.org/show_bug.cgi?id=568> TODO
10 * <https://bugs.libre-soc.org/show_bug.cgi?id=927> bug - RT>=32
11 * <https://bugs.libre-soc.org/show_bug.cgi?id=862> VF Predication
12 * <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vsetvlivsetvl-instructions>
14 * pseudocode [[openpower/isa/simplev]]
17 Add the following section to the Simple-V Chapter
23 | 0-5|6-10|11-15|16-22 | 23 24 25 | 26-30 |31| FORM |
24 | -- | -- | --- | ---- |----------| ----- |--|----------|
25 |PO | RT | RA | SVi | ms vs vf | XO |Rc| SVL-Form |
27 * setvl RT,RA,SVi,vf,vs,ms (Rc=0)
28 * setvl. RT,RA,SVi,vf,vs,ms (Rc=1)
33 overflow <- 0b0 # sets CR.SO if set and if Rc=1
36 if ms = 1 then MVL <- VLimm[0:6]
37 else MVL <- SVSTATE[0:6]
39 if vs = 0 then VL <- SVSTATE[7:13]
41 if (RA) >u 0b1111111 then
44 else VL <- (RA)[57:63]
45 else if _RT = 0 then VL <- VLimm[0:6]
46 else if CTR >u 0b1111111 then
50 # limit VL to within MVL
57 GPR(_RT) <- [0]*57 || VL
58 # MAXVL is a static "state-reset" opportunity so VF is only set then.
60 SVSTATE[63] <- vf # set Vertical-First mode
61 SVSTATE[62] <- 0b0 # clear persist bit
64 Special Registers Altered:
71 * `SVi` - bits 16-22 - an immediate operand for setting MVL and/or VL
72 * `ms` - bit 23 - allows for setting of MVL
73 * `vs` - bit 24 - allows for setting of VL
74 * `vf` - bit 25 - sets "Vertical First Mode".
76 Note that in immediate setting mode VL and MVL start from **one** but that
77 this is compensated for in the assembly notation. i.e. that an immediate
78 value of 1 in assembler notation actually places the value 0b0000000 in
79 the `SVi` field bits: on execution the `setvl` instruction adds one to
80 the decoded `SVi` field bits, resulting in VL/MVL being set to 1. This
81 allows VL to be set to values ranging from 1 to 128 with only 7 bits
82 instead of 8. Setting VL/MVL to 0 would result in all Vector operations
83 becoming `nop`. If this is truly desired (nop behaviour) then setting
84 VL and MVL to zero is to be done via the [[SVSTATE SPR|sv/sprs]].
86 Note that setmvli is a pseudo-op, based on RA/RT=0, and setvli likewise
89 setvli VL=8 : setvl r0, r0, VL=8, vf=0, vs=1, ms=0
90 setvli. VL=8 : setvl. r0, r0, VL=8, vf=0, vs=1, ms=0
91 setmvli MVL=8 : setvl r0, r0, MVL=8, vf=0, vs=0, ms=1
92 setmvli. MVL=8 : setvl. r0, r0, MVL=8, vf=0, vs=0, ms=1
95 Additional pseudo-op for obtaining VL without modifying it (or any state):
98 getvl r5 : setvl r5, r0, vf=0, vs=0, ms=0
99 getvl. r5 : setvl. r5, r0, vf=0, vs=0, ms=0
102 Note that whilst it is possible to set both MVL and VL from the same
103 immediate, it is not possible to set them to different immediates in
104 the same instruction. Doing so would require two instructions.
106 Use of setvl results in changes to the SVSTATE SPR. see [[sv/sprs]]
108 **Selecting sources for VL**
110 There is considerable opcode pressure, consequently to set MVL and VL
111 from different sources is as follows:
113 | condition | effect |
115 | `vs=1, RA=0, RT!=0` | VL,RT set to MIN(MVL, CTR) |
116 | `vs=1, RA=0, RT=0` | VL set to MIN(MVL, SVi+1) |
117 | `vs=1, RA!=0, RT=0` | VL set to MIN(MVL, RA) |
118 | `vs=1, RA!=0, RT!=0` | VL,RT set to MIN(MVL, RA) |
120 The reasoning here is that the opportunity to set RT equal to the
121 immediate `SVi+1` is sacrificed in favour of setting from CTR.
123 **Unusual Rc=1 behaviour**
125 Normally, the return result from an instruction is in `RT`. With it
126 being possible for `RT=0` to mean that `CTR` mode is to be read, some
127 different semantics are needed.
129 CR Field 0, when `Rc=1`, may be set even if `RT=0`. The reason is that
130 overflow may occur: `VL`, if set either from an immediate or from `CTR`,
131 may not exceed `MAXVL`, and if it is, `CR0.SO` must be set.
133 In reality it is **`VL`** being set. Therefore, rather than `CR0`
134 testing `RT` when `Rc=1`, CR0.EQ is set if `VL=0`, CR0.GE is set if `VL`
139 Sub-vector elements are not be considered "Vertical". The vec2/3/4
140 is to be considered as if the "single element". Caveats exist for
141 [[sv/mv.swizzle]] and [[sv/mv.vec]] when Pack/Unpack is enabled, due
142 to the order in which VL and SUBVL loops are applied being swapped
143 (outer-inner becomes inner-outer)
147 ### Core concept loop
149 This example illustrates the Cray-style Loop concept. However where most Cray
150 Vectors have a Max Vector Length hard-coded into the architecture, Simple-V
151 allows MVL to be set, but only as a static immediate, so that compilers may
152 embed the register resource allocation statically at compile-time.
156 setvl a3, a0, MVL=8 # update a3 with vl
157 # (# of elements this iteration)
159 # set a3=VL=MIN(a0,MVL)
160 # do vector operations at up to 8 length (MVL=8)
162 sub. a0, a0, a3 # Decrement count by vl, set CR0.eq
163 bnez a0, loop # Any more?
168 In this example, the `setvl.` instruction enabled Rc=1, which
169 sets CR0.eq when VL becomes zero.
179 setvli. r4, r3, MVL=64
185 ### Load/Store-Multi (selective)
187 Up to 64 FPRs will be loaded, here. `r3` is set one per bit for each
188 FP register required to be loaded. The block of memory from which the
189 registers are loaded is contiguous (no gaps): any FP register which has
190 a corresponding zero bit in `r3` is *unaltered*. In essence this is a
191 selective LD-multi with "Scatter" capability.
194 setvli r0, MVL=64, VL=64
195 sv.fld/dm=r3 *r0, 0(r30) # selective load 64 FP registers
198 Up to 64 FPRs will be saved, here. Again, `r3` specifies which
199 registers are set in a `VEXPAND` fashion.
202 setvli r0, MVL=64, VL=64
203 sv.stfd/sm=r3 *fp0, 0(r30) # selective store 64 FP registers