-# OpenPOWER SV setvl/setvli
+# setvl: Set Vector Length
+<!-- hide -->
See links:
* <http://lists.libre-soc.org/pipermail/libre-soc-dev/2020-November/001366.html>
* <https://bugs.libre-soc.org/show_bug.cgi?id=535>
+* <https://bugs.libre-soc.org/show_bug.cgi?id=587>
+* <https://bugs.libre-soc.org/show_bug.cgi?id=568> TODO
+* <https://bugs.libre-soc.org/show_bug.cgi?id=927> bug - RT>=32
+* <https://bugs.libre-soc.org/show_bug.cgi?id=862> VF Predication
* <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vsetvlivsetvl-instructions>
+* [[sv/svstep]]
+* pseudocode [[openpower/isa/simplev]]
+<!-- show -->
+
+Add the following section to the Simple-V Chapter
+
+## setvl
+
+SVL-Form
+
+| 0-5|6-10|11-15|16-22 | 23 24 25 | 26-30 |31| FORM |
+| -- | -- | --- | ---- |----------| ----- |--|----------|
+|PO | RT | RA | SVi | ms vs vf | XO |Rc| SVL-Form |
+
+* setvl RT,RA,SVi,vf,vs,ms (Rc=0)
+* setvl. RT,RA,SVi,vf,vs,ms (Rc=1)
+
+Pseudo-code:
+
+```
+ overflow <- 0b0 # sets CR.SO if set and if Rc=1
+ VLimm <- SVi + 1
+ # set or get MVL
+ if ms = 1 then MVL <- VLimm[0:6]
+ else MVL <- SVSTATE[0:6]
+ # set or get VL
+ if vs = 0 then VL <- SVSTATE[7:13]
+ else if _RA != 0 then
+ if (RA) >u 0b1111111 then
+ VL <- 0b1111111
+ overflow <- 0b1
+ else VL <- (RA)[57:63]
+ else if _RT = 0 then VL <- VLimm[0:6]
+ else if CTR >u 0b1111111 then
+ VL <- 0b1111111
+ overflow <- 0b1
+ else VL <- CTR[57:63]
+ # limit VL to within MVL
+ if VL >u MVL then
+ overflow <- 0b1
+ VL <- MVL
+ SVSTATE[0:6] <- MVL
+ SVSTATE[7:13] <- VL
+ if _RT != 0 then
+ GPR(_RT) <- [0]*57 || VL
+ # MAXVL is a static "state-reset" opportunity so VF is only set then.
+ if ms = 1 then
+ SVSTATE[63] <- vf # set Vertical-First mode
+ SVSTATE[62] <- 0b0 # clear persist bit
+```
+
+Special Registers Altered:
+
+```
+ CR0 (if Rc=1)
+ SVSTATE
+```
+
+* `SVi` - bits 16-22 - an immediate operand for setting MVL and/or VL
+* `ms` - bit 23 - allows for setting of MVL
+* `vs` - bit 24 - allows for setting of VL
+* `vf` - bit 25 - sets "Vertical First Mode".
+
+Note that in immediate setting mode VL and MVL start from **one** but that
+this is compensated for in the assembly notation. i.e. that an immediate
+value of 1 in assembler notation actually places the value 0b0000000 in
+the `SVi` field bits: on execution the `setvl` instruction adds one to
+the decoded `SVi` field bits, resulting in VL/MVL being set to 1. In future
+this will allow VL to be set to values ranging from 1 to 128 with only 7 bits
+instead of 8. Setting VL/MVL to 0 would result in all Vector operations
+becoming `nop`. If this is truly desired (nop behaviour) then setting
+VL and MVL to zero is to be done via the [[SVSTATE SPR|sv/sprs]].
-Use of setvl results in changes to the MVL, VL and STATE SPRs. see [[sv/sprs]]♧
+Note that setmvli is a pseudo-op, based on RA/RT=0, and setvli likewise
-# Format
+```
+ setvli VL=8 : setvl r0, r0, VL=8, vf=0, vs=1, ms=0
+ setvli. VL=8 : setvl. r0, r0, VL=8, vf=0, vs=1, ms=0
+ setmvli MVL=8 : setvl r0, r0, MVL=8, vf=0, vs=0, ms=1
+ setmvli. MVL=8 : setvl. r0, r0, MVL=8, vf=0, vs=0, ms=1
+```
-| 0..5 |6..10|11..15|16.20|21.22.23.24..25|26.....30|31| name |
-|------|-----|------|-----|---------------|---------|--|---------|
-| 19 | RT | RA | | XO[0:4] | XO[5:9] |Rc| XL-Form |
-| 19 | RT | RA |imm | imm // vs ms | NNNNN |Rc| setvl |
+Additional pseudo-op for obtaining VL without modifying it (or any state):
-Note that imm spans 7 bits (16 to 22), and that bit 22 is reserved abd must be zero. Setting bit 22 causes an illegal exception.
+```
+ getvl r5 : setvl r5, r0, vf=0, vs=0, ms=0
+ getvl. r5 : setvl. r5, r0, vf=0, vs=0, ms=0
+```
-Note that VL and MVL start from **one** i.e. that an immediate value of zero will result in VL/MVL being set to 1. 0b111111 results in VL/MVL being set to 64. This is because setting VL/MVL to 1 results in "scalar identity" behaviour, where setting VL/MVL to 0 would result in all Vector operations becoming nop. If this is truly desired (nop behaviour) then setting VL and MVL to zero be done via the [[SV SPRs|sv/sprs]]
+Note that whilst it is possible to set both MVL and VL from the same
+immediate, it is not possible to set them to different immediates in
+the same instruction. Doing so would require two instructions.
-Note that setmvli is a pseudo-op, based on RA/RT=0, and setvli likewise
+Use of setvl results in changes to the SVSTATE SPR. see [[sv/sprs]]
-* setvli VL=8 : setvl r5, r0, VL=8
-* setmvli MVL=8 : setvl r0, r0, MVL=8
-
-# Pseudocode
-
- // instruction fields:
- rd = get_rt_field(); // bits 6..10
- ra = get_ra_field(); // bits 11..15
- // add one. MVL/VL=1..64 not 0..63
- vlimmed = get_immed_field()+1; // 16..22
- vs = get_vs_field(); // bit 24
- ms = get_ms_field(); // bit 25
- Rc = get_Rc_field(); // bit 31
-
- // set VL (or not).
- // 3 options: from SPR, from immed, from ra
- if vs {
- // VL to be sourced from fields/regs
- if ra != 0 {
- VL = GPR[ra]
- } else {
- VL = vlimmed
- }
- } else {
- // VL not to change, source from SPR
- VL = SPR[SV_VL]
- }
-
- // set MVL (or not).
- // 2 options: from SPR, from immed
- if ms {
- MVL = vlimmed
- } else {
- MVL = SPR[SV_MVL]
- }
-
- // calculate (limit) VL
- VL = min(VL, MVL)
-
- // store VL, MVL
- SPR[SV_VL] = VL
- SPR[SV_MVL] = MVL
-
- // write rd
- if rt != 0 {
- // rt is not zero
- regs[rt] = VL;
- }
- // write CR?
- if Rc {
- // update CR from VL (not rt)
- CR0 = ....
- }
-
-# Examples
-
-## Core concept loop
+**Selecting sources for VL**
- loop:
+There is considerable opcode pressure, consequently to set MVL and VL
+from different sources is as follows:
+
+| condition | effect |
+| - | - |
+| `vs=1, RA=0, RT!=0` | VL,RT set to MIN(MVL, CTR) |
+| `vs=1, RA=0, RT=0` | VL set to MIN(MVL, SVi+1) |
+| `vs=1, RA!=0, RT=0` | VL set to MIN(MVL, RA) |
+| `vs=1, RA!=0, RT!=0` | VL,RT set to MIN(MVL, RA) |
+
+The reasoning here is that the opportunity to set RT equal to the
+immediate `SVi+1` is sacrificed in favour of setting from CTR.
+
+**Unusual Rc=1 behaviour**
+
+Normally, the return result from an instruction is in `RT`. With it
+being possible for `RT=0` to mean that `CTR` mode is to be read, some
+different semantics are needed.
+
+CR Field 0, when `Rc=1`, may be set even if `RT=0`. The reason is that
+overflow may occur: `VL`, if set either from an immediate or from `CTR`,
+may not exceed `MAXVL`, and if it is, `CR0.SO` must be set.
+
+In reality it is **`VL`** being set. Therefore, rather than `CR0`
+testing `RT` when `Rc=1`, CR0.EQ is set if `VL=0`, CR0.GE is set if `VL`
+is non-zero.
+
+**SUBVL**
+
+Sub-vector elements are not be considered "Vertical". The vec2/3/4
+is to be considered as if the "single element". Caveats exist for
+[[sv/mv.swizzle]] and [[sv/mv.vec]] when Pack/Unpack is enabled, due
+to the order in which VL and SUBVL loops are applied being swapped
+(outer-inner becomes inner-outer)
+
+## Examples
+
+### Core concept loop
+
+This example illustrates the Cray-style Loop concept. However where most Cray
+Vectors have a Max Vector Length hard-coded into the architecture, Simple-V
+allows MVL to be set, but only as a static immediate, so that compilers may
+embed the register resource allocation statically at compile-time.
+
+```
+loop:
setvl a3, a0, MVL=8 # update a3 with vl
# (# of elements this iteration)
- # set MVL to 8
+ # set MVL to 8 and
+ # set a3=VL=MIN(a0,MVL)
# do vector operations at up to 8 length (MVL=8)
# ...
- sub a0, a0, a3 # Decrement count by vl
+ sub. a0, a0, a3 # Decrement count by vl, set CR0.eq
bnez a0, loop # Any more?
+```
+
+### Loop using Rc=1
-## Loop using Rc=1
+In this example, the `setvl.` instruction enabled Rc=1, which
+sets CR0.eq when VL becomes zero. Testing of `r4` (cmpi) is thus redundant
+saving one instruction.
+```
my_fn:
li r3, 1000
b test
sub r3, r3, r4
...
test:
- setvl. r4, r3, 64
+ setvli. r4, r3, MVL=64
bne cr0, loop
end:
blr
+```
+
+### Load/Store-Multi (selective)
+
+Up to 64 FPRs will be loaded, here. `r3` is set one per bit for each
+FP register required to be loaded. The block of memory from which the
+registers are loaded is contiguous (no gaps): any FP register which has
+a corresponding zero bit in `r3` is *unaltered*. In essence this is a
+selective LD-multi with "Scatter" (`VCOMPRESS`) capability.
+
+```
+ setvli r0, MVL=64, VL=64
+ sv.fld/dm=r3 *r0, 0(r30) # selective load 64 FP registers
+```
+
+Up to 64 FPRs will be saved, here. Again, `r3` specifies which
+registers are set in a `VEXPAND` fashion.
+
+```
+ setvli r0, MVL=64, VL=64
+ sv.stfd/sm=r3 *fp0, 0(r30) # selective store 64 FP registers
+```
+
+[[!tag standards]]
+
+------
+
+\newpage{}
+