+* old page [[simple_v_extension/specification/sv.setvl]]
+* [[sv/svstep]]
+
+Use of setvl results in changes to the MVL, VL and STATE SPRs. see [[sv/sprs]]♧
+
+# Behaviour and Rationale
+
+SV's Vector Engine is based on Cray-style Variable-length Vectorisation,
+just like RVV. However unlike RVV, SV sits on top of the standard Scalar
+regfiles: there is no separate Vector register numbering. Therefore, also
+unlike RVV, SV does not have hard-coded "Lanes": microarchitects
+may use *ordinary* in-order, out-of-order, or superscalar designs
+as the basis for SV. By contrast, the relevant parameter
+in RVV is "MAXVL" and this is architecturally hard-coded into RVV systems,
+anywhere from 1 to tens of thousands of Lanes in supercomputers.
+
+SV is more like how MMX used to sit on top of the x86 FP regfile.
+Therefore when Vector operations are performed, the question has to
+be asked, "well, how much of the regfile do you want to allocate to
+this operation?" because if it is too small an amount performance may
+be affected, and if too large then other registers would overlap and
+cause data corruption, or even if allocated correctly would require
+spill to memory.
+
+The answer effectively needs to be parameterised. Hence: MAXVL (MVL)
+is set from an immediate, so that the compiler may decide, statically, a
+guaranteed resource allocation according to the needs of the application.
+
+While RVV's MAXVL was a hw limit, SV's MVL is simply a loop
+optimization. It does not carry side-effects for the arch, though for
+a specific cpu it may affect hw unit usage.
+
+Other than being able to set MVL, SV's VL (Vector Length) works just like
+RVV's VL, with one minor twist. RVV permits the `setvl` instruction to
+set VL to an arbitrary explicit value. Within the limit of MVL, VL
+**MUST** be set to the requested value. Given that RVV only works on Vector Loops,
+this is fine and part of its value and design. However, SV sits on top
+of the standard register files. When MVL=VL=2, a Vector Add on `r3`
+will perform two Scalar Adds: one on `r3` and one on `r4`.
+
+Thus there is the opportunity to set VL to an explicit value (within the
+limits of MVL) with the reasonable expectation that if two operations
+are requested (by setting VL=2) then two operations are guaranteed.
+This avoids the need for a loop (with not-insignificant use of the
+regfiles for counters), simply two instructions:
+
+ setvli r0, MVL=64, VL=64
+ ld r0.v, 0(r30) # load exactly 64 registers from memory
+
+Page Faults etc. aside this is *guaranteed* 100% without fail to perform
+64 unit-strided LDs starting from the address pointed to by r30 and put
+the contents into r0 through r63. Thus it becomes a "LOAD-MULTI". Twin
+Predication could even be used to only load relevant registers from
+the stack. This *only works if VL is set to the requested value* rather
+than, as in RVV, allowing the hardware to set VL to an arbitrary value
+(caveat being, limited to not exceed MVL)
+
+Also available is the option to set VL from CTR (`VL = MIN(CTR, MVL)`.
+In combination with SVP64 [[sv/branches]] this can save one instruction
+inside critical inner loops.