Incrementing and iteration through subvector state ssubstep and dsubstep is
possible with `sv.svstep/vecN` where as expected N may be 2/3/4. However it is necessary
to use the exact same Sub-Vector qualifier on any Prefixed
-instructions, within any given Vertical-First loop. Also valid
+instructions, within any given Vertical-First loop: `vec2/3/4` is **not**
+automatically applied to all instructions, it must be explicitly applied on
+a per-instruction basis. Also valid
is not specifying a Sub-vector
qualifier at all, but it is critically important to note that
operations will be repeated. For example if `sv.svstep/vec2`
-is used on `sv.addi` then each Vector element operation is
+is not used on `sv.addi` then each Vector element operation is
repeated twice. The reason is that whilst svstep will be
iterating through both the SUBVL and VL loops, the addi instruction
-only uses `srcstep` and `dststep`.
+only uses `srcstep` and `dststep` (not ssubstep or dsubstep) Illustrated below:
+
+```
+ def offset():
+ for step in range(VL):
+ for substep in range(SUBVL=2):
+ yield step, substep
+ for i, j in offset():
+ vec2_offs = i * SUBVL + j # calculate vec2 offset
+ addi RT+i, RA+i, 1 # but sv.addi is not vec2!
+ muli/vec2 RT+vec2_offs, RA+vec2_offs, 2 # this is
+```
+
+Actual assembler would be:
+
+```
+ loop:
+ setvl VF=1, CTRmode
+ sv.addi *RT, *RA, 1 # no vec2
+ sv.muli/vec2 *RT, *RA, 2 # vec2
+ sv.svstep/vec2 # must match the muli
+ sv.bc CTRmode, loop # subtracts VL from CTR
+```
+
+This illustrates the correct but seemingly-anomalous behaviour: `sv.svstep/vec2`
+is being requested to update `SVSTATE` to follow a vec2 loop construct. The anomalous
+`sv.addi` is not prohibited as it may in fact be desirable to execute operations twice,
+or to re-load data that was overwritten, and many other possibilities.
-------------