+# Vertical-First Mode
+
+This is a relatively new addition to SVP64 under development as of
+July 2021. Where Horizontal-First is the standard Cray-style for-loop,
+Vertical-First typically executes just the **one** scalar element
+in each Vectorised operation. That element is selected by srcstep
+and dststep *neither of which are changed as a side-effect of execution*.
+Illustrating this in pseodocode, with a branch/loop.
+To create loops, a new instruction `svstep` must be called,
+explicitly, with Rc=1:
+
+```
+loop:
+ sv.addi r0.v, r8.v, 5 # GPR(0+dststep) = GPR(8+srcstep) + 5
+ sv.addi r0.v, r8, 5 # GPR(0+dststep) = GPR(8 ) + 5
+ sv.addi r0, r8.v, 5 # GPR(0 ) = GPR(8+srcstep) + 5
+ svstep. # srcstep++, dststep++, CR0.eq = srcstep==VL
+ beq loop
+```
+
+Three examples are illustrated of different types of Scalar-Vector
+operations. Note that in its simplest form **only one** element is
+executed per instruction **not** multiple elements per instruction.
+(The more advanced version of Vertical-First mode may execute multiple
+elements per instruction, however the number executed **must** remain
+a fixed quantity.)
+
+Now that such explicit loops can increment inexorably towards VL,
+of course we now need a way to test if srcstep or dststep have reached
+VL. This is achieved in one of two ways: [[sv/svstep]] has an Rc=1 mode
+where CR0 will be updated if VL is reached. A standard v3.0B Branch
+Conditional may rely on that. Alternatively, the number of elements
+may be transferred into CTR, as is standard practice in Power ISA.
+Here, SVP64 [[sv/branches]] have a mode which allows CTR to be decremented
+by the number of vertical elements executed.
+