may be assumed to be 4. SUBVL is considered to be the "source" subvector
length.
+Pseudocode exploiting python "yield" for clarity: element-width overrides,
+Saturation and Predication also left out, for clarity:
+
```
def index_src():
for i in range(VL):
else:
yield (i*SUBVL, swiz[j]-3)
- # yield an outer-SUBVL, inner VL loop with DEST SUBVL
def index_dest():
for i in range(VL):
for j in range(dst_subvl):