be performed, followed by a VSPLAT-augmented mv, copying the one *scalar*
value into multiple register destinations.
-Note also that cache-inhibited VSPLAT with Predicate-result is possible.
+Note also that cache-inhibited VSPLAT with Data-Dependent Fail-First is possible.
This allows for example to issue a massive batch of memory-mapped
peripheral reads, stopping at the first NULL-terminated character and
truncating VL to that point. No branch is needed to issue that large
break # stop looping
```
-**Data-Dependent Fault-First on Store-Conditional**
+**Data-Dependent Fault-First on Store-Conditional (Rc=1)**
There are very few instructions that allow Rc=1 for Load/Store:
one of those is the `stdcx.` and other Atomic Store-Conditional
-instructions. It should be self-evident that being able to
-Vectorise and then truncate a sequence of Atomic Store-Conditional
-operations at the point where a store was not performed, should
-be pretty important.
+instructions. With Simple-V being a loop around Scalar instructions
+strictly obeying Scalar Program Order a Fail-First loop on an
+Atomic Store-Conditional will always fail the second and all other
+Store-Conditional instructions in Horizontal-First Mode because
+Load-Reservation and Store-Conditional are required to be executed
+in pairs.
+
+By contrast, in Vertical-First Mode it is in fact possible to issue
+the pairs, and consequently allowing Vectorised Data-Dependent Fail-First is
+useful. Care should be taken however when VL is truncated in Vertical-First
+Mode.
## LOAD/STORE Elwidths <a name="elwidth"></a>