Data-dependent fail-on-first has two distinct variants: one for LD/ST,
the other for arithmetic operations (actually, CR-driven). Note in each
-case the assumption is that vector elements are required appear to be
-executed in sequential Program Order, element 0 being the first.
-
-* Data-driven (CR-driven) fail-on-first activates when Rc=1 or other
- CR-creating operation produces a result (including cmp). Similar to
- branch, an analysis of the CR is performed and if the test fails, the
- vector operation terminates and discards all element operations at and
- above the current one, and VL is truncated to either
- the *previous* element or the current one, depending on whether
- VLi (VL "inclusive") is set.
+case the assumption is that vector elements are required to appear to be
+executed in sequential Program Order. When REMAP is not active,
+element 0 would be the first.
+
+Data-driven (CR-driven) fail-on-first activates when Rc=1 or other
+CR-creating operation produces a result (including cmp). Similar to
+branch, an analysis of the CR is performed and if the test fails, the
+vector operation terminates and discards all element operations at and
+above the current one, and VL is truncated to either
+the *previous* element or the current one, depending on whether
+VLi (VL "inclusive") is set.
Thus the new VL comprises a contiguous vector of results,
-all of which pass the testing criteria (equal to zero, less than zero).
+all of which pass the testing criteria (equal to zero, less than zero etc
+as defined by the CR-bit test).
The CR-based data-driven fail-on-first is "new" and not found in ARM
SVE or RVV. At the same time it is "old" because it is almost
against the `inv` field).
If the CR.eq bit is equal to `inv` then the Vector is truncated and
the loop ends.
-Note that when RC1=1 the result elements are never stored, only the CRs.
+Note that when RC1=1 the result elements are never stored, only the CR
+Fields.
VLi is only available as an option when `Rc=0` (or for instructions
which do not have Rc). When set, the current element is always
also included in the count (the new length that VL will be set to).
This may be useful in combination with "inv" to truncate the Vector
-to `exclude` elements that fail a test, or, in the case of implementations
+to *exclude* elements that fail a test, or, in the case of implementations
of strncpy, to include the terminating zero.
In CR-based data-driven fail-on-first there is only the option to select
* LD/ST Failfirst may (beyond the initial first element
conditions) truncate VL for any architecturally
- suitable reason.
+ suitable reason. Beyond the first element LD/ST Failfirst is
+ arbitrarily speculative and 100% non-deterministic.
* CR-based data-dependent first on the other hand MUST NOT truncate VL
arbitrarily to a length decided by the hardware: VL MUST only be
truncated based explicitly on whether a test fails.
-This because it is a precise test on which algorithms
-will rely.
+This because it is a precise Deterministic test on which algorithms
+can and will will rely.
**Floating-point Exceptions**
Operations that actually produce or alter CR Field as a result
have their own SVP64 Mode, described
-in [[sv/cr_ops]]
+in [[sv/cr_ops]].
# pred-result mode