* <https://bugs.libre-soc.org/show_bug.cgi?id=687>
* <https://bugs.libre-soc.org/show_bug.cgi?id=936> write on failfirst
+* <https://bugs.libre-soc.org/show_bug.cgi?id=1183> enable mapreduce with failfirst
* [[svp64]]
* [[sv/branches]]
* [[sv/cr_int_predication]]
interesting conceptual challenges for SVP64, which was designed
primarily for vectors of arithmetic and logical operations. However
if predicates may be bits of CR Fields it makes sense to extend
-Simple-V to cover CR Operations, especially given that Vectorised Rc=1
-may be processed by Vectorised CR Operations that usefully in turn
+Simple-V to cover CR Operations, especially given that Vectorized Rc=1
+may be processed by Vectorized CR Operations that usefully in turn
may become Predicate Masks to yet more Vector operations, like so:
```
operations are firmly out of scope for this section, being covered fully
by [[sv/normal]].
-* Examples of v3.0B instructions to which this section does
+* Examples of Vectorizeable Defined Word-instructions to which this section does
apply is
- `mfcr` and `cmpi` (3 bit operands) and
- `crnor` and `crand` (5 bit operands).
Reduction is useful for analysing a Vector of Condition Register Fields
and reducing it to one single Condition Register Field.
-Predicate-result does not make any sense because when Rc=1 a co-result
-is created (a CR Field). Testing the co-result allows the decision to
-be made to store or not store the main result, and for CR Ops the CR
-Field result *is* the main result.
+Special atrention should be paid on the difference between Data-Dependent Fail-First
+on CR operations and [[openpower/sv/normal]] regarding the seemingly-contradictory
+behaviour of `Rc=1,VLi=0`. Explained below.
## Format
a CR bit and whether it is set (inv=0) or unset (inv=1)
* **RG** Reverse-Gear: inverts the Vector Loop order (VL-1 downto 0) rather
than the normal 0 upto VL-1
-* **SVM** sets "subvector" reduce mode
* **VLi** VL inclusive: in fail-first mode, the truncation of
VL *includes* the current element at the failure point rather
than excludes it from the count.
decision. However with CR-based operations that CR Field result to be
tested is provided *by the operation itself*.
-Data-dependent SVP64 Vectorised Operations involving the creation
+Data-dependent SVP64 Vectorized Operations involving the creation
or modification of a CR can require an extra two bits, which are not
available in the compact space of the SVP64 RM `MODE` Field. With the
concept of element width overrides being meaningless for CR Fields it
[[sv/ldst]], be set to an arbitrary value. Deterministic behaviour
is *required*.
+Important also to note is that reduce mode is implied by Data-Dependent Fail-First.
+In other words where normally if the destination is Scalar, the looping
+terminates at the first result, Data-Dependent Fail-First *continues*
+just as it does in reduce mode. This allows effectively *conditional*
+reduction (one register is both a source and destination) where testing of
+each result gives an option to exit.
+
+**Apparent contradictory behaviour compared to Rc=1,VLi=0**
+
+In [[openpower/sv/normal]] mode when Rc=1 and VLi=0 the Vector of
+co-results appears to ignore VLi=0 because the last CR Field co-result
+element tested is written out regardless of the setting of VLi.
+This is because when Rc=1 the CR Fields are co-results *not* actual
+results.
+
+When looking at the *actual* number of results written (arithmetic
+results on arithmetic operations vs CR-Field results on *CR-Field*
+operations), and ignoring the Rc=1 co-results entirely,
+the totals (the behaviours) are consistent whether
+VLi=0 or VLi=1.
+
+*Programmer's Note: Data-dependent fail-first stores an updated
+VL in the SVSTATE SPR, not in any GPR. If needed
+VL may be obtained by using the alias `getvl`.
+
## Reduction and Iteration
Bearing in mind as described in the [[svp64/appendix]] SVP64 Horizontal
is a much easier proposition to consider.
The prohibitions utilise the CR Field numbers implicitly to
-split out Vectorised CR operations to be considered completely
+split out Vectorized CR operations to be considered completely
separare and distinct from Scalar CR operations *even though
they both use the same binary encoding*. This does in turn
mean that at the Decode Phase it becomes necessary to examine
not only the operation (`sv.crand`, `sv.cmp`) but also
the CR Field numbers as well as whether, in the EXTRA2/3 Mode
-bits, the operands are Vectorised.
+bits, the operands are Vectorized.
A future version of Power ISA, where SVP64Single is proposed,
would in fact introduce "Conditional Execution", including