- mr OR crm: "normal" map-reduce mode or CR-mode.
- mr.svm OR crm.svm: when vec2/3/4 set, sub-vector mapreduce is enabled
-# Proposed Parallel-reduction algorithm
+# Parallel-reduction algorithm
The principle of SVP64 is that SVP64 is a fully-independent
Abstraction of hardware-looping in between issue and execute phases
that has no relation to the operation it issues.
Additional state cannot be saved on context-switching beyond that
-of SVSTATE.
+of SVSTATE, making things slightly tricky.
Executable demo pseudocode, full version
[here](https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv/preduce.py;hb=HEAD)
return vec
```
-In this version the need for an explicit MV is made unnecessary by instead
-leaving elements *in situ*. The internal modifications to the predicate may,
-due to the reduction being entirely deterministic, be "reconstructed"
-on a context-switch. This may make some implementations slower.
+This algorithm works by noting when data remains in-place rather than
+being reduced, and referring to that alternative position on subsequent
+layers of reduction. It is re-entrant. If however interrupted and
+restored, some implementations may take longer to re-establish the
+context.
+
+Its application by default is that:
+
+* RA, FRA or BFA is the first register as the first operand
+ (ci index offset in the above pseudocode)
+* RB, FRB or BFB is the second (co index offset)
+* RT (result) also uses ci **if RA==RT**
+
+For more complex applications a REMAP Schedule must be used
+
+*Programmers's note:
+if passed a predicate mask with only one bit set, this algorithm
+takes no action, similar to when a predicate mask is all zero.*
*Implementor's Note: many SIMD-based Parallel Reduction Algorithms are
implemented in hardware with MVs that ensure lane-crossing is minimised.