Mode is an augmentation of SV behaviour, providing additional
functionality. Some of these alterations are element-based (saturation),
-others involve post-analysis (predicate result) and others are
-Vector-based (mapreduce, fail-on-first).
+others are Vector-based (mapreduce, fail-on-first).
[[sv/ldst]], [[sv/cr_ops]] and [[sv/branches]] are covered separately:
the following Modes apply to Arithmetic and Logical SVP64 operations:
is performed. See [[svp64/appendix]].
Note that there are comprehensive caveats when using this mode,
and it should not be confused with the Parallel Reduction [[sv/remap]].
+ Also care is needed with `hphint`.
Note that ffirst and reduce modes are not anticipated to be
high-performance in some implementations. ffirst due to interactions
-with VL, and reduce due to it requiring additional operations to produce
-a result. simple and saturate are however inter-element
+with VL, and reduce due to it creating overlapping operations in
+many of its uses. simple and saturate are however inter-element
independent and may easily be parallelised to give high performance,
regardless of the value of VL.
Dependency chain as long as Sequential Program Execution Order is preserved.
Easy examples include Reduction on Logical OR or AND operations.*
+**Horizontal Parallelism Hint**
+
+`SVSTATE.hphint` declares to hardware that groups of elements up to this
+size are 100% independent (free of all Hazards inter-element but not inter-group).
+With Reduction literally creating Dependency
+Hazards on every element-level sub-instruction it is pretty clear that setting
+`hphint` *at all* would cause data corruption. However `sv.add *r0, *r4, *r0`
+for example clearly leaves room for four parallel elements. Programmers must
+be aware of this and exercise caution.
+
## Data-dependent Fail-on-first
Data-dependent fail-on-first is CR-field-driven and is completely separate
* LDST ffirst may never set VL equal to zero. This because on the first
element an exception must be raised "as normal".
* CR-based data-dependent ffirst on the other hand **can** set VL equal
- to zero. This is the only means in the entirety of SV that VL may be set
- to zero (with the exception of via the SV.STATE SPR). When VL is set
+ to zero. When VL is set
zero due to the first element failing the CR bit-test, all subsequent
vectorised operations are effectively `nops` which is
*precisely the desired and intended behaviour*.
* CR-based data-dependent first on the other hand MUST NOT truncate VL
arbitrarily to a length decided by the hardware: VL MUST only be
truncated based explicitly on whether a test fails. This because it is
- a precise Deterministic test on which algorithms can and will will rely.
+ a precise Deterministic test on which algorithms can and will rely.
**Floating-point Exceptions**