is performed. See [[svp64/appendix]].
Note that there are comprehensive caveats when using this mode,
and it should not be confused with the Parallel Reduction [[sv/remap]].
+ Also care is needed with `hphint`.
Note that ffirst and reduce modes are not anticipated to be
high-performance in some implementations. ffirst due to interactions
-with VL, and reduce due to it requiring additional operations to produce
-a result. simple and saturate are however inter-element
+with VL, and reduce due to it creating overlapping operations in
+many of its uses. simple and saturate are however inter-element
independent and may easily be parallelised to give high performance,
regardless of the value of VL.
Dependency chain as long as Sequential Program Execution Order is preserved.
Easy examples include Reduction on Logical OR or AND operations.*
+**Horizontal Parallelism Hint**
+
+`SVSTATE.hphint` declares to hardware that groups of elements up to this
+size are 100% independent. With Reduction literally creating Dependency
+Hazards on every element-level sub-instruction it is pretty clear that setting
+`hphint` *at all* would cause data corruption. However `sv.add *r0, *r4, *r0`
+for example clearly leaves room for four parallel elements. Programmers must
+be aware of this and exercise caution.
+
## Data-dependent Fail-on-first
Data-dependent fail-on-first is CR-field-driven and is completely separate