the fundamental principle that SV is nothing more than a Sub-Program-Counter
sitting between Decode and Issue phases.
+For Scalar Reduction,
Microarchitectures *may* take opportunities to parallelise the reduction
-but only if in doing so they preserve Program Order at the Element Level.
+but only if in doing so they preserve strict Program Order at the Element Level.
Opportunities where this is possible include an `OR` operation
or a MIN/MAX operation: it may be possible to parallelise the reduction,
but for Floating Point it is not permitted due to different results