* [[svp64]]
Normal SVP64 Mode covers Arithmetic and Logical operations
-to provide suitable additional behaviour. The Mode
+to provide suitable additional behaviour. The Mode
field is bits 19-23 of the [[svp64]] RM Field.
Table of contents:
## Mode
Mode is an augmentation of SV behaviour, providing additional
-functionality. Some of these alterations are element-based (saturation),
+functionality. Some of these alterations are element-based (saturation),
others involve post-analysis (predicate result) and others are
Vector-based (mapreduce, fail-on-first).
[[sv/ldst]], [[sv/cr_ops]] and [[sv/branches]] are covered separately:
the following Modes apply to Arithmetic and Logical SVP64 operations:
-* **simple** mode is straight vectorisation. no augmentations: the
+* **simple** mode is straight vectorisation. No augmentations: the
vector comprises an array of independently created results.
* **ffirst** or data-dependent fail-on-first: see separate section.
- the vector may be truncated depending on certain criteria.
+ The vector may be truncated depending on certain criteria.
*VL is altered as a result*.
* **sat mode** or saturation: clamps each element result to a min/max
- rather than overflows / wraps. allows signed and unsigned clamping
+ rather than overflows / wraps. Allows signed and unsigned clamping
for both INT and FP.
-* **reduce mode**. if used correctly, a mapreduce (or a prefix sum)
- is performed. see [[svp64/appendix]].
- note that there are comprehensive caveats when using this mode.
+* **reduce mode**. If used correctly, a mapreduce (or a prefix sum)
+ is performed. See [[svp64/appendix]].
+ Note that there are comprehensive caveats when using this mode.
* **pred-result** will test the result (CR testing selects a bit of CR
and inverts it, just like branch conditional testing) and if the
test fails it is as if the *destination* predicate bit was zero even
* **sz / dz** if predication is enabled will put zeros into the dest
(or as src in the case of twin pred) when the predicate bit is zero.
- otherwise the element is ignored or skipped, depending on context.
+ Otherwise the element is ignored or skipped, depending on context.
* **zz**: both sz and dz are set equal to this flag
* **inv CR bit** just as in branches (BO) these bits allow testing of
a CR bit and whether it is set (inv=0) or unset (inv=1)
given element hit saturation may be done using a mapreduced CR op (cror),
or by using the new crrweird instruction with Rc=1, which will transfer
the required CR bits to a scalar integer and update CR0, which will allow
-testing the scalar integer for nonzero. see [[sv/cr_int_predication]].
+testing the scalar integer for nonzero. See [[sv/cr_int_predication]].
Alternatively, a Data-Dependent Fail-First may be used to truncate the
Vector Length to non-saturated elements, greatly increasing the productivity
of parallelised inner hot-loops.*
As explained in the [[sv/appendix]] Reduce Mode switches off the check
which would normally stop looping if the result register is scalar.
Thus, the result scalar register, if also used as a source scalar,
-may be used to perform sequential accumulation. This *deliberately*
+may be used to perform sequential accumulation. This *deliberately*
sets up a chain of Register Hazard Dependencies, whereas Parallel Reduce
[[sv/remap]] deliberately issues a Tree-Schedule of operations that may
be parallelised.
## Data-dependent Fail-on-first
Data-dependent fail-on-first is CR-field-driven and is completely separate
-and distinct from LD/ST Fail-First (also known as Fault-First). Note in
+and distinct from LD/ST Fail-First (also known as Fault-First). Note in
each case the assumption is that vector elements are required to appear
to be executed in sequential Program Order. When REMAP is not active,
element 0 would be the first.
Two extremely important aspects of ffirst are:
-* LDST ffirst may never set VL equal to zero. This because on the first
+* LDST ffirst may never set VL equal to zero. This because on the first
element an exception must be raised "as normal".
* CR-based data-dependent ffirst on the other hand **can** set VL equal
to zero. This is the only means in the entirety of SV that VL may be set
non-deterministic.
* CR-based data-dependent first on the other hand MUST NOT truncate VL
arbitrarily to a length decided by the hardware: VL MUST only be
- truncated based explicitly on whether a test fails. This because it is
+ truncated based explicitly on whether a test fails. This because it is
a precise Deterministic test on which algorithms can and will will rely.
**Floating-point Exceptions**
When Floating-point exceptions are enabled VL must be truncated at
the point where the Exception appears not to have occurred. If `VLi`
is set then VL must include the faulting element, and thus the faulting
-element will always raise its exception. If however `VLi` is clear then
+element will always raise its exception. If however `VLi` is clear then
VL **excludes** the faulting element and thus the exception will **never**
be raised.