is performed. See [[svp64/appendix]].
Note that there are comprehensive caveats when using this mode,
and it should not be confused with the Parallel Reduction [[sv/remap]].
-* **pred-result** will test the result (CR testing selects a bit of CR
- and inverts it, just like branch conditional testing) and if the
- test fails it is as if the *destination* predicate bit was zero even
- before starting the operation. When Rc=1 the CR element however is
- still stored in the CR regfile, even if the test failed. See appendix
- for details.
Note that ffirst and reduce modes are not anticipated to be
high-performance in some implementations. ffirst due to interactions
with VL, and reduce due to it requiring additional operations to produce
-a result. simple, saturate and pred-result are however inter-element
+a result. simple and saturate are however inter-element
independent and may easily be parallelised to give high performance,
regardless of the value of VL.
| 01 | inv | CR-bit | Rc=1: ffirst CR sel |
| 01 | inv | VLi RC1 | Rc=0: ffirst z/nonz |
| 10 | N | dz sz | sat mode: N=0/1 u/s |
-| 11 | inv | CR-bit | Rc=1: pred-result CR sel |
-| 11 | inv | zz RC1 | Rc=0: pred-result z/nonz |
+| 11 | / | / / | reserved |
Fields:
Operations that actually produce or alter CR Field as a result have
their own SVP64 Mode, described in [[sv/cr_ops]].
-## pred-result mode
-
-This mode merges common CR testing with predication, saving on instruction
-count. Below is the pseudocode excluding predicate zeroing and elwidth
-overrides. Note that the pseudocode for [[sv/cr_ops]] is slightly
-different.
-
-```
- for i in range(VL):
- # predication test, skip all masked out elements.
- if predicate_masked_out(i):
- continue
- result = op(iregs[RA+i], iregs[RB+i])
- CRnew = analyse(result) # calculates eq/lt/gt
- # Rc=1 always stores the CR field
- if Rc=1 or RC1:
- CR.field[offs+i] = CRnew
- # now test CR, similar to branch
- if RC1 or CRnew[BO[0:1]] != BO[2]:
- continue # test failed: cancel store
- # result optionally stored but CR always is
- iregs[RT+i] = result
-```
-
-The reason for allowing the CR element to be stored is so that
-post-analysis of the CR Vector may be carried out. For example:
-Saturation may have occurred (and been prevented from updating, by the
-test) but it is desirable to know *which* elements fail saturation.
-
-Note that RC1 Mode basically turns all operations into `cmp`. The
-calculation is performed but it is only the CR that is written. The
-element result is *always* discarded, never written (just like `cmp`).
-
-Note that predication is still respected: predicate zeroing is slightly
-different: elements that fail the CR test *or* are masked out are zero'd.
-
[[!tag standards]]
--------