An additional caveat involves Condition Register Fields
when also used as Predicate Masks. An operation that
overwrites the same CR Fields that are simultaneously
-being used as a Predicate Mask is `UNDEFINED` behaviour.
+being used as a Predicate Mask is `UNDEFINED` behaviour
+if the overwritten CR field element was needed by a
+subsequent Element for its Predicate Mask bit.
This allows implementations to relax some of the
otherwise-draconian Register Hazards that would otherwise
occur, and to consider internal cacheing of the CR-based
Predicate
-bits.
+bits, but some implementations *may not necessarily
+perform pre-reading* and consequently the risk of
+overwrite is the responsibility of the Programmer.
+Special care is particularly needed here when using REMAP.
## Register files, elements, and Element-width Overrides
Likewise CR based twin predication has a second set of 3 bits, allowing
a different test to be applied.
-Note that it is assumed that Predicate Masks (whether INT or CR) are
-read *before* the operations proceed. In practice (for CR Fields)
-this creates an unnecessary block on parallelism. Therefore, it is up
-to the programmer to ensure that the CR fields used as Predicate Masks
-are not being written to by any parallel Vector Loop. Doing so results
+Note that it cannot necessarily be assumed that Predicate Masks
+(whether INT or CR) are read in full *before* the operations proceed. In practice (for CR Fields)
+this creates an unnecessary block on parallelism, prohibiting
+"Vector Chaining". Therefore, it is up
+to the programmer to ensure that the CR field Elements used as Predicate Masks
+are not overwritten by any parallel Vector Loop. Doing so results
in **UNDEFINED** behaviour, according to the definition outlined in the
Power ISA v3.0B Specification.
needs to take place, safe in the knowledge that no programmer will have
issued a Vector Instruction where previous elements could have overwritten
(destroyed) not-yet-executed CR-Predicated element operations.
+This particularly is an issue when using REMAP, as the order in
+which CR-Field-based Predicate Mask bits could be read on a per-element
+execution basis could well conflict with the order in which prior
+elements wrote to the very same CR Field.
+
+Additionally Programmers should avoid using r3 r10 or r30
+as destination registers when these are also used as a Predicate
+Mask. Doing so is again UNDEFINED behaviour.
### Integer Predication (MASKMODE=0)
r10 and r30 are at the high end of temporary and unused registers,
so as not to interfere with register allocation from ABIs.
+
### CR-based Predication (MASKMODE=1)
When the predicate mode bit is one the 3 bits are interpreted as below.