to a *CR Field* (CR0-CR7) and consequently these operands
(BF, BFA etc) are only 3-bits.
+(*It helps here to think of the top 3 bits of BA as referring
+to a CR Field, like BFA does, and the bottom 2 bits of BA
+referring to
+EQ/LT/GT/SO within that Field*)
+
With SVP64 extending the number of CR *Fields* to 128, the number of
32-bit CR *Registers* extends to 16, in order to hold all 128 CR *Fields*
(8 per CR Register). Then, it gets even more strange, when it comes
-to Vectorisation, which applies to the CR *Field* numbers. The
+to Vectorisation, which applies to the CR Field *numbers*. The
hardware-for-loop for Rc=1 for example starts at CR0 for element 0,
and moves to CR1 for element 1, and so on. The reason here is quite
simple: each element result has to have its own CR Field co-result.