Scalar Power ISA CR Register is 32-bits, but actually comprises eight
CR Fields, CR0-CR7. With each CR Field being four bits (EQ, LT, GT, SO)
this makes up 32 bits, and therefore a CR operand referring to one bit
-of the CR will be 5 bits in length. *However*, some instructions refer
-to a *CR Field* (CR0-CR7) and consequently are only 3-bits.
+of the CR will be 5 bits in length (BA, BT).
+*However*, some instructions refer
+to a *CR Field* (CR0-CR7) and consequently these operands
+(BF, BFA etc) are only 3-bits.
With SVP64 extending the number of CR *Fields* to 128, the number of
CR *Registers* extends to 16, in order to hold all 128 CR *Fields*
(8 per CR Register). Then, it gets even more strange, when it comes
to Vectorisation, which applies to the CR *Field* numbers. The
hardware-for-loop for Rc=1 for example starts at CR0 for element 0,
-and moves to CR1 for element 1, and so on. In other words, the
+and moves to CR1 for element 1, and so on. The reason here is quite
+simple: each element result has to have its own CR Field co-result.
+
+In other words, the
element is the 4-bit CR *Field*, not the bits *of* the 32-bit
CR Register, and not the CR *Register* (of which there are now 16).
All quite logical, but a little mind-bending.