It is extremely important for implementors to note that the only circumstance
where upper portions of an underlying 64-bit register are zero'd out is
when the destination is a scalar. The ideal register file has byte-level
-write-enable lines, just like most SRAMs.
+write-enable lines, just like most SRAMs, in order to avoid READ-MODIFY-WRITE.
An example ADD operation with predication and element width overrides:
if (RA.isvec) { irs1 += 1; }
if (RB.isvec) { irs2 += 1; }
+Thus it can be clearly seen that elements are packed by their
+element width, and the packing starts from the source (or destination)
+specified by the instruction.
+
# Twin (implicit) result operations
Some operations in the Power ISA already target two 64-bit scalar