Some thoughts on this: the sensible (sane) number of CRs to have is 64. A case could be made for having 128 but it is an awful lot. 64 CRs also has the advantage that it is only 4x 64 bit registers on a context-switch.
+A practical issue stems from the fact that accessing the CR regfile on a non-aligned 8-CR boundary during Vector operations would significantly increase internal routing. By aligning Vector Reads/Writes to 8 CRs this requires only 32 bit aligned read/writes.
+
How to number them as vectors gets particularly interesting. A case could be made for treating the 64 CRs as a square, and using CR numbering (CR0-7) to begin VL for-loop incrementing first by row and when rolling over to increment the column. CR6 CR14 ... CR62 then CR7 CR15 ...
When the SV prefix marks them with 2 bits, one of those could be used to indicate scalar, and the other to indicate whether the 3 bit CR number is to be treated as a horizontal vector (CR incrementing straight by 1) or a vertical vector (incrementing by 8)