The primary purpose for this encoding is for Twin Predication on LOAD
and STORE operations. see [[sv/ldst]] for detailed anslysis.
-RM-2P-2S1D:
+**RM-2P-2S1D:**
| Field Name | Field bits | Description |
|------------|------------|----------------------------|
| Rdest_EXTRA2 | `10:11` | extends Rdest (R\*\_EXTRA2 Encoding) |
| Rsrc1_EXTRA2 | `12:13` | extends Rsrc1 (R\*\_EXTRA2 Encoding) |
| Rsrc2_EXTRA2 | `14:15` | extends Rsrc2 (R\*\_EXTRA2 Encoding) |
-| MASK_SRC | `16:18` | Execution Mask for Source |
+| MASK_SRC | `16:18` | Execution Mask for Source |
+
+**RM-2P-1S2D:**
-Note that for 1S2P the EXTRA2 dest and src names are switched (Rsrc_EXTRA2
+For RM-2P-1S2D the EXTRA2 dest and src names are switched (Rsrc_EXTRA2
is in bits 10:11, Rdest1_EXTRA2 in 12:13)
-Also that for 3S (to cover `stdx` etc.) the names are switched to 3 src:
+| Field Name | Field bits | Description |
+|------------|------------|----------------------------|
+| Rsrc2_EXTRA2 | `10:11` | extends Rsrc2 (R\*\_EXTRA2 Encoding) |
+| Rsrc1_EXTRA2 | `12:13` | extends Rsrc1 (R\*\_EXTRA2 Encoding) |
+| Rdest_EXTRA2 | `14:15` | extends Rdest (R\*\_EXTRA2 Encoding) |
+| MASK_SRC | `16:18` | Execution Mask for Source |
+
+**RM-2P-3S:**
+
+Also that for RM-2P-3S (to cover `stdx` etc.) the names are switched to 3 src:
Rsrc1_EXTRA2, Rsrc2_EXTRA2, Rsrc3_EXTRA2.
-Note also that LD with update indexed, which takes 2 src and 2 dest
-(e.g. `lhaux RT,RA,RB`), does not have room for 4 registers and also
-Twin Predication. therefore these are treated as RM-2P-2S1D and the
-src spec for RA is also used for the same RA as a dest.
+| Field Name | Field bits | Description |
+|------------|------------|----------------------------|
+| Rsrc1_EXTRA2 | `10:11` | extends Rsrc1 (R\*\_EXTRA2 Encoding) |
+| Rsrc2_EXTRA2 | `12:13` | extends Rsrc2 (R\*\_EXTRA2 Encoding) |
+| Rsrc3_EXTRA2 | `14:15` | extends Rsrc3 (R\*\_EXTRA2 Encoding) |
+| MASK_SRC | `16:18` | Execution Mask for Source |
+
+Note also that LD with update indexed, which takes 2 src and
+creates 2 dest registers (e.g. `lhaux RT,RA,RB`), does not have room
+for 4 registers and also Twin Predication. Therefore these are treated as
+RM-2P-2S1D and the src spec for RA is also used for the same RA as a dest.
Note that if ELWIDTH != ELWIDTH_SRC this may result in reduced performance
or increased latency in some implementations due to lane-crossing.
The register files are therefore extended:
-* INT is extended from r0-31 to r0-127
-* FP is extended from fp0-32 to fp0-fp127
+* INT (GPR) is extended from r0-31 to r0-127
+* FP (FPR) is extended from fp0-32 to fp0-fp127
* CR Fields are extended from CR0-7 to CR0-127
However due to pressure in `RM.EXTRA` not all these registers