[[!table data="""
31 |30 ..... 25 |24..20|19..15| 14...12| 11.....8 | 7 | 6....0 |
imm[12] | imm[10:5] |rs2 | rs1 | funct3 | imm[4:1] | imm[11] | opcode |
- 1 | 6 | 5 | 5 | 3 | 4 | 1 | 7 |
+ 1 | 6 | 5 | 5 | 3 | 4 | 1 | 7 |
offset[12,10:5] || src2 | src1 | BEQ | offset[11,4:1] || BRANCH |
"""]]
vew may be one of the following (giving a table "bytestable", used below):
-| vew | bitwidth |
-| --- | -------- |
-| 000 | default |
-| 001 | 8 |
-| 010 | 16 |
-| 011 | 32 |
-| 100 | 64 |
-| 101 | 128 |
-| 110 | rsvd |
-| 111 | rsvd |
+| vew | bitwidth | bytestable |
+| --- | -------- | ---------- |
+| 000 | default | XLEN/8 |
+| 001 | 8 | 1 |
+| 010 | 16 | 2 |
+| 011 | 32 | 4 |
+| 100 | 64 | 8 |
+| 101 | 128 | 16 |
+| 110 | rsvd | rsvd |
+| 111 | rsvd | rsvd |
Pseudocode for vector length taking CSR SIMD-bitwidth into account:
Whilst the above may seem to be severe minuses, there are some strong
pluses:
-* Significant reduction of V's opcode space: over 85%.
+* Significant reduction of V's opcode space: over 95%.
* Smaller reduction of P's opcode space: around 10%.
* The potential to use Compressed instructions in both Vector and SIMD
due to the overloading of register meaning (implicit vectorisation,
> structure, as the microarchitectural guts have to be spilled to memory.)
-## Implementation Paradigms
+## Implementation Paradigms <a name="implementation_paradigms"></a>
TODO: assess various implementation paradigms. These are listed roughly
in order of simplicity (minimum compliance, for ultra-light-weight