1 # Under consideration <a name="issues"></a>
3 for element-grouping, if there is unused space within a register
4 (3 16-bit elements in a 64-bit register for example), recommend:
6 * For the unused elements in an integer register, the used element
7 closest to the MSB is sign-extended on write and the unused elements
9 * The unused elements in a floating-point register are treated as-if
10 they are set to all ones on write and are ignored on read, matching the
11 existing standard for storing smaller FP values in larger registers.
13 > no, because it wastes space.
19 > One solution is to just not support LR/SC wider than a fixed
20 > implementation-dependent size, which must be at least
21 >1 XLEN word, which can be read from a read-only CSR
22 > that can also be used for info like the kind and width of
23 > hw parallelism supported (128-bit SIMD, minimal virtual
24 > parallelism, etc.) and other things (like maybe the number
25 > of registers supported).
27 > That CSR would have to have a flag to make a read trap so
28 > a hypervisor can simulate different values.
32 > And what about instructions like JALR?
34 answer: they're not vectorised, so not a problem
38 TODO: document different lengths for INT / FP regfiles, and provide
39 as part of info CSR register. 00=32, 01=64, 10=128, 11=reserved.
43 Could the 8 bit Register VBLOCK format use regnum<<1 instead, only accessing regs 0 to 64?
47 Expand the range of SUBVL and its associated svsrcoffs and svdestoffs by
48 adding a 2nd STATE CSR (or extending STATE to 64 bits). Future version?
52 TODO: evaluate - BRIEFLY (under 1 hour MAXIMUM) - why these rules exist,
53 by illustrating with pseudo-assembly DAXPY
55 1. Trap if imm > XLEN.
58 3. Else If regs[rs1] > 2 * imm, then
60 4. Else If regs[rs1] > imm, then
61 1. Set VL to regs[rs1] / 2 rounded down.
63 1. Set VL to regs[rs1].
64 6. Set regs[rd] to VL.
66 TODO: adapt to the above rules.
68 # a0 is n, a1 is pointer to x[0], a2 is pointer to y[0], fa0 is a
70 4: vsetdcfg t0 # enable 2 64b Fl.Pt. registers
72 8: setvl t0, a0 # vl = t0 = min(mvl, n)
73 c: vld v0, a1 # load vector x
74 10: slli t1, t0, 3 # t1 = vl * 8 (in bytes)
75 14: vld v1, a2 # load vector y
76 18: add a1, a1, t1 # increment pointer to x by vl*8
77 1c: vfmadd v1, v0, fa0, v1 # v1 += v0 * fa0 (y = a * x + y)
78 20: sub a0, a0, t0 # n -= vl (t0)
79 24: vst v1, a2 # store Y
80 28: add a2, a2, t1 # increment pointer to y by vl*8
81 2c: bnez a0, loop # repeat if n != 0
86 swizzle needs a MV. see below for a potential way to use the funct7 to do a swizzle in rs2.
88 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
89 | Encoding | 31:27 | 26:25 | 24:20 | 19:15 | 14:12 | 11:7 | 6:2 | 1:0 |
90 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
91 | RV32-I-type + imm[11:0] + rs1[4:0] + funct3 | rd[4:0] + opcode + 0b11 |
92 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
93 | RV32-I-type + rsv[11:8] swizzle[7:0] + rs1[4:0] + 0b000 | rd[4:0] + OP-V + 0b11 |
94 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
99 swizzle (only active on SV or P48/P64 when SUBVL!=0):
113 potential MV.X? register-version of MV-swizzle?
115 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
116 | Encoding | 31:27 | 26:25 | 24:20 | 19:15 | 14:12 | 11:7 | 6:2 | 1:0 |
117 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
118 | RV32-R-type + funct7 + rs2[4:0] + rs1[4:0] + funct3 | rd[4:0] + opcode + 0b11 |
119 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
120 | RV32-R-type + 0b0000000 + rs2[4:0] + rs1[4:0] + 0b001 | rd[4:0] + OP-V + 0b11 |
121 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
127 potential funct7 = 0b0000001 to say that rs2 is a swizzle argument?
129 question: do we need a swizzle MV.X as well?