add potential MV.X and swizzle-mv
[libreriscv.git] / simple_v_extension / specification / discussion.mdwn
1 # Under consideration <a name="issues"></a>
2
3 for element-grouping, if there is unused space within a register
4 (3 16-bit elements in a 64-bit register for example), recommend:
5
6 * For the unused elements in an integer register, the used element
7 closest to the MSB is sign-extended on write and the unused elements
8 are ignored on read.
9 * The unused elements in a floating-point register are treated as-if
10 they are set to all ones on write and are ignored on read, matching the
11 existing standard for storing smaller FP values in larger registers.
12
13 > no, because it wastes space.
14
15 ---
16
17 info register,
18
19 > One solution is to just not support LR/SC wider than a fixed
20 > implementation-dependent size, which must be at least 
21 >1 XLEN word, which can be read from a read-only CSR
22 > that can also be used for info like the kind and width of 
23 > hw parallelism supported (128-bit SIMD, minimal virtual 
24 > parallelism, etc.) and other things (like maybe the number 
25 > of registers supported). 
26
27 > That CSR would have to have a flag to make a read trap so
28 > a hypervisor can simulate different values.
29
30 ----
31
32 > And what about instructions like JALR? 
33
34 answer: they're not vectorised, so not a problem
35
36 ---
37
38 TODO: document different lengths for INT / FP regfiles, and provide
39 as part of info CSR register. 00=32, 01=64, 10=128, 11=reserved.
40
41 ---
42
43 Could the 8 bit Register VBLOCK format use regnum<<1 instead, only accessing regs 0 to 64?
44
45 --
46
47 Expand the range of SUBVL and its associated svsrcoffs and svdestoffs by
48 adding a 2nd STATE CSR (or extending STATE to 64 bits). Future version?
49
50 --
51
52 TODO: evaluate - BRIEFLY (under 1 hour MAXIMUM) - why these rules exist,
53 by illustrating with pseudo-assembly DAXPY
54
55 1. Trap if imm > XLEN.
56 2. If rs1 is x0, then
57 1. Set VL to imm.
58 3. Else If regs[rs1] > 2 * imm, then
59 1. Set VL to XLEN.
60 4. Else If regs[rs1] > imm, then
61 1. Set VL to regs[rs1] / 2 rounded down.
62 5. Otherwise,
63 1. Set VL to regs[rs1].
64 6. Set regs[rd] to VL.
65
66 TODO: adapt to the above rules.
67
68 # a0 is n, a1 is pointer to x[0], a2 is pointer to y[0], fa0 is a
69 0: li t0, 2<<25
70 4: vsetdcfg t0 # enable 2 64b Fl.Pt. registers
71 loop:
72 8: setvl t0, a0 # vl = t0 = min(mvl, n)
73 c: vld v0, a1 # load vector x
74 10: slli t1, t0, 3 # t1 = vl * 8 (in bytes)
75 14: vld v1, a2 # load vector y
76 18: add a1, a1, t1 # increment pointer to x by vl*8
77 1c: vfmadd v1, v0, fa0, v1 # v1 += v0 * fa0 (y = a * x + y)
78 20: sub a0, a0, t0 # n -= vl (t0)
79 24: vst v1, a2 # store Y
80 28: add a2, a2, t1 # increment pointer to y by vl*8
81 2c: bnez a0, loop # repeat if n != 0
82 30: ret # return
83
84 ----
85
86 swizzle needs a MV. see below for a potential way to use the funct7 to do a swizzle in rs2.
87
88 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
89 | Encoding | 31:27 | 26:25 | 24:20 | 19:15 | 14:12 | 11:7 | 6:2 | 1:0 |
90 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
91 | RV32-I-type + imm[11:0] + rs1[4:0] + funct3 | rd[4:0] + opcode + 0b11 |
92 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
93 | RV32-I-type + rsv[11:8] swizzle[7:0] + rs1[4:0] + 0b000 | rd[4:0] + OP-V + 0b11 |
94 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
95
96 * funct3 = MV
97 * OP-V = 0b1010111
98
99 swizzle (only active on SV or P48/P64 when SUBVL!=0):
100
101 +-----+---+
102 | 1:0 | x |
103 +-----+---+
104 | 3:2 | y |
105 +-----+---+
106 | 5:4 | z |
107 +-----+---+
108 | 7:6 | w |
109 +-----+---+
110
111 ----
112
113 potential MV.X? register-version of MV-swizzle?
114
115 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
116 | Encoding | 31:27 | 26:25 | 24:20 | 19:15 | 14:12 | 11:7 | 6:2 | 1:0 |
117 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
118 | RV32-R-type + funct7 + rs2[4:0] + rs1[4:0] + funct3 | rd[4:0] + opcode + 0b11 |
119 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
120 | RV32-R-type + 0b0000000 + rs2[4:0] + rs1[4:0] + 0b001 | rd[4:0] + OP-V + 0b11 |
121 +---------------+-------------+-------+----------+----------+--------+----------+--------+--------+
122
123 * funct3 = MV.X
124 * OP-V = 0b1010111
125 * funct7 = 0b0000000
126
127 potential funct7 = 0b0000001 to say that rs2 is a swizzle argument?
128
129 question: do we need a swizzle MV.X as well?
130