simple_v_extension/specification/discussion.mdwn

   1 # Under consideration <a name="issues"></a>
   2
   3 for element-grouping, if there is unused space within a register
   4 (3 16-bit elements in a 64-bit register for example), recommend:
   5
   6 * For the unused elements in an integer register, the used element
   7   closest to the MSB is sign-extended on write and the unused elements
   8   are ignored on read.
   9 * The unused elements in a floating-point register are treated as-if
  10   they are set to all ones on write and are ignored on read, matching the
  11   existing standard for storing smaller FP values in larger registers.
  12
  13 > no, because it wastes space.
  14
  15 ---
  16
  17 info register,
  18
  19 > One solution is to just not support LR/SC wider than a fixed
  20 > implementation-dependent size, which must be at least
  21 >1 XLEN word, which can be read from a read-only CSR
  22 > that can also be used for info like the kind and width of
  23 > hw parallelism supported (128-bit SIMD, minimal virtual
  24 > parallelism, etc.) and other things (like maybe the number
  25 > of registers supported).
  26
  27 > That CSR would have to have a flag to make a read trap so
  28 > a hypervisor can simulate different values.
  29
  30 ----
  31
  32 > And what about instructions like JALR?
  33
  34 answer: they're not vectorised, so not a problem
  35
  36 ---
  37
  38 TODO: document different lengths for INT / FP regfiles, and provide
  39 as part of info CSR register. 00=32, 01=64, 10=128, 11=reserved.
  40
  41 ---
  42
  43 Could the 8 bit Register VBLOCK format use regnum<<1 instead, only accessing regs 0 to 64?
  44
  45 --
  46
  47 Expand the range of SUBVL and its associated svsrcoffs and svdestoffs by
  48 adding a 2nd STATE CSR (or extending STATE to 64 bits).  Future version?
  49
  50 --
  51
  52 TODO: evaluate - BRIEFLY (under 1 hour MAXIMUM) - why these rules exist,
  53 by illustrating with pseudo-assembly DAXPY
  54
  55 1. Trap if imm > XLEN.
  56 2. If rs1 is x0, then
  57     1. Set VL to imm.
  58 3. Else If regs[rs1] > 2 * imm, then
  59     1. Set VL to XLEN.
  60 4. Else If regs[rs1] > imm, then
  61     1. Set VL to regs[rs1] / 2 rounded down.
  62 5. Otherwise,
  63     1. Set VL to regs[rs1].
  64 6. Set regs[rd] to VL.
  65
  66 TODO: adapt to the above rules.
  67
  68     # a0 is n, a1 is pointer to x[0], a2 is pointer to y[0], fa0 is a
  69       0:  li t0, 2<<25
  70       4:  vsetdcfg t0             # enable 2 64b Fl.Pt. registers
  71     loop:
  72       8:  setvl  t0, a0           # vl = t0 = min(mvl, n)
  73       c:  vld    v0, a1           # load vector x
  74       10:  slli   t1, t0, 3        # t1 = vl * 8 (in bytes)
  75       14:  vld    v1, a2           # load vector y
  76       18:  add    a1, a1, t1       # increment pointer to x by vl*8
  77       1c:  vfmadd v1, v0, fa0, v1  # v1 += v0 * fa0 (y = a * x + y)
  78       20:  sub    a0, a0, t0       # n -= vl (t0)
  79       24:  vst    v1, a2           # store Y
  80       28:  add    a2, a2, t1       # increment pointer to y by vl*8
  81       2c:  bnez   a0, loop         # repeat if n != 0
  82       30:  ret                     # return
  83
  84 ----
  85
  86 swizzle needs a MV.  see [[mv.x]]
  87