(no commit message)
[libreriscv.git] / openpower / sv / ldst.mdwn
1 # SV Load and Store
2
3 Vectorisation of Load and Store requires creation, from scalar operations, a number of different types:
4
5 * fixed stride (contiguous sequence with no gaps)
6 * element strided (sequential but regularly offset, with gaps)
7 * vector indexed (vector of base addresses and vector of offsets)
8
9 OpenPOWER Load/Store operations may be seen from [[isa/fixedload]] and [[isa/fixedstore]] pseudocode to be of the form:
10
11 lbux RT, RA, RB
12 EA <- (RA) + (RB)
13 RT <- MEM(EA)
14
15 and for immediate variants:
16
17 lb RT,D(RA)
18 EA <- RA + EXTS(D)
19 RT <- MEM(EA)
20
21 Thus in the first example, the source registers may each be independently marked as scalar or vector, and likewise the destination; in the second example only the one source and one dest may be marked as scalar or vector.
22
23 Thus we can see that Vector Indexed may be covered, and, as demonstrated with the pseudocode below, the immediate can be set to the element width in order to give unit stride.
24
25 At the minimum however it is possible to provide unit stride and vector mode, as follows:
26
27 function op_ld(RT, RA, immed) # LD not VLD!
28  rdv = map_dest_extra(RT);
29  rsv = map_src_extra(RA);
30  ps = get_pred_val(FALSE, RA); # predication on src
31  pd = get_pred_val(FALSE, RT); # ... AND on dest
32  for (int i = 0, int j = 0; i < VL && j < VL;):
33 # skip nonpredicates elements
34 if (RA.isvec) while (!(ps & 1<<i)) i++;
35 if (RT.isvec) while (!(pd & 1<<j)) j++;
36 if (RA.isvec)
37 # indirect mode (multi mode)
38 EA = ireg[rsv+i] + immed;
39 elif (RT.isvec)
40 # unit and element stride mode
41 EA = ireg[rsv] + i * immed
42 else
43 # standard scalar mode (but predicated)
44 EA = ireg[rsv] + immed
45 ireg[rdv+j] <= MEM[EA];
46 if (!RA.isvec && !RT.isvec)
47 break # scalar-scalar
48 if (RA.isvec) i++;
49 if (RT.isvec) j++;
50
51 Indexed LD is:
52
53 function op_ldx(RT, RA, RB) # LD not VLD!
54  rdv = map_dest_extra(RT);
55  rsv = map_src_extra(RA);
56  rso = map_src_extra(RB);
57  ps = get_pred_val(FALSE, RA); # predication on src
58  pd = get_pred_val(FALSE, RT); # ... AND on dest
59  for (i=0, j=0, k=0; i < VL && j < VL && k < VL):
60 # skip nonpredicated RA, RB and RT
61 if (RA.isvec) while (!(ps & 1<<i)) i++;
62 if (RB.isvec) while (!(ps & 1<<k)) k++;
63 if (RT.isvec) while (!(pd & 1<<j)) j++;
64 EA = ireg[rsv] + ireg[rso] # indexed address
65 ireg[rdv+j] <= MEM[EA];
66 if (!RA.isvec && !RT.isvec && !RB.isvec)
67 break # scalar-scalar
68 if (RA.isvec) i++;
69 if (RB.isvec) i++;
70 if (RT.isvec) j++;
71