openpower/sv/ldst/discussion.mdwn

   1 # notes from lxo
   2
   3 this section covers assembly notation for the immediate and indexed LD/ST.
   4 the summary is that in immediate mode for LD it is not clear that if the
   5 destination register is Vectorized `RT.v` but the source `imm(RA)` is scalar
   6 the memory being read is *still a vector load*, known as "unit or element strides".
   7
   8 This anomaly is made clear with the following notation:
   9
  10     sv.ld RT.v, imm(RA).v
  11
  12 The following notation, although technically correct due to being implicitly identical to the above, is prohibited and is a syntax error:
  13
  14     sv.ld RT.v, imm(RA)
  15
  16 Notes taken from IRC conversation
  17
  18     <lxo> sv.ld r#.v, ofst(r#).v -> the whole vector is at ofst+r#
  19     <lxo> sv.ld r#.v, ofst(r#.v) -> r# is a vector of addresses
  20     <lxo> similarly sv.ldx r#.v, r#, r#.v -> whole vector at r#+r#
  21     <lxo> whereas sv.ldx r#.v, r#.v, r# -> vector of addresses
  22     <lxo> point being, you take an operand with the "m" constraint (or other memory-operand constraints), append .v to it and you're done addressing the in-memory vector
  23     <lxo> as in asm ("sv.ld1 %0.v, %1.v" : "=r"(vec_in_reg) : "m"(vec_in_mem));
  24     <lxo> (and ld%U1 got mangled into underline; %U expands to x if the address is a sum of registers
  25
  26 permutations of vector selection, to identify above asm-syntax:
  27
  28      imm(RA)  RT.v   RA.v   nonstrided
  29          sv.ld r#.v, ofst(r#2.v) -> r#2 is a vector of addresses
  30            mem@     0+r#2   offs+(r#2+1)  offs+(r#2+2)
  31            destreg  r#      r#+1          r#+2
  32      imm(RA)  RT.s   RA.v   nonstrided
  33          sv.ld r#, ofst(r#2.v) -> r#2 is a vector of addresses
  34            (dest r# is scalar) -> VSELECT mode
  35      imm(RA)  RT.v   RA.s   fixed stride: unit or element
  36          sv.ld r#.v, ofst(r#2).v -> whole vector is at ofst+r#2
  37            mem@r#2  +0   +1   +2
  38            destreg  r#   r#+1 r#+2
  39          sv.ld/els r#.v, ofst(r#2).v -> vector at ofst*elidx+r#2
  40            mem@r#2  +0 ...   +offs ...  +offs*2
  41            destreg  r#       r#+1       r#+2
  42      imm(RA)  RT.s   RA.s   not vectorized
  43          sv.ld r#, ofst(r#2)
  44
  45 indexed mode:
  46
  47      RA,RB    RT.v  RA.v  RB.v
  48         sv.ldx r#.v, r#2, r#3.v -> whole vector at r#2+r#3
  49      RA,RB    RT.v  RA.s  RB.v
  50         sv.ldx r#.v, r#2.v, r#3.v -> whole vector at r#2+r#3
  51      RA,RB    RT.v  RA.v  RB.s
  52         sv.ldx r#.v, r#2.v, r#3 -> vector of addresses
  53      RA,RB    RT.v  RA.s  RB.s
  54         sv.ldx r#.v, r#2, r#3 -> VSPLAT mode
  55      RA,RB    RT.s  RA.v  RB.v
  56      RA,RB    RT.s  RA.s  RB.v
  57      RA,RB    RT.s  RA.v  RB.s
  58      RA,RB    RT.s  RA.s  RB.s not vectorized