openpower/sv/vector_ops.mdwn

   1 [[!tag standards]]
   2
   3 # SV Vector-assist Operations.
   4
   5 Links:
   6
   7 * [[discussion]]
   8 * <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-register-gather-instructions>
   9 * <https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-May/004884.html>
  10 * <https://bugs.libre-soc.org/show_bug.cgi?id=865> implementation in simulator
  11 * <https://bugs.libre-soc.org/show_bug.cgi?id=213>
  12 * <https://bugs.libre-soc.org/show_bug.cgi?id=142> specialist vector ops
  13  out of scope for this document [[openpower/sv/3d_vector_ops]]
  14 * [[simple_v_extension/specification/bitmanip]] previous version,
  15   contains pseudocode for sof, sif, sbf
  16 * <https://en.m.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set#TBM_(Trailing_Bit_Manipulation)>
  17
  18 The core Power ISA was designed as scalar: SV provides a level of
  19 abstraction to add variable-length element-independent parallelism.
  20 Therefore there are not that many cases where *actual* Vector instructions
  21 are needed. If they are, they are more "assistance" functions.  Two
  22 traditional Vector instructions were initially considered (conflictd and
  23 vmiota) however they may be synthesised from existing SVP64 instructions:
  24 vmiota may use [[svstep]].  Details in [[discussion]]
  25
  26 Notes:
  27
  28 * Instructions suited to 3D GPU workloads (dotproduct, crossproduct,
  29   normalise) are out of scope: this document is for more general-purpose
  30   instructions that underpin and are critical to general-purpose Vector
  31   workloads (including GPU and VPU)
  32 * Instructions related to the adaptation of CRs for use as
  33   predicate masks are covered separately, by crweird operations.
  34   See [[sv/cr_int_predication]].
  35
  36 ## Mask-suited Bitmanipulation
  37
  38
  39 BM2-Form
  40
  41 |0..5  |6..10|11..15|16..20|21-25|26|27..31| Form |
  42 |------|-----|------|------|-----|--|------|------|
  43 | PO   |  RS |   RA |   RB |bm   |L |   XO | BM2-Form |
  44
  45 * bmask RS,RA,RB,bm,L
  46
  47 Pseudo-code:
  48
  49 ```
  50     if _RB = 0 then mask <- [1] * XLEN
  51     else            mask <- (RB)
  52     ra <- (RA) & mask
  53     a1 <- ra
  54     if bm[4] = 0 then a1 <- ¬ra
  55     mode2 <- bm[2:3]
  56     if mode2 = 0 then a2 <- (¬ra)+1
  57     if mode2 = 1 then a2 <- ra-1
  58     if mode2 = 2 then a2 <- ra+1
  59     if mode2 = 3 then a2 <- ¬(ra+1)
  60     a1 <- a1 & mask
  61     a2 <- a2 & mask
  62     # select operator
  63     mode3 <- bm[0:1]
  64     if mode3 = 0 then result <- a1 | a2
  65     if mode3 = 1 then result <- a1 & a2
  66     if mode3 = 2 then result <- a1 ^ a2
  67     if mode3 = 3 then result <- undefined([0]*XLEN)
  68     # mask output
  69     result <- result & mask
  70     # optionally restore masked-out bits
  71     if L = 1 then
  72         result <- result | (RA & ¬mask)
  73     RT <- result
  74 ```
  75
  76 * first pattern A: two options `x` or `~x`
  77 * second pattern B: three options `|` `&` or `^`
  78 * third pattern C: four options `x+1`, `x-1`, `~(x+1)` or `(~x)+1`
  79
  80
  81 The lower two bits of `bm` set to 0b11 are `RESERVED`. An illegal instruction
  82 trap must be raised.
  83
  84 Special Registers Altered:
  85
  86 ```
  87     None
  88 ```
  89
  90 ## Carry-lookahead
  91
  92 As a single scalar 32-bit instruction, up to 64 carry-propagation bits
  93 may be computed.  When the output is then used as a Predicate mask it can
  94 be used to selectively perform the "add carry" of biginteger math, with
  95 `sv.addi/sm=rN RT.v, RA.v, 1`.
  96
  97 * cprop RT,RA,RB (Rc=0)
  98 * cprop. RT,RA,RB (Rc=1)
  99
 100 pseudocode:
 101
 102 ```
 103     P = (RA)
 104     G = (RB)
 105     RT = ((P|G)+G)^P
 106 ```
 107
 108 X-Form
 109
 110 | 0:5|6:10|11:15|16:20| 21:30      |31| name      |  Form   |
 111 | -- | -- | --- | --- | ---------  |--| ----      | ------- |
 112 | PO | RT | RA  | RB  | XO         |Rc|     cprop | X-Form  |
 113
 114 used not just for carry lookahead, also a special type of predication mask operation.
 115