3 # SV Vector-assist Operations.
8 * <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-register-gather-instructions>
9 * <https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-May/004884.html>
10 * <https://bugs.libre-soc.org/show_bug.cgi?id=865> implementation in simulator
11 * <https://bugs.libre-soc.org/show_bug.cgi?id=213>
12 * <https://bugs.libre-soc.org/show_bug.cgi?id=142> specialist vector ops
13 out of scope for this document [[openpower/sv/3d_vector_ops]]
14 * [[simple_v_extension/specification/bitmanip]] previous version,
15 contains pseudocode for sof, sif, sbf
16 * <https://en.m.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set#TBM_(Trailing_Bit_Manipulation)>
18 The core Power ISA was designed as scalar: SV provides a level of
19 abstraction to add variable-length element-independent parallelism.
20 Therefore there are not that many cases where *actual* Vector instructions
21 are needed. If they are, they are more "assistance" functions. Two
22 traditional Vector instructions were initially considered (conflictd and
23 vmiota) however they may be synthesised from existing SVP64 instructions:
24 vmiota may use [[svstep]]. Details in [[discussion]]
28 * Instructions suited to 3D GPU workloads (dotproduct, crossproduct,
29 normalise) are out of scope: this document is for more general-purpose
30 instructions that underpin and are critical to general-purpose Vector
31 workloads (including GPU and VPU)
32 * Instructions related to the adaptation of CRs for use as
33 predicate masks are covered separately, by crweird operations.
34 See [[sv/cr_int_predication]].
36 ## Mask-suited Bitmanipulation
41 |0..5 |6..10|11..15|16..20|21-25|26|27..31| Form |
42 |------|-----|------|------|-----|--|------|------|
43 | PO | RS | RA | RB |bm |L | XO | BM2-Form |
50 if _RB = 0 then mask <- [1] * XLEN
54 if bm[4] = 0 then a1 <- ¬ra
56 if mode2 = 0 then a2 <- (¬ra)+1
57 if mode2 = 1 then a2 <- ra-1
58 if mode2 = 2 then a2 <- ra+1
59 if mode2 = 3 then a2 <- ¬(ra+1)
64 if mode3 = 0 then result <- a1 | a2
65 if mode3 = 1 then result <- a1 & a2
66 if mode3 = 2 then result <- a1 ^ a2
67 if mode3 = 3 then result <- undefined([0]*XLEN)
69 result <- result & mask
70 # optionally restore masked-out bits
72 result <- result | (RA & ¬mask)
76 * first pattern A: two options `x` or `~x`
77 * second pattern B: three options `|` `&` or `^`
78 * third pattern C: four options `x+1`, `x-1`, `~(x+1)` or `(~x)+1`
81 The lower two bits of `bm` set to 0b11 are `RESERVED`. An illegal instruction
84 Special Registers Altered:
92 As a single scalar 32-bit instruction, up to 64 carry-propagation bits
93 may be computed. When the output is then used as a Predicate mask it can
94 be used to selectively perform the "add carry" of biginteger math, with
95 `sv.addi/sm=rN RT.v, RA.v, 1`.
97 * cprop RT,RA,RB (Rc=0)
98 * cprop. RT,RA,RB (Rc=1)
110 | 0:5|6:10|11:15|16:20| 21:30 |31| name | Form |
111 | -- | -- | --- | --- | --------- |--| ---- | ------- |
112 | PO | RT | RA | RB | XO |Rc| cprop | X-Form |
114 used not just for carry lookahead, also a special type of predication mask operation.