[[!tag standards]] # SV Vector-assist Operations. Links: * [[discussion]] * * * implementation in simulator * * specialist vector ops out of scope for this document [[openpower/sv/3d_vector_ops]] * [[simple_v_extension/specification/bitmanip]] previous version, contains pseudocode for sof, sif, sbf * The core Power ISA was designed as scalar: SV provides a level of abstraction to add variable-length element-independent parallelism. Therefore there are not that many cases where *actual* Vector instructions are needed. If they are, they are more "assistance" functions. Two traditional Vector instructions were initially considered (conflictd and vmiota) however they may be synthesised from existing SVP64 instructions: vmiota may use [[svstep]]. Details in [[discussion]] Notes: * Instructions suited to 3D GPU workloads (dotproduct, crossproduct, normalise) are out of scope: this document is for more general-purpose instructions that underpin and are critical to general-purpose Vector workloads (including GPU and VPU) * Instructions related to the adaptation of CRs for use as predicate masks are covered separately, by crweird operations. See [[sv/cr_int_predication]]. ## Mask-suited Bitmanipulation BM2-Form |0..5 |6..10|11..15|16..20|21-25|26|27..31| Form | |------|-----|------|------|-----|--|------|------| | PO | RS | RA | RB |bm |L | XO | BM2-Form | * bmask RS,RA,RB,bm,L Pseudo-code: ``` if _RB = 0 then mask <- [1] * XLEN else mask <- (RB) ra <- (RA) & mask a1 <- ra if bm[4] = 0 then a1 <- ¬ra mode2 <- bm[2:3] if mode2 = 0 then a2 <- (¬ra)+1 if mode2 = 1 then a2 <- ra-1 if mode2 = 2 then a2 <- ra+1 if mode2 = 3 then a2 <- ¬(ra+1) a1 <- a1 & mask a2 <- a2 & mask # select operator mode3 <- bm[0:1] if mode3 = 0 then result <- a1 | a2 if mode3 = 1 then result <- a1 & a2 if mode3 = 2 then result <- a1 ^ a2 if mode3 = 3 then result <- undefined([0]*XLEN) # mask output result <- result & mask # optionally restore masked-out bits if L = 1 then result <- result | (RA & ¬mask) RT <- result ``` * first pattern A: two options `x` or `~x` * second pattern B: three options `|` `&` or `^` * third pattern C: four options `x+1`, `x-1`, `~(x+1)` or `(~x)+1` The lower two bits of `bm` set to 0b11 are `RESERVED`. An illegal instruction trap must be raised. Special Registers Altered: ``` None ``` ## Carry-lookahead As a single scalar 32-bit instruction, up to 64 carry-propagation bits may be computed. When the output is then used as a Predicate mask it can be used to selectively perform the "add carry" of biginteger math, with `sv.addi/sm=rN RT.v, RA.v, 1`. * cprop RT,RA,RB (Rc=0) * cprop. RT,RA,RB (Rc=1) pseudocode: ``` P = (RA) G = (RB) RT = ((P|G)+G)^P ``` X-Form | 0:5|6:10|11:15|16:20| 21:30 |31| name | Form | | -- | -- | --- | --- | --------- |--| ---- | ------- | | PO | RT | RA | RB | XO |Rc| cprop | X-Form | used not just for carry lookahead, also a special type of predication mask operation.