4 These are bit manipulation opcodes that, if provided, augment SimpleV for
5 the purposes of efficiently accelerating Vector Processing, 3D Graphics
8 The justification for their inclusion in BitManip is identical to the
9 significant justification that went into their inclusion in the
10 RISC-V Vector Extension (under the "Predicate Mask" opcodes section)
13 <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-mask-instructions>
18 SV uses standard integer scalar registers as a predicate bitmask. Therefore,
19 the majority of RISC-V RV32I / RV64I bit-level instructions are perfectly
20 adequate. Some exceptions however present themselves from RVV.
22 ## logical bit-wise instructions
24 These are the available bitwise instructions in RVV:
26 vmand.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB && vs1[i].LSB
27 vmnand.mm vd, vs2, vs1 # vd[i] = !(vs2[i].LSB && vs1[i].LSB)
28 vmandnot.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB && !vs1[i].LSB
29 vmxor.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB ^^ vs1[i].LSB
30 vmor.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB || vs1[i].LSB
31 vmnor.mm vd, vs2, vs1 # vd[i] = !(vs2[i[.LSB || vs1[i].LSB)
32 vmornot.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB || !vs1[i].LSB
33 vmxnor.mm vd, vs2, vs1 # vd[i] = !(vs2[i].LSB ^^ vs1[i].LSB)
35 The ones that exist in scalar RISC-V are:
37 AND rd, rs1, rs2 # rd = rs1 & rs2
38 OR rd, rs1, rs2 # rd = rs1 | rs2
39 XOR rd, rs1, rs2 # rd = rs1 ^ rs2
41 The ones in Bitmanip are:
43 ANDN rd, rs1, rs2 # rd = rs1 & ~rs2
44 ORN rd, rs1, rs2 # rd = rs1 | ~rs2
45 XORN rd, rs1, rs2 # rd = rs1 ^ ~rs2
52 These are currently listed as "pseudo-ops" in BitManip-Draft (0.91)
53 They need to be actual opcodes.
56 TODO: there is an extensive table in RVV of bit-level operations:
58 output instruction pseudoinstruction
60 | 0 | 1 | 2 | 3 | instruction | pseudoinstruction |
61 | - | - | - | - | -------------------------- | ----------------- |
62 | 0 | 0 | 0 | 0 | vmxor.mm vd, vd, vd | vmclr.m vd |
63 | 1 | 0 | 0 | 0 | vmnor.mm vd, src1, src2 | |
64 | 0 | 1 | 0 | 0 | vmandnot.mm vd, src2, src1 | |
65 | 1 | 1 | 0 | 0 | vmnand.mm vd, src1, src1 | vmnot.m vd, src1 |
66 | 0 | 0 | 1 | 0 | vmandnot.mm vd, src1, src2 | |
67 | 1 | 0 | 1 | 0 | vmnand.mm vd, src2, src2 | vmnot.m vd, src2 |
68 | 0 | 1 | 1 | 0 | vmxor.mm vd, src1, src2 | |
69 | 1 | 1 | 1 | 0 | vmnand.mm vd, src1, src2 | |
70 | 0 | 0 | 0 | 1 | vmand.mm vd, src1, src2 | |
71 | 1 | 0 | 0 | 1 | vmxnor.mm vd, src1, src2 | |
72 | 0 | 1 | 0 | 1 | vmand.mm vd, src2, src2 | vmcpy.m vd, src2 |
73 | 1 | 1 | 0 | 1 | vmornot.mm vd, src2, src1 | |
74 | 0 | 0 | 1 | 1 | vmand.mm vd, src1, src1 | vmcpy.m vd, src1 |
75 | 1 | 0 | 1 | 1 | vmornot.mm vd, src1, src2 | |
76 | 1 | 1 | 1 | 1 | vmxnor.mm vd, vd, vd | vmset.m vd |
78 ## pcnt - population count
84 unsigned int v; // count the number of bits set in v
85 unsigned int c; // c accumulates the total bits set in v
88 v &= v - 1; // clear the least significant bit set
91 This instruction is present in BitManip.
93 ## ffirst - find first bit
95 finds the first bit set as an index.
100 uint_xlen_t clz(uint_xlen_t rs1)
102 for (int count = 0; count < XLEN; count++)
103 if ((rs1 << count) >> (XLEN - 1))
108 This is similar but not identical to BitManip "CLZ". CLZ returns XLEN when no bits are set, whereas RVV returns -1.
110 ## sbf - set before first bit
112 Sets all LSBs leading up to (excluding) where an LSB in the src is set,
113 and sets zeros including and following the src bit found.
114 If the second operand is non-zero, this process continues the search
115 (in the same LSB to MSB order) beginning each time (including the first time)
116 from where 1s are set in the second operand.
118 A side-effect of the search is that when src is zero, the output is all ones.
119 If the second operand is non-zero and the src is zero, the output is a
120 copy of the second operand.
124 7 6 5 4 3 2 1 0 Bit number
126 1 0 0 1 0 1 0 0 a3 contents
128 0 0 0 0 0 0 1 1 a2 contents
130 1 0 0 1 0 1 0 1 a3 contents
134 0 0 0 0 0 0 0 0 a3 contents
138 1 1 0 0 0 0 1 1 a0 vcontents
139 1 0 0 1 0 1 0 0 a3 contents
141 0 1 0 0 0 0 1 1 a2 contents
145 def sof(rd, rs1, rs2):
147 setting_mode = rs2 == x0 or (regs[rs2] & 1)
152 # only reenable when predicate in use, and bit valid
153 if !setting_mode && rs2 != x0:
154 if (regs[rs2] & bit):
155 # back into "setting" mode
161 if regs[rs1] & bit == 1:
165 # setting mode, search for 1
166 if regs[rs1] & bit: # found a bit in rs1:
168 # next loop starts skipping
170 regs[rd] |= bit # always set except when search succeeds
174 def sbf(rd, rs1, rs2):
176 # start setting if no predicate or if 1st predicate bit set
177 setting_mode = rs2 == x0 or (regs[rs2] & 1)
180 if rs2 != x0 and (regs[rs2] & bit):
184 if regs[rs1] & bit: # found a bit in rs1: stop setting rd
188 else if rs2 != x0: # searching mode
189 if (regs[rs2] & bit):
190 setting_mode = True # back into "setting" mode
193 ## sif - set including first bit
195 Similar to sbf except including the bit which ends a run. i.e:
196 Sets all LSBs leading up to *and including* where an LSB in the src is set,
197 and sets zeros following the point where the src bit is found.
199 The side-effect of when the src is zero is also the same as for sbf:
200 output is all 1s if src2 is zero, and output is equal to src2 if src2
206 7 6 5 4 3 2 1 0 Element number
208 1 0 0 1 0 1 0 0 a3 contents
210 0 0 0 0 0 1 1 1 a2 contents
212 1 0 0 1 0 1 0 1 a3 contents
216 1 1 0 0 0 0 1 1 a0 vcontents
217 1 0 0 1 0 1 0 0 a3 contents
219 1 1 x x x x 1 1 a2 contents
223 def sif(rd, rs1, rs2):
225 setting_mode = rs2 == x0 or (regs[rs2] & 1)
230 # only reenable when predicate in use, and bit valid
231 if !setting_mode && rs2 != x0:
232 if (regs[rs2] & bit):
233 # back into "setting" mode
239 if regs[rs1] & bit == 1:
243 # setting mode, search for 1
244 regs[rd] |= bit # always set during search
245 if regs[rs1] & bit: # found a bit in rs1:
247 # next loop starts skipping
251 ## sof - set only first bit
253 Similar to sbf and sif except *only* set the bit which ends a run.
255 Unlike sbf and sif however, if the src is zero then the output is
256 also guaranteed to be zero, irrespective of src2's contents.
260 7 6 5 4 3 2 1 0 Element number
262 1 0 0 1 0 1 0 0 a3 contents
264 0 0 0 0 0 1 0 0 a2 contents
266 1 0 0 1 0 1 0 1 a3 contents
270 1 1 0 0 0 0 1 1 a0 vcontents
271 1 1 0 1 0 1 0 0 a3 contents
273 0 1 x x x x 0 0 a2 contents
277 def sof(rd, rs1, rs2):
279 setting_mode = rs2 == x0 or (regs[rs2] & 1)
284 # only reenable when predicate in use, and bit valid
285 if !setting_mode && rs2 != x0:
286 if (regs[rs2] & bit):
287 # back into "setting" mode
293 if regs[rs1] & bit == 1:
297 # setting mode, search for 1
298 if regs[rs1] & bit: # found a bit in rs1:
299 regs[rd] |= bit # only set when search succeeds
301 # next loop starts skipping