[[!tag standards]] # SV Vector Operations not added Links: * * conflictd example * * Both of these instructions may be synthesised from SVP64 Vector instructions. conflictd is an O(N^2) instruction based on `sv.cmpi` and iota is an O(N) instruction based on `sv.addi` with the appropriate predication # conflictd This is based on the AVX512 conflict detection instruction. Internally the logic is used to detect address conflicts in multi-issue LD/ST operations. Two arrays of values are given: the indices are compared and duplicates reported in a triangular fashion. the instruction may be used for histograms (computed in parallel) input = [100, 100, 3, 100, 5, 100, 100, 3] conflict result = [ 0b00000000, // Note: first element always zero 0b00000001, // 100 is present on #0 0b00000000, 0b00000011, // 100 is present on #0 and #1 0b00000000, 0b00001011, // 100 is present on #0, #1, #3 0b00011011, // .. and #4 0b00000100 // 3 is present on #2 ] Pseudocode: for i in range(VL): for j in range(1, i): if src1[i] == src2[j]: result[j] |= 1< * # iota Based on RVV vmiota. vmiota may be viewed as a cumulative variant of popcount, generating multiple results. successive iterations include more and more bits of the bitstream being tested. When masked, only the bits not masked out are included in the count process. viota RT/v, RA, RB Note that when RA=0 this indicates to test against all 1s, resulting in the instruction generating a vector sequence [0, 1, 2... VL-1]. This will be equivalent to RVV vid.m which is a pseudo-op, here (RA=0). Example 7 6 5 4 3 2 1 0 Element number 1 0 0 1 0 0 0 1 v2 contents viota.m v4, v2 # Unmasked 2 2 2 1 1 1 1 0 v4 result 1 1 1 0 1 0 1 1 v0 contents 1 0 0 1 0 0 0 1 v2 contents 2 3 4 5 6 7 8 9 v4 contents viota.m v4, v2, v0.t # Masked 1 1 1 5 1 7 1 0 v4 results def iota(RT, RA, RB): mask = RB ? iregs[RB] : 0b111111...1 val = RA ? iregs[RA] : 0b111111...1 for i in range(VL): if RA.scalar: testmask = (1<