signed and unsigned, these are N-to-M (N=64/32/16, M=32/16/8) chop/clamp/sign/zero-extend operations. May be implemented by a clamped move to a smaller elwidth.
The other direction, vec_unpack widening ops, may need some way to tell whether to sign-extend or zero-extend.
+
+*scalar extsw/b/h gives one set, mv gives another. src elwidth override and dest rlwidth override provide the pack/unpack*
## vavgs\* (vec_avg)
result = truncate((a + b + 1) >> 1))
+*These do not exist in scalar ISA and would need to be added. Essentially it is a type of post-processing involving the CA bit so could be included in the existing scalar pipeline ALU*
+
## vabsdu\* (vec_abs)
unsigned 8/16/32: these are all of the form:
result = (src1 > src2) ? truncate(src1-src2) :
truncate(src2-src1)
+*These do not exist in the scalar ISA and would need to be added*
+
## vmaxs\* / vmaxu\* (and min)
signed and unsigned, 8/16/32: these are all of the form:
result = (src1 > src2) ? src1 : src2 # max
result = (src1 < src2) ? src1 : src2 # min
+*These do not exist in the scalar INTEGER ISA and would need to be added*
+
## vmerge operations
Their main point was to work around the odd/even multiplies. SV swizzles and mv.x should handle all cases.
a.x + a.y + a.z ...
a.x * a.y * a.z ...
-*This would realistically need to be done with a loop doing a mapreduce. I looked very early on at doing this type of operation and concluded it would be better done with a series of halvings each time, as separate instructions: VL=16 then VL=8 then 4 then 2 and finally one scalar. An OoO multi-issue engine woukd be more than capable of desling with the Dependencies.*
+*This would realistically need to be done with a loop doing a mapreduce sequrnce. I looked very early on at doing this type of operation and concluded it would be better done with a series of halvings each time, as separate instructions: VL=16 then VL=8 then 4 then 2 and finally one scalar. An OoO multi-issue engine woukd be more than capable of desling with the Dependencies.*
## vec_mul*
ctz - count trailing zeroes
clz - count leading zeroes
popcnt - count set bits
+
+*These all exist in the scalar ISA*