|0..5 |6..10|11..15|16..20|21-25|26|27..31| Form |
|------|-----|------|------|-----|--|------|------|
-| PO | RS | RA | RB |mode |L | XO | BM2-Form |
+| PO | RS | RA | RB |bm |L | XO | BM2-Form |
-* bmask RT,RA,RB,mode,L
+* bmask RT,RA,RB,bm,L
The patterns within the pseudocode for AMD TBM and x86 BMI1 are
as follows:
Thus it makes sense to create a single instruction
that covers all of these. A crucial addition that is essential
-for Scalable Vector usage however is the second mask parameter
-(RB).
+for Scalable Vector usage as Predicate Masks, is the second mask parameter
+(RB). The additional paramater, L, if set, will leave bits of RA masked
+by RB unaltered, otherwise those bits are set to zero. Note that when `RB=0`
+then instead of reading from the register file the mask is set to all ones.
Executable pseudocode demo: