contains pseudocode for sof, sif, sbf
* <https://en.m.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set#TBM_(Trailing_Bit_Manipulation)>
-The core Power ISA was designed as scalar: SV provides a level of abstraction to add variable-length element-independent parallelism.
-Therefore there are not that many cases where *actual* Vector
-instructions are needed. If they are, they are more "assistance"
-functions. Two traditional Vector instructions were initially
-considered (conflictd and vmiota) however they may be synthesised
-from existing SVP64 instructions: vmiota may use [[svstep]].
-Details in [[discussion]]
+The core Power ISA was designed as scalar: SV provides a level of
+abstraction to add variable-length element-independent parallelism.
+Therefore there are not that many cases where *actual* Vector instructions
+are needed. If they are, they are more "assistance" functions. Two
+traditional Vector instructions were initially considered (conflictd and
+vmiota) however they may be synthesised from existing SVP64 instructions:
+vmiota may use [[svstep]]. Details in [[discussion]]
Notes:
-* Instructions suited to 3D GPU workloads (dotproduct, crossproduct, normalise) are out of scope: this document is for more general-purpose instructions that underpin and are critical to general-purpose Vector workloads (including GPU and VPU)
-* Instructions related to the adaptation of CRs for use as predicate masks are covered separately, by crweird operations. See [[sv/cr_int_predication]].
+* Instructions suited to 3D GPU workloads (dotproduct, crossproduct,
+ normalise) are out of scope: this document is for more general-purpose
+ instructions that underpin and are critical to general-purpose Vector
+ workloads (including GPU and VPU)
+* Instructions related to the adaptation of CRs for use as
+ predicate masks are covered separately, by crweird operations.
+ See [[sv/cr_int_predication]].
-# Mask-suited Bitmanipulation
+## Mask-suited Bitmanipulation
Based on RVV masked set-before-first, set-after-first etc.
and Intel and AMD Bitmanip instructions made generalised then
[[!inline pages="openpower/sv/bmask.py" quick="yes" raw="yes" ]]
```
-# Carry-lookahead
+## Carry-lookahead
As a single scalar 32-bit instruction, up to 64 carry-propagation bits
may be computed. When the output is then used as a Predicate mask it can
pseudocode:
+```
P = (RA)
G = (RB)
RT = ((P|G)+G)^P
+```
X-Form