independent of each other (no Register *or Memory* Hazards).
Elements are considered to be in the same source batch if they have
-the same value of `FLOOR(srcstep/hphint)`. Likewise in the same destination batch.
-Three key observations here:
+the same value of `FLOOR(srcstep/hphint)`. Likewise in the same destination batch
+for the same value `FLOOR(dststep/hphint)`.
+Four key observations here:
1. predication is **not** involved here. the number of actual elements
involved is considered *before* predicate masks are applied.
batches
3. batch evaluation is done *before* REMAP, making Hazard elimination easier
for Multi-Issue systems.
+4. `hphint` is *not* limited to power-of-two. Hardware implementors may choose
+ a lower parallelism hint up to `hphint` and may find power-of-two more
+ convenient. Actual parallelism (Dependency Hazard relaxation) must **never**
+ exceed `hphint`.
*Hardware Architect note: each element within the same group may be treated as
100% independent from any other element within that group, and therefore