limited to the type of operation, it may perform Warshall Transitive
Closure, Integer Matrix, Bitmanipulation Matrix, Galois Field (carryless
mul) Matrix, and with care potentially Graph Maximum Flow as well. Also
- suited to Convolutions, Matrix Transpose and rotate.
+ suited to Convolutions, Matrix Transpose and rotate, *all* of which is
+ in-place.
* General-purpose Indexed REMAP, this option is provided to implement
an equivalent of VSX `vperm`
* Parallel Reduction REMAP, performs an automatic map-reduce using