to keep power consumption down by avoiding register spill as well as L1/L2
cache strip-mining. General-purpose RADIX-2 DCT and complex DFT will be
shown and explained, as well as the in-place Matrix Multiply which does
-not require transposing or register spill for any sized Matrices up to
-128 FMACs. The basics of SVP64, covered in the Overview [1], will also
-be briefly described.
+not require transposing or register spill for any sized Matrices
+(including non-power-two) up to 128 FMACs. The basics of SVP64, covered
+in the Overview [1], will also be briefly described.
[1] https://libre-soc.org/openpower/sv/overview/