register-renaming will have an easier time dealing with this than
DSP-style SIMD micro-architectures.
-## REMAP FFT, DFT, NTT
-
-The algorithm from a later section of this Appendix shows how FFT REMAP works,
-and it may be executed as a standalone python3 program.
-The executable code is designed to illustrate how a hardware
-implementation may generate Indices which are completely
-independent of the Execution of element-level operations,
-even for something as complex as a Triple-loop Tukey-Cooley
-Schedule. A comprehensive demo and test suite may be found
-[here](https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_svp64_fft.py;hb=HEAD)
-including Complex Number FFT which deploys Vertical-First Mode
-on top of the REMAP Schedules.
-
-Other uses include more than DFT and NTT: as abstracted RISC-paradigm
-the Schedules are not
-restricted in any way or tied to any particular instructtion.
-If the programmer can find any algorithm
-which has identical triple nesting then the FFT Schedule may be
-used even there.
-# 4x4 Matrix to vec4 Multiply (4x4 by 1x4)
+### 4x4 Matrix to vec4 Multiply (4x4 by 1x4)
The following settings will allow a 4x4 matrix (starting at f8), expressed
as a sequence of 16 numbers first by row then by column, to be multiplied
of some other computation, which is frequently the case, then
clearly the zeroing is not needed.
+## REMAP FFT, DFT, NTT
+
+The algorithm from a later section of this Appendix shows how FFT REMAP works,
+and it may be executed as a standalone python3 program.
+The executable code is designed to illustrate how a hardware
+implementation may generate Indices which are completely
+independent of the Execution of element-level operations,
+even for something as complex as a Triple-loop Tukey-Cooley
+Schedule. A comprehensive demo and test suite may be found
+[here](https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_svp64_fft.py;hb=HEAD)
+including Complex Number FFT which deploys Vertical-First Mode
+on top of the REMAP Schedules.
+
+Other uses include more than DFT and NTT: as abstracted RISC-paradigm
+the Schedules are not
+restricted in any way or tied to any particular instructtion.
+If the programmer can find any algorithm
+which has identical triple nesting then the FFT Schedule may be
+used even there.
+
[[!tag standards]]
---------