ternlogv is experimental and is the only operation that may be considered a "Packed SIMD". It is added as a variant of the already well-justified ternlog operation (done in AVX512 as an immediate only) "because it looks fun". As it is based on the LUT4 concept it will allow accelerated emulation of FPGAs. Other vendors of ISAs are buying FPGA companies to achieve similar objectives.
-general-purpose Galois Field operations are added so as to avoid huge custom opcode proliferation across many areas of Computer Science. however for convenience and also to avoid setup costs, some of the more common operations (clmul, crc32) are also added. The expectation is that these operations would all be covered by the same pipeline.
+general-purpose Galois Field 2^M operations are added so as to avoid huge custom opcode proliferation across many areas of Computer Science. however for convenience and also to avoid setup costs, some of the more common operations (clmul, crc32) are also added. The expectation is that these operations would all be covered by the same pipeline.
note that there are brownfield spaces below that could incorporate some of the set-before-first and other scalar operations listed in [[sv/vector_ops]], and
the [[sv/av_opcodes]] as well as [[sv/setvl]]
to save registers and make operations orthogonal with standard
arithmetic the modulo is to be set in an SPR
+## Twin Butterfly (Tukey-Cooley) Mul-add-sub
+
+used in combination with SV FFT REMAP to perform
+a full NTT in-place
+
+ gffmadd RT,RA,RC,RB (Rc=0)
+ gffmadd. RT,RA,RC,RB (Rc=1)
+
+Pseudo-code:
+
+ RT <- GFMULADD(RA, RC, RB)
+ RS <- GFMULADD(RA, RC, RB)
+
+
## Multiply
this requires 3 parameters and a "degree"