From 361b35474d9c4cbbab09b53c61f91f93812a5ce3 Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 5 Mar 2022 21:03:15 +0000 Subject: [PATCH] --- openpower/sv/bitmanip.mdwn | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/openpower/sv/bitmanip.mdwn b/openpower/sv/bitmanip.mdwn index f85e9ce3c..f6c2265b1 100644 --- a/openpower/sv/bitmanip.mdwn +++ b/openpower/sv/bitmanip.mdwn @@ -19,7 +19,7 @@ When combined with SV, scalar variants of bitmanip operations found in VSX are a ternlogv is experimental and is the only operation that may be considered a "Packed SIMD". It is added as a variant of the already well-justified ternlog operation (done in AVX512 as an immediate only) "because it looks fun". As it is based on the LUT4 concept it will allow accelerated emulation of FPGAs. Other vendors of ISAs are buying FPGA companies to achieve similar objectives. -general-purpose Galois Field operations are added so as to avoid huge custom opcode proliferation across many areas of Computer Science. however for convenience and also to avoid setup costs, some of the more common operations (clmul, crc32) are also added. The expectation is that these operations would all be covered by the same pipeline. +general-purpose Galois Field 2^M operations are added so as to avoid huge custom opcode proliferation across many areas of Computer Science. however for convenience and also to avoid setup costs, some of the more common operations (clmul, crc32) are also added. The expectation is that these operations would all be covered by the same pipeline. note that there are brownfield spaces below that could incorporate some of the set-before-first and other scalar operations listed in [[sv/vector_ops]], and the [[sv/av_opcodes]] as well as [[sv/setvl]] @@ -519,6 +519,20 @@ see to save registers and make operations orthogonal with standard arithmetic the modulo is to be set in an SPR +## Twin Butterfly (Tukey-Cooley) Mul-add-sub + +used in combination with SV FFT REMAP to perform +a full NTT in-place + + gffmadd RT,RA,RC,RB (Rc=0) + gffmadd. RT,RA,RC,RB (Rc=1) + +Pseudo-code: + + RT <- GFMULADD(RA, RC, RB) + RS <- GFMULADD(RA, RC, RB) + + ## Multiply this requires 3 parameters and a "degree" -- 2.30.2