update openpower2021 abstract
[libreriscv.git] / conferences / openpower2021.mdwn
1 # OpenPOWER Summit 2021
2
3 Links
4
5 * <https://cfp.openpower.foundation/summit2021/cfp>
6 * <https://cfp.openpower.foundation/summit2021/talk/review/CA7XEWT9ZKMJ3D7NRXXEK9SYPXBAHPCD>
7
8 # Abstract
9
10 *Draft SVP64 in-place Matrix Multiply and FFT / DCT for OpenPOWER*
11
12 Advanced Cray-style Vectors are being developed for the Power ISA, as a
13 Draft Extension for submission to the new OpenPOWER ISA Working Group,
14 named SVP64. Whilst in-place Matrix Multiply was planned for a much
15 later advanced version of SVP64, an investigation into putting FFMPEG's
16 MP3 CODEC inner loop into Vectorised Assembler resulted in such a large
17 drop in code size (over 4x reduction) that it warranted priority
18 investigation.
19
20 Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT)
21 and Number-Theory Transform (NTT) form the basis of too numerous
22 high-priority algorithms to count. Normal SIMD Processors and even
23 normal Vector Processors have a hard time dealing with them: inspecting
24 FFMPEG's source code reveals that heavily optimised inline assembler (no
25 loops, just hundreds to thousands of lines of assembler) is not uncommon.
26
27 The focus of this NLnet-sponsored research is therefore to create enhancements
28 to SVP64 to be able to cover DFT, DCT, NTT and Matrix-Multiply entirely
29 in-place. In-place is crucially important for many applications (3D, Video)
30 to keep power consumption down by avoiding register spill as well as L1/L2
31 cache strip-mining. General-purpose RADIX-2 DCT and complex DFT will be
32 shown and explained, as well as the in-place Matrix Multiply which does
33 not require transposing or register spill for any sized Matrices
34 (including non-power-two) up to 128 FMACs. The basics of SVP64, covered
35 in the Overview [1], will also be briefly described.
36
37 [1] https://libre-soc.org/openpower/sv/overview/