(no commit message)
[libreriscv.git] / openpower / sv / twin_butterfly.mdwn
1 * <https://bugs.libre-soc.org/show_bug.cgi?id=1074>
2 * <https://libre-soc.org/openpower/sv/biginteger/> for format and
3 information about implicit RS/FRS
4 * <https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/test_caller_svp64_dct.py;hb=HEAD>
5
6 # [DRAFT] Twin Butterfly DCT Instruction(s)
7
8 The goal is to implement instructions that calculate the expression:
9
10 ```
11 fdct_round_shift((a +/- b) * c)
12 ```
13
14 For the single-coefficient butterfly instruction, and:
15
16 ```
17 fdct_round_shift(a * c1 +/- b * c2)
18 ```
19
20 For the double-coefficient butterfly instruction.
21
22 `fdct_round_shift` is defined as `ROUND_POWER_OF_TWO(x, 14)`
23
24 ```
25 #define ROUND_POWER_OF_TWO(value, n) (((value) + (1 << ((n)-1))) >> (n))
26 ```
27
28 These instructions are at the core of **ALL** FDCT calculations in many major video codecs, including -but not limited to- VP8/VP9, AV1, etc.
29 Arm includes special instructions to optimize these operations, although they are limited in precision: `vqrdmulhq_s16`/`vqrdmulhq_s32`.
30
31 The suggestion is to have a single instruction to calculate both values `((a + b) * c) >> N`, and `((a - b) * c) >> N`.
32 The instruction will run in accumulate mode, so in order to calculate the 2-coeff version one would just have to call the same instruction with different order a, b and a different constant c.
33
34 ```
35 # [DRAFT] Integer Butterfly Multiply Add/Sub FFT/DCT
36
37 BF-Form
38
39 * maddsubrs RT,RA,RB,RC,SH
40
41 Pseudo-code:
42
43 RT2 <- RT + 1
44 sum <- (RA) + (RB)
45 diff <- (RA) - (RB)
46 prod1 <- MUL(RC, sum)
47 prod2 <- MUL(RC, diff)
48 res1 <- ROTL64(prod1, SH)
49 res2 <- ROTL64(prod2, SH)
50 RT <- (RT) + res1
51 RT2 <- (RT2) + res2
52
53 Special Registers Altered:
54
55 None
56 ```
57
58 Where BF-Form is defined in fields.txt:
59
60 ```
61 # 1.6.39 BF-FORM
62 |0 | 6 |11 |16 |21 | 25 |30 |31 |
63 | PO | RT | RA | RB | RC | SH | XO | Rc |
64
65 ```
66
67 The instruction has been added to `minor_22.csv`:
68 ```
69 ------01000,ALU,OP_MADDSUBRS,RA,RB,RC,RT,NONE,CR0,0,0,ZERO,0,NONE,0,0,0,0,1,0,RC_ONLY,0,0,maddsubrs,A,,1,unofficial until submitted and approved/renumbered by the opf isa wg
70 ```
71