openpower/sv/twin_butterfly.mdwn

   1
   2 * <https://bugs.libre-soc.org/show_bug.cgi?id=1074>
   3 * <https://libre-soc.org/openpower/sv/biginteger/> for format and
   4   information about implicit RS/FRS
   5
   6 # [DRAFT] Twin Butterfly DCT Instruction(s)
   7
   8 The goal is to implement instructions that calculate the expression:
   9
  10 ```
  11 fdct_round_shift((a +/- b) * c)
  12 ```
  13
  14 For the single-coefficient butterfly instruction, and:
  15
  16 ```
  17  fdct_round_shift(a * c1  +/- b * c2)
  18 ```
  19
  20 For the double-coefficient butterfly instruction.
  21
  22 `fdct_round_shift` is defined as `ROUND_POWER_OF_TWO(x, 14)`
  23
  24 ```
  25 #define ROUND_POWER_OF_TWO(value, n) (((value) + (1 << ((n)-1))) >> (n))
  26 ```
  27
  28 These instructions are at the core of **ALL** FDCT calculations in many major video codecs, including -but not limited to- VP8/VP9, AV1, etc.
  29 Arm includes special instructions to optimize these operations, although they are limited in precision: `vqrdmulhq_s16`/`vqrdmulhq_s32`.
  30
  31 The suggestion is to have a single instruction to calculate both values `((a + b) * c) >> N`, and `((a - b) * c) >> N`.
  32 The instruction will run in accumulate mode, so in order to calculate the 2-coeff version one would just have to call the same instruction with different order a, b and a different constant c.
  33
  34 ```
  35 # [DRAFT] Integer Butterfly Multiply Add/Sub FFT/DCT
  36
  37 BF-Form
  38
  39 * maddsubrs  RT,RA,RB,RC,SH
  40
  41 Pseudo-code:
  42
  43     RT2 <- RT + 1
  44     sum <- (RA) + (RB)
  45     diff <- (RA) - (RB)
  46     prod1 <- MUL(RC, sum)
  47     prod2 <- MUL(RC, diff)
  48     res1 <- ROTL64(prod1, SH)
  49     res2 <- ROTL64(prod2, SH)
  50     RT <- (RT) + res1
  51     RT2 <- (RT2) + res2
  52
  53 Special Registers Altered:
  54
  55     None
  56 ```
  57
  58 Where BF-Form is defined in fields.txt:
  59
  60 ```
  61 # 1.6.39 BF-FORM
  62     |0     | 6   |11   |16   |21   | 25  |30  |31  |
  63     | PO   | RT  | RA  | RB  | RC  | SH  | XO | Rc |
  64
  65 ```
  66
  67 The instruction has been added to `minor_59.csv`:
  68 ```
  69 1111011111,ALU,OP_MADDSUBRS,RA,RB,RC,RT,NONE,CR1,0,0,ZERO,0,NONE,0,0,0,0,1,0,RC_ONLY,0,0,maddsubrs,A,,1,unofficial until submitted and approved/renumbered by the opf isa wg
  70 ```
  71