openpower/sv/av_opcodes.mdwn

   1 [[!tag standards]]
   2
   3 # Scalar OpenPOWER Audio and Video Opcodes
   4
   5 the fundamental principle of SV is a hardware for-loop. therefore the first (and in nearly 100% of cases only) place to put Vector operations is first and foremost in the *scalar* ISA.  However only by analysing those scalar opcodes *in* a SV Vectorization context does it become clear why they are needed and how they may be designed.
   6
   7 This page therefore has accompanying discussion at <https://bugs.libre-soc.org/show_bug.cgi?id=230> for evolution of suitable opcodes.
   8
   9 Links
  10
  11 * <https://bugs.libre-soc.org/show_bug.cgi?id=915> add overflow to maxmin.
  12 * <https://bugs.libre-soc.org/show_bug.cgi?id=863> add pseudocode etc.
  13 * <https://bugs.libre-soc.org/show_bug.cgi?id=234> hardware implementation
  14 * <https://bugs.libre-soc.org/show_bug.cgi?id=910> mins/maxs zero-option?
  15 * <https://bugs.libre-soc.org/show_bug.cgi?id=1057> move all int/fp min/max to ls013
  16 * [[vpu]]
  17 * [[sv/int_fp_mv]]
  18 * [[openpower/isa/av]] pseudocode
  19 * [[av_opcodes/analysis]]
  20 * TODO review HP 1994-6 PA-RISC MAX <https://en.m.wikipedia.org/wiki/Multimedia_Acceleration_eXtensions>
  21 * <https://en.m.wikipedia.org/wiki/Sum_of_absolute_differences>
  22 * List of MMX instructions <https://cs.fit.edu/~mmahoney/cse3101/mmx.html>
  23
  24 # Summary
  25
  26 In-advance, the summary of base scalar operations that need to be added is:
  27
  28 | instruction   | pseudocode               |
  29 | ------------  | ------------------------      |
  30 | average-add.  | result = (src1 + src2 + 1) >> 1 |
  31 | abs-diff      | result = abs (src1-src2) |
  32 | abs-accumulate| result += abs (src1-src2) |
  33 | (un)signed min| result = (src1 < src2) ? src1 : src2 [[ls013]] |
  34 | (un)signed max| result = (src1 > src2) ? src1 : src2 [[ls013]]  |
  35 | bitwise sel   | (a ? b : c) - use [[sv/bitmanip]] ternary |
  36 | int/fp move   | covered by REMAP and Pack/Unpack |
  37
  38 Implemented at the [[openpower/isa/av]] pseudocode page.
  39
  40 All other capabilities (saturate in particular) are achieved with [[sv/svp64]] modes and swizzle.  Note that minmax and ternary are added in bitmanip.
  41
  42 # Instructions
  43
  44 ## Average Add
  45
  46 X-Form
  47
  48 * avgadd  RT,RA,RB (Rc=0)
  49 * avgadd. RT,RA,RB (Rc=1)
  50
  51 Pseudo-code:
  52
  53     a <- [0] * (XLEN+1)
  54     b <- [0] * (XLEN+1)
  55     a[1:XLEN] <- (RA)
  56     b[1:XLEN] <- (RB)
  57     r <- (a + b + 1)
  58     RT <- r[0:XLEN-1]
  59
  60 Special Registers Altered:
  61
  62     CR0                     (if Rc=1)
  63
  64 ## Absolute Signed Difference
  65
  66 X-Form
  67
  68 * absds  RT,RA,RB (Rc=0)
  69 * absds. RT,RA,RB (Rc=1)
  70
  71 Pseudo-code:
  72
  73     if (RA) < (RB) then RT <- ¬(RA) + (RB) + 1
  74     else                RT <- ¬(RB) + (RA) + 1
  75
  76 Special Registers Altered:
  77
  78     CR0                     (if Rc=1)
  79
  80 ## Absolute Unsigned Difference
  81
  82 X-Form
  83
  84 * absdu  RT,RA,RB (Rc=0)
  85 * absdu. RT,RA,RB (Rc=1)
  86
  87 Pseudo-code:
  88
  89     if (RA) <u (RB) then RT <- ¬(RA) + (RB) + 1
  90     else                RT <- ¬(RB) + (RA) + 1
  91
  92 Special Registers Altered:
  93
  94     CR0                     (if Rc=1)
  95
  96 ## Absolute Accumulate Unsigned Difference
  97
  98 X-Form
  99
 100 * absdacu  RT,RA,RB (Rc=0)
 101 * absdacu. RT,RA,RB (Rc=1)
 102
 103 Pseudo-code:
 104
 105     if (RA) <u (RB) then r <- ¬(RA) + (RB) + 1
 106     else                 r <- ¬(RB) + (RA) + 1
 107     RT <- (RT) + r
 108
 109 Special Registers Altered:
 110
 111     CR0                     (if Rc=1)
 112
 113 ## Absolute Accumulate Signed Difference
 114
 115 X-Form
 116
 117 * absdacs  RT,RA,RB (Rc=0)
 118 * absdacs. RT,RA,RB (Rc=1)
 119
 120 Pseudo-code:
 121
 122     if (RA) < (RB) then r <- ¬(RA) + (RB) + 1
 123     else                r <- ¬(RB) + (RA) + 1
 124     RT <- (RT) + r
 125
 126 Special Registers Altered:
 127
 128     CR0                     (if Rc=1)