openpower/sv/av_opcodes.mdwn

   1 [[!tag standards]]
   2
   3 # Scalar OpenPOWER Audio and Video Opcodes
   4
   5 the fundamental principle of SV is a hardware for-loop. therefore the first (and in nearly 100% of cases only) place to put Vector operations is first and foremost in the *scalar* ISA.  However only by analysing those scalar opcodes *in* a SV Vectorisation context does it become clear why they are needed and how they may be designed.
   6
   7 This page therefore has accompanying discussion at <https://bugs.libre-soc.org/show_bug.cgi?id=230> for evolution of suitable opcodes.
   8
   9 Links
  10
  11 * <https://bugs.libre-soc.org/show_bug.cgi?id=915> add overflow to maxmin.
  12 * <https://bugs.libre-soc.org/show_bug.cgi?id=863> add pseudocode etc.
  13 * <https://bugs.libre-soc.org/show_bug.cgi?id=234> hardware implementation
  14 * <https://bugs.libre-soc.org/show_bug.cgi?id=910> mins/maxs zero-option?
  15 * [[vpu]]
  16 * [[sv/int_fp_mv]]
  17 * [[openpower/isa/av]] pseudocode
  18 * [[av_opcodes/analysis]]
  19 * TODO review HP 1994-6 PA-RISC MAX <https://en.m.wikipedia.org/wiki/Multimedia_Acceleration_eXtensions>
  20 * <https://en.m.wikipedia.org/wiki/Sum_of_absolute_differences>
  21 * List of MMX instructions <https://cs.fit.edu/~mmahoney/cse3101/mmx.html>
  22
  23 # Summary
  24
  25 In-advance, the summary of base scalar operations that need to be added is:
  26
  27 | instruction   | pseudocode               |
  28 | ------------  | ------------------------      |
  29 | average-add.  | result = (src1 + src2 + 1) >> 1 |
  30 | abs-diff      | result = abs (src1-src2) |
  31 | abs-accumulate| result += abs (src1-src2) |
  32 | (un)signed min| result = (src1 < src2) ? src1 : src2 |
  33 | (un)signed max| result = (src1 > src2) ? src1 : src2 |
  34 | bitwise sel   | (a ? b : c) - use [[sv/bitmanip]] ternary |
  35 | int/fp move   | covered by REMAP and Pack/Unpack |
  36
  37 Implemented at the [[openpower/isa/av]] pseudocode page.
  38
  39 All other capabilities (saturate in particular) are achieved with [[sv/svp64]] modes and swizzle.  Note that minmax and ternary are added in bitmanip.
  40
  41 # Instructions
  42
  43 ## Average Add
  44
  45 X-Form
  46
  47 * avgadd  RT,RA,RB (Rc=0)
  48 * avgadd. RT,RA,RB (Rc=1)
  49
  50 Pseudo-code:
  51
  52     a <- [0] * (XLEN+1)
  53     b <- [0] * (XLEN+1)
  54     a[1:XLEN] <- (RA)
  55     b[1:XLEN] <- (RB)
  56     r <- (a + b + 1)
  57     RT <- r[0:XLEN-1]
  58
  59 Special Registers Altered:
  60
  61     CR0                     (if Rc=1)
  62
  63 ## Absolute Signed Difference
  64
  65 X-Form
  66
  67 * absds  RT,RA,RB (Rc=0)
  68 * absds. RT,RA,RB (Rc=1)
  69
  70 Pseudo-code:
  71
  72     if (RA) < (RB) then RT <- ¬(RA) + (RB) + 1
  73     else                RT <- ¬(RB) + (RA) + 1
  74
  75 Special Registers Altered:
  76
  77     CR0                     (if Rc=1)
  78
  79 ## Absolute Unsigned Difference
  80
  81 X-Form
  82
  83 * absdu  RT,RA,RB (Rc=0)
  84 * absdu. RT,RA,RB (Rc=1)
  85
  86 Pseudo-code:
  87
  88     if (RA) <u (RB) then RT <- ¬(RA) + (RB) + 1
  89     else                RT <- ¬(RB) + (RA) + 1
  90
  91 Special Registers Altered:
  92
  93     CR0                     (if Rc=1)
  94
  95 ## Absolute Accumulate Unsigned Difference
  96
  97 X-Form
  98
  99 * absdacu  RT,RA,RB (Rc=0)
 100 * absdacu. RT,RA,RB (Rc=1)
 101
 102 Pseudo-code:
 103
 104     if (RA) <u (RB) then r <- ¬(RA) + (RB) + 1
 105     else                 r <- ¬(RB) + (RA) + 1
 106     RT <- (RT) + r
 107
 108 Special Registers Altered:
 109
 110     CR0                     (if Rc=1)
 111
 112 ## Absolute Accumulate Signed Difference
 113
 114 X-Form
 115
 116 * absdacs  RT,RA,RB (Rc=0)
 117 * absdacs. RT,RA,RB (Rc=1)
 118
 119 Pseudo-code:
 120
 121     if (RA) < (RB) then r <- ¬(RA) + (RB) + 1
 122     else                r <- ¬(RB) + (RA) + 1
 123     RT <- (RT) + r
 124
 125 Special Registers Altered:
 126
 127     CR0                     (if Rc=1)