3 # Scalar OpenPOWER Audio and Video Opcodes
5 the fundamental principle of SV is a hardware for-loop. therefore the first (and in nearly 100% of cases only) place to put Vector operations is first and foremost in the *scalar* ISA. However only by analysing those scalar opcodes *in* a SV Vectorization context does it become clear why they are needed and how they may be designed.
7 This page therefore has accompanying discussion at <https://bugs.libre-soc.org/show_bug.cgi?id=230> for evolution of suitable opcodes.
11 * <https://bugs.libre-soc.org/show_bug.cgi?id=915> add overflow to maxmin.
12 * <https://bugs.libre-soc.org/show_bug.cgi?id=863> add pseudocode etc.
13 * <https://bugs.libre-soc.org/show_bug.cgi?id=234> hardware implementation
14 * <https://bugs.libre-soc.org/show_bug.cgi?id=910> mins/maxs zero-option?
15 * <https://bugs.libre-soc.org/show_bug.cgi?id=1057> move all int/fp min/max to ls013
18 * [[openpower/isa/av]] pseudocode
19 * [[av_opcodes/analysis]]
20 * TODO review HP 1994-6 PA-RISC MAX <https://en.m.wikipedia.org/wiki/Multimedia_Acceleration_eXtensions>
21 * <https://en.m.wikipedia.org/wiki/Sum_of_absolute_differences>
22 * List of MMX instructions <https://cs.fit.edu/~mmahoney/cse3101/mmx.html>
26 In-advance, the summary of base scalar operations that need to be added is:
28 | instruction | pseudocode |
29 | ------------ | ------------------------ |
30 | average-add. | result = (src1 + src2 + 1) >> 1 |
31 | abs-diff | result = abs (src1-src2) |
32 | abs-accumulate| result += abs (src1-src2) |
33 | (un)signed min| result = (src1 < src2) ? src1 : src2 [[ls013]] |
34 | (un)signed max| result = (src1 > src2) ? src1 : src2 [[ls013]] |
35 | bitwise sel | (a ? b : c) - use [[sv/bitmanip]] ternary |
36 | int/fp move | covered by REMAP and Pack/Unpack |
38 Implemented at the [[openpower/isa/av]] pseudocode page.
40 All other capabilities (saturate in particular) are achieved with [[sv/svp64]] modes and swizzle. Note that minmax and ternary are added in bitmanip.
48 * avgadd RT,RA,RB (Rc=0)
49 * avgadd. RT,RA,RB (Rc=1)
60 Special Registers Altered:
64 ## Absolute Signed Difference
68 * absds RT,RA,RB (Rc=0)
69 * absds. RT,RA,RB (Rc=1)
73 if (RA) < (RB) then RT <- ¬(RA) + (RB) + 1
74 else RT <- ¬(RB) + (RA) + 1
76 Special Registers Altered:
80 ## Absolute Unsigned Difference
84 * absdu RT,RA,RB (Rc=0)
85 * absdu. RT,RA,RB (Rc=1)
89 if (RA) <u (RB) then RT <- ¬(RA) + (RB) + 1
90 else RT <- ¬(RB) + (RA) + 1
92 Special Registers Altered:
96 ## Absolute Accumulate Unsigned Difference
100 * absdacu RT,RA,RB (Rc=0)
101 * absdacu. RT,RA,RB (Rc=1)
105 if (RA) <u (RB) then r <- ¬(RA) + (RB) + 1
106 else r <- ¬(RB) + (RA) + 1
109 Special Registers Altered:
113 ## Absolute Accumulate Signed Difference
117 * absdacs RT,RA,RB (Rc=0)
118 * absdacs. RT,RA,RB (Rc=1)
122 if (RA) < (RB) then r <- ¬(RA) + (RB) + 1
123 else r <- ¬(RB) + (RA) + 1
126 Special Registers Altered: