(no commit message)
[libreriscv.git] / openpower / sv / av_opcodes.mdwn
1 [[!tag standards]]
2
3 # Scalar OpenPOWER Audio and Video Opcodes
4
5 the fundamental principle of SV is a hardware for-loop. therefore the first (and in nearly 100% of cases only) place to put Vector operations is first and foremost in the *scalar* ISA. However only by analysing those scalar opcodes *in* a SV Vectorisation context does it become clear why they are needed and how they may be designed.
6
7 This page therefore has accompanying discussion at <https://bugs.libre-soc.org/show_bug.cgi?id=230> for evolution of suitable opcodes.
8
9 Links
10
11 * <https://bugs.libre-soc.org/show_bug.cgi?id=915> add overflow to maxmin.
12 * <https://bugs.libre-soc.org/show_bug.cgi?id=863> add pseudocode etc.
13 * <https://bugs.libre-soc.org/show_bug.cgi?id=234> hardware implementation
14 * <https://bugs.libre-soc.org/show_bug.cgi?id=910> mins/maxs zero-option?
15 * <https://bugs.libre-soc.org/show_bug.cgi?id=1057> move all int/fp min/max to ls013
16 * [[vpu]]
17 * [[sv/int_fp_mv]]
18 * [[openpower/isa/av]] pseudocode
19 * [[av_opcodes/analysis]]
20 * TODO review HP 1994-6 PA-RISC MAX <https://en.m.wikipedia.org/wiki/Multimedia_Acceleration_eXtensions>
21 * <https://en.m.wikipedia.org/wiki/Sum_of_absolute_differences>
22 * List of MMX instructions <https://cs.fit.edu/~mmahoney/cse3101/mmx.html>
23
24 # Summary
25
26 In-advance, the summary of base scalar operations that need to be added is:
27
28 | instruction | pseudocode |
29 | ------------ | ------------------------ |
30 | average-add. | result = (src1 + src2 + 1) >> 1 |
31 | abs-diff | result = abs (src1-src2) |
32 | abs-accumulate| result += abs (src1-src2) |
33 | (un)signed min| result = (src1 < src2) ? src1 : src2 [[ls013]] |
34 | (un)signed max| result = (src1 > src2) ? src1 : src2 [[ls013]] |
35 | bitwise sel | (a ? b : c) - use [[sv/bitmanip]] ternary |
36 | int/fp move | covered by REMAP and Pack/Unpack |
37
38 Implemented at the [[openpower/isa/av]] pseudocode page.
39
40 All other capabilities (saturate in particular) are achieved with [[sv/svp64]] modes and swizzle. Note that minmax and ternary are added in bitmanip.
41
42 # Instructions
43
44 ## Average Add
45
46 X-Form
47
48 * avgadd RT,RA,RB (Rc=0)
49 * avgadd. RT,RA,RB (Rc=1)
50
51 Pseudo-code:
52
53 a <- [0] * (XLEN+1)
54 b <- [0] * (XLEN+1)
55 a[1:XLEN] <- (RA)
56 b[1:XLEN] <- (RB)
57 r <- (a + b + 1)
58 RT <- r[0:XLEN-1]
59
60 Special Registers Altered:
61
62 CR0 (if Rc=1)
63
64 ## Absolute Signed Difference
65
66 X-Form
67
68 * absds RT,RA,RB (Rc=0)
69 * absds. RT,RA,RB (Rc=1)
70
71 Pseudo-code:
72
73 if (RA) < (RB) then RT <- ¬(RA) + (RB) + 1
74 else RT <- ¬(RB) + (RA) + 1
75
76 Special Registers Altered:
77
78 CR0 (if Rc=1)
79
80 ## Absolute Unsigned Difference
81
82 X-Form
83
84 * absdu RT,RA,RB (Rc=0)
85 * absdu. RT,RA,RB (Rc=1)
86
87 Pseudo-code:
88
89 if (RA) <u (RB) then RT <- ¬(RA) + (RB) + 1
90 else RT <- ¬(RB) + (RA) + 1
91
92 Special Registers Altered:
93
94 CR0 (if Rc=1)
95
96 ## Absolute Accumulate Unsigned Difference
97
98 X-Form
99
100 * absdacu RT,RA,RB (Rc=0)
101 * absdacu. RT,RA,RB (Rc=1)
102
103 Pseudo-code:
104
105 if (RA) <u (RB) then r <- ¬(RA) + (RB) + 1
106 else r <- ¬(RB) + (RA) + 1
107 RT <- (RT) + r
108
109 Special Registers Altered:
110
111 CR0 (if Rc=1)
112
113 ## Absolute Accumulate Signed Difference
114
115 X-Form
116
117 * absdacs RT,RA,RB (Rc=0)
118 * absdacs. RT,RA,RB (Rc=1)
119
120 Pseudo-code:
121
122 if (RA) < (RB) then r <- ¬(RA) + (RB) + 1
123 else r <- ¬(RB) + (RA) + 1
124 RT <- (RT) + r
125
126 Special Registers Altered:
127
128 CR0 (if Rc=1)