(no commit message)
[libreriscv.git] / openpower / sv / implementation.mdwn
1 # Implementation
2
3 This page covers and coordinates implementing SV. The basic concept is
4 to go step-by-step through the [[sv/overview]] adding each feature,
5 one at a time. Caveats and notes are included so that other implementors may avoid some common pitfalls.
6
7 Links:
8
9 * <http://lists.libre-soc.org/pipermail/libre-soc-dev/2021-January/001865.html>
10 * <https://bugs.libre-soc.org/show_bug.cgi?id=578> python-based svp64
11 assembler translator
12 * <https://bugs.libre-soc.org/show_bug.cgi?id=579> c/c++ macro svp64
13 assembler translator
14 * <https://bugs.libre-soc.org/show_bug.cgi?id=586> microwatt svp64-decode1.vhdl autogenerator
15 * <https://bugs.libre-soc.org/show_bug.cgi?id=577> gcc/binutils/svp64
16 * <https://bugs.libre-soc.org/show_bug.cgi?id=241> gem5 / ISACaller simulator
17 - <https://bugs.libre-soc.org/show_bug.cgi?id=581> gem5 upstreaming
18 * <https://bugs.libre-soc.org/show_bug.cgi?id=583> TestIssuer
19 * <https://bugs.libre-soc.org/show_bug.cgi?id=588> PowerDecoder2
20 * <https://bugs.libre-soc.org/show_bug.cgi?id=587> setvl ancillary tasks
21 (instruction form SVL-Form, field designations, pseudocode, SPR allocation)
22 * <https://bugs.libre-soc.org/show_bug.cgi?id=615> agree sv assembly syntax
23 * <https://bugs.libre-soc.org/show_bug.cgi?id=617> TestIssuer add single/twin Predication
24 * <https://bugs.libre-soc.org/show_bug.cgi?id=618> ISACaller add single/twin Predication
25 * <https://bugs.libre-soc.org/show_bug.cgi?id=619> tracking manual augmentation of CSV files
26 * <https://bugs.libre-soc.org/show_bug.cgi?id=636> add zeroing and exceptions
27 * <https://bugs.libre-soc.org/show_bug.cgi?id=663> element-width overrides
28
29 # Code to convert
30
31 There are five projects:
32
33 * TestIssuer (the HDL)
34 * ISACaller (the python-based simulator)
35 * power-gem5 (a cycle accurate simulator)
36 * Microwatt (VHDL)
37 * gcc and binutils
38
39 Each of these needs to have SV augmentation, and the best way to
40 do it is if they are all done at the same time, implementing the same
41 incremental feature.
42
43 # Critical tasks
44
45 These are prerequisite tasks:
46
47 * power-gem5 automanagement, similar to pygdbmi for starting qemu
48 - found this <https://www.gem5.org/documentation/general_docs/debugging_and_testing/debugging/debugging_simulated_code>
49 just use pygdbmi
50 - remote gdb should work <https://github.com/power-gem5/gem5/blob/gem5-experimental/src/arch/power/remote_gdb.cc>
51 * c++, c and python macros for generating [[sv/svp64]] assembler
52 (svp64 prefixes)
53 - python svp64 underway, minimalist sufficient for FU unit tests
54 <https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/sv/trans/svp64.py;hb=HEAD>
55 * PowerDecoder2 - both TestIssuer and ISACaller are dependent on this
56 - <https://bugs.libre-soc.org/show_bug.cgi?id=588> underway
57 - INT and CR EXTRA svp64 fields completed.
58 * SVP64PowerDecoder2, used to identify SVP64 Prefixes. DONE.
59
60 People coordinating different tasks. This doesn't mean exclusive work on these areas it just means they are the "coordinator" and lead:
61
62 * Lauri:
63 * Jacob: C/C++ header for using SV through inline assembly
64 * Cesar: TestIssuer FSM
65 * Alain: power-gem5
66 * Cole:
67 * Luke: ISACaller, python-assembler-generator-class
68 * Tobias:
69 * Alexandre: binutils-svp64-assembler and gcc
70 * Paul: microwatt
71
72 # Adding SV
73
74 order: listed in [[sv/overview]]
75
76 ## svp64 decoder
77
78 An autogenerator containing CSV files is available so that the task of creating decoders is not burdensome. sv_analyse.py creates the CSV files, SVP64RM class picks them up.
79
80 * ISACaller: part done. svp64 detected, PowerDecoder2 in use
81 * power-gem5: TODO
82 * TestIssuer: part done. svp64 detected, PowerDecoder2 in use.
83 * Microwatt: TODO, started auto-generated sv_decode.vhdl
84 * python-based assembler-translator: 40% done (lkcl)
85 * c++ macros: underway (jacob)
86
87 Note when decoding the RM into bits different modes that LDST interprets the 5 mode bits differently not just on whether it is LD/ST bit also what *type* of LD/ST. Immediate LD/ST is further qualified to indicate if it operates in element-strided or unit-strided mode. However Indexed LD/ST is not.
88
89 **IMPORTANT**! when spotting RA=0 in some instructions it is critical to note that the *full **seven** bits* are used (those from EXTRA2/3 included) because RA is no longer only five bits.
90
91 Links:
92
93 * <https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv_analysis.py;hb=HEAD>
94 * <https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/decoder/power_svp64.py;hb=HEAD>
95 * <https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/decoder/power_svp64_rm.py;hb=HEAD>
96
97 ## SVSTATE SPR needed
98
99 This is a peer of MSR but is stored in an SPR. It should be considered part of the state of PC+MSR because SVSTATE is effectively a Sub-PC.
100
101 Chosen values, fitting with v3.1 / v3.0C p12 "Sandbox" guidelines:
102
103 num name priv width
104 704,SVSTATE,no,no,32
105 720,SVSRR0,yes,yes,32
106
107 Progress:
108
109 * ISACaller: done
110 * power-gem5: TODO
111 * TestIssuer: TODO
112 * Microwatt: TODO
113
114 * <https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/sv/svstate.py;hb=HEAD>
115
116 ## Adding SVSTATE "set/get" support for hw/sw debugging
117
118 This includes adding DMI get/set support in hardware as well as gdb (remote) support.
119
120 * LibreSOC DMI/JTAG: TODO
121 * Microwatt DMI: TODO
122 * power-gem5 remote gdb: TODO
123 * TestIssuer: DONE (read-only at least) <https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=4d5482810c980ff927ccec62968a40a490ea86eb>
124
125 Links:
126
127 * <https://bugs.libre-soc.org/show_bug.cgi?id=609>
128
129 ## sv.setvl
130
131 a [[sv/setvl]] instruction is needed, which also implements [[sv/sprs]] i.e. primarily the `SVSTATE` SPR. the dual-access SPRs for VL and MVL which mirror into the SVSTATE.VL and SVSTATE.MVL fields are not immediately essential to implement.
132
133 * LibreSOC OpenPOWER wiki fields/forms: DONE. pseudocode: TODO
134 * ISACaller: TODO
135 * power-gem5: TODO
136 * TestIssuer: TODO
137 * Microwatt: TODO
138
139 Links:
140
141 ## SVSRR0 for exceptions
142
143 SV's SVSTATE context is effectively a Sub-PC. On exceptions the PC is saved into SRR0: it should come as no surprise that SVSTATE must be treated exactly the same. SVSRR0 therefore is added to the list to be saved/restored in **exactly** the same way and time as SRR0 and SRR1. This is fundamental and absolutely critical to view SVSTATE as a full peer of PC (CIA, NIA).
144
145 * ISACaller: TODO unit test
146 * power-gem5: TODO
147 * TestIssuer: TODO
148 * Microwatt: TODO
149
150 * added ISACaller SVSRR0 save <https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=25071d491dba94495657796eb6ff10eb6499257f>
151
152 ## Illegal instruction exceptions
153
154 Anything not listed as SVP64 extended must raise an illegal exception if prefixed. setvl, branch, mtmsr, mfmsr at the minimum.
155
156 * ISACaller: TODO
157 * power-gem5: TODO
158 * TestIssuer: TODO
159 * Microwatt: TODO
160
161 ## VL for-loop
162
163 main SV for-loop, as a FSM, updating `SVSTATE.srcstep`, using it as the index in the for-loop from 0 to VL-1. Register numbers are incremented by one if marked as vector.
164
165 *This loop goes in between decode and issue phases*. It is as if there were multiple sequential instructions in the instruction stream *and the loop must be treated as such*. Specifically: all register read and write hazards **MUST** be respected; the Program Order must be respected even though and especially because this is Sub-PC execution.
166
167 This **includes** any exceptions, hence why SVSTATE exists and why SVSRR0 must be used to store SVSTATE alongside when SRR0 and SRR1 store PC and MSR.
168
169 Due to the need for exceptions to occur in the middle, the loop should *not* be implemented as an actual for-loop, whilst recognising that optimised implementations may do multi-issue element execution as long as Program Order is preserved, just as it would be for the PC.
170
171 * ISACaller: DONE, first revision <https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=9078b2935beb4ba89dcd2af91bb5e3a0bcffbe71>
172 * power-gem5: TODO
173 * TestIssuer:
174 - part done <https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=92ba64ea13794dea71816be746a056d52e245651>
175 - done <https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=97136d71397f420479d601dcb80f0df4abf73d22>
176 * Microwatt: TODO
177
178 Remember the following register files need to have for-loops, plus
179 unit tests:
180
181 * GPR
182 * SPRs (yes, really: mtspr and mfspr are SV Context-extensible)
183 * Condition Registers. see note below
184 * FPR (if present)
185
186 When Rc=1 is encountered in an SVP64 Context the destination is different (TODO) i.e. not CR0 or CR1. Implicit Rc=1 Condition Registers are still Vectorised but do **not** have EXTRA2/3 spec adjustments. The only part if the EXTRA2/3 spec that is observed and respected is whether the CR is Vectorised (isvec).
187
188 ## Increasing register file sizes
189
190 TODO. INTs, FPs, CRs, these all increase to 128. Welcome To Vector ISAs.
191
192 At the same time the `Rc=1` CR offsets normslly CR0 and CR1 for fixed and FP scalar may also be adjusted.
193
194 ## Single and Twin Predication
195
196 both CR and INT predication is needed, as well as zeroing in both.
197 the order is best done as follows:
198
199 * INT-based single
200 * CR-based single
201 * srcstep+dststep
202 * INT-based twin
203 * CR-based twin
204 * Zeroing single
205 * Zeroing twin
206
207 Best done as a FSM that "advances" srcstep and dststep over the
208 zeros in their respective predicate masks, *including* when the
209 src and dest predicate mask is "All 1s".
210
211 Bear in mind that srcstep+deststep are a form of back-to-back
212 VGATHER+VSCATTER
213
214 Watch out in zeroing! CR0 will *not* be set (itself) to zero:
215 the CR0.eq flag will be set because the *result* is still tested.
216 correction: CR0-and-any-other-Vector-of-CR-fields (Vector elements
217 have their corresponding CR field, so the test of zero needs to
218 be done for the associated *element* result, not jam absolutely
219 every element vector test *into* CR0)
220
221 Progress:
222
223 * TestIssuer <https://bugs.libre-soc.org/show_bug.cgi?id=617>
224 and Zeroing <https://bugs.libre-soc.org/show_bug.cgi?id=636>
225 * ISACaller <https://bugs.libre-soc.org/show_bug.cgi?id=618>
226 * power-gem5: TODO
227 * Microwatt: TODO
228
229 ## Element width overrides
230
231 <https://bugs.libre-soc.org/show_bug.cgi?id=663>
232
233 * Pseudocode: TODO
234 * Simulator: TODO
235 * TestIssuer: TODO
236 * unit tests: TODO
237 * power-gem5: TODO
238 * cavatools: TODO
239
240 ## Reduce Mode
241
242 TODO
243
244 ## Saturation Mode
245
246 TODO
247
248 ## REMAP and Context Propagation
249
250 * <https://libre-soc.org/openpower/sv/remap/>
251 * <https://libre-soc.org/openpower/sv/propagation/>
252 * <https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/svp64.py;hb=HEAD>
253
254 ## Vectorised Branches
255
256 TODO [[sv/branches]]
257
258 ## Vectorised LD/ST
259
260 TODO [[sv/ldst]]