util: Add a README file for the m5 utility.
[gem5.git] / util / m5 / README.md
1 The m5 utility provides a command line and library interface for gem5
2 operations.
3
4 These operations are requested by the simulated software through some special
5 behavior which is recognized by gem5. gem5 will then perform the requested
6 operation which is outside the normal behavior of the simulated system.
7
8
9
10 # Trigger mechanisms
11
12 There are a few different ways the simulated software can let gem5 know it
13 wants to perform an operation. Different CPU models have different constraints
14 depending on how they're implemented, and may not support all of these
15 different mechanisms.
16
17 Trigger | Native | KVM | Fast Model
18 -------------|---------|-----|------------
19 Instruction | Yes | |
20 Address | ARM/X86 | Yes |
21 Semihosting | ARM | | Yes
22
23 ## "Magic" Instructions
24
25 This is the oldest trigger mechanism in gem5, and is supported by all of the
26 CPU models which interpret instructions one at a time using gem5's ISA
27 definitions. It works by co-opting instructions which normally are undefined,
28 and redefining them to trigger gem5 operations. Exactly what instructions
29 these are, how they encode what operation they go with, etc., vary from ISA to ISA.
30
31 When using the KVM CPU models, the instruction stream is executing on actual
32 physical hardware which won't treat these instructions specially. They will
33 retain their old behavior and, most likely, raise an undefined instruction
34 exception if executed.
35
36 Other external models, like ARM's Fast Model CPUs, also won't treat these
37 instructions specially.
38
39 ## "Magic" Address Range
40
41 This mechanism was added for the KVM CPUs so that they could trigger gem5
42 operations without having to recognize special instructions. This trigger is
43 based on a specially set aside range of physical addresses. When a read or
44 write is targetted at that range, instead of a normal device or memory access,
45 a gem5 operation is triggered.
46
47 Depending on the ISA, gem5 native CPUs should support this mechanism (see the
48 table below).
49
50 When using the KVM CPU, the special range of addresses are not registered as
51 memory, and so the KVM virtual machine will exit when they're accessed. gem5
52 will have a chance to recognize the special address, and can trigger the
53 operation.
54
55 When using an external model like ARM's Fast Model CPUs, these external
56 accesses will leave the CPU complex, and gem5 will be able to recognize them.
57 Unfortunately if the CPU has multiple threads of execution, gem5 won't be able
58 to tell which the access came from. Also, the memory access may not happen at a
59 precise point in the simulated instruction stream due to binary translation.
60 The architectural state may not be in a consistent state which is suitable to
61 extract arguments or inject a return value.
62
63 ### Default address range
64
65 Since x86 has a predictable address space layout, the "magic" address range can
66 be put in a predictable, default location, which is at 0xFFFF0000.
67
68 On other architectures, notably ARM, the address space is less predictable, and
69 it doesn't make sense to set a default location which won't be valid on all
70 configurations.
71
72 ## Semihosting
73
74 This mechanism was added to support ARM's Fast Model CPUs. It extends ARM's
75 semihosting support, a mechanism which was already defined to interrupt normal
76 execution and trigger some sort of behavior in a containing host.
77
78 On ISAs which support semihosting (only ARM now, and probably going forward),
79 gem5 native CPUs can support semihosting instructions, and so should support
80 the semihosting trigger mechanism.
81
82 KVM CPUs use real hardware, and so semihosting instructions will not have
83 special behavior and will act like their normal counterparts (HLT, etc.).
84
85
86
87 # Building
88
89 ## Supported ABIs
90
91 To build either the command line utility or one of the versions of the library,
92 first identify what ABI(s) you're targetting.
93
94 ABI | Description | Triggers
95 ---------|--------------|----------
96 aarch64 | 64 bit ARM | instruction, adddress, semihosting
97 arm | 32 bit ARM | instruction
98 thumb | ARM thumb | instruction
99 sparc | 64 bit SPARC | instruction
100 x86 | amd64/x86_64 | instruction, address
101
102 ## SCons
103
104 The m5 utility uses a scons based build system. gem5 itself also uses SCons,
105 but these builds are (mostly) not related and separate.
106
107 The SConscript for this utility is set up to use a build directory called
108 "build", similar to gem5 itself. The build directory is structured so that you
109 can ask scons to build a portion of it to help narrow down what you want to
110 build.
111
112 ### native
113
114 There is a **build/native** directory which is for some test binaries which
115 test generic functionality and are compiled for the host, whatever that happens
116 to be. These can be run directly, unlike ABI specific tests which may be
117 possible to run directly depending on the host's architecture, but may not.
118
119 ### ABI
120
121 The first level subdirectories of the build directory (other than "native",
122 described above) is named after the ABI you're targetting. For instance, build
123 products for x86 would be in the **build/x86** subdirectory.
124
125 Within an ABI subdirectory will be linked copies of all the source files needed
126 for the build, and also "test" and "out" subdirectories.
127
128 #### test
129
130 The "test" subdirectory, for instance **build/x86/test**, holds the test
131 binaries for that ABI in a bin subdirectory, and the results of running those
132 tests (if requested and possible) in a "result" subdirectory.
133
134 #### out
135
136 The "out" subdirectory, for instance **build/x86/out**, holds the various final
137 build products. This includes:
138
139 - m5: The command line utility.
140 - libm5.a: C library.
141 - gem5OpJni.jar, libgem5OpJni.so, jni/gem5Op.class: Java support files.
142 - libgem5OpLua.so: Lua module/library.
143
144 ## Build options
145
146 There are some variables which set build options which need to be controlled on
147 a per ABI level. Currently, these are:
148
149 - CROSS_COMPILE: The cross compiler prefix.
150 - QEMU_ARCH: The QEMU architecture suffix.
151
152 To set these for a particular ABI, prefix the variable name with the ABI's name
153 and then a dot. For instance, to set the cross compiler prefix to
154 "x86_64-linux-gnu-" for x86, you would run scons like this:
155
156 ```shell
157 scons x86.CROSS_COMPILE=x86_64-linux-gnu- build/x86/out/m5
158 ```
159
160 ABI | QEMU_ARCH | CROSS_COMPILE
161 ---------|-----------|---------------------
162 aarch64 | aarch64 | aarch64-linux-gnu-
163 arm | arm | arm-linux-gnueabihf-
164 thumb | arm | arm-linux-gnueabihf-
165 sparc | sparc64 | sparc64-linux-gnu-
166 x86 | x86_64 |
167
168 Note that the default setting for the x86 cross compiler prefix is blank,
169 meaning that the native/host compiler will be used. If building on a non-x86
170 host, then you'll need to set an appopriate prefix and may be able to clear
171 some other prefix corresponding to that host.
172
173 ## External dependency detection
174
175 In some cases, if an external dependency isn't detected, the build will
176 gracefully exclude some targets which depend on it. These include:
177
178 ### Java support
179
180 The SConscript will attempt to find the javac and jar programs. If it can't, it
181 will disable building the Java support files.
182
183 ### Lua support
184
185 The SConscript will attempt to find lua51 support using pkg-config. If it
186 can't, it will disable building the lua module/library.
187
188 ### Non-native tests
189
190 The SConscript will attempt to find various QEMU binaries so that it can run
191 non-native tests using QEMU's application level emulation. The name of the
192 binary it looks for depends on the ABI and is set to qemu-${QEMU_ARCH}. See
193 above for a description of per ABI build variables, including QEMU_ARCH.
194
195 If it can't find a program with that name, it will disable running non-native
196 test binaries for that ABI.
197
198
199
200 # Testing
201
202 Tests are based on the googletest system. There are native tests which test
203 mechanisms which are not specific to any ABI and can be run on the host. These
204 are built using the native toolchain.
205
206 There are also tests for ABI specific mechanisms like the various trigger
207 types. These will be built using the cross compiler configured for a given ABI.
208 These tests can be run in QEMU in its application emulation mode, and the build
209 system can run them automatically if requested and if the required dependencies
210 have been met.
211
212 The tests for the trigger mechanisms can't count on those mechanisms actually
213 working when running under QEMU, and so will try to set up intercepts which
214 will catch attempts to use them and verify that they were used correctly. When
215 running these tests under gem5, set the RUNNING_IN_GEM5 environment variable
216 which will tell the test to expect the trigger mechanism to actually work.
217
218
219
220 # Command line utility
221
222 The command line utility provides a way of triggering gem5 operations either
223 interactively through a terminal connection to the simulated system, or scripts
224 running within it.
225
226 ## Calling syntax
227
228 Any call to the utility should have the following structure:
229
230 ```shell
231 m5 [call type] <command> [arguments]
232 ```
233
234 Call type is optional and selects what trigger mechanism should be used. If
235 it's omitted, the default mechanism will be used. What the default mechanism is
236 varies based on the ABI.
237
238 ABI | Default call type
239 ---------|-------------------
240 aarch64 | instruction
241 arm | instruction
242 thumb | instruction
243 sparc | instruction
244 x86 | address
245
246 The default is usually to use a magic instruction, which for most ABIs is the
247 only mechanism that's supported, and is what the m5 utility would
248 tradditionally have used. On x86, the address based mechanism is the default
249 since it's supported on all current CPU types which also support x86.
250
251 ### Call type
252
253 To override the default call type, you can use one of these arguments.
254
255 ```shell
256 --addr [address override]
257 ```
258
259 Selects the magic address call type. On most ABIs which don't have a default
260 magic address range, this argument must be followed by the address range to
261 use. On x86 if no address is specified, the default (0xFFFF0000) will be used.
262
263 ```shell
264 --inst
265 ```
266
267 Selects the magic instruction call type.
268
269 ```shell
270 --semi
271 ```
272
273 Selects the semihosting based call type.
274
275 ### Commands and arguments
276
277 To see a list of commands and the arguments they support, run the utility with
278 the --help argument.
279
280 ```shell
281 m5 --help
282 ```
283
284
285
286 # C library
287
288 The C library provides a set of functions which can trigger gem5 operations
289 from within compiled programs.
290
291 ## Building in the library
292
293 To use the C library, include the header file located at
294
295 ```shell
296 include/gem5/m5ops.h
297 ```
298
299 like so:
300
301 ```shell
302 #include <gem5/m5ops.h>
303 ```
304
305 That will declare the various functions which wrap each of the gem5 operations.
306 It includes another header file located at
307
308 ```shell
309 include/gem5/asm/generic/m5ops.h
310 ```
311
312 using a path relative to include. Be sure that include path will resolve based
313 on the settings of your compiler, or move or modify to fit the existing
314 options.
315
316 As part of the linking step of your application, link in the libm5.a static
317 library archive which provides the definitions of those functions.
318
319 ## Trigger mechanisms
320
321 The bare function name as defined in the header file will use the magic
322 instruction based trigger mechanism, what would have historically been the
323 default.
324
325 Some macros at the end of the header file will set up other declarations which
326 mirror all of the other definitions, but with an "_addr" and "_semi" suffix.
327 These other versions will trigger the same gem5 operations, but using the
328 "magic" address or semihosting trigger mechanisms. While those functions will
329 be unconditionally declared in the header file, a definition will exist in the
330 library only if that trigger mechanism is supported for that ABI.
331
332
333
334 # Java jar
335
336 To use the gem5 java jar, you will need to load the corresponding .so.
337
338 ```shell
339 System.loadLibrary("gem5OpJni");
340 ```
341
342 In your java source, import the gem5Op class which will have methods for
343 calling each of the gem5 operations.
344
345 ```shell
346 import jni.gem5Op
347 ```
348
349 These methods will all use the magic instruction based trigger mechanism.
350
351
352
353 # lua module
354
355 The lua module is implemented in a file called libgem5OpLua.so, and should be
356 loaded using typical lua mechanisms. It will be built against lua 5.1.
357
358 ## Integer values
359
360 In lua 5.1, all numeric values are (typically) represented as doubles. That
361 means that 64 bit integer argument values of any type, but in particular
362 addresses, can't be represented exactly. Calls to gem5 operations using that
363 type of argument or returning that type of value may not work properly.
364
365 In lua 5.3, numeric values can be represented by either a double or a proper
366 integer without having to rebuild the lua interpreter configured for one or the
367 other. If the module was ported to lua 5.3 then integer values could be passed
368 safely.
369
370
371
372 # Known problems
373
374 ## Java/lua cross compiling
375
376 When building the java or lua modules, a C cross compiler is used so that any
377 generated binary will be built for the target ABI. Unfortunately, other tools,
378 headers, etc, come from the host and may not be useable, or worse may be
379 subtley broken, when used to target a different ABI. To build these objects
380 correctly, we would need to use a proper cross build environment for their
381 corresponding languages. Something like this could likely be set up using a
382 tool like buildroot.
383
384