gdb/doc/agentexpr.texi

   1 @c \input texinfo
   2 @c %**start of header
   3 @c @setfilename agentexpr.info
   4 @c @settitle GDB Agent Expressions
   5 @c @setchapternewpage off
   6 @c %**end of header
   7
   8 @c This file is part of the GDB manual.
   9 @c
  10 @c Copyright (C) 2003, 2004, 2005, 2006, 2009, 2010, 2011
  11 @c Free Software Foundation, Inc.
  12 @c
  13 @c See the file gdb.texinfo for copying conditions.
  14
  15 @node Agent Expressions
  16 @appendix The GDB Agent Expression Mechanism
  17
  18 In some applications, it is not feasible for the debugger to interrupt
  19 the program's execution long enough for the developer to learn anything
  20 helpful about its behavior.  If the program's correctness depends on its
  21 real-time behavior, delays introduced by a debugger might cause the
  22 program to fail, even when the code itself is correct.  It is useful to
  23 be able to observe the program's behavior without interrupting it.
  24
  25 Using GDB's @code{trace} and @code{collect} commands, the user can
  26 specify locations in the program, and arbitrary expressions to evaluate
  27 when those locations are reached.  Later, using the @code{tfind}
  28 command, she can examine the values those expressions had when the
  29 program hit the trace points.  The expressions may also denote objects
  30 in memory --- structures or arrays, for example --- whose values GDB
  31 should record; while visiting a particular tracepoint, the user may
  32 inspect those objects as if they were in memory at that moment.
  33 However, because GDB records these values without interacting with the
  34 user, it can do so quickly and unobtrusively, hopefully not disturbing
  35 the program's behavior.
  36
  37 When GDB is debugging a remote target, the GDB @dfn{agent} code running
  38 on the target computes the values of the expressions itself.  To avoid
  39 having a full symbolic expression evaluator on the agent, GDB translates
  40 expressions in the source language into a simpler bytecode language, and
  41 then sends the bytecode to the agent; the agent then executes the
  42 bytecode, and records the values for GDB to retrieve later.
  43
  44 The bytecode language is simple; there are forty-odd opcodes, the bulk
  45 of which are the usual vocabulary of C operands (addition, subtraction,
  46 shifts, and so on) and various sizes of literals and memory reference
  47 operations.  The bytecode interpreter operates strictly on machine-level
  48 values --- various sizes of integers and floating point numbers --- and
  49 requires no information about types or symbols; thus, the interpreter's
  50 internal data structures are simple, and each bytecode requires only a
  51 few native machine instructions to implement it.  The interpreter is
  52 small, and strict limits on the memory and time required to evaluate an
  53 expression are easy to determine, making it suitable for use by the
  54 debugging agent in real-time applications.
  55
  56 @menu
  57 * General Bytecode Design::     Overview of the interpreter.
  58 * Bytecode Descriptions::       What each one does.
  59 * Using Agent Expressions::     How agent expressions fit into the big picture.
  60 * Varying Target Capabilities:: How to discover what the target can do.
  61 * Rationale::                   Why we did it this way.
  62 @end menu
  63
  64
  65 @c @node Rationale
  66 @c @section Rationale
  67
  68
  69 @node General Bytecode Design
  70 @section General Bytecode Design
  71
  72 The agent represents bytecode expressions as an array of bytes.  Each
  73 instruction is one byte long (thus the term @dfn{bytecode}).  Some
  74 instructions are followed by operand bytes; for example, the @code{goto}
  75 instruction is followed by a destination for the jump.
  76
  77 The bytecode interpreter is a stack-based machine; most instructions pop
  78 their operands off the stack, perform some operation, and push the
  79 result back on the stack for the next instruction to consume.  Each
  80 element of the stack may contain either a integer or a floating point
  81 value; these values are as many bits wide as the largest integer that
  82 can be directly manipulated in the source language.  Stack elements
  83 carry no record of their type; bytecode could push a value as an
  84 integer, then pop it as a floating point value.  However, GDB will not
  85 generate code which does this.  In C, one might define the type of a
  86 stack element as follows:
  87 @example
  88 union agent_val @{
  89   LONGEST l;
  90   DOUBLEST d;
  91 @};
  92 @end example
  93 @noindent
  94 where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for
  95 the largest integer and floating point types on the machine.
  96
  97 By the time the bytecode interpreter reaches the end of the expression,
  98 the value of the expression should be the only value left on the stack.
  99 For tracing applications, @code{trace} bytecodes in the expression will
 100 have recorded the necessary data, and the value on the stack may be
 101 discarded.  For other applications, like conditional breakpoints, the
 102 value may be useful.
 103
 104 Separate from the stack, the interpreter has two registers:
 105 @table @code
 106 @item pc
 107 The address of the next bytecode to execute.
 108
 109 @item start
 110 The address of the start of the bytecode expression, necessary for
 111 interpreting the @code{goto} and @code{if_goto} instructions.
 112
 113 @end table
 114 @noindent
 115 Neither of these registers is directly visible to the bytecode language
 116 itself, but they are useful for defining the meanings of the bytecode
 117 operations.
 118
 119 There are no instructions to perform side effects on the running
 120 program, or call the program's functions; we assume that these
 121 expressions are only used for unobtrusive debugging, not for patching
 122 the running code.
 123
 124 Most bytecode instructions do not distinguish between the various sizes
 125 of values, and operate on full-width values; the upper bits of the
 126 values are simply ignored, since they do not usually make a difference
 127 to the value computed.  The exceptions to this rule are:
 128 @table @asis
 129
 130 @item memory reference instructions (@code{ref}@var{n})
 131 There are distinct instructions to fetch different word sizes from
 132 memory.  Once on the stack, however, the values are treated as full-size
 133 integers.  They may need to be sign-extended; the @code{ext} instruction
 134 exists for this purpose.
 135
 136 @item the sign-extension instruction (@code{ext} @var{n})
 137 These clearly need to know which portion of their operand is to be
 138 extended to occupy the full length of the word.
 139
 140 @end table
 141
 142 If the interpreter is unable to evaluate an expression completely for
 143 some reason (a memory location is inaccessible, or a divisor is zero,
 144 for example), we say that interpretation ``terminates with an error''.
 145 This means that the problem is reported back to the interpreter's caller
 146 in some helpful way.  In general, code using agent expressions should
 147 assume that they may attempt to divide by zero, fetch arbitrary memory
 148 locations, and misbehave in other ways.
 149
 150 Even complicated C expressions compile to a few bytecode instructions;
 151 for example, the expression @code{x + y * z} would typically produce
 152 code like the following, assuming that @code{x} and @code{y} live in
 153 registers, and @code{z} is a global variable holding a 32-bit
 154 @code{int}:
 155 @example
 156 reg 1
 157 reg 2
 158 const32 @i{address of z}
 159 ref32
 160 ext 32
 161 mul
 162 add
 163 end
 164 @end example
 165
 166 In detail, these mean:
 167 @table @code
 168
 169 @item reg 1
 170 Push the value of register 1 (presumably holding @code{x}) onto the
 171 stack.
 172
 173 @item reg 2
 174 Push the value of register 2 (holding @code{y}).
 175
 176 @item const32 @i{address of z}
 177 Push the address of @code{z} onto the stack.
 178
 179 @item ref32
 180 Fetch a 32-bit word from the address at the top of the stack; replace
 181 the address on the stack with the value.  Thus, we replace the address
 182 of @code{z} with @code{z}'s value.
 183
 184 @item ext 32
 185 Sign-extend the value on the top of the stack from 32 bits to full
 186 length.  This is necessary because @code{z} is a signed integer.
 187
 188 @item mul
 189 Pop the top two numbers on the stack, multiply them, and push their
 190 product.  Now the top of the stack contains the value of the expression
 191 @code{y * z}.
 192
 193 @item add
 194 Pop the top two numbers, add them, and push the sum.  Now the top of the
 195 stack contains the value of @code{x + y * z}.
 196
 197 @item end
 198 Stop executing; the value left on the stack top is the value to be
 199 recorded.
 200
 201 @end table
 202
 203
 204 @node Bytecode Descriptions
 205 @section Bytecode Descriptions
 206
 207 Each bytecode description has the following form:
 208
 209 @table @asis
 210
 211 @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
 212
 213 Pop the top two stack items, @var{a} and @var{b}, as integers; push
 214 their sum, as an integer.
 215
 216 @end table
 217
 218 In this example, @code{add} is the name of the bytecode, and
 219 @code{(0x02)} is the one-byte value used to encode the bytecode, in
 220 hexadecimal.  The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows
 221 the stack before and after the bytecode executes.  Beforehand, the stack
 222 must contain at least two values, @var{a} and @var{b}; since the top of
 223 the stack is to the right, @var{b} is on the top of the stack, and
 224 @var{a} is underneath it.  After execution, the bytecode will have
 225 popped @var{a} and @var{b} from the stack, and replaced them with a
 226 single value, @var{a+b}.  There may be other values on the stack below
 227 those shown, but the bytecode affects only those shown.
 228
 229 Here is another example:
 230
 231 @table @asis
 232
 233 @item @code{const8} (0x22) @var{n}: @result{} @var{n}
 234 Push the 8-bit integer constant @var{n} on the stack, without sign
 235 extension.
 236
 237 @end table
 238
 239 In this example, the bytecode @code{const8} takes an operand @var{n}
 240 directly from the bytecode stream; the operand follows the @code{const8}
 241 bytecode itself.  We write any such operands immediately after the name
 242 of the bytecode, before the colon, and describe the exact encoding of
 243 the operand in the bytecode stream in the body of the bytecode
 244 description.
 245
 246 For the @code{const8} bytecode, there are no stack items given before
 247 the @result{}; this simply means that the bytecode consumes no values
 248 from the stack.  If a bytecode consumes no values, or produces no
 249 values, the list on either side of the @result{} may be empty.
 250
 251 If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode
 252 treats it as an integer.  If a value is written is @var{addr}, then the
 253 bytecode treats it as an address.
 254
 255 We do not fully describe the floating point operations here; although
 256 this design can be extended in a clean way to handle floating point
 257 values, they are not of immediate interest to the customer, so we avoid
 258 describing them, to save time.
 259
 260
 261 @table @asis
 262
 263 @item @code{float} (0x01): @result{}
 264
 265 Prefix for floating-point bytecodes.  Not implemented yet.
 266
 267 @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b}
 268 Pop two integers from the stack, and push their sum, as an integer.
 269
 270 @item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b}
 271 Pop two integers from the stack, subtract the top value from the
 272 next-to-top value, and push the difference.
 273
 274 @item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b}
 275 Pop two integers from the stack, multiply them, and push the product on
 276 the stack.  Note that, when one multiplies two @var{n}-bit numbers
 277 yielding another @var{n}-bit number, it is irrelevant whether the
 278 numbers are signed or not; the results are the same.
 279
 280 @item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b}
 281 Pop two signed integers from the stack; divide the next-to-top value by
 282 the top value, and push the quotient.  If the divisor is zero, terminate
 283 with an error.
 284
 285 @item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b}
 286 Pop two unsigned integers from the stack; divide the next-to-top value
 287 by the top value, and push the quotient.  If the divisor is zero,
 288 terminate with an error.
 289
 290 @item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b}
 291 Pop two signed integers from the stack; divide the next-to-top value by
 292 the top value, and push the remainder.  If the divisor is zero,
 293 terminate with an error.
 294
 295 @item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b}
 296 Pop two unsigned integers from the stack; divide the next-to-top value
 297 by the top value, and push the remainder.  If the divisor is zero,
 298 terminate with an error.
 299
 300 @item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b}
 301 Pop two integers from the stack; let @var{a} be the next-to-top value,
 302 and @var{b} be the top value.  Shift @var{a} left by @var{b} bits, and
 303 push the result.
 304
 305 @item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b}
 306 Pop two integers from the stack; let @var{a} be the next-to-top value,
 307 and @var{b} be the top value.  Shift @var{a} right by @var{b} bits,
 308 inserting copies of the top bit at the high end, and push the result.
 309
 310 @item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b}
 311 Pop two integers from the stack; let @var{a} be the next-to-top value,
 312 and @var{b} be the top value.  Shift @var{a} right by @var{b} bits,
 313 inserting zero bits at the high end, and push the result.
 314
 315 @item @code{log_not} (0x0e): @var{a} @result{} @var{!a}
 316 Pop an integer from the stack; if it is zero, push the value one;
 317 otherwise, push the value zero.
 318
 319 @item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b}
 320 Pop two integers from the stack, and push their bitwise @code{and}.
 321
 322 @item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b}
 323 Pop two integers from the stack, and push their bitwise @code{or}.
 324
 325 @item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b}
 326 Pop two integers from the stack, and push their bitwise
 327 exclusive-@code{or}.
 328
 329 @item @code{bit_not} (0x12): @var{a} @result{} @var{~a}
 330 Pop an integer from the stack, and push its bitwise complement.
 331
 332 @item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b}
 333 Pop two integers from the stack; if they are equal, push the value one;
 334 otherwise, push the value zero.
 335
 336 @item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b}
 337 Pop two signed integers from the stack; if the next-to-top value is less
 338 than the top value, push the value one; otherwise, push the value zero.
 339
 340 @item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b}
 341 Pop two unsigned integers from the stack; if the next-to-top value is less
 342 than the top value, push the value one; otherwise, push the value zero.
 343
 344 @item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits
 345 Pop an unsigned value from the stack; treating it as an @var{n}-bit
 346 twos-complement value, extend it to full length.  This means that all
 347 bits to the left of bit @var{n-1} (where the least significant bit is bit
 348 0) are set to the value of bit @var{n-1}.  Note that @var{n} may be
 349 larger than or equal to the width of the stack elements of the bytecode
 350 engine; in this case, the bytecode should have no effect.
 351
 352 The number of source bits to preserve, @var{n}, is encoded as a single
 353 byte unsigned integer following the @code{ext} bytecode.
 354
 355 @item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits
 356 Pop an unsigned value from the stack; zero all but the bottom @var{n}
 357 bits.  This means that all bits to the left of bit @var{n-1} (where the
 358 least significant bit is bit 0) are set to the value of bit @var{n-1}.
 359
 360 The number of source bits to preserve, @var{n}, is encoded as a single
 361 byte unsigned integer following the @code{zero_ext} bytecode.
 362
 363 @item @code{ref8} (0x17): @var{addr} @result{} @var{a}
 364 @itemx @code{ref16} (0x18): @var{addr} @result{} @var{a}
 365 @itemx @code{ref32} (0x19): @var{addr} @result{} @var{a}
 366 @itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a}
 367 Pop an address @var{addr} from the stack.  For bytecode
 368 @code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the
 369 natural target endianness.  Push the fetched value as an unsigned
 370 integer.
 371
 372 Note that @var{addr} may not be aligned in any particular way; the
 373 @code{ref@var{n}} bytecodes should operate correctly for any address.
 374
 375 If attempting to access memory at @var{addr} would cause a processor
 376 exception of some sort, terminate with an error.
 377
 378 @item @code{ref_float} (0x1b): @var{addr} @result{} @var{d}
 379 @itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d}
 380 @itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d}
 381 @itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d}
 382 @itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a}
 383 Not implemented yet.
 384
 385 @item @code{dup} (0x28): @var{a} => @var{a} @var{a}
 386 Push another copy of the stack's top element.
 387
 388 @item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a}
 389 Exchange the top two items on the stack.
 390
 391 @item @code{pop} (0x29): @var{a} =>
 392 Discard the top value on the stack.
 393
 394 @item @code{pick} (0x32) @var{n}: @var{a} @dots{} @var{b} => @var{a} @dots{} @var{b} @var{a}
 395 Duplicate an item from the stack and push it on the top of the stack.
 396 @var{n}, a single byte, indicates the stack item to copy.  If @var{n}
 397 is zero, this is the same as @code{dup}; if @var{n} is one, it copies
 398 the item under the top item, etc.  If @var{n} exceeds the number of
 399 items on the stack, terminate with an error.
 400
 401 @item @code{rot} (0x33): @var{a} @var{b} @var{c} => @var{c} @var{b} @var{a}
 402 Rotate the top three items on the stack.
 403
 404 @item @code{if_goto} (0x20) @var{offset}: @var{a} @result{}
 405 Pop an integer off the stack; if it is non-zero, branch to the given
 406 offset in the bytecode string.  Otherwise, continue to the next
 407 instruction in the bytecode stream.  In other words, if @var{a} is
 408 non-zero, set the @code{pc} register to @code{start} + @var{offset}.
 409 Thus, an offset of zero denotes the beginning of the expression.
 410
 411 The @var{offset} is stored as a sixteen-bit unsigned value, stored
 412 immediately following the @code{if_goto} bytecode.  It is always stored
 413 most significant byte first, regardless of the target's normal
 414 endianness.  The offset is not guaranteed to fall at any particular
 415 alignment within the bytecode stream; thus, on machines where fetching a
 416 16-bit on an unaligned address raises an exception, you should fetch the
 417 offset one byte at a time.
 418
 419 @item @code{goto} (0x21) @var{offset}: @result{}
 420 Branch unconditionally to @var{offset}; in other words, set the
 421 @code{pc} register to @code{start} + @var{offset}.
 422
 423 The offset is stored in the same way as for the @code{if_goto} bytecode.
 424
 425 @item @code{const8} (0x22) @var{n}: @result{} @var{n}
 426 @itemx @code{const16} (0x23) @var{n}: @result{} @var{n}
 427 @itemx @code{const32} (0x24) @var{n}: @result{} @var{n}
 428 @itemx @code{const64} (0x25) @var{n}: @result{} @var{n}
 429 Push the integer constant @var{n} on the stack, without sign extension.
 430 To produce a small negative value, push a small twos-complement value,
 431 and then sign-extend it using the @code{ext} bytecode.
 432
 433 The constant @var{n} is stored in the appropriate number of bytes
 434 following the @code{const}@var{b} bytecode.  The constant @var{n} is
 435 always stored most significant byte first, regardless of the target's
 436 normal endianness.  The constant is not guaranteed to fall at any
 437 particular alignment within the bytecode stream; thus, on machines where
 438 fetching a 16-bit on an unaligned address raises an exception, you
 439 should fetch @var{n} one byte at a time.
 440
 441 @item @code{reg} (0x26) @var{n}: @result{} @var{a}
 442 Push the value of register number @var{n}, without sign extension.  The
 443 registers are numbered following GDB's conventions.
 444
 445 The register number @var{n} is encoded as a 16-bit unsigned integer
 446 immediately following the @code{reg} bytecode.  It is always stored most
 447 significant byte first, regardless of the target's normal endianness.
 448 The register number is not guaranteed to fall at any particular
 449 alignment within the bytecode stream; thus, on machines where fetching a
 450 16-bit on an unaligned address raises an exception, you should fetch the
 451 register number one byte at a time.
 452
 453 @item @code{getv} (0x2c) @var{n}: @result{} @var{v}
 454 Push the value of trace state variable number @var{n}, without sign
 455 extension.
 456
 457 The variable number @var{n} is encoded as a 16-bit unsigned integer
 458 immediately following the @code{getv} bytecode.  It is always stored most
 459 significant byte first, regardless of the target's normal endianness.
 460 The variable number is not guaranteed to fall at any particular
 461 alignment within the bytecode stream; thus, on machines where fetching a
 462 16-bit on an unaligned address raises an exception, you should fetch the
 463 register number one byte at a time.
 464
 465 @item @code{setv} (0x2d) @var{n}: @result{} @var{v}
 466 Set trace state variable number @var{n} to the value found on the top
 467 of the stack.  The stack is unchanged, so that the value is readily
 468 available if the assignment is part of a larger expression.  The
 469 handling of @var{n} is as described for @code{getv}.
 470
 471 @item @code{trace} (0x0c): @var{addr} @var{size} @result{}
 472 Record the contents of the @var{size} bytes at @var{addr} in a trace
 473 buffer, for later retrieval by GDB.
 474
 475 @item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr}
 476 Record the contents of the @var{size} bytes at @var{addr} in a trace
 477 buffer, for later retrieval by GDB.  @var{size} is a single byte
 478 unsigned integer following the @code{trace} opcode.
 479
 480 This bytecode is equivalent to the sequence @code{dup const8 @var{size}
 481 trace}, but we provide it anyway to save space in bytecode strings.
 482
 483 @item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr}
 484 Identical to trace_quick, except that @var{size} is a 16-bit big-endian
 485 unsigned integer, not a single byte.  This should probably have been
 486 named @code{trace_quick16}, for consistency.
 487
 488 @item @code{tracev} (0x2e) @var{n}: @result{} @var{a}
 489 Record the value of trace state variable number @var{n} in the trace
 490 buffer.  The handling of @var{n} is as described for @code{getv}.
 491
 492 @item @code{tracenz} (0x2f)  @var{addr} @var{size} @result{}
 493 Record the bytes at @var{addr} in a trace buffer, for later retrieval
 494 by GDB.  Stop at either the first zero byte, or when @var{size} bytes
 495 have been recorded, whichever occurs first.
 496
 497 @item @code{end} (0x27): @result{}
 498 Stop executing bytecode; the result should be the top element of the
 499 stack.  If the purpose of the expression was to compute an lvalue or a
 500 range of memory, then the next-to-top of the stack is the lvalue's
 501 address, and the top of the stack is the lvalue's size, in bytes.
 502
 503 @end table
 504
 505
 506 @node Using Agent Expressions
 507 @section Using Agent Expressions
 508
 509 Agent expressions can be used in several different ways by @value{GDBN},
 510 and the debugger can generate different bytecode sequences as appropriate.
 511
 512 One possibility is to do expression evaluation on the target rather
 513 than the host, such as for the conditional of a conditional
 514 tracepoint.  In such a case, @value{GDBN} compiles the source
 515 expression into a bytecode sequence that simply gets values from
 516 registers or memory, does arithmetic, and returns a result.
 517
 518 Another way to use agent expressions is for tracepoint data
 519 collection.  @value{GDBN} generates a different bytecode sequence for
 520 collection; in addition to bytecodes that do the calculation,
 521 @value{GDBN} adds @code{trace} bytecodes to save the pieces of
 522 memory that were used.
 523
 524 @itemize @bullet
 525
 526 @item
 527 The user selects trace points in the program's code at which GDB should
 528 collect data.
 529
 530 @item
 531 The user specifies expressions to evaluate at each trace point.  These
 532 expressions may denote objects in memory, in which case those objects'
 533 contents are recorded as the program runs, or computed values, in which
 534 case the values themselves are recorded.
 535
 536 @item
 537 GDB transmits the tracepoints and their associated expressions to the
 538 GDB agent, running on the debugging target.
 539
 540 @item
 541 The agent arranges to be notified when a trace point is hit.
 542
 543 @item
 544 When execution on the target reaches a trace point, the agent evaluates
 545 the expressions associated with that trace point, and records the
 546 resulting values and memory ranges.
 547
 548 @item
 549 Later, when the user selects a given trace event and inspects the
 550 objects and expression values recorded, GDB talks to the agent to
 551 retrieve recorded data as necessary to meet the user's requests.  If the
 552 user asks to see an object whose contents have not been recorded, GDB
 553 reports an error.
 554
 555 @end itemize
 556
 557
 558 @node Varying Target Capabilities
 559 @section Varying Target Capabilities
 560
 561 Some targets don't support floating-point, and some would rather not
 562 have to deal with @code{long long} operations.  Also, different targets
 563 will have different stack sizes, and different bytecode buffer lengths.
 564
 565 Thus, GDB needs a way to ask the target about itself.  We haven't worked
 566 out the details yet, but in general, GDB should be able to send the
 567 target a packet asking it to describe itself.  The reply should be a
 568 packet whose length is explicit, so we can add new information to the
 569 packet in future revisions of the agent, without confusing old versions
 570 of GDB, and it should contain a version number.  It should contain at
 571 least the following information:
 572
 573 @itemize @bullet
 574
 575 @item
 576 whether floating point is supported
 577
 578 @item
 579 whether @code{long long} is supported
 580
 581 @item
 582 maximum acceptable size of bytecode stack
 583
 584 @item
 585 maximum acceptable length of bytecode expressions
 586
 587 @item
 588 which registers are actually available for collection
 589
 590 @item
 591 whether the target supports disabled tracepoints
 592
 593 @end itemize
 594
 595 @node Rationale
 596 @section Rationale
 597
 598 Some of the design decisions apparent above are arguable.
 599
 600 @table @b
 601
 602 @item What about stack overflow/underflow?
 603 GDB should be able to query the target to discover its stack size.
 604 Given that information, GDB can determine at translation time whether a
 605 given expression will overflow the stack.  But this spec isn't about
 606 what kinds of error-checking GDB ought to do.
 607
 608 @item Why are you doing everything in LONGEST?
 609
 610 Speed isn't important, but agent code size is; using LONGEST brings in a
 611 bunch of support code to do things like division, etc.  So this is a
 612 serious concern.
 613
 614 First, note that you don't need different bytecodes for different
 615 operand sizes.  You can generate code without @emph{knowing} how big the
 616 stack elements actually are on the target.  If the target only supports
 617 32-bit ints, and you don't send any 64-bit bytecodes, everything just
 618 works.  The observation here is that the MIPS and the Alpha have only
 619 fixed-size registers, and you can still get C's semantics even though
 620 most instructions only operate on full-sized words.  You just need to
 621 make sure everything is properly sign-extended at the right times.  So
 622 there is no need for 32- and 64-bit variants of the bytecodes.  Just
 623 implement everything using the largest size you support.
 624
 625 GDB should certainly check to see what sizes the target supports, so the
 626 user can get an error earlier, rather than later.  But this information
 627 is not necessary for correctness.
 628
 629
 630 @item Why don't you have @code{>} or @code{<=} operators?
 631 I want to keep the interpreter small, and we don't need them.  We can
 632 combine the @code{less_} opcodes with @code{log_not}, and swap the order
 633 of the operands, yielding all four asymmetrical comparison operators.
 634 For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y <
 635 x)}.
 636
 637 @item Why do you have @code{log_not}?
 638 @itemx Why do you have @code{ext}?
 639 @itemx Why do you have @code{zero_ext}?
 640 These are all easily synthesized from other instructions, but I expect
 641 them to be used frequently, and they're simple, so I include them to
 642 keep bytecode strings short.
 643
 644 @code{log_not} is equivalent to @code{const8 0 equal}; it's used in half
 645 the relational operators.
 646
 647 @code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8
 648 @var{s-n} rsh_signed}, where @var{s} is the size of the stack elements;
 649 it follows @code{ref@var{m}} and @var{reg} bytecodes when the value
 650 should be signed.  See the next bulleted item.
 651
 652 @code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask}
 653 log_and}; it's used whenever we push the value of a register, because we
 654 can't assume the upper bits of the register aren't garbage.
 655
 656 @item Why not have sign-extending variants of the @code{ref} operators?
 657 Because that would double the number of @code{ref} operators, and we
 658 need the @code{ext} bytecode anyway for accessing bitfields.
 659
 660 @item Why not have constant-address variants of the @code{ref} operators?
 661 Because that would double the number of @code{ref} operators again, and
 662 @code{const32 @var{address} ref32} is only one byte longer.
 663
 664 @item Why do the @code{ref@var{n}} operators have to support unaligned fetches?
 665 GDB will generate bytecode that fetches multi-byte values at unaligned
 666 addresses whenever the executable's debugging information tells it to.
 667 Furthermore, GDB does not know the value the pointer will have when GDB
 668 generates the bytecode, so it cannot determine whether a particular
 669 fetch will be aligned or not.
 670
 671 In particular, structure bitfields may be several bytes long, but follow
 672 no alignment rules; members of packed structures are not necessarily
 673 aligned either.
 674
 675 In general, there are many cases where unaligned references occur in
 676 correct C code, either at the programmer's explicit request, or at the
 677 compiler's discretion.  Thus, it is simpler to make the GDB agent
 678 bytecodes work correctly in all circumstances than to make GDB guess in
 679 each case whether the compiler did the usual thing.
 680
 681 @item Why are there no side-effecting operators?
 682 Because our current client doesn't want them?  That's a cheap answer.  I
 683 think the real answer is that I'm afraid of implementing function
 684 calls.  We should re-visit this issue after the present contract is
 685 delivered.
 686
 687 @item Why aren't the @code{goto} ops PC-relative?
 688 The interpreter has the base address around anyway for PC bounds
 689 checking, and it seemed simpler.
 690
 691 @item Why is there only one offset size for the @code{goto} ops?
 692 Offsets are currently sixteen bits.  I'm not happy with this situation
 693 either:
 694
 695 Suppose we have multiple branch ops with different offset sizes.  As I
 696 generate code left-to-right, all my jumps are forward jumps (there are
 697 no loops in expressions), so I never know the target when I emit the
 698 jump opcode.  Thus, I have to either always assume the largest offset
 699 size, or do jump relaxation on the code after I generate it, which seems
 700 like a big waste of time.
 701
 702 I can imagine a reasonable expression being longer than 256 bytes.  I
 703 can't imagine one being longer than 64k.  Thus, we need 16-bit offsets.
 704 This kind of reasoning is so bogus, but relaxation is pathetic.
 705
 706 The other approach would be to generate code right-to-left.  Then I'd
 707 always know my offset size.  That might be fun.
 708
 709 @item Where is the function call bytecode?
 710
 711 When we add side-effects, we should add this.
 712
 713 @item Why does the @code{reg} bytecode take a 16-bit register number?
 714
 715 Intel's IA-64 architecture has 128 general-purpose registers,
 716 and 128 floating-point registers, and I'm sure it has some random
 717 control registers.
 718
 719 @item Why do we need @code{trace} and @code{trace_quick}?
 720 Because GDB needs to record all the memory contents and registers an
 721 expression touches.  If the user wants to evaluate an expression
 722 @code{x->y->z}, the agent must record the values of @code{x} and
 723 @code{x->y} as well as the value of @code{x->y->z}.
 724
 725 @item Don't the @code{trace} bytecodes make the interpreter less general?
 726 They do mean that the interpreter contains special-purpose code, but
 727 that doesn't mean the interpreter can only be used for that purpose.  If
 728 an expression doesn't use the @code{trace} bytecodes, they don't get in
 729 its way.
 730
 731 @item Why doesn't @code{trace_quick} consume its arguments the way everything else does?
 732 In general, you do want your operators to consume their arguments; it's
 733 consistent, and generally reduces the amount of stack rearrangement
 734 necessary.  However, @code{trace_quick} is a kludge to save space; it
 735 only exists so we needn't write @code{dup const8 @var{SIZE} trace}
 736 before every memory reference.  Therefore, it's okay for it not to
 737 consume its arguments; it's meant for a specific context in which we
 738 know exactly what it should do with the stack.  If we're going to have a
 739 kludge, it should be an effective kludge.
 740
 741 @item Why does @code{trace16} exist?
 742 That opcode was added by the customer that contracted Cygnus for the
 743 data tracing work.  I personally think it is unnecessary; objects that
 744 large will be quite rare, so it is okay to use @code{dup const16
 745 @var{size} trace} in those cases.
 746
 747 Whatever we decide to do with @code{trace16}, we should at least leave
 748 opcode 0x30 reserved, to remain compatible with the customer who added
 749 it.
 750
 751 @end table