From: Jim Blandy Date: Tue, 28 Mar 2006 19:19:16 +0000 (+0000) Subject: src/gdb/ChangeLog: X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=7d30c22d4c26dfe28d06602bee825be609a36858;p=binutils-gdb.git src/gdb/ChangeLog: 2006-03-28 Jim Blandy * prologue-value.c, prologue-value.h: New files. * Makefile.in (prologue_value_h): New variable. (HFILES_NO_SRCDIR): List prologue-value.h. (SFILES): List prologue-value.c. (COMMON_OBS): List prologue-value.o. (prologue-value.o): New rule. src/gdb/doc/ChangeLog: 2006-03-28 Jim Blandy * gdbint.texinfo (Prologue Analysis): New section. --- diff --git a/gdb/ChangeLog b/gdb/ChangeLog index d88dce2e56b..b0d3a307b72 100644 --- a/gdb/ChangeLog +++ b/gdb/ChangeLog @@ -1,3 +1,12 @@ +2006-03-28 Jim Blandy + + * prologue-value.c, prologue-value.h: New files. + * Makefile.in (prologue_value_h): New variable. + (HFILES_NO_SRCDIR): List prologue-value.h. + (SFILES): List prologue-value.c. + (COMMON_OBS): List prologue-value.o. + (prologue-value.o): New rule. + 2006-03-27 Michael Snyder * xstormy16-tdep.c (xstormy16_return_value, xstormy16_push_dummy_call, diff --git a/gdb/Makefile.in b/gdb/Makefile.in index 1ad9738eef5..ba418186e46 100644 --- a/gdb/Makefile.in +++ b/gdb/Makefile.in @@ -542,6 +542,7 @@ SFILES = ada-exp.y ada-lang.c ada-typeprint.c ada-valprint.c \ objc-exp.y objc-lang.c \ objfiles.c osabi.c observer.c \ p-exp.y p-lang.c p-typeprint.c p-valprint.c parse.c printcmd.c \ + prologue-value.c \ regcache.c reggroups.c remote.c remote-fileio.c \ scm-exp.c scm-lang.c scm-valprint.c \ sentinel-frame.c \ @@ -757,6 +758,7 @@ ppcnbsd_tdep_h = ppcnbsd-tdep.h ppcobsd_tdep_h = ppcobsd-tdep.h ppc_tdep_h = ppc-tdep.h proc_utils_h = proc-utils.h +prologue_value_h = prologue-value.h regcache_h = regcache.h reggroups_h = reggroups.h regset_h = regset.h @@ -867,6 +869,7 @@ HFILES_NO_SRCDIR = bcache.h buildsym.h call-cmds.h coff-solib.h defs.h \ symfile.h stabsread.h target.h terminal.h typeprint.h \ xcoffsolib.h \ macrotab.h macroexp.h macroscope.h \ + prologue-value.h \ ada-lang.h c-lang.h f-lang.h \ jv-lang.h \ m2-lang.h p-lang.h \ @@ -2437,6 +2440,8 @@ procfs.o: procfs.c $(defs_h) $(inferior_h) $(target_h) $(gdbcore_h) \ proc-service.o: proc-service.c $(defs_h) $(gdb_proc_service_h) $(inferior_h) \ $(symtab_h) $(target_h) $(gregset_h) proc-why.o: proc-why.c $(defs_h) $(proc_utils_h) +prologue-value.o: prologue-value.c $(defs_h) $(gdb_string_h) $(gdb_assert_h) \ + $(prologue_value_h) $(regcache_h) p-typeprint.o: p-typeprint.c $(defs_h) $(gdb_obstack_h) $(bfd_h) $(symtab_h) \ $(gdbtypes_h) $(expression_h) $(value_h) $(gdbcore_h) $(target_h) \ $(language_h) $(p_lang_h) $(typeprint_h) $(gdb_string_h) diff --git a/gdb/doc/ChangeLog b/gdb/doc/ChangeLog index 6345276ed4c..5af047593d9 100644 --- a/gdb/doc/ChangeLog +++ b/gdb/doc/ChangeLog @@ -1,3 +1,7 @@ +2006-03-28 Jim Blandy + + * gdbint.texinfo (Prologue Analysis): New section. + 2006-03-07 Jim Blandy * gdb.texinfo (Connecting): Document 'target remote pipe'. diff --git a/gdb/doc/gdbint.texinfo b/gdb/doc/gdbint.texinfo index 8389f8fe9a8..e14aa2cf7f8 100644 --- a/gdb/doc/gdbint.texinfo +++ b/gdb/doc/gdbint.texinfo @@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame struct, and then @code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and @code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame. +@section Prologue Analysis + +@cindex prologue analysis +@cindex call frame information +@cindex CFI (call frame information) +To produce a backtrace and allow the user to manipulate older frames' +variables and arguments, @value{GDBN} needs to find the base addresses +of older frames, and discover where those frames' registers have been +saved. Since a frame's ``callee-saves'' registers get saved by +younger frames if and when they're reused, a frame's registers may be +scattered unpredictably across younger frames. This means that +changing the value of a register-allocated variable in an older frame +may actually entail writing to a save slot in some younger frame. + +Modern versions of GCC emit Dwarf call frame information (``CFI''), +which describes how to find frame base addresses and saved registers. +But CFI is not always available, so as a fallback @value{GDBN} uses a +technique called @dfn{prologue analysis} to find frame sizes and saved +registers. A prologue analyzer disassembles the function's machine +code starting from its entry point, and looks for instructions that +allocate frame space, save the stack pointer in a frame pointer +register, save registers, and so on. Obviously, this can't be done +accurately in general, but it's tractible to do well enough to be very +helpful. Prologue analysis predates the GNU toolchain's support for +CFI; at one time, prologue analysis was the only mechanism +@value{GDBN} used for stack unwinding at all, when the function +calling conventions didn't specify a fixed frame layout. + +In the olden days, function prologues were generated by hand-written, +target-specific code in GCC, and treated as opaque and untouchable by +optimizers. Looking at this code, it was usually straightforward to +write a prologue analyzer for @value{GDBN} that would accurately +understand all the prologues GCC would generate. However, over time +GCC became more aggressive about instruction scheduling, and began to +understand more about the semantics of the prologue instructions +themselves; in response, @value{GDBN}'s analyzers became more complex +and fragile. Keeping the prologue analyzers working as GCC (and the +instruction sets themselves) evolved became a substantial task. + +@cindex @file{prologue-value.c} +@cindex abstract interpretation of function prologues +@cindex pseudo-evaluation of function prologues +To try to address this problem, the code in @file{prologue-value.h} +and @file{prologue-value.c} provides a general framework for writing +prologue analyzers that are simpler and more robust than ad-hoc +analyzers. When we analyze a prologue using the prologue-value +framework, we're really doing ``abstract interpretation'' or +``pseudo-evaluation'': running the function's code in simulation, but +using conservative approximations of the values registers and memory +would hold when the code actually runs. For example, if our function +starts with the instruction: + +@example +addi r1, 42 # add 42 to r1 +@end example +@noindent +we don't know exactly what value will be in @code{r1} after executing +this instruction, but we do know it'll be 42 greater than its original +value. + +If we then see an instruction like: + +@example +addi r1, 22 # add 22 to r1 +@end example +@noindent +we still don't know what @code{r1's} value is, but again, we can say +it is now 64 greater than its original value. + +If the next instruction were: + +@example +mov r2, r1 # set r2 to r1's value +@end example +@noindent +then we can say that @code{r2's} value is now the original value of +@code{r1} plus 64. + +It's common for prologues to save registers on the stack, so we'll +need to track the values of stack frame slots, as well as the +registers. So after an instruction like this: + +@example +mov (fp+4), r2 +@end example +@noindent +then we'd know that the stack slot four bytes above the frame pointer +holds the original value of @code{r1} plus 64. + +And so on. + +Of course, this can only go so far before it gets unreasonable. If we +wanted to be able to say anything about the value of @code{r1} after +the instruction: + +@example +xor r1, r3 # exclusive-or r1 and r3, place result in r1 +@end example +@noindent +then things would get pretty complex. But remember, we're just doing +a conservative approximation; if exclusive-or instructions aren't +relevant to prologues, we can just say @code{r1}'s value is now +``unknown''. We can ignore things that are too complex, if that loss of +information is acceptable for our application. + +So when we say ``conservative approximation'' here, what we mean is an +approximation that is either accurate, or marked ``unknown'', but +never inaccurate. + +Using this framework, a prologue analyzer is simply an interpreter for +machine code, but one that uses conservative approximations for the +contents of registers and memory instead of actual values. Starting +from the function's entry point, you simulate instructions up to the +current PC, or an instruction that you don't know how to simulate. +Now you can examine the state of the registers and stack slots you've +kept track of. + +@itemize @bullet + +@item +To see how large your stack frame is, just check the value of the +stack pointer register; if it's the original value of the SP +minus a constant, then that constant is the stack frame's size. +If the SP's value has been marked as ``unknown'', then that means +the prologue has done something too complex for us to track, and +we don't know the frame size. + +@item +To see where we've saved the previous frame's registers, we just +search the values we've tracked --- stack slots, usually, but +registers, too, if you want --- for something equal to the register's +original value. If the calling conventions suggest a standard place +to save a given register, then we can check there first, but really, +anything that will get us back the original value will probably work. +@end itemize + +This does take some work. But prologue analyzers aren't +quick-and-simple pattern patching to recognize a few fixed prologue +forms any more; they're big, hairy functions. Along with inferior +function calls, prologue analysis accounts for a substantial portion +of the time needed to stabilize a @value{GDBN} port. So it's +worthwhile to look for an approach that will be easier to understand +and maintain. In the approach described above: + +@itemize @bullet + +@item +It's easier to see that the analyzer is correct: you just see +whether the analyzer properly (albiet conservatively) simulates +the effect of each instruction. + +@item +It's easier to extend the analyzer: you can add support for new +instructions, and know that you haven't broken anything that +wasn't already broken before. + +@item +It's orthogonal: to gather new information, you don't need to +complicate the code for each instruction. As long as your domain +of conservative values is already detailed enough to tell you +what you need, then all the existing instruction simulations are +already gathering the right data for you. + +@end itemize + +The file @file{prologue-value.h} contains detailed comments explaining +the framework and how to use it. + + @section Breakpoint Handling @cindex breakpoints diff --git a/gdb/prologue-value.c b/gdb/prologue-value.c new file mode 100644 index 00000000000..4ad4d6c828f --- /dev/null +++ b/gdb/prologue-value.c @@ -0,0 +1,591 @@ +/* Prologue value handling for GDB. + Copyright 2003, 2004, 2005 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to: + + Free Software Foundation, Inc. + 51 Franklin St - Fifth Floor + Boston, MA 02110-1301 + USA */ + +#include "defs.h" +#include "gdb_string.h" +#include "gdb_assert.h" +#include "prologue-value.h" +#include "regcache.h" + + +/* Constructors. */ + +pv_t +pv_unknown (void) +{ + pv_t v = { pvk_unknown, 0, 0 }; + + return v; +} + + +pv_t +pv_constant (CORE_ADDR k) +{ + pv_t v; + + v.kind = pvk_constant; + v.reg = -1; /* for debugging */ + v.k = k; + + return v; +} + + +pv_t +pv_register (int reg, CORE_ADDR k) +{ + pv_t v; + + v.kind = pvk_register; + v.reg = reg; + v.k = k; + + return v; +} + + + +/* Arithmetic operations. */ + +/* If one of *A and *B is a constant, and the other isn't, swap the + values as necessary to ensure that *B is the constant. This can + reduce the number of cases we need to analyze in the functions + below. */ +static void +constant_last (pv_t *a, pv_t *b) +{ + if (a->kind == pvk_constant + && b->kind != pvk_constant) + { + pv_t temp = *a; + *a = *b; + *b = temp; + } +} + + +pv_t +pv_add (pv_t a, pv_t b) +{ + constant_last (&a, &b); + + /* We can add a constant to a register. */ + if (a.kind == pvk_register + && b.kind == pvk_constant) + return pv_register (a.reg, a.k + b.k); + + /* We can add a constant to another constant. */ + else if (a.kind == pvk_constant + && b.kind == pvk_constant) + return pv_constant (a.k + b.k); + + /* Anything else we don't know how to add. We don't have a + representation for, say, the sum of two registers, or a multiple + of a register's value (adding a register to itself). */ + else + return pv_unknown (); +} + + +pv_t +pv_add_constant (pv_t v, CORE_ADDR k) +{ + /* Rather than thinking of all the cases we can and can't handle, + we'll just let pv_add take care of that for us. */ + return pv_add (v, pv_constant (k)); +} + + +pv_t +pv_subtract (pv_t a, pv_t b) +{ + /* This isn't quite the same as negating B and adding it to A, since + we don't have a representation for the negation of anything but a + constant. For example, we can't negate { pvk_register, R1, 10 }, + but we do know that { pvk_register, R1, 10 } minus { pvk_register, + R1, 5 } is { pvk_constant, , 5 }. + + This means, for example, that we could subtract two stack + addresses; they're both relative to the original SP. Since the + frame pointer is set based on the SP, its value will be the + original SP plus some constant (probably zero), so we can use its + value just fine, too. */ + + constant_last (&a, &b); + + /* We can subtract two constants. */ + if (a.kind == pvk_constant + && b.kind == pvk_constant) + return pv_constant (a.k - b.k); + + /* We can subtract a constant from a register. */ + else if (a.kind == pvk_register + && b.kind == pvk_constant) + return pv_register (a.reg, a.k - b.k); + + /* We can subtract a register from itself, yielding a constant. */ + else if (a.kind == pvk_register + && b.kind == pvk_register + && a.reg == b.reg) + return pv_constant (a.k - b.k); + + /* We don't know how to subtract anything else. */ + else + return pv_unknown (); +} + + +pv_t +pv_logical_and (pv_t a, pv_t b) +{ + constant_last (&a, &b); + + /* We can 'and' two constants. */ + if (a.kind == pvk_constant + && b.kind == pvk_constant) + return pv_constant (a.k & b.k); + + /* We can 'and' anything with the constant zero. */ + else if (b.kind == pvk_constant + && b.k == 0) + return pv_constant (0); + + /* We can 'and' anything with ~0. */ + else if (b.kind == pvk_constant + && b.k == ~ (CORE_ADDR) 0) + return a; + + /* We can 'and' a register with itself. */ + else if (a.kind == pvk_register + && b.kind == pvk_register + && a.reg == b.reg + && a.k == b.k) + return a; + + /* Otherwise, we don't know. */ + else + return pv_unknown (); +} + + + +/* Examining prologue values. */ + +int +pv_is_identical (pv_t a, pv_t b) +{ + if (a.kind != b.kind) + return 0; + + switch (a.kind) + { + case pvk_unknown: + return 1; + case pvk_constant: + return (a.k == b.k); + case pvk_register: + return (a.reg == b.reg && a.k == b.k); + default: + gdb_assert (0); + } +} + + +int +pv_is_constant (pv_t a) +{ + return (a.kind == pvk_constant); +} + + +int +pv_is_register (pv_t a, int r) +{ + return (a.kind == pvk_register + && a.reg == r); +} + + +int +pv_is_register_k (pv_t a, int r, CORE_ADDR k) +{ + return (a.kind == pvk_register + && a.reg == r + && a.k == k); +} + + +enum pv_boolean +pv_is_array_ref (pv_t addr, CORE_ADDR size, + pv_t array_addr, CORE_ADDR array_len, + CORE_ADDR elt_size, + int *i) +{ + /* Note that, since .k is a CORE_ADDR, and CORE_ADDR is unsigned, if + addr is *before* the start of the array, then this isn't going to + be negative... */ + pv_t offset = pv_subtract (addr, array_addr); + + if (offset.kind == pvk_constant) + { + /* This is a rather odd test. We want to know if the SIZE bytes + at ADDR don't overlap the array at all, so you'd expect it to + be an || expression: "if we're completely before || we're + completely after". But with unsigned arithmetic, things are + different: since it's a number circle, not a number line, the + right values for offset.k are actually one contiguous range. */ + if (offset.k <= -size + && offset.k >= array_len * elt_size) + return pv_definite_no; + else if (offset.k % elt_size != 0 + || size != elt_size) + return pv_maybe; + else + { + *i = offset.k / elt_size; + return pv_definite_yes; + } + } + else + return pv_maybe; +} + + + +/* Areas. */ + + +/* A particular value known to be stored in an area. + + Entries form a ring, sorted by unsigned offset from the area's base + register's value. Since entries can straddle the wrap-around point, + unsigned offsets form a circle, not a number line, so the list + itself is structured the same way --- there is no inherent head. + The entry with the lowest offset simply follows the entry with the + highest offset. Entries may abut, but never overlap. The area's + 'entry' pointer points to an arbitrary node in the ring. */ +struct area_entry +{ + /* Links in the doubly-linked ring. */ + struct area_entry *prev, *next; + + /* Offset of this entry's address from the value of the base + register. */ + CORE_ADDR offset; + + /* The size of this entry. Note that an entry may wrap around from + the end of the address space to the beginning. */ + CORE_ADDR size; + + /* The value stored here. */ + pv_t value; +}; + + +struct pv_area +{ + /* This area's base register. */ + int base_reg; + + /* The mask to apply to addresses, to make the wrap-around happen at + the right place. */ + CORE_ADDR addr_mask; + + /* An element of the doubly-linked ring of entries, or zero if we + have none. */ + struct area_entry *entry; +}; + + +struct pv_area * +make_pv_area (int base_reg) +{ + struct pv_area *a = (struct pv_area *) xmalloc (sizeof (*a)); + + memset (a, 0, sizeof (*a)); + + a->base_reg = base_reg; + a->entry = 0; + + /* Remember that shift amounts equal to the type's width are + undefined. */ + a->addr_mask = ((((CORE_ADDR) 1 << (TARGET_ADDR_BIT - 1)) - 1) << 1) | 1; + + return a; +} + + +/* Delete all entries from AREA. */ +static void +clear_entries (struct pv_area *area) +{ + struct area_entry *e = area->entry; + + if (e) + { + /* This needs to be a do-while loop, in order to actually + process the node being checked for in the terminating + condition. */ + do + { + struct area_entry *next = e->next; + xfree (e); + } + while (e != area->entry); + + area->entry = 0; + } +} + + +void +free_pv_area (struct pv_area *area) +{ + clear_entries (area); + xfree (area); +} + + +static void +do_free_pv_area_cleanup (void *arg) +{ + free_pv_area ((struct pv_area *) arg); +} + + +struct cleanup * +make_cleanup_free_pv_area (struct pv_area *area) +{ + return make_cleanup (do_free_pv_area_cleanup, (void *) area); +} + + +int +pv_area_store_would_trash (struct pv_area *area, pv_t addr) +{ + /* It may seem odd that pvk_constant appears here --- after all, + that's the case where we know the most about the address! But + pv_areas are always relative to a register, and we don't know the + value of the register, so we can't compare entry addresses to + constants. */ + return (addr.kind == pvk_unknown + || addr.kind == pvk_constant + || (addr.kind == pvk_register && addr.reg != area->base_reg)); +} + + +/* Return a pointer to the first entry we hit in AREA starting at + OFFSET and going forward. + + This may return zero, if AREA has no entries. + + And since the entries are a ring, this may return an entry that + entirely preceeds OFFSET. This is the correct behavior: depending + on the sizes involved, we could still overlap such an area, with + wrap-around. */ +static struct area_entry * +find_entry (struct pv_area *area, CORE_ADDR offset) +{ + struct area_entry *e = area->entry; + + if (! e) + return 0; + + /* If the next entry would be better than the current one, then scan + forward. Since we use '<' in this loop, it always terminates. + + Note that, even setting aside the addr_mask stuff, we must not + simplify this, in high school algebra fashion, to + (e->next->offset < e->offset), because of the way < interacts + with wrap-around. We have to subtract offset from both sides to + make sure both things we're comparing are on the same side of the + discontinuity. */ + while (((e->next->offset - offset) & area->addr_mask) + < ((e->offset - offset) & area->addr_mask)) + e = e->next; + + /* If the previous entry would be better than the current one, then + scan backwards. */ + while (((e->prev->offset - offset) & area->addr_mask) + < ((e->offset - offset) & area->addr_mask)) + e = e->prev; + + /* In case there's some locality to the searches, set the area's + pointer to the entry we've found. */ + area->entry = e; + + return e; +} + + +/* Return non-zero if the SIZE bytes at OFFSET would overlap ENTRY; + return zero otherwise. AREA is the area to which ENTRY belongs. */ +static int +overlaps (struct pv_area *area, + struct area_entry *entry, + CORE_ADDR offset, + CORE_ADDR size) +{ + /* Think carefully about wrap-around before simplifying this. */ + return (((entry->offset - offset) & area->addr_mask) < size + || ((offset - entry->offset) & area->addr_mask) < entry->size); +} + + +void +pv_area_store (struct pv_area *area, + pv_t addr, + CORE_ADDR size, + pv_t value) +{ + /* Remove any (potentially) overlapping entries. */ + if (pv_area_store_would_trash (area, addr)) + clear_entries (area); + else + { + CORE_ADDR offset = addr.k; + struct area_entry *e = find_entry (area, offset); + + /* Delete all entries that we would overlap. */ + while (e && overlaps (area, e, offset, size)) + { + struct area_entry *next = (e->next == e) ? 0 : e->next; + e->prev->next = e->next; + e->next->prev = e->prev; + + xfree (e); + e = next; + } + + /* Move the area's pointer to the next remaining entry. This + will also zero the pointer if we've deleted all the entries. */ + area->entry = e; + } + + /* Now, there are no entries overlapping us, and area->entry is + either zero or pointing at the closest entry after us. We can + just insert ourselves before that. + + But if we're storing an unknown value, don't bother --- that's + the default. */ + if (value.kind == pvk_unknown) + return; + else + { + CORE_ADDR offset = addr.k; + struct area_entry *e = (struct area_entry *) xmalloc (sizeof (*e)); + e->offset = offset; + e->size = size; + e->value = value; + + if (area->entry) + { + e->prev = area->entry->prev; + e->next = area->entry; + e->prev->next = e->next->prev = e; + } + else + { + e->prev = e->next = e; + area->entry = e; + } + } +} + + +pv_t +pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size) +{ + /* If we have no entries, or we can't decide how ADDR relates to the + entries we do have, then the value is unknown. */ + if (! area->entry + || pv_area_store_would_trash (area, addr)) + return pv_unknown (); + else + { + CORE_ADDR offset = addr.k; + struct area_entry *e = find_entry (area, offset); + + /* If this entry exactly matches what we're looking for, then + we're set. Otherwise, say it's unknown. */ + if (e->offset == offset && e->size == size) + return e->value; + else + return pv_unknown (); + } +} + + +int +pv_area_find_reg (struct pv_area *area, + struct gdbarch *gdbarch, + int reg, + CORE_ADDR *offset_p) +{ + struct area_entry *e = area->entry; + + if (e) + do + { + if (e->value.kind == pvk_register + && e->value.reg == reg + && e->value.k == 0 + && e->size == register_size (gdbarch, reg)) + { + if (offset_p) + *offset_p = e->offset; + return 1; + } + + e = e->next; + } + while (e != area->entry); + + return 0; +} + + +void +pv_area_scan (struct pv_area *area, + void (*func) (void *closure, + pv_t addr, + CORE_ADDR size, + pv_t value), + void *closure) +{ + struct area_entry *e = area->entry; + pv_t addr; + + addr.kind = pvk_register; + addr.reg = area->base_reg; + + if (e) + do + { + addr.k = e->offset; + func (closure, addr, e->size, e->value); + e = e->next; + } + while (e != area->entry); +} diff --git a/gdb/prologue-value.h b/gdb/prologue-value.h new file mode 100644 index 00000000000..ec44cad021f --- /dev/null +++ b/gdb/prologue-value.h @@ -0,0 +1,302 @@ +/* Interface to prologue value handling for GDB. + Copyright 2003, 2004, 2005 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to: + + Free Software Foundation, Inc. + 51 Franklin St - Fifth Floor + Boston, MA 02110-1301 + USA */ + +#ifndef PROLOGUE_VALUE_H +#define PROLOGUE_VALUE_H + +/* When we analyze a prologue, we're really doing 'abstract + interpretation' or 'pseudo-evaluation': running the function's code + in simulation, but using conservative approximations of the values + it would have when it actually runs. For example, if our function + starts with the instruction: + + addi r1, 42 # add 42 to r1 + + we don't know exactly what value will be in r1 after executing this + instruction, but we do know it'll be 42 greater than its original + value. + + If we then see an instruction like: + + addi r1, 22 # add 22 to r1 + + we still don't know what r1's value is, but again, we can say it is + now 64 greater than its original value. + + If the next instruction were: + + mov r2, r1 # set r2 to r1's value + + then we can say that r2's value is now the original value of r1 + plus 64. + + It's common for prologues to save registers on the stack, so we'll + need to track the values of stack frame slots, as well as the + registers. So after an instruction like this: + + mov (fp+4), r2 + + then we'd know that the stack slot four bytes above the frame + pointer holds the original value of r1 plus 64. + + And so on. + + Of course, this can only go so far before it gets unreasonable. If + we wanted to be able to say anything about the value of r1 after + the instruction: + + xor r1, r3 # exclusive-or r1 and r3, place result in r1 + + then things would get pretty complex. But remember, we're just + doing a conservative approximation; if exclusive-or instructions + aren't relevant to prologues, we can just say r1's value is now + 'unknown'. We can ignore things that are too complex, if that loss + of information is acceptable for our application. + + So when I say "conservative approximation" here, what I mean is an + approximation that is either accurate, or marked "unknown", but + never inaccurate. + + Once you've reached the current PC, or an instruction that you + don't know how to simulate, you stop. Now you can examine the + state of the registers and stack slots you've kept track of. + + - To see how large your stack frame is, just check the value of the + stack pointer register; if it's the original value of the SP + minus a constant, then that constant is the stack frame's size. + If the SP's value has been marked as 'unknown', then that means + the prologue has done something too complex for us to track, and + we don't know the frame size. + + - To see where we've saved the previous frame's registers, we just + search the values we've tracked --- stack slots, usually, but + registers, too, if you want --- for something equal to the + register's original value. If the ABI suggests a standard place + to save a given register, then we can check there first, but + really, anything that will get us back the original value will + probably work. + + Sure, this takes some work. But prologue analyzers aren't + quick-and-simple pattern patching to recognize a few fixed prologue + forms any more; they're big, hairy functions. Along with inferior + function calls, prologue analysis accounts for a substantial + portion of the time needed to stabilize a GDB port. So I think + it's worthwhile to look for an approach that will be easier to + understand and maintain. In the approach used here: + + - It's easier to see that the analyzer is correct: you just see + whether the analyzer properly (albiet conservatively) simulates + the effect of each instruction. + + - It's easier to extend the analyzer: you can add support for new + instructions, and know that you haven't broken anything that + wasn't already broken before. + + - It's orthogonal: to gather new information, you don't need to + complicate the code for each instruction. As long as your domain + of conservative values is already detailed enough to tell you + what you need, then all the existing instruction simulations are + already gathering the right data for you. + + A 'struct prologue_value' is a conservative approximation of the + real value the register or stack slot will have. */ + +struct prologue_value { + + /* What sort of value is this? This determines the interpretation + of subsequent fields. */ + enum { + + /* We don't know anything about the value. This is also used for + values we could have kept track of, when doing so would have + been too complex and we don't want to bother. The bottom of + our lattice. */ + pvk_unknown, + + /* A known constant. K is its value. */ + pvk_constant, + + /* The value that register REG originally had *UPON ENTRY TO THE + FUNCTION*, plus K. If K is zero, this means, obviously, just + the value REG had upon entry to the function. REG is a GDB + register number. Before we start interpreting, we initialize + every register R to { pvk_register, R, 0 }. */ + pvk_register, + + } kind; + + /* The meanings of the following fields depend on 'kind'; see the + comments for the specific 'kind' values. */ + int reg; + CORE_ADDR k; +}; + +typedef struct prologue_value pv_t; + + +/* Return the unknown prologue value --- { pvk_unknown, ?, ? }. */ +pv_t pv_unknown (void); + +/* Return the prologue value representing the constant K. */ +pv_t pv_constant (CORE_ADDR k); + +/* Return the prologue value representing the original value of + register REG, plus the constant K. */ +pv_t pv_register (int reg, CORE_ADDR k); + + +/* Return conservative approximations of the results of the following + operations. */ +pv_t pv_add (pv_t a, pv_t b); /* a + b */ +pv_t pv_add_constant (pv_t v, CORE_ADDR k); /* a + k */ +pv_t pv_subtract (pv_t a, pv_t b); /* a - b */ +pv_t pv_logical_and (pv_t a, pv_t b); /* a & b */ + + +/* Return non-zero iff A and B are identical expressions. + + This is not the same as asking if the two values are equal; the + result of such a comparison would have to be a pv_boolean, and + asking whether two 'unknown' values were equal would give you + pv_maybe. Same for comparing, say, { pvk_register, R1, 0 } and { + pvk_register, R2, 0}. + + Instead, this function asks whether the two representations are the + same. */ +int pv_is_identical (pv_t a, pv_t b); + + +/* Return non-zero if A is known to be a constant. */ +int pv_is_constant (pv_t a); + +/* Return non-zero if A is the original value of register number R + plus some constant, zero otherwise. */ +int pv_is_register (pv_t a, int r); + + +/* Return non-zero if A is the original value of register R plus the + constant K. */ +int pv_is_register_k (pv_t a, int r, CORE_ADDR k); + +/* A conservative boolean type, including "maybe", when we can't + figure out whether something is true or not. */ +enum pv_boolean { + pv_maybe, + pv_definite_yes, + pv_definite_no, +}; + + +/* Decide whether a reference to SIZE bytes at ADDR refers exactly to + an element of an array. The array starts at ARRAY_ADDR, and has + ARRAY_LEN values of ELT_SIZE bytes each. If ADDR definitely does + refer to an array element, set *I to the index of the referenced + element in the array, and return pv_definite_yes. If it definitely + doesn't, return pv_definite_no. If we can't tell, return pv_maybe. + + If the reference does touch the array, but doesn't fall exactly on + an element boundary, or doesn't refer to the whole element, return + pv_maybe. */ +enum pv_boolean pv_is_array_ref (pv_t addr, CORE_ADDR size, + pv_t array_addr, CORE_ADDR array_len, + CORE_ADDR elt_size, + int *i); + + +/* A 'struct pv_area' keeps track of values stored in a particular + region of memory. */ +struct pv_area; + +/* Create a new area, tracking stores relative to the original value + of BASE_REG. If BASE_REG is SP, then this effectively records the + contents of the stack frame: the original value of the SP is the + frame's CFA, or some constant offset from it. + + Stores to constant addresses, unknown addresses, or to addresses + relative to registers other than BASE_REG will trash this area; see + pv_area_store_would_trash. */ +struct pv_area *make_pv_area (int base_reg); + +/* Free AREA. */ +void free_pv_area (struct pv_area *area); + + +/* Register a cleanup to free AREA. */ +struct cleanup *make_cleanup_free_pv_area (struct pv_area *area); + + +/* Store the SIZE-byte value VALUE at ADDR in AREA. + + If ADDR is not relative to the same base register we used in + creating AREA, then we can't tell which values here the stored + value might overlap, and we'll have to mark everything as + unknown. */ +void pv_area_store (struct pv_area *area, + pv_t addr, + CORE_ADDR size, + pv_t value); + +/* Return the SIZE-byte value at ADDR in AREA. This may return + pv_unknown (). */ +pv_t pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size); + +/* Return true if storing to address ADDR in AREA would force us to + mark the contents of the entire area as unknown. This could happen + if, say, ADDR is unknown, since we could be storing anywhere. Or, + it could happen if ADDR is relative to a different register than + the other stores base register, since we don't know the relative + values of the two registers. + + If you've reached such a store, it may be better to simply stop the + prologue analysis, and return the information you've gathered, + instead of losing all that information, most of which is probably + okay. */ +int pv_area_store_would_trash (struct pv_area *area, pv_t addr); + + +/* Search AREA for the original value of REGISTER. If we can't find + it, return zero; if we can find it, return a non-zero value, and if + OFFSET_P is non-zero, set *OFFSET_P to the register's offset within + AREA. GDBARCH is the architecture of which REGISTER is a member. + + In the worst case, this takes time proportional to the number of + items stored in AREA. If you plan to gather a lot of information + about registers saved in AREA, consider calling pv_area_scan + instead, and collecting all your information in one pass. */ +int pv_area_find_reg (struct pv_area *area, + struct gdbarch *gdbarch, + int register, + CORE_ADDR *offset_p); + + +/* For every part of AREA whose value we know, apply FUNC to CLOSURE, + the value's address, its size, and the value itself. */ +void pv_area_scan (struct pv_area *area, + void (*func) (void *closure, + pv_t addr, + CORE_ADDR size, + pv_t value), + void *closure); + + +#endif /* PROLOGUE_VALUE_H */