From 275ee935b336458aea19b63ce745cacee9b68d12 Mon Sep 17 00:00:00 2001 From: Andrew Burgess Date: Wed, 26 May 2021 22:03:23 +0100 Subject: [PATCH] gdb: prevent an assertion when computing the frame_id for an inline frame I ran into this assertion while GDB was trying to unwind the stack: gdb/inline-frame.c:173: internal-error: void inline_frame_this_id(frame_info*, void**, frame_id*): Assertion `frame_id_p (*this_id)' failed. That is, when building the frame_id for an inline frame, GDB asks for the frame_id of the previous frame. Unfortunately, no valid frame_id was returned for the previous frame, and so the assertion triggers. What is happening is this, I had a stack that looked something like this (the arrows '->' point from caller to callee): normal_frame -> inline_frame However, for whatever reason (e.g. broken debug information, or corrupted stack contents in the inferior), when GDB tries to unwind "normal_frame", it ends up getting back effectively the same frame, thus the call stack looks like this to GDB: .-> normal_frame -> inline_frame | | '-----' Given such a situation we would expect GDB to terminate the stack with an error like this: Backtrace stopped: previous frame identical to this frame (corrupt stack?) However, the inline_frame causes a problem, and here's why: When unwinding we start from the sentinel frame and call get_prev_frame. We eventually end up in get_prev_frame_if_no_cycle, in here we create a raw frame, and as this is frame #0 we immediately return. However, eventually we will try to unwind the stack further. When we do this we inevitably needing to know the frame_id for frame #0, and so, eventually, we end up in compute_frame_id. In compute_frame_id we first find the right unwinder for this frame, in our case (i.e. for inline_frame) the $pc is within the function normal_frame, but also within a block associated with the inlined function inline_frame, as such the inline frame unwinder claims this frame. Back in compute_frame_id we next compute the frame_id, for our inline_frame this means a call to inline_frame_this_id. The ID of an inline frame is based on the id of the previous frame, so from inline_frame_this_id we call get_prev_frame_always, this eventually calls get_prev_frame_if_no_cycle again, which creates another raw frame and calls compute_frame_id (for frames other than frame 0 we immediately compute the frame_id). In compute_frame_id we again identify the correct unwinder for this frame. Our $pc is unchanged, however, the fact that the next frame is of type INLINE_FRAME prevents the inline frame unwinder from claiming this frame again, and so, the standard DWARF frame unwinder claims normal_frame. We return to compute_frame_id and call the standard DWARF function to build the frame_id for normal_frame. With the frame_id of normal_frame figured out we return to compute_frame_id, and then to get_prev_frame_if_no_cycle, where we add the ID for normal_frame into the frame_id cache, and return the frame back to inline_frame_this_id. From inline_frame_this_id we build a frame_id for inline_frame and return to compute_frame_id, and then to get_prev_frame_if_no_cycle, which adds the frame_id for inline_frame into the frame_id cache. So far, so good. However, as we are trying to unwind the complete stack, we eventually ask for the previous frame of normal_frame, remember, at this point GDB doesn't know the stack is corrupted (with a cycle), GDB still needs to figure that out. So, we eventually end up in get_prev_frame_if_no_cycle where we create a raw frame and call compute_frame_id, remember, this is for the frame before normal_frame. The first task for compute_frame_id is to find the unwinder for this frame, so all of the frame sniffers are tried in order, this includes the inline frame sniffer. The inline frame sniffer asks for the $pc, this request is sent up the stack to normal_frame, which, due to its cyclic behaviour, tells GDB that the $pc in the previous frame was the same as the $pc in normal_frame. GDB spots that this $pc corresponds to both the function normal_frame and also the inline function inline_frame. As the next frame is not an INLINE_FRAME then GDB figures that we have not yet built a frame to cover inline_frame, and so the inline sniffer claims this new frame. Our stack is now looking like this: inline_frame -> normal_frame -> inline_frame But, we have not yet computed the frame id for the outer most (on the left) inline_frame. After the frame sniffer has claimed the inline frame GDB returns to compute_frame_id and calls inline_frame_this_id. In here GDB calls get_prev_frame_always, which eventually ends up in get_prev_frame_if_no_cycle again, where we create a raw frame and call compute_frame_id. Just like before, compute_frame_id tries to find an unwinder for this new frame, it sees that the $pc is within both normal_frame and inline_frame, but the next frame is, again, an INLINE_FRAME, so, just like before the standard DWARF unwinder claims this frame. Back in compute_frame_id we again call the standard DWARF function to build the frame_id for this new copy of normal_frame. At this point the stack looks like this: normal_frame -> inline_frame -> normal_frame -> inline_frame After compute_frame_id we return to get_prev_frame_if_no_cycle, where we try to add the frame_id for the new normal_frame into the frame_id cache, however, unlike before, we fail to add this frame_id as it is a duplicate of the previous normal_frame frame_id. Having found a duplicate get_prev_frame_if_no_cycle unlinks the new frame from the stack, and returns nullptr, the stack now looks like this: inline_frame -> normal_frame -> inline_frame The nullptr result from get_prev_frame_if_no_cycle is fed back to inline_frame_this_id, which forwards this to get_frame_id, which immediately returns null_frame_id. As null_frame_id is not considered a valid frame_id, this is what triggers the assertion. In summary then: - inline_frame_this_id currently assumes that as the inline frame exists, we will always get a valid frame back from get_prev_frame_always, - get_prev_frame_if_no_cycle currently assumes that it is safe to return nullptr when it sees a cycle. Notice that in frame.c:compute_frame_id, this code: fi->this_id.value = outer_frame_id; fi->unwind->this_id (fi, &fi->prologue_cache, &fi->this_id.value); gdb_assert (frame_id_p (fi->this_id.value)); The assertion makes it clear that the this_id function must always return a valid frame_id (e.g. null_frame_id is not a valid return value), and similarly in inline_frame.c:inline_frame_this_id this code: *this_id = get_frame_id (get_prev_frame_always (this_frame)); /* snip comment */ gdb_assert (frame_id_p (*this_id)); Makes it clear that every inline frame expects to be able to get a previous frame, which will have a valid frame_id. As I have discussed above, these assumptions don't currently hold in all cases. One possibility would be to move the call to get_prev_frame_always forward from inline_frame_this_id to inline_frame_sniffer, however, this falls foul of (in frame.c:frame_cleanup_after_sniffer) this assertion: /* No sniffer should extend the frame chain; sniff based on what is already certain. */ gdb_assert (!frame->prev_p); This assert prohibits any sniffer from trying to get the previous frame, as getting the previous frame is likely to depend on the next frame, I can understand why this assertion is a good thing, and I'm in no rush to alter this rule. The solution proposed here takes onboard feedback from both Pedro, and Simon (see the links below). The get_prev_frame_if_no_cycle function is renamed to get_prev_frame_maybe_check_cycle, and will now not do cycle detection for inline frames, even when we spot a duplicate frame it is still returned. This is fine, as, if the normal frame has a duplicate frame-id then the inline frame will also have a duplicate frame-id. And so, when we reject the inline frame, the duplicate normal frame, which is previous to the inline frame, will also be rejected. In inline-frame.c the call to get_prev_frame_always is no longer nested inside the call to get_frame_id. There are reasons why get_prev_frame_always can return nullptr, for example, if there is a memory error while trying to get the previous frame, if this should happen then we now give a more informative error message. Historical Links: Patch v2: https://sourceware.org/pipermail/gdb-patches/2021-June/180208.html Feedback: https://sourceware.org/pipermail/gdb-patches/2021-July/180651.html https://sourceware.org/pipermail/gdb-patches/2021-July/180663.html Patch v3: https://sourceware.org/pipermail/gdb-patches/2021-July/181029.html Feedback: https://sourceware.org/pipermail/gdb-patches/2021-July/181035.html Additional input: https://sourceware.org/pipermail/gdb-patches/2021-September/182040.html --- gdb/frame.c | 60 +++++++- gdb/inline-frame.c | 5 +- .../gdb.base/inline-frame-cycle-unwind.c | 58 +++++++ .../gdb.base/inline-frame-cycle-unwind.exp | 145 ++++++++++++++++++ .../gdb.base/inline-frame-cycle-unwind.py | 85 ++++++++++ 5 files changed, 344 insertions(+), 9 deletions(-) create mode 100644 gdb/testsuite/gdb.base/inline-frame-cycle-unwind.c create mode 100644 gdb/testsuite/gdb.base/inline-frame-cycle-unwind.exp create mode 100644 gdb/testsuite/gdb.base/inline-frame-cycle-unwind.py diff --git a/gdb/frame.c b/gdb/frame.c index d28944075ed..16673258373 100644 --- a/gdb/frame.c +++ b/gdb/frame.c @@ -2044,14 +2044,23 @@ frame_register_unwind_location (struct frame_info *this_frame, int regnum, outermost, with UNWIND_SAME_ID stop reason. Unlike the other validity tests, that compare THIS_FRAME and the next frame, we do this right after creating the previous frame, to avoid ever ending - up with two frames with the same id in the frame chain. */ + up with two frames with the same id in the frame chain. + + There is however, one case where this cycle detection is not desirable, + when asking for the previous frame of an inline frame, in this case, if + the previous frame is a duplicate and we return nullptr then we will be + unable to calculate the frame_id of the inline frame, this in turn + causes inline_frame_this_id() to fail. So for inline frames (and only + for inline frames), the previous frame will always be returned, even when it + has a duplicate frame_id. We're not worried about cycles in the frame + chain as, if the previous frame returned here has a duplicate frame_id, + then the frame_id of the inline frame, calculated based off the frame_id + of the previous frame, should also be a duplicate. */ static struct frame_info * -get_prev_frame_if_no_cycle (struct frame_info *this_frame) +get_prev_frame_maybe_check_cycle (struct frame_info *this_frame) { - struct frame_info *prev_frame; - - prev_frame = get_prev_frame_raw (this_frame); + struct frame_info *prev_frame = get_prev_frame_raw (this_frame); /* Don't compute the frame id of the current frame yet. Unwinding the sentinel frame can fail (e.g., if the thread is gone and we @@ -2070,7 +2079,42 @@ get_prev_frame_if_no_cycle (struct frame_info *this_frame) try { compute_frame_id (prev_frame); - if (!frame_stash_add (prev_frame)) + + bool cycle_detection_p = get_frame_type (this_frame) != INLINE_FRAME; + + /* This assert checks GDB's state with respect to calculating the + frame-id of THIS_FRAME, in the case where THIS_FRAME is an inline + frame. + + If THIS_FRAME is frame #0, and is an inline frame, then we put off + calculating the frame_id until we specifically make a call to + get_frame_id(). As a result we can enter this function in two + possible states. If GDB asked for the previous frame of frame #0 + then THIS_FRAME will be frame #0 (an inline frame), and the + frame_id will be in the NOT_COMPUTED state. However, if GDB asked + for the frame_id of frame #0, then, as getting the frame_id of an + inline frame requires us to get the frame_id of the previous + frame, we will still end up in here, and the frame_id status will + be COMPUTING. + + If, instead, THIS_FRAME is at a level greater than #0 then things + are simpler. For these frames we immediately compute the frame_id + when the frame is initially created, and so, for those frames, we + will always enter this function with the frame_id status of + COMPUTING. */ + gdb_assert (cycle_detection_p + || (this_frame->level > 0 + && (this_frame->this_id.p + == frame_id_status::COMPUTING)) + || (this_frame->level == 0 + && (this_frame->this_id.p + != frame_id_status::COMPUTED))); + + /* We must do the CYCLE_DETECTION_P check after attempting to add + PREV_FRAME into the cache; if PREV_FRAME is unique then we do want + it in the cache, but if it is a duplicate and CYCLE_DETECTION_P is + false, then we don't want to unlink it. */ + if (!frame_stash_add (prev_frame) && cycle_detection_p) { /* Another frame with the same id was already in the stash. We just detected a cycle. */ @@ -2147,7 +2191,7 @@ get_prev_frame_always_1 (struct frame_info *this_frame) until we have unwound all the way down to the previous non-inline frame. */ if (get_frame_type (this_frame) == INLINE_FRAME) - return get_prev_frame_if_no_cycle (this_frame); + return get_prev_frame_maybe_check_cycle (this_frame); /* If this_frame is the current frame, then compute and stash its frame id prior to fetching and computing the frame id of the @@ -2248,7 +2292,7 @@ get_prev_frame_always_1 (struct frame_info *this_frame) } } - return get_prev_frame_if_no_cycle (this_frame); + return get_prev_frame_maybe_check_cycle (this_frame); } /* Return a "struct frame_info" corresponding to the frame that called diff --git a/gdb/inline-frame.c b/gdb/inline-frame.c index c98af1842a6..df7bd826ff5 100644 --- a/gdb/inline-frame.c +++ b/gdb/inline-frame.c @@ -163,7 +163,10 @@ inline_frame_this_id (struct frame_info *this_frame, function, there must be previous frames, so this is safe - as long as we're careful not to create any cycles. See related comments in get_prev_frame_always_1. */ - *this_id = get_frame_id (get_prev_frame_always (this_frame)); + frame_info *prev_frame = get_prev_frame_always (this_frame); + if (prev_frame == nullptr) + error (_("failed to find previous frame when computing inline frame id")); + *this_id = get_frame_id (prev_frame); /* We need a valid frame ID, so we need to be based on a valid frame. FSF submission NOTE: this would be a good assertion to diff --git a/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.c b/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.c new file mode 100644 index 00000000000..183c40928b6 --- /dev/null +++ b/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.c @@ -0,0 +1,58 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2021 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +static void inline_func (void); +static void normal_func (void); + +volatile int global_var; +volatile int level_counter; + +static void __attribute__((noinline)) +normal_func (void) +{ + /* Do some work. */ + ++global_var; + + /* Now the inline function. */ + --level_counter; + inline_func (); + ++level_counter; + + /* Do some work. */ + ++global_var; +} + +static inline void __attribute__((__always_inline__)) +inline_func (void) +{ + if (level_counter > 1) + { + --level_counter; + normal_func (); + ++level_counter; + } + else + ++global_var; /* Break here. */ +} + +int +main () +{ + level_counter = 6; + normal_func (); + return 0; +} diff --git a/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.exp b/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.exp new file mode 100644 index 00000000000..2801b683a03 --- /dev/null +++ b/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.exp @@ -0,0 +1,145 @@ +# Copyright (C) 2021 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# This test checks for an edge case when unwinding inline frames which +# occur towards the older end of the stack when the stack ends with a +# cycle. Consider this well formed stack: +# +# main -> normal_frame -> inline_frame +# +# Now consider that, for whatever reason, the stack unwinding of +# "normal_frame" becomes corrupted, such that the stack appears to be +# this: +# +# .-> normal_frame -> inline_frame +# | | +# '------' +# +# When confronted with such a situation we would expect GDB to detect +# the stack frame cycle and terminate the backtrace at the first +# instance of "normal_frame" with a message: +# +# Backtrace stopped: previous frame identical to this frame (corrupt stack?) +# +# However, at one point there was a bug in GDB's inline frame +# mechanism such that the fact that "inline_frame" was inlined into +# "normal_frame" would cause GDB to trigger an assertion. +# +# This text makes use of a Python unwinder which can fake the cyclic +# stack cycle, further the test sets up multiple levels of normal and +# inline frames. At the point of testing the stack looks like this: +# +# main -> normal_func -> inline_func -> normal_func -> inline_func -> normal_func -> inline_func +# +# Where "normal_func" is a normal frame, and "inline_func" is an inline frame. +# +# The python unwinder is then used to force a stack cycle at each +# "normal_func" frame in turn, we then check that GDB can successfully unwind +# the stack. + +standard_testfile + +if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile}]} { + return -1 +} + +# Skip this test if Python scripting is not enabled. +if { [skip_python_tests] } { continue } + +if ![runto_main] then { + fail "can't run to main" + return 0 +} + +set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py] + +# Run to the breakpoint where we will carry out the test. +gdb_breakpoint [gdb_get_line_number "Break here"] +gdb_continue_to_breakpoint "stop at test breakpoint" + +# Load the script containing the unwinder, this must be done at the +# testing point as the script will examine the stack as it is loaded. +gdb_test_no_output "source ${pyfile}"\ + "import python scripts" + +# Check the unbroken stack. +gdb_test_sequence "bt" "backtrace when the unwind is left unbroken" { + "\\r\\n#0 \[^\r\n\]* inline_func \\(\\) at " + "\\r\\n#1 \[^\r\n\]* normal_func \\(\\) at " + "\\r\\n#2 \[^\r\n\]* inline_func \\(\\) at " + "\\r\\n#3 \[^\r\n\]* normal_func \\(\\) at " + "\\r\\n#4 \[^\r\n\]* inline_func \\(\\) at " + "\\r\\n#5 \[^\r\n\]* normal_func \\(\\) at " + "\\r\\n#6 \[^\r\n\]* main \\(\\) at " +} + +with_test_prefix "cycle at level 5" { + # Arrange to introduce a stack cycle at frame 5. + gdb_test_no_output "python stop_at_level=5" + gdb_test "maint flush register-cache" \ + "Register cache flushed\\." + gdb_test_lines "bt" "backtrace when the unwind is broken at frame 5" \ + [multi_line \ + "#0 \[^\r\n\]* inline_func \\(\\) at \[^\r\n\]+" \ + "#1 \[^\r\n\]* normal_func \\(\\) at \[^\r\n\]+" \ + "#2 \[^\r\n\]* inline_func \\(\\) at \[^\r\n\]+" \ + "#3 \[^\r\n\]* normal_func \\(\\) at \[^\r\n\]+" \ + "#4 \[^\r\n\]* inline_func \\(\\) at \[^\r\n\]+" \ + "#5 \[^\r\n\]* normal_func \\(\\) at \[^\r\n\]+" \ + "Backtrace stopped: previous frame identical to this frame \\(corrupt stack\\?\\)"] +} + +with_test_prefix "cycle at level 3" { + # Arrange to introduce a stack cycle at frame 3. + gdb_test_no_output "python stop_at_level=3" + gdb_test "maint flush register-cache" \ + "Register cache flushed\\." + gdb_test_lines "bt" "backtrace when the unwind is broken at frame 3" \ + [multi_line \ + "#0 \[^\r\n\]* inline_func \\(\\) at \[^\r\n\]+" \ + "#1 \[^\r\n\]* normal_func \\(\\) at \[^\r\n\]+" \ + "#2 \[^\r\n\]* inline_func \\(\\) at \[^\r\n\]+" \ + "#3 \[^\r\n\]* normal_func \\(\\) at \[^\r\n\]+" \ + "Backtrace stopped: previous frame identical to this frame \\(corrupt stack\\?\\)"] +} + +with_test_prefix "cycle at level 1" { + # Arrange to introduce a stack cycle at frame 1. + gdb_test_no_output "python stop_at_level=1" + gdb_test "maint flush register-cache" \ + "Register cache flushed\\." + gdb_test_lines "bt" "backtrace when the unwind is broken at frame 1" \ + [multi_line \ + "#0 \[^\r\n\]* inline_func \\(\\) at \[^\r\n\]+" \ + "#1 \[^\r\n\]* normal_func \\(\\) at \[^\r\n\]+" \ + "Backtrace stopped: previous frame identical to this frame \\(corrupt stack\\?\\)"] +} + +# Flush the register cache (which also flushes the frame cache) so we +# get a full backtrace again, then switch on frame debugging and try +# to back trace. At one point this triggered an assertion. +gdb_test "maint flush register-cache" \ + "Register cache flushed\\." "" +gdb_test_no_output "set debug frame 1" +gdb_test_multiple "bt" "backtrace with debugging on" { + -re "^$gdb_prompt $" { + pass $gdb_test_name + } + -re "\[^\r\n\]+\r\n" { + exp_continue + } +} +gdb_test "p 1 + 2 + 3" " = 6" \ + "ensure GDB is still alive" diff --git a/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.py b/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.py new file mode 100644 index 00000000000..99c571f973c --- /dev/null +++ b/gdb/testsuite/gdb.base/inline-frame-cycle-unwind.py @@ -0,0 +1,85 @@ +# Copyright (C) 2021 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +import gdb +from gdb.unwinder import Unwinder + +# Set this to the stack level the backtrace should be corrupted at. +# This will only work for frame 1, 3, or 5 in the test this unwinder +# was written for. +stop_at_level = None + +# Set this to the stack frame size of frames 1, 3, and 5. These +# frames will all have the same stack frame size as they are the same +# function called recursively. +stack_adjust = None + + +class FrameId(object): + def __init__(self, sp, pc): + self._sp = sp + self._pc = pc + + @property + def sp(self): + return self._sp + + @property + def pc(self): + return self._pc + + +class TestUnwinder(Unwinder): + def __init__(self): + Unwinder.__init__(self, "stop at level") + + def __call__(self, pending_frame): + global stop_at_level + global stack_adjust + + if stop_at_level is None or pending_frame.level() != stop_at_level: + return None + + if stack_adjust is None: + raise gdb.GdbError("invalid stack_adjust") + + if not stop_at_level in [1, 3, 5]: + raise gdb.GdbError("invalid stop_at_level") + + sp_desc = pending_frame.architecture().registers().find("sp") + sp = pending_frame.read_register(sp_desc) + stack_adjust + pc = (gdb.lookup_symbol("normal_func"))[0].value().address + unwinder = pending_frame.create_unwind_info(FrameId(sp, pc)) + + for reg in pending_frame.architecture().registers("general"): + val = pending_frame.read_register(reg) + unwinder.add_saved_register(reg, val) + return unwinder + + +gdb.unwinder.register_unwinder(None, TestUnwinder(), True) + +# When loaded, it is expected that the stack looks like: +# +# main -> normal_func -> inline_func -> normal_func -> inline_func -> normal_func -> inline_func +# +# Compute the stack frame size of normal_func, which has inline_func +# inlined within it. +f0 = gdb.newest_frame() +f1 = f0.older() +f2 = f1.older() +f0_sp = f0.read_register("sp") +f2_sp = f2.read_register("sp") +stack_adjust = f2_sp - f0_sp -- 2.30.2