From 0a39bb3218ec528236da4953a97d07f0da9313ce Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Wed, 5 Aug 2015 20:01:42 +0100 Subject: [PATCH] stepping is disturbed by setjmp/longjmp | try/catch in other threads At https://sourceware.org/ml/gdb-patches/2015-08/msg00097.html, Joel observed that trying to next/step a program on GNU/Linux sometimes results in the following failed assertion: % gdb -q .obj/gprof/main (gdb) start (gdb) n (gdb) step [...]/infrun.c:2391: internal-error: resume: Assertion `sig != GDB_SIGNAL_0' failed. What happened is that, during the "next" operation, GDB hit a longjmp/exception/step-resume breakpoint but failed to see that this breakpoint was set for a different thread than the one being stepped. Joel's detailed analysis follows: More precisely, at the end of the "start" command, we are stopped at the start of function Main in main.adb; there are 4 threads in total, and we are in the main thread (which is thread 1): (gdb) info thread Id Target Id Frame 4 Thread 0xb7a56ba0 (LWP 28379) 0xffffe410 in __kernel_vsyscall () 3 Thread 0xb7c5aba0 (LWP 28378) 0xffffe410 in __kernel_vsyscall () 2 Thread 0xb7e5eba0 (LWP 28377) 0xffffe410 in __kernel_vsyscall () * 1 Thread 0xb7ea18c0 (LWP 28370) main () at /[...]/main.adb:57 All the logs below reference Thread ID/LWP, but it'll be easier to talk about the threads by GDB thread number. For instance, thread 1 is LWP 28370 while thread 3 is LWP 28378. So, the explanations below translate the LWPs into thread numbers. Back to what happens while we are trying to "next' our program: (gdb) n infrun: clear_proceed_status_thread (Thread 0xb7a56ba0 (LWP 28379)) infrun: clear_proceed_status_thread (Thread 0xb7c5aba0 (LWP 28378)) infrun: clear_proceed_status_thread (Thread 0xb7e5eba0 (LWP 28377)) infrun: clear_proceed_status_thread (Thread 0xb7ea18c0 (LWP 28370)) infrun: proceed (addr=0xffffffff, signal=GDB_SIGNAL_DEFAULT) infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x805451e infrun: target_wait (-1.0.0, status) = infrun: 28370.28370.0 [Thread 0xb7ea18c0 (LWP 28370)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x8054523 We've resumed thread 1 (LWP 28370), and received in return a signal that the same thread stopped slightly further. It's still in the range of instructions for the line of source we started the "next" from, as evidenced by the following trace... infrun: stepping inside range [0x805451e-0x8054531] ... and thus, we decide to continue stepping the same thread: infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523 infrun: prepare_to_wait That's when we get an event from a different thread (thread 3)... infrun: target_wait (-1.0.0, status) = infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x80782d0 infrun: context switch infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378) ... which we find to be at the address where we set a breakpoint on "the unwinder debug hook" (namely "_Unwind_DebugHook"). But GDB fails to notice that the breakpoint was inserted for thread 1 only, and so decides to handle it as... infrun: BPSTAT_WHAT_SET_LONGJMP_RESUME ... and inserts a breakpoint at the corresponding resume address, as evidenced by this the next log: infrun: exception resume at 80542a2 That breakpoint seems innocent right now, but will play a role fairly quickly. But for now, GDB has inserted the exception-resume breakpoint, and needs to single-step thread 3 past the breakpoint it just hit. Thus, it temporarily disables the exception breakpoint, and requests a step of that thread: infrun: skipping breakpoint: stepping past insn at: 0x80782d0 infrun: skipping breakpoint: stepping past insn at: 0x80782d0 infrun: skipping breakpoint: stepping past insn at: 0x80782d0 infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 0xb7c5aba0 (LWP 28378)] at 0x80782d0 infrun: prepare_to_wait We then get a notification, still from thread 3, that it's now past that breakpoint... infrun: prepare_to_wait infrun: target_wait (-1.0.0, status) = infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x8078424 ... so we can resume what we were doing before, which is single-stepping thread 1 until we get to a new line of code: infrun: switching back to stepped thread infrun: Switching context from Thread 0xb7c5aba0 (LWP 28378) to Thread 0xb7ea18c0 (LWP 28370) infrun: expected thread still hasn't advanced infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 0xb7ea18c0 (LWP 28370)] at 0x8054523 The "resume" log above shows that we're resuming thread 1 from where we left off (0x8054523). We get one more stop at 0x8054529, which is still inside our stepping range so we go again. That's when we get the following event, from thread 3: infrun: prepare_to_wait infrun: target_wait (-1.0.0, status) = infrun: 28370.28378.0 [Thread 0xb7c5aba0 (LWP 28378)], infrun: status->kind = stopped, signal = GDB_SIGNAL_TRAP infrun: TARGET_WAITKIND_STOPPED infrun: stop_pc = 0x80542a2 Now the stop_pc address is interesting, because it's the address of "exception resume" breakpoint... infrun: context switch infrun: Switching context from Thread 0xb7ea18c0 (LWP 28370) to Thread 0xb7c5aba0 (LWP 28378) infrun: BPSTAT_WHAT_CLEAR_LONGJMP_RESUME ... and since that location is at a different line of code, this is where it decides the "next" operation should stop: infrun: stop_waiting [Switching to Thread 0xb7c5aba0 (LWP 28378)] 0x080542a2 in inte_tache_rt.ttache_rt ( <_task>=0x80968ec ) at /[...]/inte_tache_rt.adb:54 54 end loop; However, what GDB should have noticed earlier that the exception breakpoint we hit was for a different thread, thus should have single-stepped that thread out of the breakpoint _without_ inserting the exception-return breakpoint, and then resumed the single-stepping of the initial thread (thread 1) until that thread stepped out of its stepping range. This is what this patch does, and after applying it, GDB now correctly stops on the next line of code. The patch adds a C++ test that exercises this, both for setjmp/longjmp and exception breakpoints. With an unpatched GDB it shows: (gdb) next [Switching to Thread 22445.22455] thread_try_catch (arg=0x0) at /home/pedro/gdb/mygit/build/../src/gdb/testsuite/gdb.threads/next-other-thr-longjmp.c:59 59 catch (...) (gdb) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 1 next /home/pedro/gdb/mygit/build/../src/gdb/infrun.c:4865: internal-error: process_event_stop_test: Assertion `ecs->event_thread->control.exception_resume_breakpoint != NULL' fa iled. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) FAIL: gdb.threads/next-other-thr-longjmp.exp: next to line 2 (GDB internal error) Resyncing due to internal error. n Tested on x86_64-linux, no regressions. gdb/ChangeLog: 2015-08-05 Pedro Alves Joel Brobecker * breakpoint.c (bpstat_what) : Handle the case where BS->STOP is not set. gdb/testsuite/ChangeLog: 2015-08-05 Pedro Alves * gdb.threads/next-while-other-thread-longjmps.c: New file. * gdb.threads/next-while-other-thread-longjmps.exp: New file. --- gdb/ChangeLog | 7 + gdb/breakpoint.c | 18 ++- gdb/testsuite/ChangeLog | 5 + .../next-while-other-thread-longjmps.c | 127 ++++++++++++++++++ .../next-while-other-thread-longjmps.exp | 40 ++++++ 5 files changed, 193 insertions(+), 4 deletions(-) create mode 100644 gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.c create mode 100644 gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp diff --git a/gdb/ChangeLog b/gdb/ChangeLog index 5d618239681..7ea7a02f941 100644 --- a/gdb/ChangeLog +++ b/gdb/ChangeLog @@ -1,3 +1,10 @@ +2015-08-05 Pedro Alves + Joel Brobecker + + * breakpoint.c (bpstat_what) + : Handle the + case where BS->STOP is not set. + 2015-08-05 Ulrich Weigand * nat/gdb_thread_db.h: Add copyright header. diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c index 78a694ec42b..125b22fd5f9 100644 --- a/gdb/breakpoint.c +++ b/gdb/breakpoint.c @@ -5752,13 +5752,23 @@ bpstat_what (bpstat bs_head) case bp_longjmp: case bp_longjmp_call_dummy: case bp_exception: - this_action = BPSTAT_WHAT_SET_LONGJMP_RESUME; - retval.is_longjmp = bptype != bp_exception; + if (bs->stop) + { + this_action = BPSTAT_WHAT_SET_LONGJMP_RESUME; + retval.is_longjmp = bptype != bp_exception; + } + else + this_action = BPSTAT_WHAT_SINGLE; break; case bp_longjmp_resume: case bp_exception_resume: - this_action = BPSTAT_WHAT_CLEAR_LONGJMP_RESUME; - retval.is_longjmp = bptype == bp_longjmp_resume; + if (bs->stop) + { + this_action = BPSTAT_WHAT_CLEAR_LONGJMP_RESUME; + retval.is_longjmp = bptype == bp_longjmp_resume; + } + else + this_action = BPSTAT_WHAT_SINGLE; break; case bp_step_resume: if (bs->stop) diff --git a/gdb/testsuite/ChangeLog b/gdb/testsuite/ChangeLog index f633c115a54..59d0be75854 100644 --- a/gdb/testsuite/ChangeLog +++ b/gdb/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2015-08-05 Pedro Alves + + * gdb.threads/next-while-other-thread-longjmps.c: New file. + * gdb.threads/next-while-other-thread-longjmps.exp: New file. + 2015-08-03 Sandra Loosemore * gdb.base/bp-permanent.exp: Report test as unsupported if diff --git a/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.c b/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.c new file mode 100644 index 00000000000..4733b8f4b73 --- /dev/null +++ b/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.c @@ -0,0 +1,127 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2015 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include +#include +#include +#include + +/* Number of threads. */ +#define NTHREADS 10 + +/* When set, threads exit. */ +volatile int break_out; + +pthread_barrier_t barrier; + +/* Entry point for threads that setjmp/longjmp. */ + +static void * +thread_longjmp (void *arg) +{ + jmp_buf env; + + pthread_barrier_wait (&barrier); + + while (!break_out) + { + if (setjmp (env) == 0) + longjmp (env, 1); + + usleep (1); + } + return NULL; +} + +/* Entry point for threads that try/catch. */ + +static void * +thread_try_catch (void *arg) +{ + volatile unsigned int counter = 0; + + pthread_barrier_wait (&barrier); + + while (!break_out) + { + try + { + throw 1; + } + catch (...) + { + counter++; + } + + usleep (1); + } + return NULL; +} + +int +main (void) +{ + pthread_t threads[NTHREADS]; + int i; + int ret; + + /* Don't run forever. */ + alarm (180); + + pthread_barrier_init (&barrier, NULL, NTHREADS + 1); + + for (i = 0; i < NTHREADS; i++) + { + /* Half of the threads does setjmp/longjmp, the other half does + try/catch. */ + if ((i % 2) == 0) + ret = pthread_create (&threads[i], NULL, thread_longjmp , NULL); + else + ret = pthread_create (&threads[i], NULL, thread_try_catch , NULL); + assert (ret == 0); + } + + /* Wait until all threads are running. */ + pthread_barrier_wait (&barrier); + +#define LINE usleep (1) + + /* The other thread's setjmp/longjmp/try/catch should not disturb + this thread's stepping over these lines. */ + + LINE; /* set break here */ + LINE; /* line 1 */ + LINE; /* line 2 */ + LINE; /* line 3 */ + LINE; /* line 4 */ + LINE; /* line 5 */ + LINE; /* line 6 */ + LINE; /* line 7 */ + LINE; /* line 8 */ + LINE; /* line 9 */ + LINE; /* line 10 */ + + break_out = 1; + + for (i = 0; i < NTHREADS; i++) + { + ret = pthread_join (threads[i], NULL); + assert (ret == 0); + } + + return 0; +} diff --git a/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp b/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp new file mode 100644 index 00000000000..72a0617bf0e --- /dev/null +++ b/gdb/testsuite/gdb.threads/next-while-other-thread-longjmps.exp @@ -0,0 +1,40 @@ +# Copyright (C) 2015 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# This test has the main thread step over a few lines, while a few +# threads constantly do setjmp/long and others do try/catch. The +# "next" commands in the main thread should be able to complete +# undisturbed. + +standard_testfile + +set linenum [gdb_get_line_number "set break here"] + +if {[prepare_for_testing "failed to prepare" \ + $testfile $srcfile {c++ debug pthreads}] == -1} { + return -1 +} + +if ![runto_main] then { + fail "Can't run to main" + return 0 +} + +gdb_breakpoint $linenum +gdb_continue_to_breakpoint "start line" + +for {set i 1} {$i <= 10} {incr i} { + gdb_test "next" " line $i .*" "next to line $i" +} -- 2.30.2