Don't ever Quit out of resume
If you have a breakpoint command that re-resumes the target, like:
break foo
commands
> c
> end
and then let the inferior run, hitting the breakpoint, and then press
Ctrl-C at just the right time, between GDB processing the stop at
"foo", and re-resuming the target, you'll hit the QUIT call in
infrun.c:resume.
With this hack, we can reproduce the bad case consistently:
--- a/gdb/inf-loop.c
+++ b/gdb/inf-loop.c
@@ -31,6 +31,8 @@
#include "top.h"
#include "observer.h"
+bool continue_hack;
+
/* General function to handle events in the inferior. */
void
@@ -64,6 +66,8 @@ inferior_event_handler (enum inferior_event_type event_type,
{
check_frame_language_change ();
+ continue_hack = true;
+
/* Don't propagate breakpoint commands errors. Either we're
stopping or some command resumes the inferior. The user will
be informed. */
diff --git a/gdb/infrun.c b/gdb/infrun.c
index
d425664..
c74b14c 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2403,6 +2403,10 @@ resume (enum gdb_signal sig)
gdb_assert (!tp->stop_requested);
gdb_assert (!thread_is_in_step_over_chain (tp));
+ extern bool continue_hack;
+
+ if (continue_hack)
+ set_quit_flag ();
QUIT;
The GDB backtrace looks like this:
(top-gdb) bt
...
#3 0x0000000000612e8b in throw_quit(char const*, ...) (fmt=0xaf84a1 "Quit") at src/gdb/common/common-exceptions.c:408
#4 0x00000000007fc104 in quit() () at src/gdb/utils.c:748
#5 0x00000000006a79d2 in default_quit_handler() () at src/gdb/event-top.c:954
#6 0x00000000007fc134 in maybe_quit() () at src/gdb/utils.c:762
#7 0x00000000006f66a3 in resume(gdb_signal) (sig=GDB_SIGNAL_0) at src/gdb/infrun.c:2406
#8 0x0000000000700c3d in keep_going_pass_signal(execution_control_state*) (ecs=0x7ffcf3744e60) at src/gdb/infrun.c:7793
#9 0x00000000006f5fcd in start_step_over() () at src/gdb/infrun.c:2145
#10 0x00000000006f7b1f in proceed(unsigned long, gdb_signal) (addr=
18446744073709551615, siggnal=GDB_SIGNAL_DEFAULT)
at src/gdb/infrun.c:3135
#11 0x00000000006ebdd4 in continue_1(int) (all_threads=0) at src/gdb/infcmd.c:842
#12 0x00000000006ec097 in continue_command(char*, int) (args=0x0, from_tty=0) at src/gdb/infcmd.c:938
#13 0x00000000004b5140 in do_cfunc(cmd_list_element*, char*, int) (c=0x2d18570, args=0x0, from_tty=0)
at src/gdb/cli/cli-decode.c:106
#14 0x00000000004b8219 in cmd_func(cmd_list_element*, char*, int) (cmd=0x2d18570, args=0x0, from_tty=0)
at src/gdb/cli/cli-decode.c:1952
#15 0x00000000007f1532 in execute_command(char*, int) (p=0x7ffcf37452b1 "", from_tty=0) at src/gdb/top.c:608
#16 0x00000000004bd127 in execute_control_command(command_line*) (cmd=0x3a88ef0) at src/gdb/cli/cli-script.c:485
#17 0x00000000005cae0c in bpstat_do_actions_1(bpstat*) (bsp=0x37edcf0) at src/gdb/breakpoint.c:4513
#18 0x00000000005caf67 in bpstat_do_actions() () at src/gdb/breakpoint.c:4563
#19 0x00000000006e8798 in inferior_event_handler(inferior_event_type, void*) (event_type=INF_EXEC_COMPLETE, client_data=0x0)
at src/gdb/inf-loop.c:72
#20 0x00000000006f9447 in fetch_inferior_event(void*) (client_data=0x0) at src/gdb/infrun.c:3970
#21 0x00000000006e870e in inferior_event_handler(inferior_event_type, void*) (event_type=INF_REG_EVENT, client_data=0x0)
at src/gdb/inf-loop.c:43
#22 0x0000000000494d58 in remote_async_serial_handler(serial*, void*) (scb=0x3585ca0, context=0x2cd1b80)
at src/gdb/remote.c:13820
#23 0x000000000044d682 in run_async_handler_and_reschedule(serial*) (scb=0x3585ca0) at src/gdb/ser-base.c:137
#24 0x000000000044d767 in fd_event(int, void*) (error=0, context=0x3585ca0) at src/gdb/ser-base.c:188
#25 0x00000000006a5686 in handle_file_event(file_handler*, int) (file_ptr=0x45997d0, ready_mask=1)
at src/gdb/event-loop.c:733
#26 0x00000000006a5c29 in gdb_wait_for_event(int) (block=1) at src/gdb/event-loop.c:859
#27 0x00000000006a4aa6 in gdb_do_one_event() () at src/gdb/event-loop.c:347
#28 0x00000000006a4ade in start_event_loop() () at src/gdb/event-loop.c:371
and when that happens, you end up with GDB's run control in quite a
messed up state. Something like this:
thread_function1 (arg=0x1) at threads.c:107
107 usleep (SLEEP); /* Loop increment. */
Quit
(gdb) c
Continuing.
** nothing happens, time passes..., press ctrl-c again **
^CQuit
(gdb) info threads
Id Target Id Frame
1 Thread 1462.1462 "threads" (running)
* 2 Thread 1462.1466 "threads" (running)
3 Thread 1462.1465 "function0" (running)
(gdb) c
Cannot execute this command while the selected thread is running.
(gdb)
The first "Quit" above is thrown from within "resume", and cancels run
control while GDB is in the middle of stepping over a breakpoint.
with step_over_info_valid_p() true. The next "c" didn't actually
resume anything, because GDB throught that the step-over was still in
progress. It wasn't, because the thread that was supposed to be
stepping over the breakpoint wasn't actually resumed.
So at this point, we press Ctrl-C again, and this time, the default
quit handler is called directly from the event loop
(event-top.c:default_quit_handler -> quit()), because gdb was left
owning the terminal (because the previous resume was cancelled before
we reach target_resume -> target_terminal::inferior()).
Note that the exception called from within resume ends up calling
normal_stop via resume_cleanups. That's very borked though, because
normal_stop is going to re-handle whatever was the last reported
event, possibly even re-running a hook stop... I think that the only
sane way to safely cancel the run control state machinery is to push
an event via handle_inferior_event like all other events.
The fix here does two things, and either alone would fix the problem
at hand:
#1 - passes the terminal to the inferior earlier, so that any QUIT
call from the point we declare the target as running goes to the
inferior directly, protecting run control from unsafe QUIT calls.
#2 - gets rid of this QUIT call in resume and of its related unsafe
resume_cleanups.
Aboout #2, the comment describing resume says:
/* Resume the inferior, but allow a QUIT. This is useful if the user
wants to interrupt some lengthy single-stepping operation
(for child processes, the SIGINT goes to the inferior, and so
we get a SIGINT random_signal, but for remote debugging and perhaps
other targets, that's not true).
but that's a really old comment that predates a lot of fixes to Ctrl-C
handling throughout both GDB core and the remote target, that made
sure that a Ctrl-C isn't ever lost. In any case, if some target
depended on this, a much better fix would be to make the target return
a SIGINT stop out of target_wait the next time that is called.
This was exposed by the new gdb.base/bp-cmds-continue-ctrl-c.exp
testcase added later in the series.
gdb/ChangeLog:
2017-11-16 Pedro Alves <palves@redhat.com>
* infrun.c (resume_cleanups): Delete.
(resume): No longer install a resume_cleanups cleanup nor call
QUIT.
(proceed): Pass the terminal to the inferior.
(keep_going_pass_signal): No longer install a resume_cleanups
cleanup.