remote follow fork and spurious child stops in non-stop mode
Running gdb.threads/fork-plus-threads.exp against gdbserver in
extended-remote mode, even though the test passes, we still see broken
behavior:
(gdb) PASS: gdb.threads/fork-plus-threads.exp: set detach-on-fork off
continue &
Continuing.
(gdb) PASS: gdb.threads/fork-plus-threads.exp: continue &
[New Thread 28092.28092]
[Thread 28092.28092] #2 stopped.
[New Thread 28094.28094]
[Inferior 2 (process 28092) exited normally]
[New Thread 28094.28105]
[New Thread 28094.28109]
...
[Thread 28174.28174] #18 stopped.
[New Thread 28185.28185]
[Inferior 10 (process 28174) exited normally]
[New Thread 28185.28196]
[Thread 28185.28185] #20 stopped.
Cannot remove breakpoints because program is no longer writable.
Further execution is probably impossible.
[Inferior 11 (process 28185) exited normally]
[Inferior 1 (process 28091) exited normally]
PASS: gdb.threads/fork-plus-threads.exp: reached breakpoint
info threads
No threads.
(gdb) PASS: gdb.threads/fork-plus-threads.exp: no threads left
info inferiors
Num Description Executable
* 1 <null> /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/fork-plus-threads
(gdb) PASS: gdb.threads/fork-plus-threads.exp: only inferior 1 left
All the "[Thread FOO] #NN stopped." above are bogus, as well as the
"Cannot remove breakpoints because program is no longer writable.",
which is a consequence.
The problem is that when we intercept a fork event, we should report
the event for the parent, only, and leave the child stopped, but not
report its stop event. GDB later decides whether to follow the parent
or the child. But because handle_extended_wait does not set the
child's last_status.kind to TARGET_WAITKIND_STOPPED, a
stop_all_threads/unstop_all_lwps sequence (e.g., from trying to access
memory) by mistake ends up queueing a SIGSTOP on the child, resuming
it, and then when that SIGSTOP is intercepted, because the LWP has
last_resume_kind set to resume_stop, gdbserver reports the stop to
GDB, as GDB_SIGNAL_0:
...
>>>> entering unstop_all_lwps
unstopping all lwps
proceed_one_lwp: lwp 1600
client wants LWP to remain 1600 stopped
proceed_one_lwp: lwp 1828
Client wants LWP 1828 to stop. Making sure it has a SIGSTOP pending
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sending sigstop to lwp 1828
pc is 0x3615ebc7cc
Resuming lwp 1828 (continue, signal 0, stop expected)
continue from pc 0x3615ebc7cc
unstop_all_lwps done
sigchld_handler
<<<< exiting unstop_all_lwps
handling possible target event
>>>> entering linux_wait_1
linux_wait_1: [<all threads>]
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x1): status(137f), 1828
LWFE: waitpid(-1, ...) returned 1828, ERRNO-OK
LLW: waitpid 1828 received Stopped (signal) (stopped)
pc is 0x3615ebc7cc
Expected stop.
LLW: resume_stop SIGSTOP caught for LWP 1828.1828.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
linux_wait_1 ret = LWP 1828.1828, 1, 0
<<<< exiting linux_wait_1
Writing resume reply for LWP 1828.1828:1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Tested on x86_64 Fedora 20, extended-remote.
gdb/gdbserver/ChangeLog:
2015-07-30 Pedro Alves <palves@redhat.com>
* linux-low.c (handle_extended_wait): Set the child's last
reported status to TARGET_WAITKIND_STOPPED.