From: Pedro Alves Date: Fri, 12 Nov 2021 20:50:29 +0000 (+0000) Subject: Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=0d36baa9af0d9929c96b89a184a469c432c68b0d;p=binutils-gdb.git Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED (A good chunk of the problem statement in the commit log below is Andrew's, adjusted for a different solution, and for covering displaced stepping too. The testcase is mostly Andrew's too.) This commit addresses bugs gdb/19675 and gdb/27830, which are about stepping over a breakpoint set at a clone syscall instruction, one is about displaced stepping, and the other about in-line stepping. Currently, when a new thread is created through a clone syscall, GDB sets the new thread running. With 'continue' this makes sense (assuming no schedlock): - all-stop mode, user issues 'continue', all threads are set running, a newly created thread should also be set running. - non-stop mode, user issues 'continue', other pre-existing threads are not affected, but as the new thread is (sort-of) a child of the thread the user asked to run, it makes sense that the new threads should be created in the running state. Similarly, if we are stopped at the clone syscall, and there's no software breakpoint at this address, then the current behaviour is fine: - all-stop mode, user issues 'stepi', stepping will be done in place (as there's no breakpoint to step over). While stepping the thread of interest all the other threads will be allowed to continue. A newly created thread will be set running, and then stopped once the thread of interest has completed its step. - non-stop mode, user issues 'stepi', stepping will be done in place (as there's no breakpoint to step over). Other threads might be running or stopped, but as with the continue case above, the new thread will be created running. The only possible issue here is that the new thread will be left running after the initial thread has completed its stepi. The user would need to manually select the thread and interrupt it, this might not be what the user expects. However, this is not something this commit tries to change. The problem then is what happens when we try to step over a clone syscall if there is a breakpoint at the syscall address. - For both all-stop and non-stop modes, with in-line stepping: + user issues 'stepi', + [non-stop mode only] GDB stops all threads. In all-stop mode all threads are already stopped. + GDB removes s/w breakpoint at syscall address, + GDB single steps just the thread of interest, all other threads are left stopped, + New thread is created running, + Initial thread completes its step, + [non-stop mode only] GDB resumes all threads that it previously stopped. There are two problems in the in-line stepping scenario above: 1. The new thread might pass through the same code that the initial thread is in (i.e. the clone syscall code), in which case it will fail to hit the breakpoint in clone as this was removed so the first thread can single step, 2. The new thread might trigger some other stop event before the initial thread reports its step completion. If this happens we end up triggering an assertion as GDB assumes that only the thread being stepped should stop. The assert looks like this: infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed. - For both all-stop and non-stop modes, with displaced stepping: + user issues 'stepi', + GDB starts the displaced step, moves thread's PC to the out-of-line scratch pad, maybe adjusts registers, + GDB single steps the thread of interest, [non-stop mode only] all other threads are left as they were, either running or stopped. In all-stop, all other threads are left stopped. + New thread is created running, + Initial thread completes its step, GDB re-adjusts its PC, restores/releases scratchpad, + [non-stop mode only] GDB resumes the thread, now past its breakpoint. + [all-stop mode only] GDB resumes all threads. There is one problem with the displaced stepping scenario above: 3. When the parent thread completed its step, GDB adjusted its PC, but did not adjust the child's PC, thus that new child thread will continue execution in the scratch pad, invoking undefined behavior. If you're lucky, you see a crash. If unlucky, the inferior gets silently corrupted. What is needed is for GDB to have more control over whether the new thread is created running or not. Issue #1 above requires that the new thread not be allowed to run until the breakpoint has been reinserted. The only way to guarantee this is if the new thread is held in a stopped state until the single step has completed. Issue #3 above requires that GDB is informed of when a thread clones itself, and of what is the child's ptid, so that GDB can fixup both the parent and the child. When looking for solutions to this problem I considered how GDB handles fork/vfork as these have some of the same issues. The main difference between fork/vfork and clone is that the clone events are not reported back to core GDB. Instead, the clone event is handled automatically in the target code and the child thread is immediately set running. Note we have support for requesting thread creation events out of the target (TARGET_WAITKIND_THREAD_CREATED). However, those are reported for the new/child thread. That would be sufficient to address in-line stepping (issue #1), but not for displaced-stepping (issue #3). To handle displaced-stepping, we need an event that is reported to the _parent_ of the clone, as the information about the displaced step is associated with the clone parent. TARGET_WAITKIND_THREAD_CREATED includes no indication of which thread is the parent that spawned the new child. In fact, for some targets, like e.g., Windows, it would be impossible to know which thread that was, as thread creation there doesn't work by "cloning". The solution implemented here is to model clone on fork/vfork, and introduce a new TARGET_WAITKIND_THREAD_CLONED event. This event is similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except that we end up with a new thread in the same process, instead of a new thread of a new process. Like FORKED and VFORKED, THREAD_CLONED waitstatuses have a child_ptid property, and the child is held stopped until GDB explicitly resumes it. This addresses the in-line stepping case (issues #1 and #2). The infrun code that handles displaced stepping fixup for the child after a fork/vfork event is thus reused for THREAD_CLONE, with some minimal conditions added, addressing the displaced stepping case (issue #3). The native Linux backend is adjusted to unconditionally report TARGET_WAITKIND_THREAD_CLONED events to the core. Following the follow_fork model in core GDB, we introduce a target_follow_clone target method, which is responsible for making the new clone child visible to the rest of GDB. Subsequent patches will add clone events support to the remote protocol and gdbserver. displaced_step_in_progress_thread becomes unused with this patch, but a new use will reappear later in the series. To avoid deleting it and readding it back, this patch marks it with attribute unused, and the latter patch removes the attribute again. We need to do this because the function is static, and with no callers, the compiler would warn, (error with -Werror), breaking the build. This adds a new gdb.threads/stepi-over-clone.exp testcase, which exercises stepping over a clone syscall, with displaced stepping vs inline stepping, and all-stop vs non-stop. We already test stepping over clone syscalls with gdb.base/step-over-syscall.exp, but this test uses pthreads, while the other test uses raw clone, and this one is more thorough. The testcase passes on native GNU/Linux, but fails against GDBserver. GDBserver will be fixed by a later patch in the series. Co-authored-by: Andrew Burgess Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830 Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e Reviewed-By: Andrew Burgess --- diff --git a/gdb/infrun.c b/gdb/infrun.c index c60cfc07aa7..e3157f86aff 100644 --- a/gdb/infrun.c +++ b/gdb/infrun.c @@ -1606,6 +1606,7 @@ step_over_info_valid_p (void) /* Return true if THREAD is doing a displaced step. */ static bool +ATTRIBUTE_UNUSED displaced_step_in_progress_thread (thread_info *thread) { gdb_assert (thread != nullptr); @@ -1967,6 +1968,31 @@ static displaced_step_finish_status displaced_step_finish (thread_info *event_thread, const target_waitstatus &event_status) { + /* Check whether the parent is displaced stepping. */ + struct regcache *regcache = get_thread_regcache (event_thread); + struct gdbarch *gdbarch = regcache->arch (); + inferior *parent_inf = event_thread->inf; + + /* If this was a fork/vfork/clone, this event indicates that the + displaced stepping of the syscall instruction has been done, so + we perform cleanup for parent here. Also note that this + operation also cleans up the child for vfork, because their pages + are shared. */ + + /* If this is a fork (child gets its own address space copy) and + some displaced step buffers were in use at the time of the fork, + restore the displaced step buffer bytes in the child process. + + Architectures which support displaced stepping and fork events + must supply an implementation of + gdbarch_displaced_step_restore_all_in_ptid. This is not enforced + during gdbarch validation to support architectures which support + displaced stepping but not forks. */ + if (event_status.kind () == TARGET_WAITKIND_FORKED + && gdbarch_supports_displaced_stepping (gdbarch)) + gdbarch_displaced_step_restore_all_in_ptid + (gdbarch, parent_inf, event_status.child_ptid ()); + displaced_step_thread_state *displaced = &event_thread->displaced_step_state; /* Was this thread performing a displaced step? */ @@ -1986,8 +2012,39 @@ displaced_step_finish (thread_info *event_thread, /* Do the fixup, and release the resources acquired to do the displaced step. */ - return gdbarch_displaced_step_finish (displaced->get_original_gdbarch (), - event_thread, event_status); + displaced_step_finish_status status + = gdbarch_displaced_step_finish (displaced->get_original_gdbarch (), + event_thread, event_status); + + if (event_status.kind () == TARGET_WAITKIND_FORKED + || event_status.kind () == TARGET_WAITKIND_VFORKED + || event_status.kind () == TARGET_WAITKIND_THREAD_CLONED) + { + /* Since the vfork/fork/clone syscall instruction was executed + in the scratchpad, the child's PC is also within the + scratchpad. Set the child's PC to the parent's PC value, + which has already been fixed up. Note: we use the parent's + aspace here, although we're touching the child, because the + child hasn't been added to the inferior list yet at this + point. */ + + struct regcache *child_regcache + = get_thread_arch_aspace_regcache (parent_inf, + event_status.child_ptid (), + gdbarch, + parent_inf->aspace); + /* Read PC value of parent. */ + CORE_ADDR parent_pc = regcache_read_pc (regcache); + + displaced_debug_printf ("write child pc from %s to %s", + paddress (gdbarch, + regcache_read_pc (child_regcache)), + paddress (gdbarch, parent_pc)); + + regcache_write_pc (child_regcache, parent_pc); + } + + return status; } /* Data to be passed around while handling an event. This data is @@ -5854,67 +5911,13 @@ handle_inferior_event (struct execution_control_state *ecs) case TARGET_WAITKIND_FORKED: case TARGET_WAITKIND_VFORKED: - /* Check whether the inferior is displaced stepping. */ - { - struct regcache *regcache = get_thread_regcache (ecs->event_thread); - struct gdbarch *gdbarch = regcache->arch (); - inferior *parent_inf = find_inferior_ptid (ecs->target, ecs->ptid); - - /* If this is a fork (child gets its own address space copy) - and some displaced step buffers were in use at the time of - the fork, restore the displaced step buffer bytes in the - child process. - - Architectures which support displaced stepping and fork - events must supply an implementation of - gdbarch_displaced_step_restore_all_in_ptid. This is not - enforced during gdbarch validation to support architectures - which support displaced stepping but not forks. */ - if (ecs->ws.kind () == TARGET_WAITKIND_FORKED - && gdbarch_supports_displaced_stepping (gdbarch)) - gdbarch_displaced_step_restore_all_in_ptid - (gdbarch, parent_inf, ecs->ws.child_ptid ()); - - /* If displaced stepping is supported, and thread ecs->ptid is - displaced stepping. */ - if (displaced_step_in_progress_thread (ecs->event_thread)) - { - struct regcache *child_regcache; - CORE_ADDR parent_pc; - - /* GDB has got TARGET_WAITKIND_FORKED or TARGET_WAITKIND_VFORKED, - indicating that the displaced stepping of syscall instruction - has been done. Perform cleanup for parent process here. Note - that this operation also cleans up the child process for vfork, - because their pages are shared. */ - displaced_step_finish (ecs->event_thread, ecs->ws); - /* Start a new step-over in another thread if there's one - that needs it. */ - start_step_over (); - - /* Since the vfork/fork syscall instruction was executed in the scratchpad, - the child's PC is also within the scratchpad. Set the child's PC - to the parent's PC value, which has already been fixed up. - FIXME: we use the parent's aspace here, although we're touching - the child, because the child hasn't been added to the inferior - list yet at this point. */ - - child_regcache - = get_thread_arch_aspace_regcache (parent_inf, - ecs->ws.child_ptid (), - gdbarch, - parent_inf->aspace); - /* Read PC value of parent process. */ - parent_pc = regcache_read_pc (regcache); - - displaced_debug_printf ("write child pc from %s to %s", - paddress (gdbarch, - regcache_read_pc (child_regcache)), - paddress (gdbarch, parent_pc)); - - regcache_write_pc (child_regcache, parent_pc); - } - } + case TARGET_WAITKIND_THREAD_CLONED: + + displaced_step_finish (ecs->event_thread, ecs->ws); + + /* Start a new step-over in another thread if there's one that + needs it. */ + start_step_over (); context_switch (ecs); @@ -5930,7 +5933,7 @@ handle_inferior_event (struct execution_control_state *ecs) need to unpatch at follow/detach time instead to be certain that new breakpoints added between catchpoint hit time and vfork follow are detached. */ - if (ecs->ws.kind () != TARGET_WAITKIND_VFORKED) + if (ecs->ws.kind () == TARGET_WAITKIND_FORKED) { /* This won't actually modify the breakpoint list, but will physically remove the breakpoints from the child. */ @@ -5962,14 +5965,24 @@ handle_inferior_event (struct execution_control_state *ecs) if (!bpstat_causes_stop (ecs->event_thread->control.stop_bpstat)) { bool follow_child - = (follow_fork_mode_string == follow_fork_mode_child); + = (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED + && follow_fork_mode_string == follow_fork_mode_child); ecs->event_thread->set_stop_signal (GDB_SIGNAL_0); process_stratum_target *targ = ecs->event_thread->inf->process_target (); - bool should_resume = follow_fork (); + bool should_resume; + if (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED) + should_resume = follow_fork (); + else + { + should_resume = true; + inferior *inf = ecs->event_thread->inf; + inf->top_target ()->follow_clone (ecs->ws.child_ptid ()); + ecs->event_thread->pending_follow.set_spurious (); + } /* Note that one of these may be an invalid pointer, depending on detach_fork. */ @@ -5980,16 +5993,21 @@ handle_inferior_event (struct execution_control_state *ecs) child is marked stopped. */ /* If not resuming the parent, mark it stopped. */ - if (follow_child && !detach_fork && !non_stop && !sched_multi) + if (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED + && follow_child && !detach_fork && !non_stop && !sched_multi) parent->set_running (false); /* If resuming the child, mark it running. */ - if (follow_child || (!detach_fork && (non_stop || sched_multi))) + if (ecs->ws.kind () == TARGET_WAITKIND_THREAD_CLONED + || (follow_child || (!detach_fork && (non_stop || sched_multi)))) child->set_running (true); /* In non-stop mode, also resume the other branch. */ - if (!detach_fork && (non_stop - || (sched_multi && target_is_non_stop_p ()))) + if ((ecs->ws.kind () == TARGET_WAITKIND_THREAD_CLONED + && target_is_non_stop_p ()) + || (!detach_fork && (non_stop + || (sched_multi + && target_is_non_stop_p ())))) { if (follow_child) switch_to_thread (parent); diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c index 97d80053c6f..da870e84922 100644 --- a/gdb/linux-nat.c +++ b/gdb/linux-nat.c @@ -1280,69 +1280,85 @@ get_detach_signal (struct lwp_info *lp) return 0; } -/* Detach from LP. If SIGNO_P is non-NULL, then it points to the - signal number that should be passed to the LWP when detaching. - Otherwise pass any pending signal the LWP may have, if any. */ +/* If LP has a pending fork/vfork/clone status, return it. */ -static void -detach_one_lwp (struct lwp_info *lp, int *signo_p) +static gdb::optional +get_pending_child_status (lwp_info *lp) { LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT; linux_nat_debug_printf ("lwp %s (stopped = %d)", lp->ptid.to_string ().c_str (), lp->stopped); - int lwpid = lp->ptid.lwp (); - int signo; - - gdb_assert (lp->status == 0 || WIFSTOPPED (lp->status)); - - /* If the lwp/thread we are about to detach has a pending fork event, - there is a process GDB is attached to that the core of GDB doesn't know - about. Detach from it. */ - /* Check in lwp_info::status. */ if (WIFSTOPPED (lp->status) && linux_is_extended_waitstatus (lp->status)) { int event = linux_ptrace_get_extended_event (lp->status); - if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK) + if (event == PTRACE_EVENT_FORK + || event == PTRACE_EVENT_VFORK + || event == PTRACE_EVENT_CLONE) { unsigned long child_pid; int ret = ptrace (PTRACE_GETEVENTMSG, lp->ptid.lwp (), 0, &child_pid); if (ret == 0) - detach_one_pid (child_pid, 0); + { + target_waitstatus ws; + + if (event == PTRACE_EVENT_FORK) + ws.set_forked (ptid_t (child_pid, child_pid)); + else if (event == PTRACE_EVENT_VFORK) + ws.set_vforked (ptid_t (child_pid, child_pid)); + else if (event == PTRACE_EVENT_CLONE) + ws.set_thread_cloned (ptid_t (lp->ptid.pid (), child_pid)); + else + gdb_assert_not_reached ("unhandled"); + + return ws; + } else - perror_warning_with_name (_("Failed to detach fork child")); + { + perror_warning_with_name (_("Failed to retrieve event msg")); + return {}; + } } } /* Check in lwp_info::waitstatus. */ - if (lp->waitstatus.kind () == TARGET_WAITKIND_VFORKED - || lp->waitstatus.kind () == TARGET_WAITKIND_FORKED) - detach_one_pid (lp->waitstatus.child_ptid ().pid (), 0); - + if (is_new_child_status (lp->waitstatus.kind ())) + return lp->waitstatus; - /* Check in thread_info::pending_waitstatus. */ thread_info *tp = linux_target->find_thread (lp->ptid); - if (tp->has_pending_waitstatus ()) - { - const target_waitstatus &ws = tp->pending_waitstatus (); - if (ws.kind () == TARGET_WAITKIND_VFORKED - || ws.kind () == TARGET_WAITKIND_FORKED) - detach_one_pid (ws.child_ptid ().pid (), 0); - } + /* Check in thread_info::pending_waitstatus. */ + if (tp->has_pending_waitstatus () + && is_new_child_status (tp->pending_waitstatus ().kind ())) + return tp->pending_waitstatus (); /* Check in thread_info::pending_follow. */ - if (tp->pending_follow.kind () == TARGET_WAITKIND_VFORKED - || tp->pending_follow.kind () == TARGET_WAITKIND_FORKED) - detach_one_pid (tp->pending_follow.child_ptid ().pid (), 0); + if (is_new_child_status (tp->pending_follow.kind ())) + return tp->pending_follow; - if (lp->status != 0) - linux_nat_debug_printf ("Pending %s for %s on detach.", - strsignal (WSTOPSIG (lp->status)), - lp->ptid.to_string ().c_str ()); + return {}; +} + +/* Detach from LP. If SIGNO_P is non-NULL, then it points to the + signal number that should be passed to the LWP when detaching. + Otherwise pass any pending signal the LWP may have, if any. */ + +static void +detach_one_lwp (struct lwp_info *lp, int *signo_p) +{ + int lwpid = lp->ptid.lwp (); + int signo; + + /* If the lwp/thread we are about to detach has a pending fork/clone + event, there is a process/thread GDB is attached to that the core + of GDB doesn't know about. Detach from it. */ + + gdb::optional ws = get_pending_child_status (lp); + if (ws.has_value ()) + detach_one_pid (ws->child_ptid ().lwp (), 0); /* If there is a pending SIGSTOP, get rid of it. */ if (lp->signalled) @@ -1836,6 +1852,55 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping) return 1; } +/* See target.h. */ + +void +linux_nat_target::follow_clone (ptid_t child_ptid) +{ + lwp_info *new_lp = add_lwp (child_ptid); + new_lp->stopped = 1; + + /* If the thread_db layer is active, let it record the user + level thread id and status, and add the thread to GDB's + list. */ + if (!thread_db_notice_clone (inferior_ptid, new_lp->ptid)) + { + /* The process is not using thread_db. Add the LWP to + GDB's list. */ + add_thread (linux_target, new_lp->ptid); + } + + /* We just created NEW_LP so it cannot yet contain STATUS. */ + gdb_assert (new_lp->status == 0); + + if (!pull_pid_from_list (&stopped_pids, child_ptid.lwp (), &new_lp->status)) + internal_error (_("no saved status for clone lwp")); + + if (WSTOPSIG (new_lp->status) != SIGSTOP) + { + /* This can happen if someone starts sending signals to + the new thread before it gets a chance to run, which + have a lower number than SIGSTOP (e.g. SIGUSR1). + This is an unlikely case, and harder to handle for + fork / vfork than for clone, so we do not try - but + we handle it for clone events here. */ + + new_lp->signalled = 1; + + /* Save the wait status to report later. */ + linux_nat_debug_printf + ("waitpid of new LWP %ld, saving status %s", + (long) new_lp->ptid.lwp (), status_to_str (new_lp->status).c_str ()); + } + else + { + new_lp->status = 0; + + if (report_thread_events) + new_lp->waitstatus.set_thread_created (); + } +} + /* Handle a GNU/Linux extended wait response. If we see a clone event, we need to add the new LWP to our list (and not report the trap to higher layers). This function returns non-zero if the @@ -1876,11 +1941,9 @@ linux_handle_extended_wait (struct lwp_info *lp, int status) internal_error (_("wait returned unexpected status 0x%x"), status); } - ptid_t child_ptid (new_pid, new_pid); - if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK) { - open_proc_mem_file (child_ptid); + open_proc_mem_file (ptid_t (new_pid, new_pid)); /* The arch-specific native code may need to know about new forks even if those end up never mapped to an @@ -1917,66 +1980,18 @@ linux_handle_extended_wait (struct lwp_info *lp, int status) } if (event == PTRACE_EVENT_FORK) - ourstatus->set_forked (child_ptid); + ourstatus->set_forked (ptid_t (new_pid, new_pid)); else if (event == PTRACE_EVENT_VFORK) - ourstatus->set_vforked (child_ptid); + ourstatus->set_vforked (ptid_t (new_pid, new_pid)); else if (event == PTRACE_EVENT_CLONE) { - struct lwp_info *new_lp; - - ourstatus->set_ignore (); - linux_nat_debug_printf ("Got clone event from LWP %d, new child is LWP %ld", pid, new_pid); - new_lp = add_lwp (ptid_t (lp->ptid.pid (), new_pid)); - new_lp->stopped = 1; - new_lp->resumed = 1; - - /* If the thread_db layer is active, let it record the user - level thread id and status, and add the thread to GDB's - list. */ - if (!thread_db_notice_clone (lp->ptid, new_lp->ptid)) - { - /* The process is not using thread_db. Add the LWP to - GDB's list. */ - add_thread (linux_target, new_lp->ptid); - } - - /* Even if we're stopping the thread for some reason - internal to this module, from the perspective of infrun - and the user/frontend, this new thread is running until - it next reports a stop. */ - set_running (linux_target, new_lp->ptid, true); - set_executing (linux_target, new_lp->ptid, true); - - if (WSTOPSIG (status) != SIGSTOP) - { - /* This can happen if someone starts sending signals to - the new thread before it gets a chance to run, which - have a lower number than SIGSTOP (e.g. SIGUSR1). - This is an unlikely case, and harder to handle for - fork / vfork than for clone, so we do not try - but - we handle it for clone events here. */ - - new_lp->signalled = 1; + /* Save the status again, we'll use it in follow_clone. */ + add_to_pid_list (&stopped_pids, new_pid, status); - /* We created NEW_LP so it cannot yet contain STATUS. */ - gdb_assert (new_lp->status == 0); - - /* Save the wait status to report later. */ - linux_nat_debug_printf - ("waitpid of new LWP %ld, saving status %s", - (long) new_lp->ptid.lwp (), status_to_str (status).c_str ()); - new_lp->status = status; - } - else if (report_thread_events) - { - new_lp->waitstatus.set_thread_created (); - new_lp->status = status; - } - - return 1; + ourstatus->set_thread_cloned (ptid_t (lp->ptid.pid (), new_pid)); } return 0; @@ -3562,59 +3577,56 @@ kill_wait_callback (struct lwp_info *lp) return 0; } -/* Kill the fork children of any threads of inferior INF that are - stopped at a fork event. */ +/* Kill the fork/clone child of LP if it has an unfollowed child. */ -static void -kill_unfollowed_fork_children (struct inferior *inf) +static int +kill_unfollowed_child_callback (lwp_info *lp) { - for (thread_info *thread : inf->non_exited_threads ()) + gdb::optional ws = get_pending_child_status (lp); + if (ws.has_value ()) { - struct target_waitstatus *ws = &thread->pending_follow; - - if (ws->kind () == TARGET_WAITKIND_FORKED - || ws->kind () == TARGET_WAITKIND_VFORKED) - { - ptid_t child_ptid = ws->child_ptid (); - int child_pid = child_ptid.pid (); - int child_lwp = child_ptid.lwp (); + ptid_t child_ptid = ws->child_ptid (); + int child_pid = child_ptid.pid (); + int child_lwp = child_ptid.lwp (); - kill_one_lwp (child_lwp); - kill_wait_one_lwp (child_lwp); + kill_one_lwp (child_lwp); + kill_wait_one_lwp (child_lwp); - /* Let the arch-specific native code know this process is - gone. */ - linux_target->low_forget_process (child_pid); - } + /* Let the arch-specific native code know this process is + gone. */ + if (ws->kind () != TARGET_WAITKIND_THREAD_CLONED) + linux_target->low_forget_process (child_pid); } + + return 0; } void linux_nat_target::kill () { - /* If we're stopped while forking and we haven't followed yet, - kill the other task. We need to do this first because the + ptid_t pid_ptid (inferior_ptid.pid ()); + + /* If we're stopped while forking/cloning and we haven't followed + yet, kill the child task. We need to do this first because the parent will be sleeping if this is a vfork. */ - kill_unfollowed_fork_children (current_inferior ()); + iterate_over_lwps (pid_ptid, kill_unfollowed_child_callback); if (forks_exist_p ()) linux_fork_killall (); else { - ptid_t ptid = ptid_t (inferior_ptid.pid ()); - /* Stop all threads before killing them, since ptrace requires that the thread is stopped to successfully PTRACE_KILL. */ - iterate_over_lwps (ptid, stop_callback); + iterate_over_lwps (pid_ptid, stop_callback); /* ... and wait until all of them have reported back that they're no longer running. */ - iterate_over_lwps (ptid, stop_wait_callback); + iterate_over_lwps (pid_ptid, stop_wait_callback); /* Kill all LWP's ... */ - iterate_over_lwps (ptid, kill_callback); + iterate_over_lwps (pid_ptid, kill_callback); /* ... and wait until we've flushed all events. */ - iterate_over_lwps (ptid, kill_wait_callback); + iterate_over_lwps (pid_ptid, kill_wait_callback); } target_mourn_inferior (inferior_ptid); diff --git a/gdb/linux-nat.h b/gdb/linux-nat.h index 770fe924427..1cdbeafd4f3 100644 --- a/gdb/linux-nat.h +++ b/gdb/linux-nat.h @@ -129,6 +129,8 @@ public: void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) override; + void follow_clone (ptid_t) override; + std::vector static_tracepoint_markers_by_strid (const char *id) override; diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c index 580fc768dd1..eae96e2daba 100644 --- a/gdb/target-delegates.c +++ b/gdb/target-delegates.c @@ -76,6 +76,7 @@ struct dummy_target : public target_ops int insert_vfork_catchpoint (int arg0) override; int remove_vfork_catchpoint (int arg0) override; void follow_fork (inferior *arg0, ptid_t arg1, target_waitkind arg2, bool arg3, bool arg4) override; + void follow_clone (ptid_t arg0) override; int insert_exec_catchpoint (int arg0) override; int remove_exec_catchpoint (int arg0) override; void follow_exec (inferior *arg0, ptid_t arg1, const char *arg2) override; @@ -251,6 +252,7 @@ struct debug_target : public target_ops int insert_vfork_catchpoint (int arg0) override; int remove_vfork_catchpoint (int arg0) override; void follow_fork (inferior *arg0, ptid_t arg1, target_waitkind arg2, bool arg3, bool arg4) override; + void follow_clone (ptid_t arg0) override; int insert_exec_catchpoint (int arg0) override; int remove_exec_catchpoint (int arg0) override; void follow_exec (inferior *arg0, ptid_t arg1, const char *arg2) override; @@ -1547,6 +1549,28 @@ debug_target::follow_fork (inferior *arg0, ptid_t arg1, target_waitkind arg2, bo gdb_puts (")\n", gdb_stdlog); } +void +target_ops::follow_clone (ptid_t arg0) +{ + this->beneath ()->follow_clone (arg0); +} + +void +dummy_target::follow_clone (ptid_t arg0) +{ + default_follow_clone (this, arg0); +} + +void +debug_target::follow_clone (ptid_t arg0) +{ + gdb_printf (gdb_stdlog, "-> %s->follow_clone (...)\n", this->beneath ()->shortname ()); + this->beneath ()->follow_clone (arg0); + gdb_printf (gdb_stdlog, "<- %s->follow_clone (", this->beneath ()->shortname ()); + target_debug_print_ptid_t (arg0); + gdb_puts (")\n", gdb_stdlog); +} + int target_ops::insert_exec_catchpoint (int arg0) { diff --git a/gdb/target.c b/gdb/target.c index f688ff33e3b..bf82649ed98 100644 --- a/gdb/target.c +++ b/gdb/target.c @@ -2685,6 +2685,13 @@ default_follow_fork (struct target_ops *self, inferior *child_inf, internal_error (_("could not find a target to follow fork")); } +static void +default_follow_clone (struct target_ops *self, ptid_t child_ptid) +{ + /* Some target returned a clone event, but did not know how to follow it. */ + internal_error (_("could not find a target to follow clone")); +} + /* See target.h. */ void diff --git a/gdb/target.h b/gdb/target.h index 68b269fb3e6..d4d81e727e9 100644 --- a/gdb/target.h +++ b/gdb/target.h @@ -642,6 +642,13 @@ struct target_ops TARGET_DEFAULT_RETURN (1); virtual void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) TARGET_DEFAULT_FUNC (default_follow_fork); + + /* Add CHILD_PTID to the thread list, after handling a + TARGET_WAITKIND_THREAD_CLONE event for the clone parent. The + parent is inferior_ptid. */ + virtual void follow_clone (ptid_t child_ptid) + TARGET_DEFAULT_FUNC (default_follow_clone); + virtual int insert_exec_catchpoint (int) TARGET_DEFAULT_RETURN (1); virtual int remove_exec_catchpoint (int) diff --git a/gdb/target/waitstatus.c b/gdb/target/waitstatus.c index 2b8404fb75b..a8edbb17d60 100644 --- a/gdb/target/waitstatus.c +++ b/gdb/target/waitstatus.c @@ -45,6 +45,7 @@ DIAGNOSTIC_ERROR_SWITCH case TARGET_WAITKIND_FORKED: case TARGET_WAITKIND_VFORKED: + case TARGET_WAITKIND_THREAD_CLONED: return string_appendf (str, ", child_ptid = %s", this->child_ptid ().to_string ().c_str ()); diff --git a/gdb/target/waitstatus.h b/gdb/target/waitstatus.h index 4d23f1cbff4..3d3a0cf9d02 100644 --- a/gdb/target/waitstatus.h +++ b/gdb/target/waitstatus.h @@ -95,6 +95,13 @@ enum target_waitkind /* There are no resumed children left in the program. */ TARGET_WAITKIND_NO_RESUMED, + /* The thread was cloned. The event's ptid corresponds to the + cloned parent. The cloned child is held stopped at its entry + point, and its ptid is in the event's m_child_ptid. The target + must not add the cloned child to GDB's thread list until + target_ops::follow_clone() is called. */ + TARGET_WAITKIND_THREAD_CLONED, + /* The thread was created. */ TARGET_WAITKIND_THREAD_CREATED, @@ -102,6 +109,17 @@ enum target_waitkind TARGET_WAITKIND_THREAD_EXITED, }; +/* Determine if KIND represents an event with a new child - a fork, + vfork, or clone. */ + +static inline bool +is_new_child_status (target_waitkind kind) +{ + return (kind == TARGET_WAITKIND_FORKED + || kind == TARGET_WAITKIND_VFORKED + || kind == TARGET_WAITKIND_THREAD_CLONED); +} + /* Return KIND as a string. */ static inline const char * @@ -125,6 +143,8 @@ DIAGNOSTIC_ERROR_SWITCH return "FORKED"; case TARGET_WAITKIND_VFORKED: return "VFORKED"; + case TARGET_WAITKIND_THREAD_CLONED: + return "THREAD_CLONED"; case TARGET_WAITKIND_EXECD: return "EXECD"; case TARGET_WAITKIND_VFORK_DONE: @@ -325,6 +345,14 @@ struct target_waitstatus return *this; } + target_waitstatus &set_thread_cloned (ptid_t child_ptid) + { + this->reset (); + m_kind = TARGET_WAITKIND_THREAD_CLONED; + m_value.child_ptid = child_ptid; + return *this; + } + target_waitstatus &set_thread_created () { this->reset (); @@ -369,8 +397,7 @@ struct target_waitstatus ptid_t child_ptid () const { - gdb_assert (m_kind == TARGET_WAITKIND_FORKED - || m_kind == TARGET_WAITKIND_VFORKED); + gdb_assert (is_new_child_status (m_kind)); return m_value.child_ptid; } diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.c b/gdb/testsuite/gdb.threads/stepi-over-clone.c new file mode 100644 index 00000000000..12909161c4c --- /dev/null +++ b/gdb/testsuite/gdb.threads/stepi-over-clone.c @@ -0,0 +1,90 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2021-2023 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include +#include +#include +#include +#include + +/* Set this to non-zero from GDB to start a third worker thread. */ +volatile int start_third_thread = 0; + +void * +thread_worker_2 (void *arg) +{ + int i; + + printf ("Hello from the third thread.\n"); + fflush (stdout); + + for (i = 0; i < 300; ++i) + sleep (1); + + return NULL; +} + +void * +thread_worker_1 (void *arg) +{ + int i; + pthread_t thr; + void *val; + + if (start_third_thread) + pthread_create (&thr, NULL, thread_worker_2, NULL); + + printf ("Hello from the first thread.\n"); + fflush (stdout); + + for (i = 0; i < 300; ++i) + sleep (1); + + if (start_third_thread) + pthread_join (thr, &val); + + return NULL; +} + +void * +thread_idle_loop (void *arg) +{ + int i; + + for (i = 0; i < 300; ++i) + sleep (1); + + return NULL; +} + +int +main () +{ + pthread_t thr, thr_idle; + void *val; + + if (getenv ("MAKE_EXTRA_THREAD") != NULL) + pthread_create (&thr_idle, NULL, thread_idle_loop, NULL); + + pthread_create (&thr, NULL, thread_worker_1, NULL); + pthread_join (thr, &val); + + if (getenv ("MAKE_EXTRA_THREAD") != NULL) + pthread_join (thr_idle, &val); + + return 0; +} diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.exp b/gdb/testsuite/gdb.threads/stepi-over-clone.exp new file mode 100644 index 00000000000..e580f2248ac --- /dev/null +++ b/gdb/testsuite/gdb.threads/stepi-over-clone.exp @@ -0,0 +1,395 @@ +# Copyright 2021-2023 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +# Test performing a 'stepi' over a clone syscall instruction. + +# This test relies on us being able to spot syscall instructions in +# disassembly output. For now this is only implemented for x86-64. +require {istarget x86_64-*-*} + +# Test only on native targets, for now. +proc is_native_target {} { + return [expr {[target_info gdb_protocol] == ""}] +} +require is_native_target + +standard_testfile + +if { [prepare_for_testing "failed to prepare" $testfile $srcfile \ + {debug pthreads additional_flags=-static}] } { + return +} + +if {![runto_main]} { + return +} + +# Arrange to catch the 'clone' syscall, run until we catch the +# syscall, and try to figure out the address of the actual syscall +# instruction so we can place a breakpoint at this address. + +gdb_test_multiple "catch syscall group:process" "catch process syscalls" { + -re "The feature \'catch syscall\' is not supported.*\r\n$gdb_prompt $" { + unsupported $gdb_test_name + return + } + -re ".*$gdb_prompt $" { + pass $gdb_test_name + } +} + +gdb_test "continue" \ + "Catchpoint $decimal \\(call to syscall clone\[23\]\\), .*" + +# Return true if INSN is a syscall instruction. + +proc is_syscall_insn { insn } { + if [istarget x86_64-*-* ] { + return { $insn == "syscall" } + } else { + error "port me" + } +} + +# A list of addresses with syscall instructions. +set syscall_addrs {} + +# Get list of addresses with syscall instructions. +gdb_test_multiple "disassemble" "" { + -re "Dump of assembler code for function \[^\r\n\]+:\r\n" { + exp_continue + } + -re "^(?:=>)?\\s+(${hex})\\s+<\\+${decimal}>:\\s+(\[^\r\n\]+)\r\n" { + set addr $expect_out(1,string) + set insn [string trim $expect_out(2,string)] + if [is_syscall_insn $insn] { + verbose -log "Found a syscall at: $addr" + lappend syscall_addrs $addr + } + exp_continue + } + -re "^End of assembler dump\\.\r\n$gdb_prompt $" { + if { [llength $syscall_addrs] == 0 } { + unsupported "no syscalls found" + return -1 + } + } +} + +# The test proc. NON_STOP and DISPLACED are either 'on' or 'off', and are +# used to configure how GDB starts up. THIRD_THREAD is either true or false, +# and is used to configure the inferior. +proc test {non_stop displaced third_thread} { + global binfile srcfile + global syscall_addrs + global GDBFLAGS + global gdb_prompt hex decimal + + for { set i 0 } { $i < 3 } { incr i } { + with_test_prefix "i=$i" { + + # Arrange to start GDB in the correct mode. + save_vars { GDBFLAGS } { + append GDBFLAGS " -ex \"set non-stop $non_stop\"" + append GDBFLAGS " -ex \"set displaced $displaced\"" + clean_restart $binfile + } + + runto_main + + # Setup breakpoints at all the syscall instructions we + # might hit. Only issue one pass/fail to make tests more + # comparable between systems. + set test "break at syscall insns" + foreach addr $syscall_addrs { + if {[gdb_test -nopass "break *$addr" \ + ".*" \ + $test] != 0} { + return + } + } + # If we got here, all breakpoints were set successfully. + # We used -nopass above, so issue a pass now. + pass $test + + # Continue until we hit the syscall. + gdb_test "continue" + + if { $third_thread } { + gdb_test_no_output "set start_third_thread=1" + } + + set stepi_error_count 0 + set stepi_new_thread_count 0 + set thread_1_stopped false + set thread_2_stopped false + set seen_prompt false + set hello_first_thread false + + # The program is now stopped at main, but if testing + # against GDBserver, inferior_spawn_id is GDBserver's + # spawn_id, and the GDBserver output emitted before the + # program stopped isn't flushed unless we explicitly do + # so, because it is on a different spawn_id. We could try + # flushing it now, to avoid confusing the following tests, + # but that would have to be done under a timeout, and + # would thus slow down the testcase. Instead, if inferior + # output goes to a different spawn id, then we don't need + # to wait for the first message from the inferior with an + # anchor, as we know consuming inferior output won't + # consume GDB output. OTOH, if inferior output is coming + # out on GDB's terminal, then we must use an anchor, + # otherwise matching inferior output without one could + # consume GDB output that we are waiting for in regular + # expressions that are written after the inferior output + # regular expression match. + if {$::inferior_spawn_id != $::gdb_spawn_id} { + set anchor "" + } else { + set anchor "^" + } + + gdb_test_multiple "stepi" "" { + -re "^stepi\r\n" { + verbose -log "XXX: Consume the initial command" + exp_continue + } + -re "^\\\[New Thread\[^\r\n\]+\\\]\r\n" { + verbose -log "XXX: Consume new thread line" + incr stepi_new_thread_count + exp_continue + } + -re "^\\\[Switching to Thread\[^\r\n\]+\\\]\r\n" { + verbose -log "XXX: Consume switching to thread line" + exp_continue + } + -re "^\\s*\r\n" { + verbose -log "XXX: Consume blank line" + exp_continue + } + + -i $::inferior_spawn_id + + -re "${anchor}Hello from the first thread\\.\r\n" { + set hello_first_thread true + + verbose -log "XXX: Consume first worker thread message" + if { $third_thread } { + # If we are going to start a third thread then GDB + # should hit the breakpoint in clone before printing + # this message. + incr stepi_error_count + } + if { !$seen_prompt } { + exp_continue + } + } + -re "^Hello from the third thread\\.\r\n" { + # We should never see this message. + verbose -log "XXX: Consume third worker thread message" + incr stepi_error_count + if { !$seen_prompt } { + exp_continue + } + } + + -i $::gdb_spawn_id + + -re "^$hex in clone\[23\]? \\(\\)\r\n" { + verbose -log "XXX: Consume stop location line" + set thread_1_stopped true + if { !$seen_prompt } { + verbose -log "XXX: Continuing to look for the prompt" + exp_continue + } + } + -re "^$gdb_prompt " { + verbose -log "XXX: Consume the final prompt" + gdb_assert { $stepi_error_count == 0 } + gdb_assert { $stepi_new_thread_count == 1 } + set seen_prompt true + if { $third_thread } { + if { $non_stop } { + # In non-stop mode if we are trying to start a + # third thread (from the second thread), then the + # second thread should hit the breakpoint in clone + # before actually starting the third thread. And + # so, at this point both thread 1, and thread 2 + # should now be stopped. + if { !$thread_1_stopped || !$thread_2_stopped } { + verbose -log "XXX: Continue looking for an additional stop event" + exp_continue + } + } else { + # All stop mode. Something should have stoppped + # by now otherwise we shouldn't have a prompt, but + # we can't know which thread will have stopped as + # that is a race condition. + gdb_assert { $thread_1_stopped || $thread_2_stopped } + } + } + + if {$non_stop && !$hello_first_thread} { + exp_continue + } + + } + -re "^Thread 2\[^\r\n\]+ hit Breakpoint $decimal, $hex in clone\[23\]? \\(\\)\r\n" { + verbose -log "XXX: Consume thread 2 hit breakpoint" + set thread_2_stopped true + if { !$seen_prompt } { + verbose -log "XXX: Continuing to look for the prompt" + exp_continue + } + } + -re "^PC register is not available\r\n" { + # This is the error we'd see for remote targets. + verbose -log "XXX: Consume error line" + incr stepi_error_count + exp_continue + } + -re "^Couldn't get registers: No such process\\.\r\n" { + # This is the error we see'd for native linux + # targets. + verbose -log "XXX: Consume error line" + incr stepi_error_count + exp_continue + } + } + + # Ensure we are back at a GDB prompt, resynchronise. + verbose -log "XXX: Have completed scanning the 'stepi' output" + gdb_test "p 1 + 2 + 3" " = 6" + + # Check the number of threads we have, it should be exactly two. + set thread_count 0 + set bad_threads 0 + + # Build up our expectations for what the current thread state + # should be. Thread 1 is the easiest, this is the thread we are + # stepping, so this thread should always be stopped, and should + # always still be in clone. + set match_code {} + lappend match_code { + -re "\\*?\\s+1\\s+Thread\[^\r\n\]+clone\[23\]? \\(\\)\r\n" { + incr thread_count + exp_continue + } + } + + # What state should thread 2 be in? + if { $non_stop == "on" } { + if { $third_thread } { + # With non-stop mode on, and creation of a third thread + # having been requested, we expect Thread 2 to exist, and + # be stopped at the breakpoint in clone (just before the + # third thread is actually created). + lappend match_code { + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+$hex in clone\[23\]? \\(\\)\r\n" { + incr thread_count + exp_continue + } + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" { + incr thread_count + incr bad_threads + exp_continue + } + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" { + verbose -log "XXX: thread 2 is bad, unknown state" + incr thread_count + incr bad_threads + exp_continue + } + } + + } else { + # With non-stop mode on, and no third thread having been + # requested, then we expect Thread 2 to exist, and still + # be running. + lappend match_code { + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" { + incr thread_count + exp_continue + } + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" { + verbose -log "XXX: thread 2 is bad, unknown state" + incr thread_count + incr bad_threads + exp_continue + } + } + } + } else { + # With non-stop mode off then we expect Thread 2 to exist, and + # be stopped. We don't have any guarantee about where the + # thread will have stopped though, so we need to be vague. + lappend match_code { + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" { + verbose -log "XXX: thread 2 is bad, unexpectedly running" + incr thread_count + incr bad_threads + exp_continue + } + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+_start\[^\r\n\]+\r\n" { + # We know that the thread shouldn't be stopped + # at _start, though. This is the location of + # the scratch pad on Linux at the time of + # writting. + verbose -log "XXX: thread 2 is bad, stuck in scratchpad" + incr thread_count + incr bad_threads + exp_continue + } + -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" { + incr thread_count + exp_continue + } + } + } + + # We don't expect to ever see a thread 3. Even when we are + # requesting that this third thread be created, thread 2, the + # thread that creates thread 3, should stop before executing the + # clone syscall. So, if we do ever see this then something has + # gone wrong. + lappend match_code { + -re "\\s+3\\s+Thread\[^\r\n\]+\r\n" { + incr thread_count + incr bad_threads + exp_continue + } + } + + lappend match_code { + -re "$gdb_prompt $" { + gdb_assert { $thread_count == 2 } + gdb_assert { $bad_threads == 0 } + } + } + + set match_code [join $match_code] + gdb_test_multiple "info threads" "" $match_code + } + } +} + +# Run the test in all suitable configurations. +foreach_with_prefix third_thread { false true } { + foreach_with_prefix non-stop { "on" "off" } { + foreach_with_prefix displaced { "off" "on" } { + test ${non-stop} ${displaced} ${third_thread} + } + } +}