testsuite: tcl exec& -> 'kill -9 $pid' is racy (attach-many-short-lived-thread.exp...
authorPedro Alves <palves@redhat.com>
Fri, 31 Jul 2015 19:06:24 +0000 (20:06 +0100)
committerPedro Alves <palves@redhat.com>
Fri, 31 Jul 2015 19:06:24 +0000 (20:06 +0100)
The buildbots show that attach-many-short-lived-thread.exp is racy.
But after staring at debug logs and playing with SystemTap scripts for
a (long) while, I figured out that neither GDB, nor the kernel nor the
test's program itself are at fault.

The problem is simply that the testsuite machinery is currently
subject to PID-reuse races.  The attach-many-short-lived-threads.c
test program just happens to be much more susceptible to trigger this
race because threads and processes share the same number space on
Linux, and the test spawns many many short lived threads in
succession, thus enlarging the race window a lot.

Part of the problem is that several tests spawn processes with "exec&"
(in order to test the "attach" command) , and then at the end of the
test, to make sure things are cleaned up, issue a 'remote_spawn "kill
-p $testpid"'.  Since with tcl's "exec&", tcl itself is responsible
for reaping the process's exit status, when we go kill the process,
testpid may have already exited _and_ its status may have (and often
has) been reaped already.  Thus it can happen that another process
meanwhile reuses $testpid, and that "kill" command kills the wrong
process...  Frequently, that happens to be
attach-many-short-lived-thread, but this explains other test's races
as well.

In the attach-many-short-lived-threads test, it sometimes manifests
like this:

 (gdb) file /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads
 Reading symbols from /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads...done.
 (gdb)           Loaded /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads into /home/pedro/gdb/mygit/build/gdb/testsuite/../../gdb/gdb
 attach 5940
 Attaching to program: /home/pedro/gdb/mygit/build/gdb/testsuite/gdb.threads/attach-many-short-lived-threads, process 5940
 warning: process 5940 is a zombie - the process has already terminated
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 ptrace: Operation not permitted.
 (gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: attach
 info threads
 No threads.
 (gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: no new threads
 set breakpoint always-inserted on
 (gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: set breakpoint always-inserted on

Other times the process dies while the test is ongoing (the process is
ptrace-stopped):

 (gdb) print again = 1
 Cannot access memory at address 0x6020cc
 (gdb) FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 2: reset timer in the inferior

(Recall that on Linux, SIGKILL is not interceptable)

And other times it dies just while we're detaching:

 $4 = 319
 (gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 2: print seconds_left
 detach
 Can't detach Thread 0x7fb13b7de700 (LWP 1842): No such process
 (gdb) FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 2: detach

GDB mishandles the latter (it should ignore ESRCH while detaching just
like when continuing), but that's another story.

The fix here is to change spawn_wait_for_attach to use Expect's
'spawn' command instead of Tcl's 'exec&' to spawn programs, because
with spawn we control when to wait for/reap the process.  That allows
killing the process by PID without being subject to pid-reuse races,
because even if the process is already dead, the kernel won't reuse
the process's PID until the zombie is reaped.

The other part of the problem lies in DejaGnu itself, unfortunately.
I have occasionally seen tests (attach-many-short-lived-threads
included, but not only that one) die with a random inexplicable
SIGTERM too, and that too is caused by the same reason, except that in
that case, the rogue SIGTERM is sent from this bit in DejaGnu's remote.exp:

    exec sh -c "exec > /dev/null 2>&1 && (kill -2 $pgid || kill -2 $pid) && sleep 5 && (kill $pgid || kill $pid) && sleep 5 && (kill -9 $pgid || kill -9 $pid) &"
    ...
    catch "wait -i $shell_id"

Even if the program exits promptly, that whole cascade of kills
carries on in the background, thus potentially killing the poor
process that manages to reuse $pid...

I sent a fix for that to the DejaGnu list:
 http://lists.gnu.org/archive/html/dejagnu/2015-07/msg00000.html

With both patches in place, I haven't seen
attach-many-short-lived-threads.exp fail again.

Tested on x86_64 Fedora 20, native, gdbserver and extended-gdbserver.

gdb/testsuite/ChangeLog:
2015-07-31  Pedro Alves  <palves@redhat.com>

* gdb.base/attach-pie-misread.exp: Rename $res to $test_spawn_id.
Use spawn_id_get_pid.  Wait for spawn id after eof.  Use
kill_wait_spawned_process instead of explicit "kill -9".
* gdb.base/attach-pie-noexec.exp: Adjust to spawn_wait_for_attach
returning a spawn id instead of a pid.  Use spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/attach-twice.exp: Likewise.
* gdb.base/attach.exp: Likewise.
(do_command_attach_tests): Use gdb_spawn_with_cmdline_opts and
gdb_test_multiple.
* gdb.base/solib-overlap.exp: Adjust to spawn_wait_for_attach
returning a spawn id instead of a pid.  Use spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/valgrind-infcall.exp: Likewise.
* gdb.multi/multi-attach.exp: Likewise.
* gdb.python/py-prompt.exp: Likewise.
* gdb.python/py-sync-interp.exp: Likewise.
* gdb.server/ext-attach.exp: Likewise.
* gdb.threads/attach-into-signal.exp (corefunc): Use
spawn_wait_for_attach, spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.threads/attach-many-short-lived-threads.exp: Adjust to
spawn_wait_for_attach returning a spawn id instead of a pid.  Use
spawn_id_get_pid and kill_wait_spawned_process.
* gdb.threads/attach-stopped.exp (corefunc): Use
spawn_wait_for_attach, spawn_id_get_pid and
kill_wait_spawned_process.
* gdb.base/break-interp.exp: Rename $res to $test_spawn_id.
Use spawn_id_get_pid.  Wait for spawn id after eof.  Use
kill_wait_spawned_process instead of explicit "kill -9".
* lib/gdb.exp (can_spawn_for_attach): Adjust comment.
(kill_wait_spawned_process, spawn_id_get_pid): New procedures.
(spawn_wait_for_attach): Use spawn instead of exec to spawn
processes.  Don't map cygwin/windows pids here.  Now returns a
spawn id list.

16 files changed:
gdb/testsuite/ChangeLog
gdb/testsuite/gdb.base/attach-pie-misread.exp
gdb/testsuite/gdb.base/attach-pie-noexec.exp
gdb/testsuite/gdb.base/attach-twice.exp
gdb/testsuite/gdb.base/attach.exp
gdb/testsuite/gdb.base/break-interp.exp
gdb/testsuite/gdb.base/solib-overlap.exp
gdb/testsuite/gdb.base/valgrind-infcall.exp
gdb/testsuite/gdb.multi/multi-attach.exp
gdb/testsuite/gdb.python/py-prompt.exp
gdb/testsuite/gdb.python/py-sync-interp.exp
gdb/testsuite/gdb.server/ext-attach.exp
gdb/testsuite/gdb.threads/attach-into-signal.exp
gdb/testsuite/gdb.threads/attach-many-short-lived-threads.exp
gdb/testsuite/gdb.threads/attach-stopped.exp
gdb/testsuite/lib/gdb.exp

index fa999c316f35bfc3aeb6ba5b491e1b8d7e4f3e59..e11260c6a617b43b0657e734e94fe2120419cd7b 100644 (file)
@@ -1,3 +1,41 @@
+2015-07-31  Pedro Alves  <palves@redhat.com>
+
+       * gdb.base/attach-pie-misread.exp: Rename $res to $test_spawn_id.
+       Use spawn_id_get_pid.  Wait for spawn id after eof.  Use
+       kill_wait_spawned_process instead of explicit "kill -9".
+       * gdb.base/attach-pie-noexec.exp: Adjust to spawn_wait_for_attach
+       returning a spawn id instead of a pid.  Use spawn_id_get_pid and
+       kill_wait_spawned_process.
+       * gdb.base/attach-twice.exp: Likewise.
+       * gdb.base/attach.exp: Likewise.
+       (do_command_attach_tests): Use gdb_spawn_with_cmdline_opts and
+       gdb_test_multiple.
+       * gdb.base/solib-overlap.exp: Adjust to spawn_wait_for_attach
+       returning a spawn id instead of a pid.  Use spawn_id_get_pid and
+       kill_wait_spawned_process.
+       * gdb.base/valgrind-infcall.exp: Likewise.
+       * gdb.multi/multi-attach.exp: Likewise.
+       * gdb.python/py-prompt.exp: Likewise.
+       * gdb.python/py-sync-interp.exp: Likewise.
+       * gdb.server/ext-attach.exp: Likewise.
+       * gdb.threads/attach-into-signal.exp (corefunc): Use
+       spawn_wait_for_attach, spawn_id_get_pid and
+       kill_wait_spawned_process.
+       * gdb.threads/attach-many-short-lived-threads.exp: Adjust to
+       spawn_wait_for_attach returning a spawn id instead of a pid.  Use
+       spawn_id_get_pid and kill_wait_spawned_process.
+       * gdb.threads/attach-stopped.exp (corefunc): Use
+       spawn_wait_for_attach, spawn_id_get_pid and
+       kill_wait_spawned_process.
+       * gdb.base/break-interp.exp: Rename $res to $test_spawn_id.
+       Use spawn_id_get_pid.  Wait for spawn id after eof.  Use
+       kill_wait_spawned_process instead of explicit "kill -9".
+       * lib/gdb.exp (can_spawn_for_attach): Adjust comment.
+       (kill_wait_spawned_process, spawn_id_get_pid): New procedures.
+       (spawn_wait_for_attach): Use spawn instead of exec to spawn
+       processes.  Don't map cygwin/windows pids here.  Now returns a
+       spawn id list.
+
 2015-07-30  Sandra Loosemore  <sandra@codesourcery.com>
 
        * gdb.cp/var-tag.exp (do_global_tests): Revert broken commit
index 679c05040fe01bc9ccb490f1a09d69422b191ee1..5f09cd94c77a40d3d4ad2dab8f6c8d5d2ceb71d6 100644 (file)
@@ -126,25 +126,25 @@ if {$first_offset == 0} {
 set test "start inferior"
 gdb_exit
 
-set res [remote_spawn host $binfile]
-if { $res < 0 || $res == "" } {
+set test_spawn_id [remote_spawn host $binfile]
+if { $test_spawn_id < 0 || $test_spawn_id == "" } {
     perror "Spawning $binfile failed."
     fail $test
     return
 }
-set pid [exp_pid -i $res]
+set testpid [spawn_id_get_pid $test_spawn_id]
 gdb_expect {
     -re "sleeping\r\n" {
        pass $test
     }
     eof {
        fail "$test (eof)"
-       remote_exec host "kill -9 $pid"
+       wait -i $test_spawn_id
        return
     }
     timeout {
        fail "$test (timeout)"
-       remote_exec host "kill -9 $pid"
+       kill_wait_spawned_process $test_spawn_id
        return
     }
 }
@@ -176,8 +176,8 @@ foreach align_mult {1 2} { with_test_prefix "shift-by-$align_mult" {
     clean_restart $executable
 
     set test "attach"
-    gdb_test_multiple "attach $pid" $test {
-       -re "Attaching to program: .*, process $pid\r\n" {
+    gdb_test_multiple "attach $testpid" $test {
+       -re "Attaching to program: .*, process $testpid\r\n" {
            # Missing "$gdb_prompt $" is intentional.
            pass $test
        }
@@ -196,4 +196,4 @@ foreach align_mult {1 2} { with_test_prefix "shift-by-$align_mult" {
     gdb_test "detach" "Detaching from program: .*"
 }}
 
-remote_exec host "kill -9 $pid"
+kill_wait_spawned_process $test_spawn_id
index 30a2f4d9de9d6b47dcac8da0448f18ec1515a351..fc53b2d8705ecd67c0bdf868d3aec415ef4f1e64 100644 (file)
@@ -55,7 +55,8 @@ if {$arch == ""} {
 # Start the program running and then wait for a bit, to be sure
 # that it can be attached to.
 
-set testpid [spawn_wait_for_attach $binfile]
+set test_spawn_id [spawn_wait_for_attach $binfile]
+set testpid [spawn_id_get_pid $test_spawn_id]
 
 gdb_start
 file delete -- $binfile
@@ -63,4 +64,4 @@ gdb_test "attach $testpid" "Attaching to process $testpid\r\n.*: No such file or
 gdb_test "set architecture $arch" "The target architecture is assumed to be $arch"
 gdb_test "info shared" "From\[ \t\]+To\[ \t\]+Syms Read\[ \t\]+Shared Object Library\r\n0x.*"
 
-eval exec kill -9 $testpid
+kill_wait_spawned_process $test_spawn_id
index f6a9eb6f5ee443596ca17b648f6f101a9b09e779..380c80eff0a7fcc607bd2aac0de550269ad99fef 100644 (file)
@@ -27,7 +27,8 @@ if { [prepare_for_testing ${testfile}.exp $executable] } {
 # Start the program running and then wait for a bit, to be sure
 # that it can be attached to.
 
-set testpid [spawn_wait_for_attach $binfile]
+set test_spawn_id [spawn_wait_for_attach $binfile]
+set testpid [spawn_id_get_pid $test_spawn_id]
 
 set parentpid 0
 
@@ -48,4 +49,4 @@ gdb_test_multiple "attach $testpid" $test {
 if {$parentpid != 0} {
   eval exec kill -9 $parentpid
 }
-eval exec kill -9 $testpid
+kill_wait_spawned_process $test_spawn_id
index f2ebe3a4c562451400f09c83eaadf4c1caa6582e..bf0d84effbcfdc6c8aaab59686d5b7a146c18c50 100644 (file)
@@ -82,7 +82,8 @@ proc do_attach_tests {} {
     # Start the program running and then wait for a bit, to be sure
     # that it can be attached to.
 
-    set testpid [spawn_wait_for_attach $binfile]
+    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
 
     # Verify that we cannot attach to nonsense.
 
@@ -279,10 +280,11 @@ proc do_attach_tests {} {
     # the next test run (and prevent the compile by keeping
     # the text file busy), in case the "set should_exit" didn't
     # work.
-   
-    remote_exec build "kill -9 ${testpid}"
 
-    set testpid [spawn_wait_for_attach $binfile]
+    kill_wait_spawned_process $test_spawn_id
+
+    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
 
     # Verify that we can attach to the process, and find its a.out
     # when we're cd'd to some directory that doesn't contain the
@@ -322,14 +324,15 @@ proc do_attach_tests {} {
        "y"
     
     # Another "don't leave a process around"
-    remote_exec build "kill -9 ${testpid}"
+    kill_wait_spawned_process $test_spawn_id
 }
 
 proc do_call_attach_tests {} {
     global gdb_prompt
     global binfile2
     
-    set testpid [spawn_wait_for_attach $binfile2]
+    set test_spawn_id [spawn_wait_for_attach $binfile2]
+    set testpid [spawn_id_get_pid $test_spawn_id]
 
     # Attach
    
@@ -366,7 +369,7 @@ proc do_call_attach_tests {} {
    
     # Be paranoid
    
-    remote_exec build "kill -9 ${testpid}"
+    kill_wait_spawned_process $test_spawn_id
 }
 
 proc do_command_attach_tests {} {
@@ -382,27 +385,21 @@ proc do_command_attach_tests {} {
        return 0
     }
 
-    set testpid [spawn_wait_for_attach $binfile]
+    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
 
     gdb_exit
-    if $verbose>1 then {
-        send_user "Spawning $GDB $INTERNAL_GDBFLAGS $GDBFLAGS --pid=$testpid\n"
-    }
 
-    eval "spawn $GDB $INTERNAL_GDBFLAGS $GDBFLAGS --pid=$testpid"
+    set res [gdb_spawn_with_cmdline_opts "--pid=$testpid"]
     set test "starting with --pid"
-    expect {
+    gdb_test_multiple "" $test {
        -re "Reading symbols from.*$gdb_prompt $" {
            pass "$test"
        }
-       timeout {
-           fail "$test (timeout)"
-       }
     }
 
     # Get rid of the process
-
-    remote_exec build "kill -9 ${testpid}"
+    kill_wait_spawned_process $test_spawn_id
 }
 
 # Test ' gdb --pid PID -ex "run" '.  GDB used to have a bug where
@@ -418,7 +415,8 @@ proc test_command_line_attach_run {} {
     }
 
     with_test_prefix "cmdline attach run" {
-       set testpid [spawn_wait_for_attach $binfile]
+       set test_spawn_id [spawn_wait_for_attach $binfile]
+       set testpid [spawn_id_get_pid $test_spawn_id]
 
        set test "run to prompt"
        gdb_exit
@@ -427,7 +425,7 @@ proc test_command_line_attach_run {} {
                     "-iex \"set height 0\" -iex \"set width 0\" --pid=$testpid -ex \"start\""]
        if { $res != 0} {
            fail $test
-           remote_exec build "kill -9 ${testpid}"
+           kill_wait_spawned_process $test_spawn_id
            return $res
        }
        gdb_test_multiple "" $test {
@@ -446,7 +444,7 @@ proc test_command_line_attach_run {} {
        }
 
        # Get rid of the process
-       remote_exec build "kill -9 ${testpid}"
+       kill_wait_spawned_process $test_spawn_id
     }
 }
 
index aca2c024dd34b8b30c91698894381e8bf2386852..59411591b1d1b20fc8025425ad59bac0c325f249 100644 (file)
@@ -308,23 +308,25 @@ proc test_attach {file displacement {relink_args ""}} {
     set test "sleep function started"
 
     set command "${file} sleep"
-    set res [remote_spawn host $command]
-    if { $res < 0 || $res == "" } {
+    set test_spawn_id [remote_spawn host $command]
+    if { $test_spawn_id < 0 || $test_spawn_id == "" } {
        perror "Spawning $command failed."
        fail $test
        return
     }
-    set pid [exp_pid -i $res]
+    set pid [spawn_id_get_pid $test_spawn_id]
     gdb_expect {
        -re "sleeping\r\n" {
            pass $test
        }
        eof {
            fail "$test (eof)"
+           wait -i $test_spawn_id
            return
        }
        timeout {
            fail "$test (timeout)"
+           kill_wait_spawned_process $test_spawn_id
            return
        }
     }
@@ -364,7 +366,7 @@ proc test_attach {file displacement {relink_args ""}} {
        file_copy $interp_saved $interp
     }
 
-    remote_exec host "kill -9 $pid"
+    kill_wait_spawned_process $test_spawn_id
 }
 
 proc test_ld {file ifmain trynosym displacement} {
index ad96d02b82e28e93a443659dd4d7c38fe0a9f9f9..95a1849edad5136bddcfb678d7cc49c9cfad7694 100644 (file)
@@ -81,7 +81,8 @@ foreach prelink_lib1 {0x40000000 0x50000000} { with_test_prefix "$prelink_lib1"
        return -1
     }
 
-    set testpid [spawn_wait_for_attach $binfile]
+    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
 
     remote_exec build "mv -f ${binfile_lib1} ${binfile_lib1}-running"
     remote_exec build "mv -f ${binfile_lib2} ${binfile_lib2}-running"
@@ -92,7 +93,7 @@ foreach prelink_lib1 {0x40000000 0x50000000} { with_test_prefix "$prelink_lib1"
     if { [gdb_compile_shlib ${srcfile_lib} ${binfile_lib1} $lib_flags] != ""
         || [gdb_compile_shlib ${srcfile_lib} ${binfile_lib2} $lib_flags] != ""} {
        untested "Could not recompile ${binfile_lib1_test_msg} or ${binfile_lib2_test_msg}."
-       remote_exec build "kill -9 ${testpid}"
+       kill_wait_spawned_process $test_spawn_id
        return -1
     }
 
@@ -123,5 +124,5 @@ foreach prelink_lib1 {0x40000000 0x50000000} { with_test_prefix "$prelink_lib1"
 
     sleep 5
 
-    remote_exec build "kill -9 ${testpid}"
+    kill_wait_spawned_process $test_spawn_id
 }}
index 3d3fdd3c587c7713205b576eafd4b681677285c2..b835c67698d7dca7ddf77a969e846d579de20604 100644 (file)
@@ -76,7 +76,7 @@ gdb_test_multiple "" $test {
 }
 
 # Do not kill valgrind.
-set valgrind_pid [exp_pid -i [board_info host fileid]]
+set valgrind_spawn_id [board_info host fileid]
 unset gdb_spawn_id
 set board [host_info name]
 unset_board_info fileid
@@ -99,13 +99,13 @@ while {$loop && $continue_count < 100} {
        -re "Remote connection closed.*\r\n$gdb_prompt $" {
            fail "$test (remote connection closed)"
            # Only if valgrind got stuck.
-           remote_exec host "kill -9 ${valgrind_pid}"
+           kill_wait_spawned_process $valgrind_spawn_id
            return -1
        }
        -re "The program is not being run\\.\r\n$gdb_prompt $" {
            fail "$test (valgrind vgdb has terminated)"
            # Only if valgrind got stuck.
-           remote_exec host "kill -9 ${valgrind_pid}"
+           kill_wait_spawned_process $valgrind_spawn_id
            return -1
        }
        -re "\r\n$gdb_prompt $" {
@@ -126,4 +126,4 @@ gdb_test_multiple $test $test {
 }
 
 # Only if valgrind got stuck.
-remote_exec host "kill -9 ${valgrind_pid}"
+kill_wait_spawned_process $valgrind_spawn_id
index 8a7bb089fbcb2dcec4cdbfac6904f12352599932..cd89d3eb947ebea3a621870dca9cd7c1b32f59ac 100644 (file)
@@ -30,9 +30,11 @@ if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug additiona
 # Start the programs running and then wait for a bit, to be sure that
 # they can be attached to.
 
-set pid_list [spawn_wait_for_attach [list $binfile $binfile]]
-set testpid1 [lindex $pid_list 0]
-set testpid2 [lindex $pid_list 1]
+set spawn_id_list [spawn_wait_for_attach [list $binfile $binfile]]
+set test_spawn_id1 [lindex $spawn_id_list 0]
+set test_spawn_id2 [lindex $spawn_id_list 1]
+set testpid1 [spawn_id_get_pid $test_spawn_id1]
+set testpid2 [spawn_id_get_pid $test_spawn_id2]
 
 gdb_test "attach $testpid1" \
     "Attaching to program: .*, process $testpid1.*(in|at).*" \
@@ -52,3 +54,6 @@ gdb_test "backtrace" ".*main.*" "backtrace 2"
 gdb_test "kill" "" "kill inferior 2" "Kill the program being debugged.*" "y"
 gdb_test "inferior 1" ".*Switching to inferior 1.*"
 gdb_test "kill" "" "kill inferior 1" "Kill the program being debugged.*" "y"
+
+kill_wait_spawned_process $test_spawn_id1
+kill_wait_spawned_process $test_spawn_id2
index 0f7a3a4bb11c05523de7a845a491b168bd6e8bf2..6f22c5b4f7532e7babbc41703f08aa776fa8b380 100644 (file)
@@ -80,7 +80,8 @@ if {![can_spawn_for_attach]} {
     return 0
 }
 
-set testpid [spawn_wait_for_attach $binfile]
+set test_spawn_id [spawn_wait_for_attach $binfile]
+set testpid [spawn_id_get_pid $test_spawn_id]
 
 set GDBFLAGS [concat $tmp_gdbflags " -ex \"set pagination off\""]
 set GDBFLAGS [concat $GDBFLAGS " -ex \"set editing on\""]
@@ -115,5 +116,5 @@ gdb_test "python print (\"'\" + str(p\[0\]) + \"'\")" "'$gdb_prompt_fail '" \
 gdb_exit
 
 set GDBFLAGS $saved_gdbflags
-exec kill -9 ${testpid}
+kill_wait_spawned_process $test_spawn_id
 return 0
index 0ea31104be9f171047bca8b26d2b02e873b76a5a..46c1562bee9d230c52682ae9d9589f3aa4f8f6d1 100644 (file)
@@ -40,7 +40,8 @@ if { [prepare_for_testing ${testfile}.exp ${testfile} ${srcfile}] } {
     return -1
 }
 
-set testpid [spawn_wait_for_attach $binfile]
+set test_spawn_id [spawn_wait_for_attach $binfile]
+set testpid [spawn_id_get_pid $test_spawn_id]
 
 # Test command 'where' is executed when command 'attach' is done, otherwise
 # function 'sleep' may not show up in backtrace.
@@ -48,3 +49,5 @@ set testpid [spawn_wait_for_attach $binfile]
 gdb_test "python gdb.execute(\"attach $testpid\"); gdb.execute(\"where\")" \
     "in .*sleep \\(.*\\) .* in foo1 \\(\\) at .* in foo2 \\(\\) at .*" \
     "attach and where"
+
+kill_wait_spawned_process $test_spawn_id
index f3cbcf65e76adca64a7cf914b89cda947ff42422..75724e1aa4d10f2b1f9fe9cc84ddde8501234128 100644 (file)
@@ -43,7 +43,8 @@ gdbserver_start_extended
 
 gdb_test_no_output "set remote exec-file $target_exec" "set remote exec-file"
 
-set testpid [spawn_wait_for_attach $binfile]
+set test_spawn_id [spawn_wait_for_attach $binfile]
+set testpid [spawn_id_get_pid $test_spawn_id]
 
 gdb_test "attach $testpid" \
     "Attaching to program: .*, process $testpid.*(in|at).*" \
@@ -69,3 +70,5 @@ gdb_test "backtrace" ".*main.*" "backtrace 2"
 
 gdb_test "kill" "" "kill" "Kill the program being debugged.*" "y"
 gdb_test_no_output "monitor exit"
+
+kill_wait_spawned_process $test_spawn_id
index 17fd8f254a6210e184572a6de350e34cfb3dbf30..d82b9997fd7429c0095a05aaf5b21642740b474f 100644 (file)
@@ -44,12 +44,8 @@ proc corefunc { threadtype executable } {
 
        gdb_test "handle SIGALRM stop print pass" "Yes.*Yes.*Yes.*"
 
-       # Start the program running and then wait for a bit, to be sure
-       # that it can be attached to.
-       # Statistically there is a better chance without giving process a nice.
-
-       set testpid [eval exec $binfile &]
-       exec sleep 2
+       set test_spawn_id [spawn_wait_for_attach $binfile]
+       set testpid [spawn_id_get_pid $test_spawn_id]
 
        # Run 2 passes of the test.
        # The C file inferior stops pending its signals if a single one is lost,
@@ -146,16 +142,11 @@ proc corefunc { threadtype executable } {
        # Exit and detach the process.
        gdb_exit
 
-       # Make sure we don't leave a process around to confuse the
-       # next test run (and prevent the compile by keeping the text
-       # file busy), in case the "set should_exit" didn't work.
-
        # Continue the program - some Linux kernels need it before -9 if the
        # process is stopped.
        remote_exec build "kill -s CONT ${testpid}"
 
-       remote_exec build "kill -9 ${testpid}"
-
+       kill_wait_spawned_process $test_spawn_id
     }
 }
 
index 2c2236da2e9588e53dec6ca752e029d547dd396b..84e005ffcb57d3014ddb5dcc0a00b5fbbba72ef0 100644 (file)
@@ -36,7 +36,8 @@ proc test {} {
 
     clean_restart ${binfile}
 
-    set testpid [spawn_wait_for_attach $binfile]
+    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
 
     set attempts 10
     for {set attempt 1} { $attempt <= $attempts } { incr attempt } {
@@ -127,8 +128,7 @@ proc test {} {
            delete_breakpoints
        }
     }
-
-    remote_exec target "kill -9 ${testpid}"
+    kill_wait_spawned_process $test_spawn_id
 }
 
 # The test program exits after a while, in case GDB crashes.  Make it
index ac35ac5570d6027cdf4f4b7b4dcd964b724aa134..9a7edc54185e2d6ba0f20caf44ff69ee4e2758ae 100644 (file)
@@ -44,13 +44,8 @@ proc corefunc { threadtype } {
        return -1
     }
 
-    # Start the program running and then wait for a bit, to be sure
-    # that it can be attached to.
-
-    set testpid [eval exec $binfile &]
-
-    # Avoid some race:
-    sleep 2
+    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
 
     # Stop the program 
     remote_exec build "kill -s STOP ${testpid}"
@@ -86,7 +81,7 @@ proc corefunc { threadtype } {
     # the Linux kernel version.  The behavior is not tested as it is not
     # dependent on GDB.
 
-    remote_exec build "kill -9 ${testpid}"
+    kill_wait_spawned_process $test_spawn_id
 }
 
 # build the test case first without threads
index e3faf181daa338a78dc6b8ca7486f80a0a4f9d8c..986d92079463165dd7c33e7ffe79a43df42ab6b5 100644 (file)
@@ -3706,7 +3706,9 @@ proc gdb_exit { } {
 # it.
 
 proc can_spawn_for_attach { } {
-    # We use TCL's exec to get the inferior's pid.
+    # We use exp_pid to get the inferior's pid, assuming that gives
+    # back the pid of the program.  On remote boards, that would give
+    # us instead the PID of e.g., the ssh client, etc.
     if [is_remote target] then {
        return 0
     }
@@ -3722,12 +3724,50 @@ proc can_spawn_for_attach { } {
     return 1
 }
 
+# Kill a progress previously started with spawn_wait_for_attach, and
+# reap its wait status.  PROC_SPAWN_ID is the spawn id associated with
+# the process.
+
+proc kill_wait_spawned_process { proc_spawn_id } {
+    set pid [exp_pid -i $proc_spawn_id]
+
+    verbose -log "killing ${pid}"
+    remote_exec build "kill -9 ${pid}"
+
+    verbose -log "closing ${proc_spawn_id}"
+    catch "close -i $proc_spawn_id"
+    verbose -log "waiting for ${proc_spawn_id}"
+
+    # If somehow GDB ends up still attached to the process here, a
+    # blocking wait hangs until gdb is killed (or until gdb / the
+    # ptracer reaps the exit status too, but that won't happen because
+    # something went wrong.)  Passing -nowait makes expect tell Tcl to
+    # wait for the PID in the background.  That's fine because we
+    # don't care about the exit status.  */
+    wait -nowait -i $proc_spawn_id
+}
+
+# Returns the process id corresponding to the given spawn id.
+
+proc spawn_id_get_pid { spawn_id } {
+    set testpid [exp_pid -i $spawn_id]
+
+    if { [istarget "*-*-cygwin*"] } {
+       # testpid is the Cygwin PID, GDB uses the Windows PID, which
+       # might be different due to the way fork/exec works.
+       set testpid [ exec ps -e | gawk "{ if (\$1 == $testpid) print \$4; }" ]
+    }
+
+    return $testpid
+}
+
 # Start a set of programs running and then wait for a bit, to be sure
-# that they can be attached to.  Return a list of the processes' PIDs.
-# It's a test error to call this when [can_spawn_for_attach] is false.
+# that they can be attached to.  Return a list of processes spawn IDs,
+# one element for each process spawned.  It's a test error to call
+# this when [can_spawn_for_attach] is false.
 
 proc spawn_wait_for_attach { executable_list } {
-    set pid_list {}
+    set spawn_id_list {}
 
     if ![can_spawn_for_attach] {
        # The caller should have checked can_spawn_for_attach itself
@@ -3736,22 +3776,16 @@ proc spawn_wait_for_attach { executable_list } {
     }
 
     foreach {executable} $executable_list {
-       lappend pid_list [eval exec $executable &]
+       # Note we use Expect's spawn, not Tcl's exec, because with
+       # spawn we control when to wait for/reap the process.  That
+       # allows killing the process by PID without being subject to
+       # pid-reuse races.
+       lappend spawn_id_list [remote_spawn target $executable]
     }
 
     sleep 2
 
-    if { [istarget "*-*-cygwin*"] } {
-       for {set i 0} {$i < [llength $pid_list]} {incr i} {
-           # testpid is the Cygwin PID, GDB uses the Windows PID,
-           # which might be different due to the way fork/exec works.
-           set testpid [lindex $pid_list $i]
-           set testpid [ exec ps -e | gawk "{ if (\$1 == $testpid) print \$4; }" ]
-           set pid_list [lreplace $pid_list $i $i $testpid]
-       }
-    }
-
-    return $pid_list
+    return $spawn_id_list
 }
 
 #