gdb: infrun: consume multiple events at each pass in stop_all_threads
[Simon: I send this patch on behalf of Laurent Morichetti, I added the
commit message and performance measurement stuff.
Also, this patch is better viewed with "git show -w".]
stop_all_threads, in infrun.c, is used to stop all running threads on
targets that are always non-stop. It's used, for example, when the
program hits a breakpoint while GDB is set to "non-stop off". It sends
a stop request for each running thread, then collects one wait event for
each.
Since new threads can spawn while we are stopping the threads, it's
written in a way where it makes multiple such "send stop requests to
running threads & collect wait events" passes. The function completes
when it has made two passes where it hasn't seen any running threads.
With the way it's written right now is, it iterates on the thread list,
sending a stop request for each running thread. It then waits for a
single event, after which it iterates through the thread list again. It
sends stop requests for any running threads that's been created since
the last iteration. It then consumes another single wait event.
This makes it so we iterate on O(n^2) threads in total, where n is the
number of threads. This patch changes the function to reduce it to
O(n). This starts to have an impact when dealing with multiple
thousands of threads (see numbers below). At each pass, we know the
number of outstanding stop requests we have sent, for which we need to
collect a stop event. We can therefore loop to collect this many stop
events before proceeding to the next pass and iterate on the thread list
again.
To check the performance improvements with this patch, I made an
x86/Linux program with a large number of idle threads (varying from 1000
to 10000). The program's main thread hits a breakpoint once all these
threads have started, which causes stop_all_threads to be called to stop
all these threads. I measured (by patching stop_all_threads):
- the execution time of stop_all_threads
- the total number of threads we iterate on during the complete
execution of the function (the total number of times we execute the
"for (thread_info *t : all_non_exited_threads ())" loop)
These are the execution times, in milliseconds:
# threads before after
1000 226 106
2000 997 919
3000 3461 2323
4000 4330 3570
5000 8642 6600
6000 9918 8039
7000 12662 10930
8000 16652 11222
9000 21561 15875
10000 26613 20019
Note that I very unscientifically executed each case only once.
These are the number of loop executions:
# threads before after
1000
1003002 3003
2000
4006002 6003
3000
9009002 9003
4000
16012002 12003
5000
25015002 15003
6000
36018002 18003
7000
49021002 21003
8000
64024002 24003
9000
81027002 27003
10000
100030002 30003
This last table shows pretty well the O(n^2) vs O(n) behaviors.
Reg-tested on x86 GNU/Linux (Ubuntu 16.04).
gdb/ChangeLog:
YYYY-MM-DD Laurent Morichetti <Laurent.Morichetti@amd.com>
YYYY-MM-DD Simon Marchi <simon.marchi@efficios.com>
* infrun.c (stop_all_threads): Collect multiple wait events at
each pass.