Optimize .gdb_index symbol name searching
As mentioned in the previous patch, .gdb_index name lookup got
significantly slower with the previous patch.
This patch addresses that, and in the process makes .gdb_index name
searching faster than what we had before the previous patch, even.
Using the same test:
$ cat script.cmd
set pagination off
set $count = 0
while $count < 400
complete b string_prin
printf "count = %d\n", $count
set $count = $count + 1
end
$ time gdb --batch -q ./gdb-with-index -ex "source script.cmd"
I got, before the previous patch (-O2, x86-64):
real 0m1.773s
user 0m1.737s
sys 0m0.040s
and after this patch:
real 0m1.361s
user 0m1.315s
sys 0m0.040s
The basic idea here is simple: instead of always iterating over all
the symbol names in the index, we build an accelerator/sorted name
table and binary search names in it.
Later in the series, we'll want to support wild matching for C++ too,
so this mechanism already considers that. For example, say that
you're looking up functions/methods named "func", no matter the
containing namespace/class. If we sorted the table by qualified name,
then we obviously wouldn't be able to find those symbols with a binary
search:
func
ns1::a::b::func
ns1::b::func
ns2::func
(function symbol names in .gdb_index have no parameter info, like psymbols)
To address that out, we put an entry for each name component in the
sorted table. something like this:
Table Entry Actual symbol
---------------------------------
func func
func ns1::a::b::func
b::func ns1::a::b::func
a::b::func ns1::a::b::func
ns1::a::b::func ns1::a::b::func
func ns1::b::func
b::func ns1::b::func
ns1::b::func ns1::b::func
func ns2::func
ns2::func ns2::func
Which sorted results in this:
Table Entry Actual symbol
---------------------------------
a::b::func ns1::a::b::func
b::func ns1::a::b::func
b::func ns1::b::func
func func
func ns1::a::b::func
func ns1::b::func
func ns2::func
ns1::a::b::func ns1::a::b::func
ns1::b::func ns1::b::func
ns2::func ns2::func
And we can binary search this.
Note that a binary search approach works for both completion and
regular lookup, while a name hashing approach only works for normal
symbol looking, since obviously "fun" and "func" have different
hashes.
At first I was a bit wary of these tables potentially growing GDB's
memory significantly. But I did an experiment that convinced it's not
a worry at all. I hacked gdb to count the total number of entries in
all the tables, attached that gdb to my system/Fedora's Firefox
(Fedora's debug packages uses .gdb_index), did "set max-completions
unlimited", and then hit "b [TAB]" to cause everything to expand.
That resulted in
1351355 name_components. Each entry takes 8 bytes,
so that's
10810840 bytes (ignoring std::vector overhead), or ~10.3 MB.
That's IMO too small to worry about, given GDB was using over 7400MB
total at that point. I.e., we're talking about 0.1% increase.
dw2_expand_symtabs_matching unit tests covering this will be added in
a follow up patch.
If the size of this table turns out to be a concern, I have an idea to
reduce the size of the table further at the expense of a bit more code
-- the vast majority of the name offsets are either 0 or fit in
8-bits:
total name_component =
1351355, of which,
name_component::name_offset instances need 0 bits = 679531
name_component::name_offset instances need 8 bits = 669526
name_component::name_offset instances need 16 bits = 2298
name_component::name_offset instances need 32 bits = 0
name_component::idx instances need 0 bits = 51
name_component::idx instances need 8 bits = 8361
name_component::idx instances need 16 bits = 280329
name_component::idx instances need 32 bits =
1062614
so we could have separate tables for 0 name_offset, 8-bit name_offset
and 32-bit name_offset. That'd give us roughly:
679531 * 0 + 669526 * 1 + 2298 * 4 +
1062614 * 4 =
4929174, or ~4.7MB
with only 8-bit and 32-bit tables, that'd be:
1349057 * 1 + 2298 * 4 + 4 *
1351355 =
6763669 bytes, or ~6.5MB.
I don't think we need to bother though.
I also timed:
$ time gdb --batch -q -p `pidof firefox`
$ time gdb --batch -q -p `pidof firefox` -ex "b main"
$ time gdb --batch -q -p `pidof firefox` -ex "set max-completion unlimited" -ex "complete b "
and compared before previous patch vs this patch, and I didn't see a
significant difference, seemingly because time to read debug info
dominates. The "complete b " variant of the test takes ~2min
currently... (I have a follow up series that speeds that up
somewhat.)
gdb/ChangeLog:
2017-11-08 Pedro Alves <palves@redhat.com>
* dwarf2read.c (byte_swap, MAYBE_SWAP): Move higher up in file.
(struct name_component): New.
(mapped_index::name_components): New field.
(mapped_index::symbol_name_at): New method.
(dwarf2_read_index): Call mapped_index ctor.
(dw2_map_matching_symbols): Add comment about name_components
table.
(dw2_expand_symtabs_matching): Factor part to...
(dw2_expand_symtabs_matching_symbol): ... this new function.
Build name components table, and lookup symbols in it before
calling the name matcher.
(dw2_expand_marked_cus): New, factored out from
dw2_expand_symtabs_matching.
(dwarf2_per_objfile_free): Call the mapped_index's dtor.