gdb/dwarf: add assertion in maybe_queue_comp_unit
The symptom that leads to this is the crash described in PR 26828:
/home/simark/src/binutils-gdb/gdb/dwarf2/read.c:23478:25: runtime error: member access within null pointer of type 'struct dwarf2_cu'
The line of the crash is the following, in follow_die_offset:
if (target_cu != cu)
target_cu->ancestor = cu; <--- HERE
The line that assign nullptr to `target_cu` is the `per_objfile->get_cu`
call after having called maybe_queue_comp_unit:
/* If necessary, add it to the queue and load its DIEs. */
if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language))
load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
false, cu->language);
target_cu = per_objfile->get_cu (per_cu); <--- HERE
Some background: there is an invariant, documented in
maybe_queue_comp_unit's doc, that if a CU is queued for expansion
(present in dwarf2_per_bfd::queue), then its DIEs are loaded in memory.
"its DIEs are loaded in memory" is a synonym for saying that a dwarf2_cu
object exists for this CU. Yet another way to say it is that
`per_objfile->get_cu (per_cu)` returns something not nullptr for that
CU.
The crash documented in PR 26828 triggers some hard-to-reproduce
sequence that ends up violating the invariant:
- dwarf2_fetch_die_type_sect_off gets called for a DIE in CU A
- The DIE in CU A requires some DIE in CU B
- follow_die_offset calls maybe_queue_comp_unit. maybe_queue_comp_unit
sees CU B is not queued and its DIEs are not loaded, so it enqueues it
and returns 1 to its caller - meaning "the DIEs are not loaded, you
should load them" - prompting follow_die_offset to load the DIEs by
calling load_full_comp_unit
- Note that CU B is enqueued by maybe_queue_comp_unit even if it has
already been expanded. It's a bit useless (and causes trouble, see
next patch), but that's how it works right now.
- Since we entered the dwarf2/read code through
dwarf2_fetch_die_type_sect_off, nothing processes the queue, so we
exit the dwarf2/read code with CU B still lingering in the queue.
- dwarf2_fetch_die_type_sect_off gets called for a DIE in CU A, again
- The DIE in CU A requires some DIE in CU B, again
- This time, maybe_queue_comp_unit sees that CU B is in the queue.
Because of the invariant that if a CU is in the queue, its DIEs are
loaded in the memory, it returns 0 to its caller, meaning "you don't
need to load the DIEs!".
- That happens to be true, so everything is fine for now.
- Time passes, some things call dwarf2_per_objfile::age_comp_units
enough so that CU B's age becomes past the dwarf_max_cache_age
threshold. age_comp_units proceeds to free CU B's DIEs. Remember
that CU B is still lingering in the queue (oops, the invariant just
got violated).
- dwarf2_fetch_die_type_sect_off gets called for a DIE in CU A, again
- The DIE in CU A requires some DIE in CU B, again
- maybe_queue_comp_unit sees that CU B is in the queue, so returns to
its caller "you don't need to load the DIEs!". However, we know at
this point this is false.
- follow_die_offset doesn't load the DIEs and tries to obtain the DIEs for
CU B:
target_cu = per_objfile->get_cu (per_cu);
But since they are not loaded, target_cu is nullptr, and we get the
crash mentioned above a few lines after that.
This patch adds an assertions in maybe_queue_comp_unit to verify the
invariant, to make sure it doesn't return a falsehood to its caller.
The current patch doesn't fix the issue (the next patch does), but it
makes it so we catch the problem earlier and get this assertion failure
instead of a segmentation fault:
/home/simark/src/binutils-gdb/gdb/dwarf2/read.c:9100: internal-error:
int maybe_queue_comp_unit(dwarf2_cu*, dwarf2_per_cu_data*, dwarf2_per_objfile*, language):
Assertion `per_objfile->get_cu (per_cu) != nullptr' failed.
gdb/ChangeLog:
PR gdb/26828
* dwarf2/read.c (maybe_queue_comp_unit): Add assertion.
Change-Id: I4e51bd7bd58773f9fadf480179cbc4bae61508fe