Andrew Burgess [Fri, 10 Feb 2023 23:49:19 +0000 (23:49 +0000)]
GDB: Introduce limited array lengths while printing values
This commit introduces the idea of loading only part of an array in
order to print it, what I call "limited length" arrays.
The motivation behind this work is to make it possible to print slices
of very large arrays, where very large means bigger than
`max-value-size'.
Consider this GDB session with the current GDB:
(gdb) set max-value-size 100
(gdb) p large_1d_array
value requires 400 bytes, which is more than max-value-size
(gdb) p -elements 10 -- large_1d_array
value requires 400 bytes, which is more than max-value-size
notice that the request to print 10 elements still fails, even though 10
elements should be less than the max-value-size. With a patched version
of GDB:
(gdb) p -elements 10 -- large_1d_array
$1 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9...}
So now the print has succeeded. It also has loaded `max-value-size'
worth of data into value history, so the recorded value can be accessed
consistently:
(gdb) p -elements 10 -- $1
$2 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9...}
(gdb) p $1
$3 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, <unavailable> <repeats 75 times>}
(gdb)
Accesses with other languages work similarly, although for Ada only
C-style [] array element/dimension accesses use history. For both Ada
and Fortran () array element/dimension accesses go straight to the
inferior, bypassing the value history just as with C pointers.
Co-Authored-By: Maciej W. Rozycki <macro@embecosm.com>
Maciej W. Rozycki [Fri, 10 Feb 2023 23:49:19 +0000 (23:49 +0000)]
GDB/testsuite: Add `-nonl' option to `gdb_test'
Add a `-nonl' option to `gdb_test' making it possible to match output
from commands such as `output' that do not produce a new line sequence
at the end, e.g.:
(gdb) output 0
0(gdb)
Maciej W. Rozycki [Fri, 10 Feb 2023 23:49:19 +0000 (23:49 +0000)]
GDB: Only make data actually retrieved into value history available
While it makes sense to allow accessing out-of-bounds elements in the
debuggee and see whatever there might happen to be there in memory (we
are a debugger and not a programming rules enforcement facility and we
want to make people's life easier in chasing bugs), e.g.:
(gdb) print one_hundred[-1]
$1 = 0
(gdb) print one_hundred[100]
$2 = 0
(gdb)
we shouldn't really pretend that we have any meaningful data around
values recorded in history (what these commands really retrieve are
current debuggee memory contents outside the original data accessed,
really confusing in my opinion). Mark values recorded in history as
such then and verify accesses to be in-range for them:
(gdb) print one_hundred[-1]
$1 = <unavailable>
(gdb) print one_hundred[100]
$2 = <unavailable>
Add a suitable test case, which also covers integer overflows in data
location calculation.
Approved-By: Tom Tromey <tom@tromey.com>
Maciej W. Rozycki [Fri, 10 Feb 2023 23:49:19 +0000 (23:49 +0000)]
GDB: Fix the mess with value byte/bit range types
Consistently use the LONGEST and ULONGEST types for value byte/bit
offsets and lengths respectively, avoiding silent truncation for ranges
exceeding the 32-bit span, which may cause incorrect matching. Also
report a conversion overflow on byte ranges that cannot be expressed in
terms of bits with these data types, e.g.:
(gdb) print one_hundred[1LL << 58]
Integer overflow in data location calculation
(gdb) print one_hundred[(-1LL << 58) - 1]
Integer overflow in data location calculation
(gdb)
Previously such accesses would be let through with unpredictable results
produced.
Maciej W. Rozycki [Fri, 10 Feb 2023 23:49:19 +0000 (23:49 +0000)]
GDB: Ignore `max-value-size' setting with value history accesses
We have an inconsistency in value history accesses where array element
accesses cause an error for entries exceeding the currently selected
`max-value-size' setting even where such accesses successfully complete
for elements located in the inferior, e.g.:
(gdb) p/d one
$1 = 0
(gdb) p/d one_hundred
$2 = {0 <repeats 100 times>}
(gdb) p/d one_hundred[99]
$3 = 0
(gdb) set max-value-size 25
(gdb) p/d one_hundred
value requires 100 bytes, which is more than max-value-size
(gdb) p/d one_hundred[99]
$7 = 0
(gdb) p/d $2
value requires 100 bytes, which is more than max-value-size
(gdb) p/d $2[99]
value requires 100 bytes, which is more than max-value-size
(gdb)
According to our documentation the `max-value-size' setting is a safety
guard against allocating an overly large amount of memory. Moreover a
statement in documentation says, concerning this setting, that: "Setting
this variable does not affect values that have already been allocated
within GDB, only future allocations." While in the implementer-speak
the sentence may be unambiguous I think the outside user may well infer
that the setting does not apply to values previously printed.
Therefore rather than just fixing this inconsistency it seems reasonable
to lift the setting for value history accesses, under an implication
that by having been retrieved from the debuggee they have already passed
the safety check. Do it then, by suppressing the value size check in
`value_copy' -- under an observation that if the original value has been
already loaded (i.e. it's not lazy), then it must have previously passed
said check -- making the last two commands succeed:
(gdb) p/d $2
$8 = {0 <repeats 100 times>}
(gdb) p/d $2 [99]
$9 = 0
(gdb)
Expand the testsuite accordingly, covering both value history handling
and the use of `value_copy' by `make_cv_value', used by Python code.
Maciej W. Rozycki [Fri, 10 Feb 2023 23:49:19 +0000 (23:49 +0000)]
GDB: Switch to using C++ standard integer type limits
Use <climits> instead of <limits.h> and ditch local fallback definitions
for minimum and maximum value macros provided by C++11. Add LONGEST_MAX
and LONGEST_MIN definitions.
Approved-By: Tom Tromey <tom@tromey.com>
Tom Tromey [Fri, 10 Feb 2023 18:59:03 +0000 (11:59 -0700)]
Ensure all DAP requests are keyword-only
Python functions implementing DAP requests should not use positional
parameters -- it only makes sense to call them with keyword arguments.
This patch changes the few remaining cases to start with the special
"*" parameter, following this rule.
Simon Marchi [Tue, 17 Jan 2023 16:33:39 +0000 (11:33 -0500)]
gdb/testsuite: fix gdb.gdb/selftest.exp for native-extended-gdbserver
Following commit
4e2a80ba606 ("gdb/testsuite: expect SIGSEGV from top
GDB spawn id"), the next failure I get in gdb.gdb/selftest.exp, using
the native-extended-gdbserver, is:
(gdb) PASS: gdb.gdb/selftest.exp: send ^C to child process
signal SIGINT
Continuing with signal SIGINT.
FAIL: gdb.gdb/selftest.exp: send SIGINT signal to child process (timeout)
The problem is that in this gdb_test_multiple:
set description "send SIGINT signal to child process"
gdb_test_multiple "signal SIGINT" "$description" {
-re "^signal SIGINT\r\nContinuing with signal SIGINT.\r\nQuit\r\n.* $" {
pass "$description"
}
}
The "Continuing with signal SIGINT" portion is printed by the top GDB,
while the Quit portion is printed by the bottom GDB. As the
gdb_test_multiple is written, it expects both the the top GDB's spawn
id.
Fix this by splitting the gdb_test_multiple in two. The first one
expects the "Continuing with signal SIGINT" from the top GDB. The
second one expect "Quit" and the "(xgdb)" prompt from
$inferior_spawn_id. When debugging natively, this spawn id will be the
same as the top GDB's spawn id, but it's different when debugging with
GDBserver.
Change-Id: I689bd369a041b48f4dc9858d38bf977d09600da2
Tom Tromey [Fri, 30 Dec 2022 18:23:43 +0000 (11:23 -0700)]
Use std::string in main_info
This changes main_info to use std::string. It removes some manual
memory management.
Tom de Vries [Fri, 10 Feb 2023 14:58:00 +0000 (15:58 +0100)]
[gdb/testsuite] Fix linespec ambiguity in gdb.base/longjmp.exp
PR testsuite/30103 reports the following failure on aarch64-linux
(ubuntu 22.04):
...
(gdb) PASS: gdb.base/longjmp.exp: with_probes=0: pattern 1: next to longjmp
next
warning: Breakpoint address adjusted from 0x83dc305fef755015 to \
0xffdc305fef755015.
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0xffdc305fef755015
__libc_siglongjmp (env=0xaaaaaaab1018 <env>, val=1) at ./setjmp/longjmp.c:30
30 }
(gdb) KFAIL: gdb.base/longjmp.exp: with_probes=0: pattern 1: gdb/26967 \
(PRMS: next over longjmp)
delete breakpoints
Delete all breakpoints? (y or n) y
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) break 63
No line 63 in the current file.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) FAIL: gdb.base/longjmp.exp: with_probes=0: pattern 2: setup: breakpoint \
at pattern start (got interactive prompt)
...
The test-case intends to set the breakpoint on line number 63 in
gdb.base/longjmp.c.
It tries to do so by specifying "break 63", which specifies a line in the
"current source file".
Due to the KFAIL PR, gdb stopped in __libc_siglongjmp, and because of presence
of debug info, the "current source file" becomes glibc's ./setjmp/longjmp.c.
Consequently, setting the breakpoint fails.
Fix this by adding a $subdir/$srcfile: prefix to the breakpoint linespecs.
I've managed to reproduce the FAIL on x86_64/-m32, by installing the
glibc-32bit-debuginfo package. This allowed me to confirm the "current source
file" that is used:
...
(gdb) KFAIL: gdb.base/longjmp.exp: with_probes=0: pattern 1: gdb/26967 \
(PRMS: next over longjmp)
info source^M
Current source file is ../setjmp/longjmp.c^M
...
Tested on x86_64-linux, target boards unix/{-m64,-m32}.
Reported-By: Luis Machado <luis.machado@arm.com>
Reviewed-By: Tom Tromey <tom@tromey.com>
PR testsuite/30103
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30103
Tom de Vries [Fri, 10 Feb 2023 12:07:14 +0000 (13:07 +0100)]
[gdb/cli] Add maint info frame-unwinders
Add a new command "maint info frame-unwinders":
...
(gdb) help maint info frame-unwinders
List the frame unwinders currently in effect, starting with the highest \
priority.
...
Output for i386:
...
$ gdb -q -batch -ex "set arch i386" -ex "maint info frame-unwinders"
The target architecture is set to "i386".
dummy DUMMY_FRAME
dwarf2 tailcall TAILCALL_FRAME
inline INLINE_FRAME
i386 epilogue NORMAL_FRAME
dwarf2 NORMAL_FRAME
dwarf2 signal SIGTRAMP_FRAME
i386 stack tramp NORMAL_FRAME
i386 sigtramp SIGTRAMP_FRAME
i386 prologue NORMAL_FRAME
...
Output for x86_64:
...
$ gdb -q -batch -ex "set arch i386:x86-64" -ex "maint info frame-unwinders"
The target architecture is set to "i386:x86-64".
dummy DUMMY_FRAME
dwarf2 tailcall TAILCALL_FRAME
inline INLINE_FRAME
python NORMAL_FRAME
amd64 epilogue NORMAL_FRAME
i386 epilogue NORMAL_FRAME
dwarf2 NORMAL_FRAME
dwarf2 signal SIGTRAMP_FRAME
amd64 sigtramp SIGTRAMP_FRAME
amd64 prologue NORMAL_FRAME
i386 stack tramp NORMAL_FRAME
i386 sigtramp SIGTRAMP_FRAME
i386 prologue NORMAL_FRAME
...
Tested on x86_64-linux.
Reviewed-By: Tom Tromey <tom@tromey.com>
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Tsukasa OI [Fri, 10 Feb 2023 09:27:28 +0000 (09:27 +0000)]
RISC-V: Reduce effective linker relaxation passses
Commit
43025f01a0c9 ("RISC-V: Improve link time complexity.") reduced the
time complexity of the linker relaxation but some code portions did not
reflect this change.
This commit fixes a comment describing each relaxation pass and reduces
actual number of passes for the RISC-V linker relaxation from 3 to 2.
Though it does not change the functionality, it marginally improves the
performance while linking large programs (with many relocations).
bfd/ChangeLog:
* elfnn-riscv.c (_bfd_riscv_relax_section): Fix a comment to
reflect current roles of each relaxation pass.
ld/ChangeLog:
* emultempl/riscvelf.em: Reduce the number of linker relaxation
passes from 3 to 2.
Alan Modra [Fri, 10 Feb 2023 09:38:40 +0000 (20:08 +1030)]
Fix mmo memory leaks
The main one here is the section buffer, which can be quite large.
By using alloc rather than malloc we can leave tidying memory to the
generic bfd code when the bfd is closed. bfd_check_format also
releases memory when object_p fails, so while it wouldn't be wrong
to bfd_release at bad_format_free in mmo_object_p, it's a little extra
code and work for no gain.
* mmo.c (mmo_object_p): bfd_alloc rather than bfd_malloc
lop_stab_symbol. Don't free/release on error.
(mmo_get_spec_section): bfd_zalloc rather than bfd_zmalloc
section buffer.
(mmo_scan): Free fname on another error path.
Alan Modra [Fri, 10 Feb 2023 07:33:35 +0000 (18:03 +1030)]
Local label checks in integer_constant
"Local labels are never absolute" says the comment. Except when they
are. Testcase
.offset
0:
a=0b
I don't see any particular reason to disallow local labels inside
struct definitions, so delete the comment and assertions.
* expr.c (integer_constant): Delete local label assertions.
Jan Beulich [Fri, 10 Feb 2023 07:15:11 +0000 (08:15 +0100)]
x86: drop use of VEX3SOURCES
The attribute really specifies that the sum of register and memory
operands is 4. Express it like that in most places, while using the 2nd
(apart from XOP) CPU feature flags (FMA4) in reversed operand matching
logic.
With the use in build_modrm_byte() gone, part of an assertion there
also becomes meaningless - simplify that at the same time.
With all uses of the opcode modifier field gone, also drop that.
Jan Beulich [Fri, 10 Feb 2023 07:14:46 +0000 (08:14 +0100)]
x86: drop use of XOP2SOURCES
The few XOP insns which used it wrongly didn't have VexVVVV specified.
With that added, the only further missing piece to use more generic code
elsewhere is SwapSources - see e.g. the BMI2 insns for similar operand
patterns.
With the only users gone, drop the #define as well as the special case
code.
Jan Beulich [Fri, 10 Feb 2023 07:14:27 +0000 (08:14 +0100)]
x86: limit use of XOP2SOURCES
The VPROT* forms with an immediate operand are entirely standard in the
way their ModR/M bytes are built. There's no reason to invoke special
case code. With that the handling of an immediate there can also be
dropped; it was partially bogus anyway, as in its "no memory operands"
portion it ignores the possibility of an immediate operand (which was
okay only because that case was already handled by more generic code).
Jan Beulich [Fri, 10 Feb 2023 07:10:38 +0000 (08:10 +0100)]
x86: move (and rename) opcodespace attribute
This really isn't a "modifier" and rather ought to live next to the base
opcode anyway. Use the bits we presently have available to fit in the
field, renaming it to opcode_space. As an intended side effect this
helps readability at the use sites, by shortening the references quite a
bit.
In generated code arrange for human readable output, by using the
SPACE_* constants there rather than raw numbers. This may aid debugging
down the road.
Jan Beulich [Fri, 10 Feb 2023 07:10:03 +0000 (08:10 +0100)]
x86: simplify a few expressions
Fold adjacent comparisons when, by ORing in a certain mask, the same
effect can be achieved by a single one. In load_insn_p() this extends
to further uses of an already available local variable.
Jan Beulich [Fri, 10 Feb 2023 07:09:35 +0000 (08:09 +0100)]
x86: improve special casing of certain insns
Now that we have identifiers for the mnemonic strings we can avoid
opcode based comparisons, for (in many cases) being more expensive and
(in a few cases) being a little fragile and not self-documenting.
Note that the MOV optimization can be engaged by the earlier LEA one,
and hence LEA also needs checking for there.
Alan Modra [Fri, 10 Feb 2023 00:24:32 +0000 (10:54 +1030)]
objcopy of mach-o indirect symbols
Anti-fuzzer measure. I'm not sure what the correct fix is for
objcopy. Probably the BFD_MACH_O_S_NON_LAZY_SYMBOL_POINTERS,
BFD_MACH_O_S_LAZY_SYMBOL_POINTERS and BFD_MACH_O_S_SYMBOL_STUBS
contents should be read.
* mach-o.c (bfd_mach_o_section_get_nbr_indirect): Omit sections
with NULL sec->indirect_syms.
GDB Administrator [Fri, 10 Feb 2023 00:00:09 +0000 (00:00 +0000)]
Automatic date update in version.in
Tom Tromey [Thu, 9 Feb 2023 20:33:21 +0000 (13:33 -0700)]
Add full display feature to dwarf-mode.el
I've found that I often use dwarf-mode with relatively small test
files. In this situation, it's handy to be able to expand all the
DWARF, rather than moving to each "..." separately and using C-u C-m.
This patch implements this feature. It also makes a couple of other
minor changes:
* I removed a stale FIXME from dwarf-mode. In practice I find I often
use "g" to restore the buffer to a pristine state; checking the file
mtime would work against this.
* I tightened the regexp in dwarf-insert-substructure. This prevents
the C-m binding from trying to re-read a DIE which has already been
expanded.
* Finally, I've bumped the dwarf-mode version number so that this
version can easily be installed using package.el.
2023-02-09 Tom Tromey <tromey@adacore.com>
* dwarf-mode.el: Bump version to 1.8.
(dwarf-insert-substructure): Tighten regexp.
(dwarf-refresh-all): New defun.
(dwarf-mode-map): Bind "A" to dwarf-refresh-all.
(dwarf-mode): Remove old FIXME.
Tom Tromey [Thu, 9 Feb 2023 19:23:08 +0000 (12:23 -0700)]
Fix comment in gdb.rust/fnfield.exp
gdb.rust/fnfield.exp has a comment that, I assume, I copied from some
other test. This patch fixes it.
Tom Tromey [Thu, 9 Feb 2023 19:13:08 +0000 (12:13 -0700)]
Trivially simplify rust_language::print_enum
rust_language::print_enum computes:
int nfields = variant_type->num_fields ();
... but then does not reuse this in one spot. This patch corrects the
oversight.
Roland McGrath [Thu, 9 Feb 2023 18:47:17 +0000 (10:47 -0800)]
[aarch64] Avoid initializers for VLAs
Clang doesn't accept initializer syntax for variable-length
arrays in C. Just use memset instead.
Christina Schimpe [Fri, 21 Oct 2022 16:02:57 +0000 (09:02 -0700)]
gdb, testsuite: Remove unnecessary call of "set print pretty on"
The command has no effect for the loading of GDB pretty printers and is
removed by this patch to avoid confusion.
Documentation for "set print pretty"
"Cause GDB to print structures in an indented format with one member per line"
Tom Tromey [Wed, 11 Jan 2023 19:42:40 +0000 (12:42 -0700)]
Increase size of main_type::nfields
main_type::nfields is a 'short', and has been for many years. PR
c++/29985 points out that 'short' is too narrow for an enum that
contains more than 2^15 constants.
This patch bumps the size of 'nfields'. To verify that the field
isn't directly used, it is also renamed. Note that this does not
affect the size of main_type on x86-64 Fedora 36. And, if it does
have a negative effect somewhere, it's worth considering that types
could be shrunk more drastically by using subclasses for the different
codes.
This is v2 of this patch, which has these changes:
* I changed nfields to 'unsigned', per Simon's request. I looked at
changing all the uses, but this quickly fans out into a very large
patch. (One additional tweak was needed, though.)
* I wrote a test case. I discovered that GCC cannot compile a large
enough C test case, so I resorted to using the DWARF assembler.
This test doesn't reproduce the crash, but it does fail without the
patch.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29985
Tom Tromey [Thu, 9 Feb 2023 14:36:16 +0000 (07:36 -0700)]
Remove mention of cooked_index_vector
I noticed a leftover mention of cooked_index_vector. This updates the
text.
Tom Tromey [Tue, 6 Dec 2022 15:05:28 +0000 (08:05 -0700)]
Let user C-c when waiting for DWARF index finalization
In PR gdb/29854, Simon pointed out that it would be good to be able to
use C-c when the DWARF cooked index is waiting for finalization. The
idea here is to be able to interrupt a command like "break" -- not to
stop the finalization process itself, which runs in a worker thread.
This patch implements this idea, by changing the index wait functions
to, by default, allow a quit. Polling is done, because there doesn't
seem to be a better way to interrupt a wait on a std::future.
For v2, I realized that the thread compatibility code in thread-pool.h
also needed an update.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29854
Alan Modra [Thu, 9 Feb 2023 01:38:10 +0000 (12:08 +1030)]
coff keep_relocs and keep_contents
keep_relocs is set by pe_ILF_save_relocs but not used anywhere in the
coff/pe code. It is tested by the xcoff backend but not set.
keep_contents is only used by the xcoff backend when dealing with
the .loader section, and it's easy enough to dispense with it there.
keep_contents is set in various places but that's fairly useless when
the contents aren't freed anyway until later linker support functions,
add_dynamic_symbols and check_dynamic_ar_symbols. There the contents
were freed if keep_contents wasn't set. I reckon we can free them
unconditionally.
* coff-bfd.h (struct coff_section_tdata): Delete keep_relocs
and keep_contents.
* peicode.h (pe_ILF_save_relocs): Don't set keep_relocs.
* xcofflink.c (xcoff_get_section_contents): Cache contents.
Return the contents. Update callers.
(_bfd_xcoff_canonicalize_dynamic_symtab): Don't set
keep_contents for .loader.
(xcoff_link_add_dynamic_symbols): Free .loader contents
unconditionally.
(xcoff_link_check_dynamic_ar_symbols): Likewise.
GDB Administrator [Thu, 9 Feb 2023 00:00:27 +0000 (00:00 +0000)]
Automatic date update in version.in
Alan Modra [Wed, 8 Feb 2023 13:21:04 +0000 (23:51 +1030)]
coff-sh.c keep_relocs, keep_contents and keep_syms
keep_relocs and keep_contents are unused nowadays except by
xcofflink.c, and I can't see a reason why keep_syms needs to be set.
The external syms are read and used by sh_relax_section and used by
sh_relax_delete_bytes. There doesn't appear to be any way that
freeing them will cause trouble.
* coff-sh.c (sh_relax_section): Don't set keep_relocs,
keep_contents or keep_syms.
(sh_relax_delete_bytes): Don't set keep_contents.
Alan Modra [Wed, 8 Feb 2023 13:19:46 +0000 (23:49 +1030)]
Memory leak in bfd_init_section_compress_status
* compress.c (bfd_init_section_compress_status): Free
uncompressed_buffer on error return.
Alan Modra [Wed, 8 Feb 2023 04:11:58 +0000 (14:41 +1030)]
Clear cached file size when bfd changed to BFD_IN_MEMORY
If file size is calculated by bfd_get_file_size, as it is by
_bfd_alloc_and_read calls in coff_object_p, then it is cached and when
pe_ILF_build_a_bfd converts an archive entry over to BFD_IN_MEMORY,
the file size is no longer valid. Found when attempting objdump -t on
a very small (27 bytes) ILF file and hitting the pr24707 fix (commit
781152ec18f5). So, clear file size when setting BFD_IN_MEMORY on bfds
that may have been read. (It's not necessary in writable bfds,
because caching is ignored by bfd_get_size when bfd_write_p.)
I also think the PR 24707 fix is no longer neeeded. All of the
testcases in that PR and in PR24712 are caught earlier by file size
checks when reading the symbols from file. So I'm reverting that fix,
which just compared the size of an array of symbol pointers against
file size. That's only valid if on-disk symbols are larger than a
host pointer, so the test is better done in format-specific code.
bfd/
* coff-alpha.c (alpha_ecoff_get_elt_at_filepos): Clear cached
file size when making a BFD_IN_MEMORY bfd.
* opncls.c (bfd_make_readable): Likewise.
* peicode.h (pe_ILF_build_a_bfd): Likewise.
binutils/
PR 24707
* objdump.c (slurp_symtab): Revert PR24707 fix. Tidy.
(slurp_dynamic_symtab): Tidy.
Alan Modra [Wed, 8 Feb 2023 02:57:24 +0000 (13:27 +1030)]
Internal error at gas/expr.c:1814
This is the assertion
know (*input_line_pointer != ' ');
after calling operand.
The usual exit from operand calls SKIP_ALL_WHITESPACE.
* expr.c (operand): Call SKIP_ALL_WHITESPACE after call to expr.
Simon Marchi [Mon, 30 Jan 2023 20:02:49 +0000 (15:02 -0500)]
gdb: give sentinel for user frames distinct IDs, register sentinel frames to the frame cache
The test gdb.base/frame-view.exp fails like this on AArch64:
frame^M
#0 baz (z1=hahaha, /home/simark/src/binutils-gdb/gdb/value.c:4056: internal-error: value_fetch_lazy_register: Assertion `next_frame != NULL' failed.^M
A problem internal to GDB has been detected,^M
further debugging may prove unreliable.^M
FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame (GDB internal error)
The sequence of events leading to this is the following:
- When we create the user frame (the "select-frame view" command), we
create a sentinel frame just for our user-created frame, in
create_new_frame. This sentinel frame has the same id as the regular
sentinel frame.
- When printing the frame, after doing the "select-frame view" command,
the argument's pretty printer is invoked, which does an inferior
function call (this is the point of the test). This clears the frame
cache, including the "real" sentinel frame, which sets the
sentinel_frame global to nullptr.
- Later in the frame-printing process (when printing the second
argument), the auto-reinflation mechanism re-creates the user frame
by calling create_new_frame again, creating its own special sentinel
frame again. However, note that the "real" sentinel frame, the
sentinel_frame global, is still nullptr. If the selected frame had
been a regular frame, we would have called get_current_frame at some
point during the reinflation, which would have re-created the "real"
sentinel frame. But it's not the case when reinflating a user frame.
- Deep down the stack, something wants to fill in the unwind stop
reason for frame 0, which requires trying to unwind frame 1. This
leads us to trying to unwind the PC of frame 1:
#0 gdbarch_unwind_pc (gdbarch=0xffff8d010080, next_frame=...) at /home/simark/src/binutils-gdb/gdb/gdbarch.c:2955
#1 0x000000000134569c in dwarf2_tailcall_sniffer_first (this_frame=..., tailcall_cachep=0xffff773fcae0, entry_cfa_sp_offsetp=0xfffff7f7d450)
at /home/simark/src/binutils-gdb/gdb/dwarf2/frame-tailcall.c:390
#2 0x0000000001355d84 in dwarf2_frame_cache (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1089
#3 0x00000000013562b0 in dwarf2_frame_unwind_stop_reason (this_frame=..., this_cache=0xffff773fc928) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1101
#4 0x0000000001990f64 in get_prev_frame_always_1 (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2281
#5 0x0000000001993034 in get_prev_frame_always (this_frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:2376
#6 0x000000000199b814 in get_frame_unwind_stop_reason (frame=...) at /home/simark/src/binutils-gdb/gdb/frame.c:3051
#7 0x0000000001359cd8 in dwarf2_frame_cfa (this_frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1356
#8 0x000000000132122c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d8883ee "\217\002", op_end=0xffff8d8883ee "\217\002")
at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:2110
#9 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d8883ed "\234\217\002", len=1) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
#10 0x000000000131d68c in dwarf_expr_context::execute_stack_op (this=0xfffff7f80170, op_ptr=0xffff8d88840e "", op_end=0xffff8d88840e "") at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1811
#11 0x0000000001317b30 in dwarf_expr_context::eval (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239
#12 0x0000000001314c3c in dwarf_expr_context::evaluate (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2, as_lval=true, per_cu=0xffff90b03700, frame=..., addr_info=0x0,
type=0xffff8f6c8400, subobj_type=0xffff8f6c8400, subobj_offset=0) at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1078
#13 0x000000000149f9e0 in dwarf2_evaluate_loc_desc_full (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980,
subobj_type=0xffff8f6c8400, subobj_byte_offset=0, as_lval=true) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1513
#14 0x00000000014a0100 in dwarf2_evaluate_loc_desc (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, per_cu=0xffff90b03700, per_objfile=0xffff9070b980, as_lval=true)
at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1557
#15 0x00000000014aa584 in locexpr_read_variable (symbol=0xffff8f6cd770, frame=...) at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3052
- AArch64 defines a special "prev register" function,
aarch64_dwarf2_prev_register, to handle unwinding the PC. This
function does
frame_unwind_register_unsigned (this_frame, AARCH64_LR_REGNUM);
- frame_unwind_register_unsigned ultimately creates a lazy register
value, saving the frame id of this_frame->next. this_frame is the
user-created frame, to this_frame->next is the special sentinel frame
we created for it. So the saved ID is the sentinel frame ID.
- When time comes to un-lazify the value, value_fetch_lazy_register
calls frame_find_by_id, to find the frame with the ID we saved.
- frame_find_by_id sees it's the sentinel frame ID, so returns the
sentinel_frame global, which is, if you remember, nullptr.
- We hit the `gdb_assert (next_frame != NULL)` assertion in
value_fetch_lazy_register.
The issues I see here are:
- The ID of the sentinel frame created for the user-created frame is
not distinguishable from the ID of the regular sentinel frame. So
there's no way frame_find_by_id could find the right frame, in
value_fetch_lazy_register.
- Even if they had distinguishable IDs, sentinel frames created for
user frames are not registered anywhere, so there's no easy way
frame_find_by_id could find it.
This patch addresses these two issues:
- Give sentinel frames created for user frames their own distinct IDs
- Register sentinel frames in the frame cache, so they can be found
with frame_find_by_id.
I initially had this split in two patches, but I then found that it was
easier to explain as a single patch.
Rergarding the first part of the change: with this patch, the sentinel
frames created for user frames (in create_new_frame) still have
stack_status == FID_STACK_SENTINEL, but their code_addr and stack_addr
fields are now filled with the addresses used to create the user frame.
This ensures this sentinel frame ID is different from the "target"
sentinel frame ID, as well as any other "user" sentinel frame ID. If
the user tries to create the same frame, with the same addresses,
multiple times, create_sentinel_frame just reuses the existing frame.
So we won't end up with multiple user sentinels with the same ID.
Regular "target" sentinel frames remain with code_addr and stack_addr
unset.
The concrete changes for that part are:
- Remove the sentinel_frame_id constant, since there isn't one
"sentinel frame ID" now. Add the frame_id_build_sentinel function
for building sentinel frame IDs and a is_sentinel_frame_id function
to check if a frame id represents a sentinel frame.
- Replace the sentinel_frame_id check in frame_find_by_id with a
comparison to `frame_id_build_sentinel (0, 0)`. The sentinel_frame
global is meant to contain a reference to the "target" sentinel, so
the one with addresses (0, 0).
- Add stack and code address parameters to create_sentinel_frame, to be
able to create the various types of sentinel frames.
- Adjust get_current_frame to create the regular "target" sentinel.
- Adjust create_new_frame to create a sentinel with the ID specific to
the created user frame.
- Adjust sentinel_frame_prev_register to get the sentinel frame ID from
the frame_info object, since there isn't a single "sentinel frame ID"
now.
- Change get_next_frame_sentinel_okay to check for a
sentinel-frame-id-like frame ID, rather than for sentinel_frame
specifically, since this function could be called with another
sentinel frame (and we would want the assert to catch it).
The rest of the change is about registering the sentinel frame in the
frame cache:
- Change frame_stash_add's assertion to allow sentinel frame levels
(-1).
- Make create_sentinel_frame add the frame to the frame cache.
- Change the "sentinel_frame != NULL" check in reinit_frame_cache for a
check that the frame stash is not empty. The idea is that if we only
have some user-created frames in the cache when reinit_frame_cache is
called, we probably want to emit the frames invalid annotation. The
goal of that check is to avoid unnecessary repeated annotations, I
suppose, so the "frame cache not empty" check should achieve that.
After this change, I think we could theoritically get rid of the
sentienl_frame global. That sentinel frame could always be found by
looking up `frame_id_build_sentinel (0, 0)` in the frame cache.
However, I left the global there to avoid slowing the typical case down
for nothing. I however, noted in its comment that it is an
optimization.
With this fix applied, the gdb.base/frame-view.exp now passes for me on
AArch64. value_of_register_lazy now saves the special sentinel frame ID
in the value, and value_fetch_lazy_register is able to find that
sentinel frame after the frame cache reinit and after the user-created
frame was reinflated.
Tested-By: Alexandra Petlanova Hajkova <ahajkova@redhat.com>
Tested-By: Luis Machado <luis.machado@arm.com>
Change-Id: I8b77b3448822c8aab3e1c3dda76ec434eb62704f
Simon Marchi [Mon, 30 Jan 2023 20:02:48 +0000 (15:02 -0500)]
gdb: call frame unwinders' dealloc_cache methods through destroying the frame cache
Currently, some frame resources are deallocated by iterating on the
frame chain (starting from the sentinel), calling dealloc_cache. The
problem is that user-created frames are not part of that chain, so we
never call dealloc_cache for them.
I propose to make it so the dealloc_cache callbacks are called when the
frames are removed from the frame_stash hash table, by registering a
deletion function to the hash table. This happens when
frame_stash_invalidate is called by reinit_frame_cache. This way, all
frames registered in the cache will get their unwinder's dealloc_cache
callbacks called.
Note that at the moment, the sentinel frames are not registered in the
cache, so we won't call dealloc_cache for them. However, it's just a
theoritical problem, because the sentinel frame unwinder does not
provide this callback. Also, a subsequent patch will change things so
that sentinel frames are registered to the cache.
I moved the obstack_free / obstack_init pair below the
frame_stash_invalidate call in reinit_frame_cache, because I assumed
that some dealloc_cache would need to access some data on that obstack,
so it would be better to free it after clearing the hash table.
Change-Id: If4f9b38266b458c4e2f7eb43e933090177c22190
Tom Tromey [Sat, 21 Jan 2023 21:00:12 +0000 (14:00 -0700)]
Remove block.h includes from some tdep files
A few tdep files include block.h but do not need to. This patch
removes the inclusions. I checked that this worked correctly by
examining the resulting .Po file to make sure that block.h was not
being included by some other route.
Tom Tromey [Sat, 21 Jan 2023 21:00:05 +0000 (14:00 -0700)]
Don't include block.h from expop.h
expop.h needs block.h for a single inline function. However, I don't
think most of the check_objfile functions need to be defined in the
header (just the templates). This patch moves the one offending
function and removes the include.
Pedro Alves [Fri, 27 Jan 2023 18:07:56 +0000 (18:07 +0000)]
Simplify interp::exec / interp_exec - let exceptions propagate
This patch implements a simplication that I suggested here:
https://sourceware.org/pipermail/gdb-patches/2022-March/186320.html
Currently, the interp::exec virtual method interface is such that
subclass implementations must catch exceptions and then return them
via normal function return.
However, higher up the in chain, for the CLI we get to
interpreter_exec_cmd, which does:
for (i = 1; i < nrules; i++)
{
struct gdb_exception e = interp_exec (interp_to_use, prules[i]);
if (e.reason < 0)
{
interp_set (old_interp, 0);
error (_("error in command: \"%s\"."), prules[i]);
}
}
and for MI we get to mi_cmd_interpreter_exec, which has:
void
mi_cmd_interpreter_exec (const char *command, char **argv, int argc)
{
...
for (i = 1; i < argc; i++)
{
struct gdb_exception e = interp_exec (interp_to_use, argv[i]);
if (e.reason < 0)
error ("%s", e.what ());
}
}
Note that if those errors are reached, we lose the original
exception's error code. I can't see why we'd want that.
And, I can't see why we need to have interp_exec catch the exception
and return it via the normal return path. That's normally needed when
we need to handle propagating exceptions across C code, like across
readline or ncurses, but that's not the case here.
It seems to me that we can simplify things by removing some
try/catch-ing and just letting exceptions propagate normally.
Note, the "error in command" error shown above, which only exists in
the CLI interpreter-exec command, is only ever printed AFAICS if you
run "interpreter-exec console" when the top level interpreter is
already the console/tui. Like:
(gdb) interpreter-exec console "foobar"
Undefined command: "foobar". Try "help".
error in command: "foobar".
You won't see it with MI's "-interpreter-exec console" from a top
level MI interpreter:
(gdb)
-interpreter-exec console "foobar"
&"Undefined command: \"foobar\". Try \"help\".\n"
^error,msg="Undefined command: \"foobar\". Try \"help\"."
(gdb)
nor with MI's "-interpreter-exec mi" from a top level MI interpreter:
(gdb)
-interpreter-exec mi "-foobar"
^error,msg="Undefined MI command: foobar",code="undefined-command"
^done
(gdb)
in both these cases because MI's -interpreter-exec just does:
error ("%s", e.what ());
You won't see it either when running an MI command with the CLI's
"interpreter-exec mi":
(gdb) interpreter-exec mi "-foobar"
^error,msg="Undefined MI command: foobar",code="undefined-command"
(gdb)
This last case is because MI's interp::exec implementation never
returns an error:
gdb_exception
mi_interp::exec (const char *command)
{
mi_execute_command_wrapper (command);
return gdb_exception ();
}
Thus I think that "error in command" error is pretty pointless, and
since it simplifies things to not have it, the patch just removes it.
The patch also ends up addressing an old FIXME.
Change-Id: I5a6432a80496934ac7127594c53bf5221622e393
Approved-By: Tom Tromey <tromey@adacore.com>
Approved-By: Kevin Buettner <kevinb@redhat.com>
Tom Tromey [Thu, 19 Jan 2023 21:01:27 +0000 (14:01 -0700)]
Avoid FAILs in gdb.compile
Many gdb.compile C++ tests fail for me on Fedora 36. I think these
are largely bugs in the plugin, though I didn't investigate too
deeply. Once one failure is seen, this often cascades and sometimes
there are many timeouts.
For example, this can happen:
(gdb) compile code var = a->get_var ()
warning: Could not find symbol "_ZZ9_gdb_exprP10__gdb_regsE1a" for compiled module "/tmp/gdbobj-0xdI6U/out2.o".
1 symbols were missing, cannot continue.
I think this is probably a plugin bug because, IIRC, in theory these
symbols should be exempt from a lookup via gdb.
This patch arranges to catch any catastrophic failure and then simply
exit the entire .exp file.
Tom Tromey [Thu, 19 Jan 2023 18:19:32 +0000 (11:19 -0700)]
Don't let .gdb_history file cause failures
I had a .gdb_history file in my testsuite directory in the build tree,
and this provoked a failure in gdbhistsize-history.exp. It seems
simple to prevent this file from causing a failure.
Tom Tromey [Fri, 13 Jan 2023 16:59:29 +0000 (09:59 -0700)]
Merge fixup_section and fixup_symbol_section
fixup_symbol_section delegates all its work to fixup_section, so merge
the two.
Because there is only a single caller to fixup_symbol_section, we can
also remove some of the introductory logic. For example, this will
never be called with a NULL objfile any more.
The LOC_BLOCK case can be removed, because such symbols are handled by
the buildsym code now.
Finally, a symbol can only appear in a SEC_ALLOC section, so the loop
is modified to skip sections that do not have this flag set.
Tom Tromey [Fri, 13 Jan 2023 16:27:54 +0000 (09:27 -0700)]
Remove most calls to fixup_symbol_section
Nearly every call to fixup_symbol_section in gdb is incorrect, and if
any such call has an effect, it's purely by happenstance.
fixup_section has a long comment explaining that the call should only
be made before runtime section offsets are applied. And, the loop in
this code (the fallback loop -- the minsym lookup code is "ok") is
careful to remove these offsets before comparing addresses.
However, aside from a single call in dwarf2/read.c, every call in gdb
is actually done after section offsets have been applied. So, these
calls are incorrect.
Now, these calls could be made when the symbol is created. I
considered this approach, but I reasoned that the code has been this
way for many years, seemingly without ill effect. So, instead I chose
to simply remove the offending calls.
Tom Tromey [Fri, 13 Jan 2023 16:17:27 +0000 (09:17 -0700)]
Set section index when setting a symbol's block
When a symbol's block is set, the block has the runtime section offset
applied. So, it seems to me that the symbol implicitly is in the same
section as the block. Therefore, this patch sets the symbol's section
index at this same spot.
Tom Tromey [Thu, 19 Jan 2023 13:14:49 +0000 (06:14 -0700)]
Remove compunit_symtab::m_block_line_section
The previous patch hard-coded SECT_OFF_TEXT into the buildsym code.
After this, it's clear that there is only one caller of
compunit_symtab::set_block_line_section, and it always passes
SECT_OFF_TEXT. So, remove compunit_symtab::m_block_line_section and
use SECT_OFF_TEXT instead.
Tom Tromey [Fri, 13 Jan 2023 16:08:41 +0000 (09:08 -0700)]
Do not pass section index to end_compunit_symtab
Right now, the section index passed to end_compunit_symtab is always
SECT_OFF_TEXT. Remove this parameter and simply always use
SECT_OFF_TEXT.
Tom Tromey [Fri, 13 Jan 2023 15:57:08 +0000 (08:57 -0700)]
Set section indices when symbols are made
Most places in gdb that create a new symbol will apply a section
offset to the address. It seems to me that the choice of offset here
is also an implicit choice of the section. This is particularly true
if you examine fixup_section, which notes that it must be called
before such offsets are applied -- meaning that if any such call has
an effect, it's purely by accident.
This patch cleans up this area by tracking the section index and
applying it to a symbol when the address is set. This is done for
nearly every case -- the remaining cases will be handled in later
patches.
Tom Tromey [Tue, 10 Jan 2023 18:39:50 +0000 (11:39 -0700)]
Use default section indexes in fixup_symbol_section
If fixup_section does not find a matching section, it arbitrarily
chooses the first one. However, it seems better to make this default
depend on the type of the symbol -- i.e., default data symbols to
.data and text symbols to .text.
I've also made fixup_section static, as it only has one caller.
Tom Tromey [Wed, 1 Feb 2023 18:43:02 +0000 (11:43 -0700)]
Simplify checks of cooked_index
This changes the cooked_index_functions to avoid an extra null check
now that checked_static_cast allows a null argument.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Tom de Vries [Wed, 8 Feb 2023 12:46:17 +0000 (13:46 +0100)]
[gdb/testsuite] Use maint ignore-probes in gdb.base/longjmp.exp
Test-case gdb.base/longjmp.exp handles both the case that there is a libc
longjmp probe, and the case that there isn't.
However, it only tests one of the two cases.
Use maint ignore-probes to test both cases, if possible.
Tested on x86_64-linux.
Tom de Vries [Wed, 8 Feb 2023 10:48:53 +0000 (11:48 +0100)]
[gdb/testsuite] Use maint ignore-probes in gdb.base/solib-corrupted.exp
Test-case gdb.base/solib-corrupted.exp only works for a glibc without probes
interface, otherwise we run into:
...
XFAIL: gdb.base/solib-corrupted.exp: info probes
UNTESTED: gdb.base/solib-corrupted.exp: GDB is using probes
...
Fix this by using maint ignore-probes to simulate the absence of the relevant
probes.
Also, it requires glibc debuginfo, and if not present, it produces an XFAIL:
...
XFAIL: gdb.base/solib-corrupted.exp: make solibs looping
UNTESTED: gdb.base/solib-corrupted.exp: no _r_debug symbol has been found
...
This is incorrect, because an XFAIL indicates a known problem in the
environment. In this case, there is no problem: the environment is
functioning as expected when glibc debuginfo is not installed.
Fix this by using UNSUPPORTED instead, and make the message less cryptic:
...
UNSUPPORTED: gdb.base/solib-corrupted.exp: make solibs looping \
(glibc debuginfo required)
...
Finally, with glibc debuginfo present, we run into:
...
(gdb) PASS: gdb.base/solib-corrupted.exp: make solibs looping
info sharedlibrary^M
warning: Corrupted shared library list: 0x7ffff7ffe750 != 0x0^M
From To Syms Read Shared Object Library^M
0x00007ffff7dd4170 0x00007ffff7df4090 Yes /lib64/ld-linux-x86-64.so.2^M
(gdb) FAIL: gdb.base/solib-corrupted.exp: corrupted list \
(shared library list corrupted)
...
due to commit
44288716537 ("gdb, testsuite: extend gdb_test_multiple checks").
Fix this by rewriting into gdb_test_multiple and using -early.
Tested on x86_64-linux, with and without glibc debuginfo installed.
Vladimir Mezentsev [Tue, 7 Feb 2023 22:58:25 +0000 (14:58 -0800)]
gprofng: fix SIGSEGV when processing unusual dwarf
gprofng/ChangeLog
2023-02-07 Vladimir Mezentsev <vladimir.mezentsev@oracle.com>
PR gprofng/30093
* src/Dwarf.cc: add nullptr check.
* src/DwarfLib.cc: Likewise.
Alan Modra [Wed, 8 Feb 2023 00:23:59 +0000 (10:53 +1030)]
Re: Resetting section vma after _bfd_dwarf2_find_nearest_line
f.bfd_ptr is set too early to be a reliable indicator of good debug
info.
* dwarf2.c (_bfd_dwarf2_slurp_debug_info): Correct test for
debug info being previously found.
GDB Administrator [Wed, 8 Feb 2023 00:00:26 +0000 (00:00 +0000)]
Automatic date update in version.in
Andrew Burgess [Mon, 7 Nov 2022 17:18:55 +0000 (17:18 +0000)]
gdb: fix display of thread condition for multi-location breakpoints
This commit addresses the issue in PR gdb/30087.
If a breakpoint with multiple locations has a thread condition, then
the 'info breakpoints' output is a little messed up, here's an example
of the current output:
(gdb) break foo thread 1
Breakpoint 2 at 0x401114: foo. (3 locations)
(gdb) break bar thread 1
Breakpoint 3 at 0x40110a: file /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c, line 32.
(gdb) info breakpoints
Num Type Disp Enb Address What
2 breakpoint keep y <MULTIPLE> thread 1
stop only in thread 1
2.1 y 0x0000000000401114 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25
2.2 y 0x0000000000401146 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25
2.3 y 0x0000000000401168 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25
3 breakpoint keep y 0x000000000040110a in bar at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:32 thread 1
stop only in thread 1
Notice that, at the end of the location for breakpoint 3, the 'thread
1' condition is printed, but this is then repeated on the next line
with 'stop only in thread 1'.
In contrast, for breakpoint 2, the 'thread 1' appears randomly, in the
"What" column, though slightly offset, non of the separate locations
have the 'thread 1' information. Additionally for breakpoint 2 we
also get a 'stop only in thread 1' line.
There's two things going on here. First the randomly placed 'thread
1' for breakpoint 2 is due to a bug in print_one_breakpoint_location,
where we check the variable part_of_multiple instead of
header_of_multiple.
If I fix this oversight, then the output is now:
(gdb) break foo thread 1
Breakpoint 2 at 0x401114: foo. (3 locations)
(gdb) break bar thread 1
Breakpoint 3 at 0x40110a: file /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c, line 32.
(gdb) info breakpoints
Num Type Disp Enb Address What
2 breakpoint keep y <MULTIPLE>
stop only in thread 1
2.1 y 0x0000000000401114 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 thread 1
2.2 y 0x0000000000401146 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 thread 1
2.3 y 0x0000000000401168 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25 thread 1
3 breakpoint keep y 0x000000000040110a in bar at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:32 thread 1
stop only in thread 1
The 'thread 1' condition is now displayed at the end of each location,
which makes the output the same for single location breakpoints and
multi-location breakpoints.
However, there's still some duplication here. Both breakpoints 2 and
3 include a 'stop only in thread 1' line, and it feels like the
additional 'thread 1' is redundant. In fact, there's a comment to
this very effect in the code:
/* FIXME: This seems to be redundant and lost here; see the
"stop only in" line a little further down. */
So, lets fix this FIXME. The new plan is to remove all the trailing
'thread 1' markers from the CLI output, we now get this:
(gdb) break foo thread 1
Breakpoint 2 at 0x401114: foo. (3 locations)
(gdb) break bar thread 1
Breakpoint 3 at 0x40110a: file /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c, line 32.
(gdb) info breakpoints
Num Type Disp Enb Address What
2 breakpoint keep y <MULTIPLE>
stop only in thread 1
2.1 y 0x0000000000401114 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25
2.2 y 0x0000000000401146 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25
2.3 y 0x0000000000401168 in foo at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:25
3 breakpoint keep y 0x000000000040110a in bar at /tmp/src/gdb/testsuite/gdb.base/thread-bp-multi-loc.c:32
stop only in thread 1
All of the above points are also true for the Ada 'task' breakpoint
condition, and the changes I've made also update how the task
information is printed, though in the case of the Ada task there was
no 'stop only in task XXX' line printed, so I've added one of those.
Obviously it can't be quite that easy. For MI backwards compatibility
I've retained the existing code (but now only for MI like outputs),
which ensures we should generate backwards compatible output.
I've extended an Ada test to cover the new task related output, and
updated all the tests I could find that checked for the old output.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30087
Approved-By: Pedro Alves <pedro@palves.net>
Nick Clifton [Tue, 7 Feb 2023 11:40:46 +0000 (11:40 +0000)]
Fix documentation of the 'n' symbol type displayed by nm.
PR 30080 * doc/binutils.texi (nm): Update description of the 'n' symbol type.
Tom de Vries [Tue, 7 Feb 2023 10:41:44 +0000 (11:41 +0100)]
[gdb/testsuite] Improve untested message in gdb.ada/finish-var-size.exp
I came across:
...
UNTESTED: gdb.ada/finish-var-size.exp: GCC too told for this test
...
The message only tells us that the compiler version too old, not what compiler
version is required.
Fix this by rewriting using required:
...
UNSUPPORTED: gdb.ada/finish-var-size.exp: require failed: \
expr [gcc_major_version] >= 12
...
Tested on x86_64-linux.
GDB Administrator [Tue, 7 Feb 2023 00:00:16 +0000 (00:00 +0000)]
Automatic date update in version.in
Simon Marchi [Mon, 6 Feb 2023 19:12:27 +0000 (14:12 -0500)]
gdb: adjust comment on target_desc_info::from_user_p
Remove the stale reference to INFO, which is now "this target
description info" now.
Change-Id: I35dbdb089048ed7cfffe730d3134ee391b176abf
Andrew Burgess [Thu, 2 Feb 2023 11:45:41 +0000 (11:45 +0000)]
gdb/doc: extend the documentation for the 'handle' command
The documentation for the 'handle' command does not cover all of the
features of the command, and in one case, is just wrong.
The user can specify 'all' as signal name, the documentation implies
that this will change the behaviour of all signals, in reality, this
changes all signals except SIGINT and SIGTRAP (the signals used by
GDB). I've updated the docs to list this limitation.
The 'handle' command also allows the user to specify multiple signals
for a single command, e.g. 'handle SIGFPE SIGILL nostop pass print',
however the documentation doesn't describe this, so I've updated the
docs to describe this feature.
Alan Modra [Mon, 6 Feb 2023 02:16:52 +0000 (12:46 +1030)]
ppc32 and "LOAD segment with RWX permissions"
When using a bss-plt we'll always trigger the RWX warning, which
disturbs gcc test results. On the other hand, there may be reason to
want the warning when gcc is configured with --enable-secureplt.
So turning off the warning entirely for powerpc might not be the best
solution. Instead, we'll turn off the warning whenever a bss-plt is
generated, unless the user explicitly asked for the warning.
bfd/
* elf32-ppc.c (ppc_elf_select_plt_layout): Set
no_warn_rwx_segments on generating a bss plt, unless explicity
enabled by the user. Also show the bss-plt warning when
--warn-rwx-segments is given without --bss-plt.
include/
* bfdlink.h (struct bfd_link_info): Add user_warn_rwx_segments.
ld/
* lexsup.c (parse_args): Set user_warn_rwx_segments.
* testsuite/ld-elf/elf.exp: Pass --secure-plt for powerpc to
the rwx tests.
Tom de Vries [Mon, 6 Feb 2023 11:52:50 +0000 (12:52 +0100)]
[gdb/testsuite] Fix gdb.threads/schedlock.exp on fast cpu
Occasionally, I run into:
...
(gdb) PASS: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \
set scheduler-locking on
continue^M
Continuing.^M
PASS: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \
continue (with lock)
[Thread 0x7ffff746e700 (LWP 1339) exited]^M
No unwaited-for children left.^M
(gdb) Quit^M
(gdb) FAIL: gdb.threads/schedlock.exp: schedlock=on: cmd=continue: \
stop all threads (with lock) (timeout)
...
What happens is that this loop which is supposed to run "just short of forever":
...
/* Don't run forever. Run just short of it :) */
while (*myp > 0)
{
/* schedlock.exp: main loop. */
MAYBE_CALL_SOME_FUNCTION(); (*myp) ++;
}
...
finishes after 0x7fffffff iterations (when a signed wrap occurs), which on my
system takes only about 1.5 seconds.
Fix this by:
- changing the pointed-at type of myp from signed to unsigned, which makes the
wrap defined behaviour (and which also make the loop run twice as long,
which is already enough to make it impossible for me to reproduce the FAIL.
But let's try to solve this more structurally).
- changing the pointed-at type of myp from int to long long, making the wrap
unlikely.
- making sure the loop runs forever, by setting the loop condition to 1.
- making sure the loop still contains different lines (as far as debug info is
concerned) by incrementing a volatile counter in the loop.
- making sure the program doesn't run forever in case of trouble, by adding an
"alarm (30)".
Tested on x86_64-linux.
PR testsuite/30074
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30074
Andrew Burgess [Wed, 9 Nov 2022 12:54:55 +0000 (12:54 +0000)]
gdb: error if 'thread' or 'task' keywords are overused
When creating a breakpoint or watchpoint, the 'thread' and 'task'
keywords can be used to create a thread or task specific breakpoint or
watchpoint.
Currently, a thread or task specific breakpoint can only apply for a
single thread or task, if multiple threads or tasks are specified when
creating the breakpoint (or watchpoint), then the last specified id
will be used.
The exception to the above is that when the 'thread' keyword is used
during the creation of a watchpoint, GDB will give an error if
'thread' is given more than once.
In this commit I propose making this behaviour consistent, if the
'thread' or 'task' keywords are used more than once when creating
either a breakpoint or watchpoint, then GDB will give an error.
I haven't updated the manual, we don't explicitly say that these
keywords can be repeated, and (to me), given the keyword takes a
single id, I don't think it makes much sense to repeat the keyword.
As such, I see this more as adding a missing error to GDB, rather than
making some big change. However, I have added an entry to the NEWS
file as I guess it is possible that some people might hit this new
error with an existing (I claim, badly written) GDB script.
I've added some new tests to check for the new error.
Just one test needed updating, gdb.linespec/keywords.exp, this test
did use the 'thread' keyword twice, and expected the breakpoint to be
created. Looking at what this test was for though, it was checking
the use of '-force-condition', and I don't think that being able to
repeat 'thread' was actually a critical part of this test.
As such, I've updated this test to expect the error when 'thread' is
repeated.
Alan Modra [Sat, 4 Feb 2023 00:59:05 +0000 (11:29 +1030)]
Resetting section vma after _bfd_dwarf2_find_nearest_line
There are failure paths in _bfd_dwarf2_slurp_debug_info that can
result in altered section vmas. Also, when setting ET_REL section
vmas it's not too difficult to handle cases where the original vma was
non-zero, so do that too.
This patch was really in response to an addr2line buffer overflow
processing a fuzzed mips relocatable object file. The file had a
number of .debug_info sections with relocations that included lo16 and
hi16 relocs, and in that order. At least one section VMA was
non-zero. This resulted in processing of DWARF info twice, once via
the call to _bfd_dwarf2_find_nearest_line in
_bfd_mips_elf_find_nearest_line, and because that failed leaving VMAs
altered, the second via the call in _bfd_elf_find_nearest_line. The
first call left entries on mips_hi16_list pointing at buffers
allocated during the first call, the second call processed the
mips_hi16_list after the buffers had been freed. (At least when
running with asan and under valgrind. Under gdb with a non-asan
addr2line the second call allocated exactly the same buffer and the
bug didn't show.) Now I don't really care too much what happens with
fuzzed files, but the logic in _bfd_dwarf2_find_nearest_line is meant
to result in only one read of .debug_info, not multiple reads of the
same info when there are errors. This patch fixes that problem.
* dwarf2.c (struct adjusted_section): Add orig_vma.
(unset_sections): Reset vma to it.
(place_sections): Handle non-zero vma too. Save orig_vma.
(_bfd_dwarf2_slurp_debug_info): Tidy. Correct outdated comment.
On error returns after calling place_sections, call
unset_sections.
(_bfd_dwarf2_find_nearest_line_with_alt): Simplify call to
unset_sections.
Romain Geissler [Sun, 5 Feb 2023 13:56:34 +0000 (13:56 +0000)]
[PR 30082] Pass $JANSSON_LIBS and $ZSTD_LIBS to ld-bootstrap/bootrap.exp
GDB Administrator [Mon, 6 Feb 2023 00:00:11 +0000 (00:00 +0000)]
Automatic date update in version.in
GDB Administrator [Sun, 5 Feb 2023 00:00:08 +0000 (00:00 +0000)]
Automatic date update in version.in
Andrew Burgess [Thu, 24 Nov 2022 19:36:23 +0000 (19:36 +0000)]
gdb/testsuite: don't try to set non-stop mode on a running target
The test gdb.threads/thread-specific-bp.exp tries to set non-stop mode
on a running target, something which the manual makes clear is not
allowed.
This commit restructures the test a little, we now set the non-stop
mode as part of the GDBFLAGS, so the mode will be set before GDB
connects to the target. As a consequence I'm able to move the
with_test_prefix out of the check_thread_specific_breakpoint proc.
The check_thread_specific_breakpoint proc is now called within a loop.
After this commit the gdb.threads/thread-specific-bp.exp test still
has some failures, this is because of an issue GDB currently has
printing "Thread ... exited" messages. This problem should be
addressed by this patch:
https://sourceware.org/pipermail/gdb-patches/2022-December/194694.html
when it is merged.
Dimitar Dimitrov [Sun, 29 Jan 2023 09:52:52 +0000 (11:52 +0200)]
ld: pru: Add optional section alignments
The Texas Instruments SoCs with AARCH64 host processors have stricter
alignment requirements than ones with ARM32 host processors. It's not
only the requirement for resource_table to be aligned to 8. But also
any loadable segment size must be a multiple of 4 [1].
The current PRU default linker script may output a segment size not
aligned to 4, which would cause firmware load failure on AARCH64 hosts.
Fix this by using COMMONPAGESIZE and MAXPAGESIZE to signify respectively
the section memory size requirement and the resource table section's
start address alignment. This would avoid penalizing the ARM32 hosts,
for which the default values (1 and 1) are sufficient.
For AARCH64 hosts, the alignments would be overwritten from GCC spec
files using the linker command line, e.g.:
-z common-page-size=4 -z max-page-size=8
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/remoteproc/pru_rproc.c?h=v6.1#n555
ld/ChangeLog:
* scripttempl/pru.sc (_data_end): Remove the alignment.
(.data): Align output section size to COMMONPAGESIZE.
(.resource_table): Ditto.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Dimitar Dimitrov [Thu, 26 Jan 2023 19:52:45 +0000 (21:52 +0200)]
ld: pru: Merge the bss input sections into data
The popular method to load PRU firmware is through the remoteproc Linux
kernel driver. In order to save a few bytes from the firmware, the PRU
CRT0 is spared from calling memset for the bss segment [1]. Instead the
host loader is supposed to zero out the bss segment. This is important
for PRU, which typically has only 8KB for instruction memory.
The legacy non-mainline PRU host driver relied on the default
behaviour of the kernel core remoteproc [2]. That default is to zero
out the loadable memory regions not backed by file storage (i.e. the
bss sections). This worked for the libgloss' CRT0.
But the PRU loader merged in mainline Linux explicitly changes the
default behaviour [3]. It no longer is zeroing out memory regions.
Hence the bss sections are not initialized - neither by CRT0, nor by the
host loader.
This patch fixes the issue by aligning the GNU LD default linker script
with the mainline Linux kernel expectation. Since the mainline kernel
driver is submitted by the PRU manufacturer itself (Text Instruments),
we can consider that as defining the ABI.
This change has been tested on Beaglebone AI-64 [4]. Static counter
variables in the firmware are now always starting from zero, as
expected. There was only one new toolchain test failure in orphan3.d,
due to reordering of the output sections. I believe this is a harmless
issue. I could not rewrite the PASS criteria to ignore the output
section ordering, so I have disabled that test case for PRU.
[1] https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=libgloss/pru/crt0.S;h=
b3f0d53a93acc372f461007553e7688ca77753c9;hb=HEAD#l40
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/remoteproc/remoteproc_elf_loader.c?h=v6.1#n228
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/remoteproc/pru_rproc.c?h=v6.1#n641
[4] https://beagleboard.org/ai-64
ld/ChangeLog:
* scripttempl/pru.sc (.data): Merge .bss input sections into the
.data output section.
* testsuite/ld-elf/orphan3.d: Disable for PRU.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
GDB Administrator [Sat, 4 Feb 2023 00:00:09 +0000 (00:00 +0000)]
Automatic date update in version.in
Guillermo E. Martinez [Fri, 3 Feb 2023 17:17:49 +0000 (11:17 -0600)]
bpf: fix error conversion from long unsigned int to unsigned int [-Werror=overflow]
Regenerating BPF target using the maintainer mode emits:
.../opcodes/bpf-opc.c:57:11: error: conversion from โlong unsigned intโ to โunsigned intโ changes value from โ
18446744073709486335โ to โ
4294902015โ [-Werror=overflow]
57 | 64, 64, 0xffffffffffff00ff, { { F (F_IMM32) }, { F (F_OFFSET16) }, { F (F_SRCLE) }, { F (F_OP_CODE) }, { F (F_DSTLE) }, { F (F_OP_SRC) }, { F (F_OP_CLASS) }, { 0 } }
The use of a narrow size to handle the mask CGEN in instruction format
is causing this error. Additionally eBPF `call' instructions
constructed by expressions using symbols (BPF_PSEUDO_CALL) emits
annotations in `src' field of the instruction, used to identify BPF
target endianness.
cpu/
* bpf.cpu (define-call-insn): Remove `src' field from
instruction mask.
include/
*opcode/cge.h (CGEN_IFMT): Adjust mask bit width.
opcodes/
* bpf-opc.c: Regenerate.
Simon Marchi [Fri, 3 Feb 2023 14:21:26 +0000 (09:21 -0500)]
gdb: make target_desc_info_from_user_p a method of target_desc_info
Move the implementation over to target_desc_info. Remove the
target_desc_info forward declaration in target-descriptions.h, it's no
longer needed.
Change-Id: Ic95060341685afe0b73af591ca6efe32f5e7e892
Simon Marchi [Fri, 3 Feb 2023 14:21:25 +0000 (09:21 -0500)]
gdb: remove copy_inferior_target_desc_info
This function is now trivial, we can just copy inferior::tdesc_info
where needed.
Change-Id: I25185e2cd4ba1ef24a822d9e0eebec6e611d54d6
Simon Marchi [Fri, 3 Feb 2023 14:21:24 +0000 (09:21 -0500)]
gdb: remove get_tdesc_info
Remove this function, since it's now a trivial access to
inferior::tdesc_info.
Change-Id: I3e88a8214034f1a4163420b434be11f51eef462c
Simon Marchi [Fri, 3 Feb 2023 14:21:23 +0000 (09:21 -0500)]
gdb: change inferior::tdesc_info to non-pointer
I initially made this field a unique pointer, to have automatic memory
management. But I then thought that the field didn't really need to be
allocated separately from struct inferior. So make it a regular
non-pointer field of inferior.
Remove target_desc_info_free, as it's no longer needed.
Change-Id: Ica2b97071226f31c40e86222a2f6922454df1229
Simon Marchi [Fri, 3 Feb 2023 14:21:22 +0000 (09:21 -0500)]
gdb: move target_desc_info to inferior.h
In preparation for the following patch, where struct inferior needs to
"see" struct target_desc_info, move target_desc_info to the header file.
I initially moved the structure to target-descriptions.h, and later made
inferior.h include target-descriptions.h. This worked, but it then
occured to me that target_desc_info is really an inferior property that
involves a target description, so I think it makes sense to have it in
inferior.h.
Change-Id: I3e81d04faafcad431e294357389f3d4c601ee83d
Simon Marchi [Fri, 3 Feb 2023 13:23:32 +0000 (08:23 -0500)]
gdb: use assignment to initialize variable in tdesc_parse_xml
Since allocate_target_description returns a target_desc_up, use
assignment to initialize the description variable.
Change-Id: Iab3311642c09b95648984f305936f4a4cde09440
Jan Beulich [Fri, 3 Feb 2023 07:23:05 +0000 (08:23 +0100)]
x86: drop LOCK from XCHG when optimizing
Like with segment overrides on LEA, optimize away such a redundant
instruction prefix.
Jan Beulich [Fri, 3 Feb 2023 07:22:35 +0000 (08:22 +0100)]
x86-64: respect {nooptimize} when building VEX prefix
Swapping operands for commutative insns occurs outside of
optimize_encoding() and hence needs explicit checking for a request to
avoid any optimizations.
Jan Beulich [Fri, 3 Feb 2023 07:22:12 +0000 (08:22 +0100)]
x86: respect {nooptimize} for LEA
Dropping a meaningless segment prefix occurs outside of
optimize_encoding() and hence needs explicit checking for a request to
avoid any optimizations.
Jan Beulich [Fri, 3 Feb 2023 07:21:11 +0000 (08:21 +0100)]
x86-64: respect MOVABS when choosing alternative encodings
The alternative encoding is valid for MOV, but there's no such thing for
MOVABS.
Jan Beulich [Fri, 3 Feb 2023 07:20:32 +0000 (08:20 +0100)]
RISC-V: don't disassemble unrecognized insns as .byte
Insn width granularity being 16 bits, producing byte granular output
isn't very useful. With there being a way to specific otherwise
unknown insns to the assembler, use that same representation (to be
precise: its <length>,<encoding> flavor) for disassembly.
Alan Modra [Thu, 2 Feb 2023 12:09:31 +0000 (22:39 +1030)]
Add ECOFF Symbolic Header sanity checks
Anti-fuzzer measures. The checks don't ensure the various elements in
the header are distinct, but that isn't important as far as making
sure we don't overrun the buffer containing all the elements. Also,
we now don't care about offsets where the corresponding count is zero.
* ecoff.c (_bfd_ecoff_slurp_symbolic_info): Sanity check offsets
in debug->symbolic_header.
GDB Administrator [Fri, 3 Feb 2023 00:00:08 +0000 (00:00 +0000)]
Automatic date update in version.in
Simon Marchi [Tue, 3 Jan 2023 20:07:07 +0000 (15:07 -0500)]
gdb: initial support for ROCm platform (AMDGPU) debugging
This patch adds the foundation for GDB to be able to debug programs
offloaded to AMD GPUs using the AMD ROCm platform [1]. The latest
public release of the ROCm release at the time of writing is 5.4, so
this is what this patch targets.
The ROCm platform allows host programs to schedule bits of code for
execution on GPUs or similar accelerators. The programs running on GPUs
are typically referred to as `kernels` (not related to operating system
kernels).
Programs offloaded with the AMD ROCm platform can be written in the HIP
language [2], OpenCL and OpenMP, but we're going to focus on HIP here.
The HIP language consists of a C++ Runtime API and kernel language.
Here's an example of a very simple HIP program:
#include "hip/hip_runtime.h"
#include <cassert>
__global__ void
do_an_addition (int a, int b, int *out)
{
*out = a + b;
}
int
main ()
{
int *result_ptr, result;
/* Allocate memory for the device to write the result to. */
hipError_t error = hipMalloc (&result_ptr, sizeof (int));
assert (error == hipSuccess);
/* Run `do_an_addition` on one workgroup containing one work item. */
do_an_addition<<<dim3(1), dim3(1), 0, 0>>> (1, 2, result_ptr);
/* Copy result from device to host. Note that this acts as a synchronization
point, waiting for the kernel dispatch to complete. */
error = hipMemcpyDtoH (&result, result_ptr, sizeof (int));
assert (error == hipSuccess);
printf ("result is %d\n", result);
assert (result == 3);
return 0;
}
This program can be compiled with:
$ hipcc simple.cpp -g -O0 -o simple
... where `hipcc` is the HIP compiler, shipped with ROCm releases. This
generates an ELF binary for the host architecture, containing another
ELF binary with the device code. The ELF for the device can be
inspected with:
$ roc-obj-ls simple
1 host-x86_64-unknown-linux file://simple#offset=8192&size=0
1 hipv4-amdgcn-amd-amdhsa--gfx906 file://simple#offset=8192&size=34216
$ roc-obj-extract 'file://simple#offset=8192&size=34216'
$ file simple-offset8192-size34216.co
simple-offset8192-size34216.co: ELF 64-bit LSB shared object, *unknown arch 0xe0* version 1, dynamically linked, with debug_info, not stripped
^
amcgcn architecture that my `file` doesn't know about ----ยด
Running the program gives the very unimpressive result:
$ ./simple
result is 3
While running, this host program has copied the device program into the
GPU's memory and spawned an execution thread on it. The goal of this
GDB port is to let the user debug host threads and these GPU threads
simultaneously. Here's a sample session using a GDB with this patch
applied:
$ ./gdb -q -nx --data-directory=data-directory ./simple
Reading symbols from ./simple...
(gdb) break do_an_addition
Function "do_an_addition" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (do_an_addition) pending.
(gdb) r
Starting program: /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5db7640 (LWP
1082911)]
[New Thread 0x7ffef53ff640 (LWP
1082913)]
[Thread 0x7ffef53ff640 (LWP
1082913) exited]
[New Thread 0x7ffdecb53640 (LWP
1083185)]
[New Thread 0x7ffff54bf640 (LWP
1083186)]
[Thread 0x7ffdecb53640 (LWP
1083185) exited]
[Switching to AMDGPU Wave 2:2:1:1 (0,0,0)/0]
Thread 6 hit Breakpoint 1, do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24
24 *out = a + b;
(gdb) info inferiors
Num Description Connection Executable
* 1 process
1082907 1 (native) /home/smarchi/build/binutils-gdb-amdgpu/gdb/simple
(gdb) info threads
Id Target Id Frame
1 Thread 0x7ffff5dc9240 (LWP
1082907) "simple" 0x00007ffff5e9410b in ?? () from /opt/rocm-5.4.0/lib/libhsa-runtime64.so.1
2 Thread 0x7ffff5db7640 (LWP
1082911) "simple" __GI___ioctl (fd=3, request=
3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36
5 Thread 0x7ffff54bf640 (LWP
1083186) "simple" __GI___ioctl (fd=3, request=
3222817548) at ../sysdeps/unix/sysv/linux/ioctl.c:36
* 6 AMDGPU Wave 2:2:1:1 (0,0,0)/0 do_an_addition (
a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24
(gdb) bt
Python Exception <class 'gdb.error'>: Unhandled dwarf expression opcode 0xe1
#0 do_an_addition (a=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
b=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>,
out=<error reading variable: DWARF-2 expression error: `DW_OP_regx' operations must be used either alone or in conjunction with DW_OP_piece or DW_OP_bit_piece.>) at simple.cpp:24
(gdb) continue
Continuing.
result is 3
warning: Temporarily disabling breakpoints for unloaded shared library "file:///home/smarchi/build/binutils-gdb-amdgpu/gdb/simple#offset=8192&size=67208"
[Thread 0x7ffff54bf640 (LWP
1083186) exited]
[Thread 0x7ffff5db7640 (LWP
1082911) exited]
[Inferior 1 (process
1082907) exited normally]
One thing to notice is the host and GPU threads appearing under
the same inferior. This is a design goal for us, as programmers tend to
think of the threads running on the GPU as part of the same program as
the host threads, so showing them in the same inferior in GDB seems
natural. Also, the host and GPU threads share a global memory space,
which fits the inferior model.
Another thing to notice is the error messages when trying to read
variables or printing a backtrace. This is expected for the moment,
since the AMD GPU compiler produces some DWARF that uses some
non-standard extensions:
https://llvm.org/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html
There were already some patches posted by Zoran Zaric earlier to make
GDB support these extensions:
https://inbox.sourceware.org/gdb-patches/
20211105113849.118800-1-zoran.zaric@amd.com/
We think it's better to get the basic support for AMD GPU in first,
which will then give a better justification for GDB to support these
extensions.
GPU threads are named `AMDGPU Wave`: a wave is essentially a hardware
thread using the SIMT (single-instruction, multiple-threads) [3]
execution model.
GDB uses the amd-dbgapi library [4], included in the ROCm platform, for
a few things related to AMD GPU threads debugging. Different components
talk to the library, as show on the following diagram:
+---------------------------+ +-------------+ +------------------+
| GDB | amd-dbgapi target | <-> | AMD | | Linux kernel |
| +-------------------+ | Debugger | +--------+ |
| | amdgcn gdbarch | <-> | API | <=> | AMDGPU | |
| +-------------------+ | | | driver | |
| | solib-rocm | <-> | (dbgapi.so) | +--------+---------+
+---------------------------+ +-------------+
- The amd-dbgapi target is a target_ops implementation used to control
execution of GPU threads. While the debugging of host threads works
by using the ptrace / wait Linux kernel interface (as usual), control
of GPU threads is done through a special interface (dubbed `kfd`)
exposed by the `amdgpu` Linux kernel module. GDB doesn't interact
directly with `kfd`, but instead goes through the amd-dbgapi library
(AMD Debugger API on the diagram).
Since it provides execution control, the amd-dbgapi target should
normally be a process_stratum_target, not just a target_ops. More
on that later.
- The amdgcn gdbarch (describing the hardware architecture of the GPU
execution units) offloads some requests to the amd-dbgapi library,
so that knowledge about the various architectures doesn't need to be
duplicated and baked in GDB. This is for example for things like
the list of registers.
- The solib-rocm component is an solib provider that fetches the list of
code objects loaded on the device from the amd-dbgapi library, and
makes GDB read their symbols. This is very similar to other solib
providers that handle shared libraries, except that here the shared
libraries are the pieces of code loaded on the device.
Given that Linux host threads are managed by the linux-nat target, and
the GPU threads are managed by the amd-dbgapi target, having all threads
appear in the same inferior requires the two targets to be in that
inferior's target stack. However, there can only be one
process_stratum_target in a given target stack, since there can be only
one target per slot. To achieve it, we therefore resort the hack^W
solution of placing the amd-dbgapi target in the arch_stratum slot of
the target stack, on top of the linux-nat target. Doing so allows the
amd-dbgapi target to intercept target calls and handle them if they
concern GPU threads, and offload to beneath otherwise. See
amd_dbgapi_target::fetch_registers for a simple example:
void
amd_dbgapi_target::fetch_registers (struct regcache *regcache, int regno)
{
if (!ptid_is_gpu (regcache->ptid ()))
{
beneath ()->fetch_registers (regcache, regno);
return;
}
// handle it
}
ptids of GPU threads are crafted with the following pattern:
(pid, 1, wave id)
Where pid is the inferior's pid and "wave id" is the wave handle handed
to us by the amd-dbgapi library (in practice, a monotonically
incrementing integer). The idea is that on Linux systems, the
combination (pid != 1, lwp == 1) is not possible. lwp == 1 would always
belong to the init process, which would also have pid == 1 (and it's
improbable for the init process to offload work to the GPU and much less
for the user to debug it). We can therefore differentiate GPU and
non-GPU ptids this way. See ptid_is_gpu for more details.
Note that we believe that this scheme could break down in the context of
containers, where the initial process executed in a container has pid 1
(in its own pid namespace). For instance, if you were to execute a ROCm
program in a container, then spawn a GDB in that container and attach to
the process, it will likely not work. This is a known limitation. A
workaround for this is to have a dummy process (like a shell) fork and
execute the program of interest.
The amd-dbgapi target watches native inferiors, and "attaches" to them
using amd_dbgapi_process_attach, which gives it a notifier fd that is
registered in the event loop (see enable_amd_dbgapi). Note that this
isn't the same "attach" as in PTRACE_ATTACH, but being ptrace-attached
is a precondition for amd_dbgapi_process_attach to work. When the
debugged process enables the ROCm runtime, the amd-dbgapi target gets
notified through that fd, and pushes itself on the target stack of the
inferior. The amd-dbgapi target is then able to intercept target_ops
calls. If the debugged process disables the ROCm runtime, the
amd-dbgapi target unpushes itself from the target stack.
This way, the amd-dbgapi target's footprint stays minimal when debugging
a process that doesn't use the AMD ROCm platform, it does not intercept
target calls.
The amd-dbgapi library is found using pkg-config. Since enabling
support for the amdgpu architecture (amdgpu-tdep.c) depends on the
amd-dbgapi library being present, we have the following logic for
the interaction with --target and --enable-targets:
- if the user explicitly asks for amdgcn support with
--target=amdgcn-*-* or --enable-targets=amdgcn-*-*, we probe for
the amd-dbgapi and fail if not found
- if the user uses --enable-targets=all, we probe for amd-dbgapi,
enable amdgcn support if found, disable amdgcn support if not found
- if the user uses --enable-targets=all and --with-amd-dbgapi=yes,
we probe for amd-dbgapi, enable amdgcn if found and fail if not found
- if the user uses --enable-targets=all and --with-amd-dbgapi=no,
we do not probe for amd-dbgapi, disable amdgcn support
- otherwise, amd-dbgapi is not probed for and support for amdgcn is not
enabled
Finally, a simple test is included. It only tests hitting a breakpoint
in device code and resuming execution, pretty much like the example
shown above.
[1] https://docs.amd.com/category/ROCm_v5.4
[2] https://docs.amd.com/bundle/HIP-Programming-Guide-v5.4
[3] https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads
[4] https://docs.amd.com/bundle/ROCDebugger-API-Guide-v5.4
Change-Id: I591edca98b8927b1e49e4b0abe4e304765fed9ee
Co-Authored-By: Zoran Zaric <zoran.zaric@amd.com>
Co-Authored-By: Laurent Morichetti <laurent.morichetti@amd.com>
Co-Authored-By: Tony Tye <Tony.Tye@amd.com>
Co-Authored-By: Lancelot SIX <lancelot.six@amd.com>
Co-Authored-By: Pedro Alves <pedro@palves.net>
Simon Marchi [Wed, 30 Nov 2022 14:46:09 +0000 (09:46 -0500)]
gdb: make gdb_printing_disassembler::stream public
In the ROCm port, we need to access the underlying stream of a
gdb_printing_disassembler, so make it public. The reason we need to
access it is to know whether it supports style escape code. We then
pass that information to a temporary string_file we use while
symbolizing addresses.
Change-Id: Ib95755a4a45b8f6478787993e9f904df60dd8dc1
Approved-By: Andrew Burgess <aburgess@redhat.com>
Simon Marchi [Tue, 22 Nov 2022 18:18:43 +0000 (13:18 -0500)]
gdb/solib-svr4: don't disable probes interface if probe not found
In ROCm-GDB, we install an solib provider for the GPU code objects on
top of the svr4 provider for the host, in order to add solibs
representing the GPU code objects to the solib list containing the host
process' shared libraries. We override the target_so_ops::handle_event
function pointer with our own, in which we call svr4_so_ops.handle_event
(which contains svr4_handle_solib_event) manually. When the host
(un)loads a library, the ROCm part of handle_event is a no-op. When the
GPU (un)loads a code object, we want the host side (svr4) to be a no-op.
The problem is that when handle_event is called because of a GPU event,
svr4_handle_solib_event gets called while not stopped at an svr4
probe. It then assumes this means there's a problem with the probes
interface and disables it through the following sequence of events:
- solib_event_probe_at return nullptr
- svr4_handle_solib_event returns early
- the make_scope_exit callback calls disable_probes_interface
We could fix that by making the ROCm handle_event callback check if an
svr4 probe is that the stop address, and only call
svr4_so_ops.handle_event if so. However, it doesn't feel right to
include some svr4 implementation detail in the ROCm event handler.
Instead, this patch changes svr4_handle_solib_event to not assume it is
an error if called while not at an svr4 probe location, and therefore
not disable the probes interface. That just means moving the
make_scope_exit call below where we lookup the probe by pc.
Change-Id: Ie8ddf5beffa2e92b8ebfdd016454546252519244
Co-Authored-By: Lancelot SIX <lancelot.six@amd.com>
Simon Marchi [Mon, 3 Oct 2022 16:56:30 +0000 (12:56 -0400)]
gdb: add gdbarch_up
Add a gdbarch_up unique pointer type, that calls gdbarch_free on
deletion. This is used in the ROCm support patch at the end of this
series.
Change-Id: I4b808892d35d69a590ce83180f41afd91705b2c8
Approved-By: Andrew Burgess <aburgess@redhat.com>
Simon Marchi [Wed, 28 Sep 2022 18:35:26 +0000 (14:35 -0400)]
gdb: add inferior_pre_detach observable
Add an observable notified in target_detach just before calling the
detach method on the inferior's target stack. This allows observer to
do some work on the inferior while it's still ptrace-attached, in the
case of a native Linux inferior. Specifically, the amd-dbgapi target
will need it in order to call amd_dbgapi_process_detach before the
process gets ptrace-detached.
Change-Id: I28b6065e251012a4c2db8a600fe13ba31671e3c9
Approved-By: Andrew Burgess <aburgess@redhat.com>
Simon Marchi [Fri, 23 Sep 2022 15:55:32 +0000 (11:55 -0400)]
gdbsupport: add type definitions for pid, lwp and tid
A following patch will want to declare variables of the same type as
some ptid_t components. To make that easy (and avoid harcoding those
types everywhere), define some type definitions in the ptid_t struct for
each of them. Use them throughout ptid.h.
I initially used pid_t, lwp_t and tid_t, but there is the risk of some
system defining the pid_t type using a macro instead of a typedef, which
would break things. So, use the _type suffix instead.
Change-Id: I820b0bea9dafcb4914f1c9ba4bb96b5c666c8dec
Approved-By: Andrew Burgess <aburgess@redhat.com>
Pedro Alves [Fri, 23 Sep 2022 15:48:11 +0000 (11:48 -0400)]
gdb: make install_breakpoint return a non-owning reference
A following patch will want to install a breakpoint and then keep a
non-owning reference to it. Make install_breakpoint return a non-owning
reference, to make that easy.
Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: I2e8106a784021ff276ce251e24708cbdccc2d479
Approved-By: Andrew Burgess <aburgess@redhat.com>
Lancelot SIX [Fri, 2 Sep 2022 19:09:35 +0000 (15:09 -0400)]
gdb: add supports_arch_info callback to gdbarch_register
In the ROCm GDB port, there are some amdgcn architectures known by BFD
that we don't actually support in GDB. We don't want
gdbarch_printable_names to return these architectures.
gdbarch_printable_names is used for a few things:
- completion of the "set architecture" command
- the gdb.architecture_names function in Python
- foreach-arch selftests
Add an optional callback to gdbarch_register that is a predicate
indicating whether the gdbarch supports the given bfd_arch_info. by
default, it is nullptr, meaning that the gdbarch accepts all "mach"s for
that architecture known by BFD.
Change-Id: I712f94351b0b34ed1f42e5cf7fc7ba051315d860
Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
Approved-By: Andrew Burgess <aburgess@redhat.com>
Tom de Vries [Thu, 2 Feb 2023 14:07:44 +0000 (15:07 +0100)]
[gas] Update .loc syntax comment in dwarf2dbg.c
I noticed that a comment in gas/dwarf2dbg.c describing .loc syntax was missing
the "view VALUE" option.
Fix this by adding the missing option.
Enze Li [Wed, 1 Feb 2023 14:35:18 +0000 (22:35 +0800)]
gdb: remove gdb_indent.sh
GDB has been converted to a C++ program for many years[1], and the
gdb_indent.sh will not be used any more. Therefore, remove the script as
obvious.
[1] https://sourceware.org/gdb/wiki/cxx-conversion
Approved-By: Simon Marchi <simark@simark.ca>
Indu Bhagat [Thu, 2 Feb 2023 08:49:44 +0000 (00:49 -0800)]
ld/doc: use "stack trace" instead of "unwind" for SFrame
SFrame format is meant for generating stack traces only.
ld/
* ld.texi: Replace the use of "unwind" with "stack trace".
Indu Bhagat [Thu, 2 Feb 2023 08:49:29 +0000 (00:49 -0800)]
bfd: use "stack trace" instead of "unwind" for SFrame
SFrame format is meant for generating stack traces only.
bfd/
* elf-bfd.h: Replace the use of "unwind" with "stack trace".
* elf-sframe.c: Likewise.
* elf64-x86-64.c: Likewise.
* elfxx-x86.c: Likewise.
include/
* elf/common.h: Likewise.
Indu Bhagat [Thu, 2 Feb 2023 08:48:59 +0000 (00:48 -0800)]
gas: use "stack trace" instead of "unwind" for SFrame
SFrame format is meant for generating stack traces only.
gas/
* as.c: Replace the use of "unwind" with "stack trace".
* config/tc-aarch64.c: Likewise.
* config/tc-aarch64.h: Likewise.
* config/tc-i386.c: Likewise.
* config/tc-i386.h: Likewise.
* gen-sframe.c: Likewise.
* gen-sframe.h: Likewise.
* testsuite/gas/cfi-sframe/cfi-sframe-aarch64-2.s: Likewise.
* testsuite/gas/cfi-sframe/cfi-sframe-common-8.s: Likewise.
* testsuite/gas/cfi-sframe/common-empty-2.s: Likewise.
* testsuite/gas/cfi-sframe/common-empty-3.s: Likewise.