Seva Alekseyev [Mon, 13 Jun 2022 12:44:44 +0000 (08:44 -0400)]
Version 5 lineprogram header (#411)
* Version 5 lineprogram header, take 1
* Readelf/decodedline formatting fix
* DWARF 5 fields None, not missing
* Comment
* Sample binary
* Dump unit type in readelf info
* More languages described
* Describing form_line_strp
* Basic support for GNU_PROPERTY_X86_ISA_1
* Readelf decodedline format fixes to match with DWARF5
* Readelf test shorted out for the file/test where a bug in GNU readelf manifests, see PR #411.
* Newline :)
* Readelf' language names matched against binutils
* Comment about lineprogram files and directories
* GNU binutils bug worked around in a slightly less disturbing way - patched the binary, left a comment in the test script.
* Examples autotest no longer fails on Windows over expected path format
* Autotest fix
* Typo
* Windows compatibility, take 2
* No pathlib on Python 2
Co-authored-by: Seva Alekseyev <sevaa@nih.gov>
Eli Bendersky [Wed, 8 Jun 2022 12:57:46 +0000 (05:57 -0700)]
Reformat some docstrings
Eli Bendersky [Tue, 7 Jun 2022 23:20:05 +0000 (16:20 -0700)]
Minor cosmetic changes
Seva Alekseyev [Tue, 7 Jun 2022 23:17:31 +0000 (19:17 -0400)]
Support for sibling in form DW_FORM_ref_addr (#408)
Co-authored-by: Seva Alekseyev <sevaa@nih.gov>
Seva Alekseyev [Mon, 16 May 2022 13:58:24 +0000 (09:58 -0400)]
Storing the offset of DWARF operations within the expression block (#404)
* Storing the offset of every DWARF operations within the expression block
* Trivial auto test
Ronan Dunklau [Tue, 10 May 2022 13:56:32 +0000 (15:56 +0200)]
Improve DWARF 5 compatibility. (#400)
* Add support DW_FORM_implicit_const
* Add support for DW_FORM_line_strp
* Add new tests for DW_FORM_implicit_const and DW_FORM_linestrp.
Andreas Ziegler [Mon, 14 Feb 2022 13:44:27 +0000 (14:44 +0100)]
Add support for DT_RELR/SHT_RELR compressed relocation sections (#395)
As more and more tools now support DT_RELR compressed relocations
(most notably, the just released GNU binutils 2.38 [0]), let's add
support for reading these relocations as well.
The original discussion about advantages of packe RELATIVE
relocations can be found at [1]. In a nutshell, the format
exploits the fact that RELATIVE relocations are often placed
next to each other and (for x86_64) stores up to 64 relocations
in two 8-byte words. In a regular .rela.dyn table, these would
take up 24 * 64 = 1536 bytes.
The compressed relocations work as follows:
The first word in the section describes a base address and
contains an offset for a relocation. This offset must always
lie at an even address. Following this entry can be one or
more bitmap(s) which have their least significant bit set to 1.
All other bits describe (in increasing order of significance) if
the following continuous offsets also contain a relocation. The
addends for existing relocations are stored at the corresponding
offsets in the file (that is, they work like REL relocations).
A good description of the history of this feature and its current
adoption is the following blog post [2].
[0]: https://lists.gnu.org/archive/html/info-gnu/2022-02/msg00009.html
[1]: https://groups.google.com/g/generic-abi/c/bX460iggiKg?pli=1
[2]: https://maskray.me/blog/2021-10-31-relative-relocations-and-relr
Eli Bendersky [Thu, 3 Feb 2022 14:45:41 +0000 (06:45 -0800)]
Prepare for releasing version 0.28
Brendan Haines [Thu, 13 Jan 2022 23:07:39 +0000 (16:07 -0700)]
Update structs.py (#392)
Remove unused imports
Adam [Tue, 11 Jan 2022 12:05:21 +0000 (12:05 +0000)]
Add PS3/CellOS OSABI identifier (#389)
* Add PS3/CellOS OSABI identifier.
* Remove "OS" from CELL OS ABI
* Remove "OS" from CELL OS ABI
* Add Missing comma for ELFOSABI_CELL_LV2.
Marco Bonelli [Wed, 15 Dec 2021 19:25:34 +0000 (20:25 +0100)]
Add support for note GNU_PROPERTY_X86_FEATURE_1_AND (#388)
- Implement support for GNU property note type
GNU_PROPERTY_X86_FEATURE_1_AND (which is a feature bitmask) and its
relative flags.
- Fix off-by-one in "Data size" column alignment for readelf.py note
sections dump.
References:
- https://gitlab.com/x86-psABIs/x86-64-ABI
Eli Bendersky [Fri, 10 Dec 2021 14:57:26 +0000 (06:57 -0800)]
Rebuild readelf locally and add more instructions
Eli Bendersky [Fri, 10 Dec 2021 14:49:08 +0000 (06:49 -0800)]
Run readelf tests in parallel by default
Marco Bonelli [Fri, 10 Dec 2021 14:36:18 +0000 (15:36 +0100)]
Update readelf to v2.37, adapt readelf.py output and tests (#387)
Changes to conform the output of readelf.py to binutils readelf v2.37:
- Use singular "entry" when needed instead of "entries".
- Output the last entry for the .debug_line output table when
DW_LNE_end_sequence is encountered, as DWARF standard dictates. Looks
looks like this was a readelf bug which was fixed in commit
ba8826a82a29a19b78c18ce4f44fe313de279af7 of the GNU binutils-gdb repo.
- Add additional "Stmt" field in the .debug_line output table, and
ignore the new "View" field. The "Stmt" field has been implemented in
readelf.py. The "View" field is not something that the DWARF standard
defines, it's an internal register added to the line number
information state machine by binutils to perform assembler checks (see
commit
ba8826a82a29a19b78c18ce4f44fe313de279af7 of GNU binutils-gdb
repo for more info, in particular gas/doc/as.texinfo). "View" is
unimplemented in pyelftools for now and a special case has been added
in the readelf test suite to ignore it.
- Add support for printing section names when dumping .symtab entries of
st_type STT_SECTION as readelf v2.37 does (see commit
23356397449a8aa65afead0a895a20be53b3c6b0 of GNU binutils-gdb repo).
- Add suport for recognizing SOs specifically tagged as PIE (DT_FLAGS_1
dynamic tag with DF_1_PIE set). In such case, describe the file as
"Position-Independent Executable file" instead of "Shared object
file", as readelf v2.37 does.
- Add leading "0x" for version section addresses when dumping version
information (-V) as readelf does.
- Ignore "D (mbind)" in section headers flags legend (pyelftools does
not output this flag).
Special cases ADDED for run_readelf_tests.py:
- Ignore "View" column for --debug-dump=decodedline in readelf's output.
- Ignore ellipsis ("[...]") for long names/symbols/paths in readelf's
output.
Special cases REMOVED for run_readelf_tests.py:
- Detection of additional '@' after symbol names (flag_after_symtable)
seems to no longer be needed as all tests pass whitout this exception.
- Special case for DW_AT_apple_xxx seems to no longer be needed, readelf
now recognizes those.
- Special case for PT_GNU_PROPERTY no longer needed, readelf now
recognizes it.
Other changes:
- Add missing import in elftools/dwarf/lineprogram.py.
References:
- GNU binutils-gdb repo: https://sourceware.org/git/?p=binutils-gdb.git
Marco Bonelli [Tue, 7 Dec 2021 14:08:54 +0000 (15:08 +0100)]
Add support for .note.gnu.property notes section (#386)
* Add support for .note.gnu.properties notes section
References:
- Doc: https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf
- Linux: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=
00e19ceec80b03a43f626f891fcc53e57919f1b3
- Glibc: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/dl-prop.h;h=
385548fad3e4ad71dbdcdbfada58585c2f24ea5e;hb=HEAD
- Binutils: https://sourceware.org/git/?p=binutils-gdb.git&a=search&h=HEAD&st=commit&s=NT_GNU_PROPERTY_TYPE_0
* Add descriptions for .note.gnu.properties notes
* descriptions: add missing PT_GNU_PROPERTY description
* py3compat: add optional separator for bytes2hex
* readelf: align notes column headers
* elf/descriptions: conform to real readelf's output format
* test: special case some known readelf output quirks
* test: add test ELFs for .note.gnu.property notes
Seva Alekseyev [Wed, 17 Nov 2021 23:51:53 +0000 (18:51 -0500)]
DW_AT_private=0x24 (#382)
* DWARF 5 tags and attributes
* DW_AT_private
Co-authored-by: Seva Alekseyev <sevaa@nih.gov>
Seva Alekseyev [Sat, 6 Nov 2021 20:58:46 +0000 (16:58 -0400)]
DW_AT_virtual (#380)
* DWARF 5 tags and attributes
* DW_AT_virtual
Co-authored-by: Seva Alekseyev <sevaa@nih.gov>
Ulugbek Abdullaev [Fri, 29 Oct 2021 16:03:24 +0000 (18:03 +0200)]
add latest 'e_machine' mappings: EM_BPF, EM_CSKY, EM_FRV (#376)
Andreas Ziegler [Mon, 25 Oct 2021 14:31:14 +0000 (16:31 +0200)]
ELFFile: allow filtering by segment type in iter_segments() (#375)
This is very similar to the filtering implemented for
sections in commit
d71faebcd58e.
Karthikeyan Singaravelan [Sat, 16 Oct 2021 13:18:32 +0000 (18:48 +0530)]
Use assertEqual instead of assertEquals for Python 3.11 compatibility. (#374)
Marco Bonelli [Fri, 17 Sep 2021 20:37:03 +0000 (22:37 +0200)]
Keep raw note descriptors in ELF note sections as raw bytes (#372)
* ELF notes: keep raw note descriptors as bytes
* py3compat: add bytes2hex function
* elf/descriptions: use bytes2hex where needed
* ELF notes: convert to string only for known types
Jangseop Shin [Tue, 31 Aug 2021 13:19:27 +0000 (22:19 +0900)]
[example] bug fixes in dwarf_decode_address example (#361)
* [example] Handle lpe with end_sequence correctly
* [example] exclude highpc in address comparison in decode_funcname
Co-authored-by: Jangseop Shin <j.s.shin@samsung.com>
Lukas Dresel [Mon, 2 Aug 2021 15:30:19 +0000 (11:30 -0400)]
fixed parsing for structures containing uids or gids in core dumps for most architectures (#354)
* fixed parsing for structures containing uids or gids in core dumps for most architectures
* added testcase for mips corefile uid/gid parsing
* better description
* better email
William Woodruff [Thu, 27 May 2021 13:38:35 +0000 (09:38 -0400)]
dwarf: initial DWARFv5 support (#363)
* dwarf: initial DWARFv5 support
* dwarf/structs: use Embed to select header layout
* dwarf/structs: DW_FORM_strx family
Not sure how best to handle 24-bit values yet.
* dwarf/structs: use IfThenElse
`If` alone wraps the else in a `Value`.
* dwarf/structs: DW_FORM_addrx family handling
* dwarf_expr: support DW_OP_addrx
Not complete, but gets readelf.py to the end of a single
binary.
* dwarf/constants: DW_UT_* constants
* dwarf/structs: fix some DW_FORMs
* elftools, test: plumbing for DWARFv5 sections
* dwarf/constants: fix typo
* dwarf/structs: re-add a comment that got squashed
* dwarf/structs: DWARFv5 table header scaffolding
* dwarf/constants: typo
* test: add a basic DWARFv5 test
William Woodruff [Fri, 21 May 2021 13:20:12 +0000 (07:20 -0600)]
dwarf/constants: add DW_LNCT_* constants (#362)
These were introduced with DWARFv5 and are documented in S. 6.2.4.1.
Nick Desaulniers [Sat, 15 May 2021 03:34:21 +0000 (20:34 -0700)]
initial support for PPC64LE (#360)
* initial support for PPC64LE
See also:
https://openpowerfoundation.org/wp-content/uploads/2016/03/ABI64BitOpenPOWERv1.1_16July2015_pub4.pdf
3.4.1 Symbol Values
3.5.3 Relocation Types Table
Fixes #317
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
* remove references to LLVM_ADDR_SIG
LeadroyaL [Sun, 18 Apr 2021 13:30:12 +0000 (21:30 +0800)]
Support Android compressed rel/rela sections (#357)
Ref: https://android.googlesource.com/platform/bionic/+/refs/tags/android-11.0.0_r35/libc/include/elf.h
Peter LaFosse [Tue, 26 Jan 2021 00:11:51 +0000 (19:11 -0500)]
Fix/extend aarch64 register names table (#351)
Kyle Zeng [Wed, 20 Jan 2021 13:58:50 +0000 (06:58 -0700)]
fix wrong prpsinfo in 32bit coredump (#347)
* fix wrong prpsinfo in 32bit coredump
* add a sample coredump
* finish the testcase for 32bit core dump
Andreas Ziegler [Tue, 12 Jan 2021 15:03:47 +0000 (16:03 +0100)]
dynamic.py: move logic around to allow symbol access more easily (#346)
So far, the implementation of num_symbols() and get_symbol()
in the DynamicSegment class depended on iter_symbols().
However, most part of iter_symbols() is actually about
determining the number of symbols. Let's move that logic to
the correct method and use it in iter_symbols().
Additionally, in an ELF file without any exported symbols,
the hash table will be empty and will thus return a too low
number of symbols. However, a loader might still need to
access the imported symbols (which also have an entry in
the symbol table, with st_shndx set to SHN_UNDEF). To allow
this, make get_symbol() take any index and simply read the
symbol data from the corresponding index, and use
get_symbol() from iter_symbols(). This way, one can for
example use symbol index information from relocation entries
to directly access the symbol data.
These changes also make the logic in DynamicSegment resemble
the code in SymbolTableSection more closely.
Fixes: #342
Andreas Ziegler [Tue, 12 Jan 2021 00:27:24 +0000 (01:27 +0100)]
ELFFile: allow filtering of sections by type in iter_sections (#345)
As stated in the corresponding issue, we can already filter
the output of Dynamic.iter_tags() by the type of the tag
we're looking for.
Let's adapt the iteration over the sections of the ELF file
so that it only yields sections of a certain type if the
optional type parameter is passed to iter_sections().
By doing this we can also simplify two call sites inside
the ELFFile class.
Fixes: #344
Jonathan Bruchim [Mon, 7 Dec 2020 16:36:47 +0000 (18:36 +0200)]
added a method for returning the index of a section by name (#331)
* added an method for returning the index of a section by name
Signed-off-by: Jonathan <yonbruchim@gmail.com>
* changed naming of init mapping function
Signed-off-by: Jonathan <yonbruchim@gmail.com>
* Fixed CR
Added a test file containing 3 tests
1. test index of existing section
2. test index of missing section
3. test index of existing section when section_map_name is None
Signed-off-by: Jonathan Bruchim <yonbruchim@gmail.com>
Eli Bendersky [Tue, 27 Oct 2020 13:47:42 +0000 (06:47 -0700)]
Update .gitignore to ignore .egg-info dirs
Eli Bendersky [Tue, 27 Oct 2020 13:47:07 +0000 (06:47 -0700)]
Update TODO to mention git tag
Eli Bendersky [Tue, 27 Oct 2020 13:42:31 +0000 (06:42 -0700)]
remove dir that shouldn't be in git
Eli Bendersky [Tue, 27 Oct 2020 13:42:12 +0000 (06:42 -0700)]
Version 0.27 release
Eli Bendersky [Tue, 27 Oct 2020 13:29:22 +0000 (06:29 -0700)]
Add a bit more details to dwarf_pubnames_types example
Fix reference output and make test emit both outputs when they differ
Eli Bendersky [Tue, 27 Oct 2020 12:59:56 +0000 (05:59 -0700)]
Make dwarf_pubnames_types example a bit more general
Eli Bendersky [Tue, 27 Oct 2020 12:39:37 +0000 (05:39 -0700)]
Replace field access with property name access
Seva Alekseyev [Tue, 27 Oct 2020 12:36:46 +0000 (08:36 -0400)]
DebugSectionDescriptor.size initialized with decompressed section size (#339)
Andreas Ziegler [Mon, 26 Oct 2020 13:07:42 +0000 (14:07 +0100)]
hash.py: observe endianness when reading hashes (#338)
Reading the hashes from a GNUHashTable didn't properly use
the endianness of the underlying ELF file, so looking up
hashes would fail if the byte order of the analyzed file
did not match the native byte order of the current machine.
The test file consists of two functions:
int callee(){
return 42;
}
int caller(){
return callee();
}
and was compiled using `aarch64_be-linux-gcc` (version 8.3
on an x86_64 host) with the `-mbig-endian` and `-shared`
command line flags.
Seva Alekseyev [Sat, 10 Oct 2020 12:38:13 +0000 (08:38 -0400)]
DWARFv1 constants in enums, DW_FORM_ref parsing (#335)
Andreas Ziegler [Thu, 1 Oct 2020 13:45:19 +0000 (15:45 +0200)]
elf: support for ELF files with a large number of sections (#333)
* elf: implement support for ELF files with a large number of sections
As documented in the ELF specification [0] and reported in #330,
the number of sections (`e_shnum` member of the ELF header)
as well as the section table index of the section name string
table (`e_shstrndx` member) could exceed the SHN_LORESERVE
(0xff00) value. In this case, the members of the ELF header
are set to 0 or SHN_XINDEX (0xffff), respectively, and the
actual values are found in the inital entry of the section
header table (which is otherwise set to zeroes).
So far, the implementation of `elffile.num_sections()`
didn't handle these situations and simply reported that the
file contained 0 sections, and `scripts/readelf.py` presented
invalid values.
Fix it by following the specification more closely and
showing the corresponding correct values in `readelf.py`.
[0]: https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.eheader.html
Closes: #330
* test: add test file with a large number of sections
This file was generated with the following commands:
$ for i in {1..65280}; do
echo "void __attribute__((section(\"s.$i\"), naked)) f$i(void) {}";
done > many_sections.c;
echo "int main(){}" >> many_sections.c
$ gcc-8 -fno-asynchronous-unwind-tables -c -o many_sections.o.elf many_sections.c
$ strip many_sections.o.elf
Eli Bendersky [Wed, 23 Sep 2020 13:25:36 +0000 (06:25 -0700)]
Remove Travis config
Eli Bendersky [Wed, 23 Sep 2020 13:23:47 +0000 (06:23 -0700)]
Change badge image to point to github actions, not Travis
Eli Bendersky [Wed, 23 Sep 2020 13:21:12 +0000 (06:21 -0700)]
Set to run only on ubuntu because of readelf binary
Also fix mentions of Travis
Eli Bendersky [Wed, 23 Sep 2020 13:18:54 +0000 (06:18 -0700)]
Fix typo in ci.yml
Eli Bendersky [Wed, 23 Sep 2020 13:17:02 +0000 (06:17 -0700)]
Add GitHub actions workflow for CI
LeadroyaL [Wed, 19 Aug 2020 16:35:12 +0000 (00:35 +0800)]
Add support for ARM exception handler ABI (#328)
Eli Bendersky [Tue, 18 Aug 2020 00:57:18 +0000 (17:57 -0700)]
Fix python versions for tests that run
On Travis run fewer old Python versions.
Locally, only run the latest Python 2.x and 3.x
Closes #305
Val [Sat, 25 Jul 2020 12:22:10 +0000 (08:22 -0400)]
Update code to work with pickling (#327)
pagabuc [Mon, 20 Jul 2020 21:21:49 +0000 (14:21 -0700)]
Return the correct number of program headers when e_phnum is 0xffff (#326)
* Return the correct number of program headers when e_phnum is 0xffff
* Added link and relevant text of the specification
Eli Bendersky [Wed, 8 Jul 2020 00:44:33 +0000 (17:44 -0700)]
Fix formatting and add comment in test
Fish [Wed, 8 Jul 2020 00:42:33 +0000 (17:42 -0700)]
Fix the non-determinism in test_dwarf_expr. (#324)
Eli Bendersky [Wed, 8 Jul 2020 00:15:02 +0000 (17:15 -0700)]
Revert "for sibling of form ref_addr, only sibling value should be used (#268)"
This reverts commit
575425338fbab134919cb1206509589a174fb81f.
This breaks the tests:
Test file 'test/testfiles_for_readelf/sibling_ref_addr.elf'
.......................FAIL
....for option "-e"
....Output #1 is readelf, Output #2 is pyelftools
@@ Mismatch on line #13:
>> flags: 0x80000000, emb<<
>> flags: 0x80000000<<
([('equal', 0, 47, 0, 47), ('delete', 47, 52, 47, 47)])
@@ Output #1 dumped to file: /tmp/out1_vn_mmkbu.stdout
@@ Output #2 dumped to file: /tmp/out2_l8_zbr6h.stdout
.......................FAIL
....for option "-n"
....Output #1 is readelf, Output #2 is pyelftools
@@ Mismatch on line #2:
>> apuinfo 0x00000008 nt_arch (architecture)<<
>> apuinfo 0x00000008 nt_gnu_hwcap (dso-supplied software hwcap info)<<
([('equal', 0, 37, 0, 37), ('insert', 37, 37, 37, 66), ('equal', 37, 39, 66, 68), ('insert', 39, 39, 68, 72), ('equal', 39, 40, 72, 73), ('replace', 40, 41, 73, 75), ('equal', 41, 42, 75, 76), ('delete', 42, 47, 76, 76), ('equal', 47, 48, 76, 77), ('replace', 48, 55, 77, 80), ('equal', 55, 56, 80, 81)])
@@ Output #1 dumped to file: /tmp/out1_kla3jq33.stdout
@@ Output #2 dumped to file: /tmp/out2_qzmuu23z.stdout
@@ aborting - 'test/external_tools/readelf -x.text' returned '1'
Conclusion: FAIL
sagiben [Wed, 8 Jul 2020 00:10:07 +0000 (03:10 +0300)]
for sibling of form ref_addr, only sibling value should be used (#268)
* for sibling of form ref_addr, only sibling value should be used
* Add ELF testcase for PR #268
Gabriel-Andrew Pollo Guilbert [Wed, 8 Jul 2020 00:06:59 +0000 (20:06 -0400)]
Fix typo when referencing DW_FORM_ref_addr (#321)
Fish [Wed, 8 Jul 2020 00:05:00 +0000 (17:05 -0700)]
Fix Python 2 support of FDE. (#323)
Fish [Tue, 7 Jul 2020 13:07:12 +0000 (06:07 -0700)]
dwarf.CallFrameInfo: Support parsing LSDA pointers from FDEs. (#308)
* dwarf.CallFrameInfo: Support parsing LSDA pointers from FDEs.
* Add a test case.
* Make 0 explicit. More doc-string.
Nick Desaulniers [Tue, 9 Jun 2020 12:53:08 +0000 (05:53 -0700)]
initial support for aarch64 little endian (#318)
See also:
https://static.docs.arm.com/ihi0056/b/IHI0056B_aaelf64.pdf
for relocation types and
https://developer.arm.com/docs/ihi0057/c/dwarf-for-the-arm-64-bit-architecture-aarch64-abi-2018q4
for DWARF register names.
Issue #317
Link: https://github.com/ClangBuiltLinux/frame-larger-than/issues/4
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Disconnect3d [Mon, 8 Jun 2020 12:14:37 +0000 (14:14 +0200)]
Add PT_GNU_PROPERTY enum (#319)
This commit adds the missing `PT_GNU_PROPERTY` program header enums.
Additional information regarding the `PT_GNU_PROPERTY` can be found at:
* https://reviews.llvm.org/D70959
* https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf (linked in above url)
* https://sourceware.org/pipermail/libc-alpha/2020-May/113841.html (commit in libc adding this value)
This program header can be found, e.g., in a glibc in Ubuntu 20.04 (see `docker run --rm -it ubuntu:20.04 cat /usr/lib/x86_64-linux-gnu/libc-2.31.so > libc-2.31.so`).
William Woodruff [Mon, 1 Jun 2020 13:00:00 +0000 (09:00 -0400)]
dwarf/dwarf_expr: Add support for DW_OP_GNU_push_tls_address (#315)
* dwarf/dwarf_expr: Add support for DW_OP_GNU_push_tls_address
* dwarf/dwarf_expr: Use a single 64-bit operand for const8x
DWARFv4 2.5.1.1: this should be consumed as a single 64-bit operand,
not as two 32-bit operands.
* dwarf/descriptions: Fix descriptions for const8{u,s}
* test: Add tests for changed OPs
Patrick Gordon [Sat, 23 May 2020 14:49:28 +0000 (15:49 +0100)]
fix issue in aranges cu_offset_at_addr (#310)
* fix issue in aranges cu_offset_at_addr
if there are aranges for some parts of the binary but not others, incorrect aranges may be returned
* add test suite for absent/partial/complete aranges
mephi42 [Thu, 21 May 2020 12:18:17 +0000 (14:18 +0200)]
Fix determining PAGESIZE under Jython (#314)
Jython has neither `resource` nor `mmap`, therefore just use a
reasonable default.
Chunbo [Mon, 27 Apr 2020 13:43:03 +0000 (21:43 +0800)]
Fix a typo in adapters.py (#309)
Milton D. Miller II [Wed, 22 Apr 2020 12:57:32 +0000 (12:57 +0000)]
Cached random access to CUs and DIEs (#264)
* dwarf/compileunit: Lookup DIE from a reference
Accept a resolved reference address for a DIE in a compile unit and
parse the DIE at that location. Insert into the _diemap / _dielist
cache shared with iter_DIE_children() for fast repeated lookups.
This can be used to follow attribute references to a DIE that be
referenced several times (eg for a DW_AT_type reference) or find
a DIE referenced in a lookup table.
* dwarf/dwarfinfo: Cache CUs, direct parse or search from known units
Maintain a cache of compile units parsed and a map of their offsets
similar to the one mainained of DIEs by compile units.
Add the ability to parse a random compile unit when the offset of
the compile unit header is known.
Add the ability to search for a compile unit containing (spanning)
a given refaddr, such as that obtained from a DIE reference class
attribute, starting from the closest previous cached compile unit.
* dwarf/die: search for parents on demand
Add a function to set the _parent link of known chldren, iterating
down each parent of a target DIE. Walk all children of a given
parent and set each child's ._parent to avoid O(n^2) walking.
A future commit will add other methods to instatiate a DIE that will
not set the _parent link as the DIE is instantiated.
This walk uses the knowledge that in a flattened tree a parents offset
will always be less than the childs.
The call to die.set_parent in compile_unit iter_DIE_children could be
removed to make the method private,, but it is free to set starting
from the top DIE. Alternativly make it an optional argument to
DIE creation.
* dwarf/dwarfinfo: APIs to lookup DIEs
Add APIs to lookup a DIE from: (a) a DIE reference class attribute
taking into account the attribute form, (b) from a lookup table entry
(NameLUTEntry) from a .pub_types or .pub_names section, or (c) directly
from a reference addresss (.debug_info offset) regardless of how it
was obtained.
Add a test that will lookup dies from pubnames and follow die by ref.
This is a simple test that exercises the new cache lookup
methods and provides a starting point on how to determine a
variables type.
For now raise NotImplemented exception for type signature lookup
and supplemental dwarf object files.
Eli Bendersky [Sat, 28 Mar 2020 13:21:15 +0000 (06:21 -0700)]
Clean up whitespace
Seva Alekseyev [Mon, 23 Mar 2020 14:16:01 +0000 (10:16 -0400)]
Support for --debug-dump=loc in readelf.py and in the test (#304)
Eli Bendersky [Mon, 23 Mar 2020 13:02:22 +0000 (06:02 -0700)]
Reformat whitespace
Eli Bendersky [Sun, 22 Mar 2020 13:50:09 +0000 (06:50 -0700)]
Add full list of supported debug sections in readelf's help message
Eli Bendersky [Sun, 22 Mar 2020 13:42:27 +0000 (06:42 -0700)]
Fix --parallel readelf test after previous commit
Previous commit broke them because lambdas can't be picked by multiprocessing
Seva Alekseyev [Sun, 22 Mar 2020 13:35:19 +0000 (09:35 -0400)]
GNU expressions (#303)
William Woodruff [Sat, 21 Mar 2020 14:29:46 +0000 (10:29 -0400)]
Add support for DW_LNE_set_discriminator (#282)
Pierre-Marie de Rodat [Tue, 17 Mar 2020 12:01:45 +0000 (13:01 +0100)]
Enhance MIPS64 testing and simplify handling code for its peculiar relocations (#300)
* Add handling for SHF_MASKOS section flags
* Add readelf testcases for MIPS64 specificities
* Simplify the decoding of MIPS64 relocations
Instead of using "fake" fields to parse the relocation structure and
then use complex shift/masks to recover the conveyed information (once
for big endian binaries, twice for little endian ones), use fields
actually described in the spec and use straightforward shifts to
synthetize the "fake" fields.
Eli Bendersky [Sat, 14 Mar 2020 12:53:51 +0000 (05:53 -0700)]
Remove unused field
Eli Bendersky [Sat, 14 Mar 2020 12:52:03 +0000 (05:52 -0700)]
Simplify ExprDumper now that the expression parser is simpler.
We no longer need the part-by-part dumping and separate process/get_str.
Also simplify tests.
Fixes #298
Eli Bendersky [Sat, 14 Mar 2020 12:37:53 +0000 (05:37 -0700)]
Cache dispatch table between expr parses
In descriptions, ExprDumper invokes parse_expr many times on small
expressions. Initializing the dispatch table for every parse is
wasteful.
Wrap parse_expr with a simple object that generates and caches the
dispatch table during initialization. parse_expr remains stateless.
Updates #298
Audrey Dutcher [Sat, 14 Mar 2020 12:28:12 +0000 (05:28 -0700)]
Only byteswap the little endian version of mips64 r_raw_info (#297)
Eli Bendersky [Fri, 13 Mar 2020 13:31:41 +0000 (06:31 -0700)]
DWARF expr: clean up more old code and add some comments
Eli Bendersky [Fri, 13 Mar 2020 13:26:02 +0000 (06:26 -0700)]
DWARF expr: removing old GenericExprVisitor
Eli Bendersky [Fri, 13 Mar 2020 13:23:52 +0000 (06:23 -0700)]
DWARF expr: tests passing with new parser
Eli Bendersky [Fri, 13 Mar 2020 13:17:13 +0000 (06:17 -0700)]
Initial commit of new expr parsing function.
Basic unit tests pass, but old code is still in place and descriptions is not
yet converted.
Eli Bendersky [Fri, 13 Mar 2020 13:32:08 +0000 (06:32 -0700)]
Merge branch 'master' of github.com:eliben/pyelftools
Eli Bendersky [Fri, 13 Mar 2020 12:44:33 +0000 (05:44 -0700)]
Clean up whitespace in dwarf/ranges.py
Eli Bendersky [Fri, 13 Mar 2020 12:23:00 +0000 (05:23 -0700)]
Clean up whitespace
Pierre-Marie de Rodat [Tue, 10 Mar 2020 13:12:11 +0000 (14:12 +0100)]
callframe.py: fix DW_EH_PE_absptr decoding (#295)
* Handle type2/type3 relocation fields for ELF64 MIPS binaries
* dwarf/callframe.py: fix field read using the DW_EH_PE_absptr encoding
This encoding represents target addresses, so it is the virtual address
space determines its size, not the DWARF format.
Fixes #288
Pierre-Marie de Rodat [Tue, 10 Mar 2020 12:46:46 +0000 (13:46 +0100)]
readelf.py: minor enhancements for debugging (#294)
* readelf.py: add an option to show traceback on error
* readelf.py: flush stdout before printing to sys.stderr
This is necessary to make error messages appear after any display that
was emitted before the error actually happened.
Eli Bendersky [Mon, 9 Mar 2020 12:42:34 +0000 (05:42 -0700)]
Remove some unconditional printouts in unit tests
Andreas Ziegler [Mon, 9 Mar 2020 12:39:23 +0000 (13:39 +0100)]
{GNU,}HashSection: Implement symbol lookup (#290)
In super-stripped binaries, symbol tables can not be accessed
directly as we do not have section headers to find them. In
this case, we can already use the mandatory DynamicSegment
which provides methods for individual access and iteration
over symbols via a minimal implementation of symbol hash
sections which only provided the number of symbols so far.
As we can also directly look up symbols via the hash table,
let's implement this functionality as well.
The code is based on @rhelmot's implementation as discussed
in #219, with some changes around reading the hash parameters.
For supporting individual symbol lookup, we also need the
corresponding symbol table to get the Symbol objects if the
matching hash was found in the hash section. In regular ELF
files, the symbol table is denoted by the section index
provided in the sh_link field of the hash section and
automatically created when building the hash section, for
super-stripped binaries we can use the DynamicSegment (which
needs to be present in any case) as the symbol table as it
also provides a get_symbol() method relying on other ways to
determine the list of symbols. Both of these variants can be
seen in the improved test_hash.py file.
The hash tables are implemented in a base class which does not
derive from the Section class in order to allow instantiation
even if no section headers are present in the underlying file.
Pierre-Marie de Rodat [Mon, 9 Mar 2020 12:36:55 +0000 (13:36 +0100)]
Minor enhancements for readelf-based tests (#293)
* Add handling for SHT_MIPS_ABIFLAGS section types
* Add handling for SHF_MASKPROC section flags
* Add handling for DT_MIPS_FLAGS dynamic table entries
* Display DT_MIPS_SYMTABNO and DT_MIPS_LOCAL_GOTNO entries as decimal ints
* Adjust display of NT_GNU_GOLD_VERSION notes
Andreas Ziegler [Sat, 7 Mar 2020 14:32:40 +0000 (15:32 +0100)]
readelf: print addend for RELA relocations without symbol (#292)
* readelf: print addend for RELA relocations without symbol
When processing relocations from a SHT_RELA type section, GNU
readelf displays the value of the 'r_addend' field if no
symbol index is given (that is, 'r_info_sym' is 0).
By also implementing this we can better test the output for
64-bit binaries which commonly use SHT_RELA relocations.
The included test file is the same as tls.elf but compiled
for x86_64. Its code is the following:
__thread int i;
int main(){}
and it is compiled using the following command line:
$ gcc -m64 -o tls64.elf tls.c
* test: add source file for tls{,64}.elf
The comments at the top describe how to compile the source
file into tls.elf and tls64.elf.
Eli Bendersky [Sat, 7 Mar 2020 14:26:31 +0000 (06:26 -0800)]
Fix up README
Andreas Ziegler [Sat, 7 Mar 2020 14:05:42 +0000 (15:05 +0100)]
readelf.py: adapt section mapping output for .tbss sections (#289)
* readelf.py: adapt section mapping output for .tbss sections
GNU readelf does not show the .tbss section as part of the
loaded data segment when listing the section to segment
mappings, using the ELF_TBSS_SPECIAL macro in
include/elf/internal.h to skip printing the section name.
Implement the same logic in readelf.py.
* test: add test file for .tbss output in readelf.py
This test file includes a .tbss section which is not output
by GNU readelf as part of the loaded data segment when
listing the section to segment mappings.
The source code for tls.elf is simply:
__thread int i;
int main(){}
The file was compiled using the following command line:
$ gcc -o tls.elf -m32 tls.c
Seva Alekseyev [Sat, 7 Mar 2020 14:05:01 +0000 (09:05 -0500)]
DW_AT_const_value is not a location, take 2 (#277)
Fixes #274
Eli Bendersky [Sat, 7 Mar 2020 13:42:21 +0000 (05:42 -0800)]
Reformat comments and a bit of test logic
Audrey Dutcher [Sat, 7 Mar 2020 13:39:13 +0000 (20:39 +0700)]
Add resilience for degenerate cases present in files with only debug information (#287)
Some ELF files which contain only debug symbols have important sections present in the section table but marked as NOBITS instead of PROGBITS. Attempting to extract the segments can lead to crashes through parsing invalid data.
The first patch modifies the dynamic segment/section specifically to add a flag for this case, since it seems to assume that there will always be at least one entry, DT_NULL.
The second patch modifies the segment code more generally to return a dummy answer for what data it holds. The actual way that this change prevents a crash is while trying to parse .eh_frame when it is in fact NOBITS - originally I had a more targeted patch, but decided that it was important enough to do more generally
Eli Bendersky [Sat, 7 Mar 2020 13:35:44 +0000 (05:35 -0800)]
Reflow comments and clean up whitespace
Eli Bendersky [Sat, 7 Mar 2020 13:34:36 +0000 (05:34 -0800)]
Merge branch 'master' of github.com:eliben/pyelftools
Seva Alekseyev [Sat, 7 Mar 2020 13:34:29 +0000 (08:34 -0500)]
ref_addr size changed between v2 and v3 - take 2 (#273)
In DWARF 2, the DW_FORM_ref_addr format matches the target address size, while in DWARF3+ it matches the bitness of the CU record. Here are the relevant fragments from the spec, part 7:
v2:
The second type of reference is the address of any debugging information entry within the same executable or shared object; it may refer to an entry in a different compilation unit from the unit containing the reference. This type of reference (DW_FORM_ref_addr) is the size of an address on the target architecture; it is relocatable in a relocatable object file and relocated in an executable file or shared object.
v3:
The second type of reference can identify any debugging information entry within a program; in particular, it may refer to an entry in a different compilation unit from the unit containing the reference, and may refer to an entry in a different shared object. This type of reference (DW_FORM_ref_addr) is an offset from the beginning of the .debug_info section of the target executable or shared object; it is relocatable in a relocatable object file and frequently relocated in an executable file or shared object. For references from one shared object or static executable file to another, the relocation and identification of the target object must be performed by the consumer. In the 32-bit DWARF format, this offset is a 4-byte unsigned value; in the 64-bit DWARF format, it is an 8-byte unsigned value (see Section 7.4).
If elftools encounters 32-bit DWARF v2 targeting a 64-bit architecture, it will misparse DW_FORM_ref_addr and crash downstream.
I encountered this in an iOS binary from 2017, built with Xcode several versions ago for ARM64. This probably never came up before because by the time 64 bit code became relevant, most toolchains would generate DWARF 3 or newer.
Co-authored-by: Seva Alekseyev <sevaa@nih.gov>
Eli Bendersky [Sat, 7 Mar 2020 13:25:38 +0000 (05:25 -0800)]
Fix typo in comment