Andreas Ziegler [Thu, 1 Oct 2020 13:45:19 +0000 (15:45 +0200)]
elf: support for ELF files with a large number of sections (#333)
* elf: implement support for ELF files with a large number of sections
As documented in the ELF specification [0] and reported in #330,
the number of sections (`e_shnum` member of the ELF header)
as well as the section table index of the section name string
table (`e_shstrndx` member) could exceed the SHN_LORESERVE
(0xff00) value. In this case, the members of the ELF header
are set to 0 or SHN_XINDEX (0xffff), respectively, and the
actual values are found in the inital entry of the section
header table (which is otherwise set to zeroes).
So far, the implementation of `elffile.num_sections()`
didn't handle these situations and simply reported that the
file contained 0 sections, and `scripts/readelf.py` presented
invalid values.
Fix it by following the specification more closely and
showing the corresponding correct values in `readelf.py`.
[0]: https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.eheader.html
Closes: #330
* test: add test file with a large number of sections
This file was generated with the following commands:
$ for i in {1..65280}; do
echo "void __attribute__((section(\"s.$i\"), naked)) f$i(void) {}";
done > many_sections.c;
echo "int main(){}" >> many_sections.c
$ gcc-8 -fno-asynchronous-unwind-tables -c -o many_sections.o.elf many_sections.c
$ strip many_sections.o.elf
Eli Bendersky [Wed, 23 Sep 2020 13:25:36 +0000 (06:25 -0700)]
Remove Travis config
Eli Bendersky [Wed, 23 Sep 2020 13:23:47 +0000 (06:23 -0700)]
Change badge image to point to github actions, not Travis
Eli Bendersky [Wed, 23 Sep 2020 13:21:12 +0000 (06:21 -0700)]
Set to run only on ubuntu because of readelf binary
Also fix mentions of Travis
Eli Bendersky [Wed, 23 Sep 2020 13:18:54 +0000 (06:18 -0700)]
Fix typo in ci.yml
Eli Bendersky [Wed, 23 Sep 2020 13:17:02 +0000 (06:17 -0700)]
Add GitHub actions workflow for CI
LeadroyaL [Wed, 19 Aug 2020 16:35:12 +0000 (00:35 +0800)]
Add support for ARM exception handler ABI (#328)
Eli Bendersky [Tue, 18 Aug 2020 00:57:18 +0000 (17:57 -0700)]
Fix python versions for tests that run
On Travis run fewer old Python versions.
Locally, only run the latest Python 2.x and 3.x
Closes #305
Val [Sat, 25 Jul 2020 12:22:10 +0000 (08:22 -0400)]
Update code to work with pickling (#327)
pagabuc [Mon, 20 Jul 2020 21:21:49 +0000 (14:21 -0700)]
Return the correct number of program headers when e_phnum is 0xffff (#326)
* Return the correct number of program headers when e_phnum is 0xffff
* Added link and relevant text of the specification
Eli Bendersky [Wed, 8 Jul 2020 00:44:33 +0000 (17:44 -0700)]
Fix formatting and add comment in test
Fish [Wed, 8 Jul 2020 00:42:33 +0000 (17:42 -0700)]
Fix the non-determinism in test_dwarf_expr. (#324)
Eli Bendersky [Wed, 8 Jul 2020 00:15:02 +0000 (17:15 -0700)]
Revert "for sibling of form ref_addr, only sibling value should be used (#268)"
This reverts commit
575425338fbab134919cb1206509589a174fb81f.
This breaks the tests:
Test file 'test/testfiles_for_readelf/sibling_ref_addr.elf'
.......................FAIL
....for option "-e"
....Output #1 is readelf, Output #2 is pyelftools
@@ Mismatch on line #13:
>> flags: 0x80000000, emb<<
>> flags: 0x80000000<<
([('equal', 0, 47, 0, 47), ('delete', 47, 52, 47, 47)])
@@ Output #1 dumped to file: /tmp/out1_vn_mmkbu.stdout
@@ Output #2 dumped to file: /tmp/out2_l8_zbr6h.stdout
.......................FAIL
....for option "-n"
....Output #1 is readelf, Output #2 is pyelftools
@@ Mismatch on line #2:
>> apuinfo 0x00000008 nt_arch (architecture)<<
>> apuinfo 0x00000008 nt_gnu_hwcap (dso-supplied software hwcap info)<<
([('equal', 0, 37, 0, 37), ('insert', 37, 37, 37, 66), ('equal', 37, 39, 66, 68), ('insert', 39, 39, 68, 72), ('equal', 39, 40, 72, 73), ('replace', 40, 41, 73, 75), ('equal', 41, 42, 75, 76), ('delete', 42, 47, 76, 76), ('equal', 47, 48, 76, 77), ('replace', 48, 55, 77, 80), ('equal', 55, 56, 80, 81)])
@@ Output #1 dumped to file: /tmp/out1_kla3jq33.stdout
@@ Output #2 dumped to file: /tmp/out2_qzmuu23z.stdout
@@ aborting - 'test/external_tools/readelf -x.text' returned '1'
Conclusion: FAIL
sagiben [Wed, 8 Jul 2020 00:10:07 +0000 (03:10 +0300)]
for sibling of form ref_addr, only sibling value should be used (#268)
* for sibling of form ref_addr, only sibling value should be used
* Add ELF testcase for PR #268
Gabriel-Andrew Pollo Guilbert [Wed, 8 Jul 2020 00:06:59 +0000 (20:06 -0400)]
Fix typo when referencing DW_FORM_ref_addr (#321)
Fish [Wed, 8 Jul 2020 00:05:00 +0000 (17:05 -0700)]
Fix Python 2 support of FDE. (#323)
Fish [Tue, 7 Jul 2020 13:07:12 +0000 (06:07 -0700)]
dwarf.CallFrameInfo: Support parsing LSDA pointers from FDEs. (#308)
* dwarf.CallFrameInfo: Support parsing LSDA pointers from FDEs.
* Add a test case.
* Make 0 explicit. More doc-string.
Nick Desaulniers [Tue, 9 Jun 2020 12:53:08 +0000 (05:53 -0700)]
initial support for aarch64 little endian (#318)
See also:
https://static.docs.arm.com/ihi0056/b/IHI0056B_aaelf64.pdf
for relocation types and
https://developer.arm.com/docs/ihi0057/c/dwarf-for-the-arm-64-bit-architecture-aarch64-abi-2018q4
for DWARF register names.
Issue #317
Link: https://github.com/ClangBuiltLinux/frame-larger-than/issues/4
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Disconnect3d [Mon, 8 Jun 2020 12:14:37 +0000 (14:14 +0200)]
Add PT_GNU_PROPERTY enum (#319)
This commit adds the missing `PT_GNU_PROPERTY` program header enums.
Additional information regarding the `PT_GNU_PROPERTY` can be found at:
* https://reviews.llvm.org/D70959
* https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf (linked in above url)
* https://sourceware.org/pipermail/libc-alpha/2020-May/113841.html (commit in libc adding this value)
This program header can be found, e.g., in a glibc in Ubuntu 20.04 (see `docker run --rm -it ubuntu:20.04 cat /usr/lib/x86_64-linux-gnu/libc-2.31.so > libc-2.31.so`).
William Woodruff [Mon, 1 Jun 2020 13:00:00 +0000 (09:00 -0400)]
dwarf/dwarf_expr: Add support for DW_OP_GNU_push_tls_address (#315)
* dwarf/dwarf_expr: Add support for DW_OP_GNU_push_tls_address
* dwarf/dwarf_expr: Use a single 64-bit operand for const8x
DWARFv4 2.5.1.1: this should be consumed as a single 64-bit operand,
not as two 32-bit operands.
* dwarf/descriptions: Fix descriptions for const8{u,s}
* test: Add tests for changed OPs
Patrick Gordon [Sat, 23 May 2020 14:49:28 +0000 (15:49 +0100)]
fix issue in aranges cu_offset_at_addr (#310)
* fix issue in aranges cu_offset_at_addr
if there are aranges for some parts of the binary but not others, incorrect aranges may be returned
* add test suite for absent/partial/complete aranges
mephi42 [Thu, 21 May 2020 12:18:17 +0000 (14:18 +0200)]
Fix determining PAGESIZE under Jython (#314)
Jython has neither `resource` nor `mmap`, therefore just use a
reasonable default.
Chunbo [Mon, 27 Apr 2020 13:43:03 +0000 (21:43 +0800)]
Fix a typo in adapters.py (#309)
Milton D. Miller II [Wed, 22 Apr 2020 12:57:32 +0000 (12:57 +0000)]
Cached random access to CUs and DIEs (#264)
* dwarf/compileunit: Lookup DIE from a reference
Accept a resolved reference address for a DIE in a compile unit and
parse the DIE at that location. Insert into the _diemap / _dielist
cache shared with iter_DIE_children() for fast repeated lookups.
This can be used to follow attribute references to a DIE that be
referenced several times (eg for a DW_AT_type reference) or find
a DIE referenced in a lookup table.
* dwarf/dwarfinfo: Cache CUs, direct parse or search from known units
Maintain a cache of compile units parsed and a map of their offsets
similar to the one mainained of DIEs by compile units.
Add the ability to parse a random compile unit when the offset of
the compile unit header is known.
Add the ability to search for a compile unit containing (spanning)
a given refaddr, such as that obtained from a DIE reference class
attribute, starting from the closest previous cached compile unit.
* dwarf/die: search for parents on demand
Add a function to set the _parent link of known chldren, iterating
down each parent of a target DIE. Walk all children of a given
parent and set each child's ._parent to avoid O(n^2) walking.
A future commit will add other methods to instatiate a DIE that will
not set the _parent link as the DIE is instantiated.
This walk uses the knowledge that in a flattened tree a parents offset
will always be less than the childs.
The call to die.set_parent in compile_unit iter_DIE_children could be
removed to make the method private,, but it is free to set starting
from the top DIE. Alternativly make it an optional argument to
DIE creation.
* dwarf/dwarfinfo: APIs to lookup DIEs
Add APIs to lookup a DIE from: (a) a DIE reference class attribute
taking into account the attribute form, (b) from a lookup table entry
(NameLUTEntry) from a .pub_types or .pub_names section, or (c) directly
from a reference addresss (.debug_info offset) regardless of how it
was obtained.
Add a test that will lookup dies from pubnames and follow die by ref.
This is a simple test that exercises the new cache lookup
methods and provides a starting point on how to determine a
variables type.
For now raise NotImplemented exception for type signature lookup
and supplemental dwarf object files.
Eli Bendersky [Sat, 28 Mar 2020 13:21:15 +0000 (06:21 -0700)]
Clean up whitespace
Seva Alekseyev [Mon, 23 Mar 2020 14:16:01 +0000 (10:16 -0400)]
Support for --debug-dump=loc in readelf.py and in the test (#304)
Eli Bendersky [Mon, 23 Mar 2020 13:02:22 +0000 (06:02 -0700)]
Reformat whitespace
Eli Bendersky [Sun, 22 Mar 2020 13:50:09 +0000 (06:50 -0700)]
Add full list of supported debug sections in readelf's help message
Eli Bendersky [Sun, 22 Mar 2020 13:42:27 +0000 (06:42 -0700)]
Fix --parallel readelf test after previous commit
Previous commit broke them because lambdas can't be picked by multiprocessing
Seva Alekseyev [Sun, 22 Mar 2020 13:35:19 +0000 (09:35 -0400)]
GNU expressions (#303)
William Woodruff [Sat, 21 Mar 2020 14:29:46 +0000 (10:29 -0400)]
Add support for DW_LNE_set_discriminator (#282)
Pierre-Marie de Rodat [Tue, 17 Mar 2020 12:01:45 +0000 (13:01 +0100)]
Enhance MIPS64 testing and simplify handling code for its peculiar relocations (#300)
* Add handling for SHF_MASKOS section flags
* Add readelf testcases for MIPS64 specificities
* Simplify the decoding of MIPS64 relocations
Instead of using "fake" fields to parse the relocation structure and
then use complex shift/masks to recover the conveyed information (once
for big endian binaries, twice for little endian ones), use fields
actually described in the spec and use straightforward shifts to
synthetize the "fake" fields.
Eli Bendersky [Sat, 14 Mar 2020 12:53:51 +0000 (05:53 -0700)]
Remove unused field
Eli Bendersky [Sat, 14 Mar 2020 12:52:03 +0000 (05:52 -0700)]
Simplify ExprDumper now that the expression parser is simpler.
We no longer need the part-by-part dumping and separate process/get_str.
Also simplify tests.
Fixes #298
Eli Bendersky [Sat, 14 Mar 2020 12:37:53 +0000 (05:37 -0700)]
Cache dispatch table between expr parses
In descriptions, ExprDumper invokes parse_expr many times on small
expressions. Initializing the dispatch table for every parse is
wasteful.
Wrap parse_expr with a simple object that generates and caches the
dispatch table during initialization. parse_expr remains stateless.
Updates #298
Audrey Dutcher [Sat, 14 Mar 2020 12:28:12 +0000 (05:28 -0700)]
Only byteswap the little endian version of mips64 r_raw_info (#297)
Eli Bendersky [Fri, 13 Mar 2020 13:31:41 +0000 (06:31 -0700)]
DWARF expr: clean up more old code and add some comments
Eli Bendersky [Fri, 13 Mar 2020 13:26:02 +0000 (06:26 -0700)]
DWARF expr: removing old GenericExprVisitor
Eli Bendersky [Fri, 13 Mar 2020 13:23:52 +0000 (06:23 -0700)]
DWARF expr: tests passing with new parser
Eli Bendersky [Fri, 13 Mar 2020 13:17:13 +0000 (06:17 -0700)]
Initial commit of new expr parsing function.
Basic unit tests pass, but old code is still in place and descriptions is not
yet converted.
Eli Bendersky [Fri, 13 Mar 2020 13:32:08 +0000 (06:32 -0700)]
Merge branch 'master' of github.com:eliben/pyelftools
Eli Bendersky [Fri, 13 Mar 2020 12:44:33 +0000 (05:44 -0700)]
Clean up whitespace in dwarf/ranges.py
Eli Bendersky [Fri, 13 Mar 2020 12:23:00 +0000 (05:23 -0700)]
Clean up whitespace
Pierre-Marie de Rodat [Tue, 10 Mar 2020 13:12:11 +0000 (14:12 +0100)]
callframe.py: fix DW_EH_PE_absptr decoding (#295)
* Handle type2/type3 relocation fields for ELF64 MIPS binaries
* dwarf/callframe.py: fix field read using the DW_EH_PE_absptr encoding
This encoding represents target addresses, so it is the virtual address
space determines its size, not the DWARF format.
Fixes #288
Pierre-Marie de Rodat [Tue, 10 Mar 2020 12:46:46 +0000 (13:46 +0100)]
readelf.py: minor enhancements for debugging (#294)
* readelf.py: add an option to show traceback on error
* readelf.py: flush stdout before printing to sys.stderr
This is necessary to make error messages appear after any display that
was emitted before the error actually happened.
Eli Bendersky [Mon, 9 Mar 2020 12:42:34 +0000 (05:42 -0700)]
Remove some unconditional printouts in unit tests
Andreas Ziegler [Mon, 9 Mar 2020 12:39:23 +0000 (13:39 +0100)]
{GNU,}HashSection: Implement symbol lookup (#290)
In super-stripped binaries, symbol tables can not be accessed
directly as we do not have section headers to find them. In
this case, we can already use the mandatory DynamicSegment
which provides methods for individual access and iteration
over symbols via a minimal implementation of symbol hash
sections which only provided the number of symbols so far.
As we can also directly look up symbols via the hash table,
let's implement this functionality as well.
The code is based on @rhelmot's implementation as discussed
in #219, with some changes around reading the hash parameters.
For supporting individual symbol lookup, we also need the
corresponding symbol table to get the Symbol objects if the
matching hash was found in the hash section. In regular ELF
files, the symbol table is denoted by the section index
provided in the sh_link field of the hash section and
automatically created when building the hash section, for
super-stripped binaries we can use the DynamicSegment (which
needs to be present in any case) as the symbol table as it
also provides a get_symbol() method relying on other ways to
determine the list of symbols. Both of these variants can be
seen in the improved test_hash.py file.
The hash tables are implemented in a base class which does not
derive from the Section class in order to allow instantiation
even if no section headers are present in the underlying file.
Pierre-Marie de Rodat [Mon, 9 Mar 2020 12:36:55 +0000 (13:36 +0100)]
Minor enhancements for readelf-based tests (#293)
* Add handling for SHT_MIPS_ABIFLAGS section types
* Add handling for SHF_MASKPROC section flags
* Add handling for DT_MIPS_FLAGS dynamic table entries
* Display DT_MIPS_SYMTABNO and DT_MIPS_LOCAL_GOTNO entries as decimal ints
* Adjust display of NT_GNU_GOLD_VERSION notes
Andreas Ziegler [Sat, 7 Mar 2020 14:32:40 +0000 (15:32 +0100)]
readelf: print addend for RELA relocations without symbol (#292)
* readelf: print addend for RELA relocations without symbol
When processing relocations from a SHT_RELA type section, GNU
readelf displays the value of the 'r_addend' field if no
symbol index is given (that is, 'r_info_sym' is 0).
By also implementing this we can better test the output for
64-bit binaries which commonly use SHT_RELA relocations.
The included test file is the same as tls.elf but compiled
for x86_64. Its code is the following:
__thread int i;
int main(){}
and it is compiled using the following command line:
$ gcc -m64 -o tls64.elf tls.c
* test: add source file for tls{,64}.elf
The comments at the top describe how to compile the source
file into tls.elf and tls64.elf.
Eli Bendersky [Sat, 7 Mar 2020 14:26:31 +0000 (06:26 -0800)]
Fix up README
Andreas Ziegler [Sat, 7 Mar 2020 14:05:42 +0000 (15:05 +0100)]
readelf.py: adapt section mapping output for .tbss sections (#289)
* readelf.py: adapt section mapping output for .tbss sections
GNU readelf does not show the .tbss section as part of the
loaded data segment when listing the section to segment
mappings, using the ELF_TBSS_SPECIAL macro in
include/elf/internal.h to skip printing the section name.
Implement the same logic in readelf.py.
* test: add test file for .tbss output in readelf.py
This test file includes a .tbss section which is not output
by GNU readelf as part of the loaded data segment when
listing the section to segment mappings.
The source code for tls.elf is simply:
__thread int i;
int main(){}
The file was compiled using the following command line:
$ gcc -o tls.elf -m32 tls.c
Seva Alekseyev [Sat, 7 Mar 2020 14:05:01 +0000 (09:05 -0500)]
DW_AT_const_value is not a location, take 2 (#277)
Fixes #274
Eli Bendersky [Sat, 7 Mar 2020 13:42:21 +0000 (05:42 -0800)]
Reformat comments and a bit of test logic
Audrey Dutcher [Sat, 7 Mar 2020 13:39:13 +0000 (20:39 +0700)]
Add resilience for degenerate cases present in files with only debug information (#287)
Some ELF files which contain only debug symbols have important sections present in the section table but marked as NOBITS instead of PROGBITS. Attempting to extract the segments can lead to crashes through parsing invalid data.
The first patch modifies the dynamic segment/section specifically to add a flag for this case, since it seems to assume that there will always be at least one entry, DT_NULL.
The second patch modifies the segment code more generally to return a dummy answer for what data it holds. The actual way that this change prevents a crash is while trying to parse .eh_frame when it is in fact NOBITS - originally I had a more targeted patch, but decided that it was important enough to do more generally
Eli Bendersky [Sat, 7 Mar 2020 13:35:44 +0000 (05:35 -0800)]
Reflow comments and clean up whitespace
Eli Bendersky [Sat, 7 Mar 2020 13:34:36 +0000 (05:34 -0800)]
Merge branch 'master' of github.com:eliben/pyelftools
Seva Alekseyev [Sat, 7 Mar 2020 13:34:29 +0000 (08:34 -0500)]
ref_addr size changed between v2 and v3 - take 2 (#273)
In DWARF 2, the DW_FORM_ref_addr format matches the target address size, while in DWARF3+ it matches the bitness of the CU record. Here are the relevant fragments from the spec, part 7:
v2:
The second type of reference is the address of any debugging information entry within the same executable or shared object; it may refer to an entry in a different compilation unit from the unit containing the reference. This type of reference (DW_FORM_ref_addr) is the size of an address on the target architecture; it is relocatable in a relocatable object file and relocated in an executable file or shared object.
v3:
The second type of reference can identify any debugging information entry within a program; in particular, it may refer to an entry in a different compilation unit from the unit containing the reference, and may refer to an entry in a different shared object. This type of reference (DW_FORM_ref_addr) is an offset from the beginning of the .debug_info section of the target executable or shared object; it is relocatable in a relocatable object file and frequently relocated in an executable file or shared object. For references from one shared object or static executable file to another, the relocation and identification of the target object must be performed by the consumer. In the 32-bit DWARF format, this offset is a 4-byte unsigned value; in the 64-bit DWARF format, it is an 8-byte unsigned value (see Section 7.4).
If elftools encounters 32-bit DWARF v2 targeting a 64-bit architecture, it will misparse DW_FORM_ref_addr and crash downstream.
I encountered this in an iOS binary from 2017, built with Xcode several versions ago for ARM64. This probably never came up before because by the time 64 bit code became relevant, most toolchains would generate DWARF 3 or newer.
Co-authored-by: Seva Alekseyev <sevaa@nih.gov>
Eli Bendersky [Sat, 7 Mar 2020 13:25:38 +0000 (05:25 -0800)]
Fix typo in comment
William Woodruff [Sat, 7 Mar 2020 13:24:43 +0000 (08:24 -0500)]
examples: Add dwarf_lineprogram_filenames.py (#285)
This adds an example of the operation discussed in #283.
Usage:
python3 ./dwarf_lineprogram_filenames.py --test x.elf y.elf z.elf
Andreas Ziegler [Fri, 6 Mar 2020 14:00:26 +0000 (15:00 +0100)]
construct_utils.py: add missing import (#291)
Add a forgotten import of SizeofError.
Fixes: #278
Eli Bendersky [Tue, 4 Feb 2020 14:21:28 +0000 (06:21 -0800)]
Move test file name to be included in distribution
Fixes #260
Eli Bendersky [Tue, 4 Feb 2020 13:42:26 +0000 (05:42 -0800)]
Reformat whitespace in dwarf enums
Seva Alekseyev [Tue, 4 Feb 2020 13:37:04 +0000 (08:37 -0500)]
DWARF 5 tags and attributes (#271)
Eli Bendersky [Tue, 4 Feb 2020 13:33:02 +0000 (05:33 -0800)]
Trim whitespace
Fish [Tue, 4 Feb 2020 13:32:30 +0000 (06:32 -0700)]
Make sure ELFFile.has_dwarf_info() returns a boolean. (#266)
Andreas Ziegler [Tue, 4 Feb 2020 13:30:09 +0000 (14:30 +0100)]
segments.py: fix TLS checks in section_in_segment() (#275)
While the comment in section_in_segment() suggests that the
logic follows the logic inside ELF_SECTION_IN_SEGMENT_1 with
the strict parameter set, all of the checks in the binutils
macro are written so that they must succeed for the section
to be contained in the current segment. In our implementation,
however, the checks were not properly negated.
This showed in the case of .tdata and .tbss which did not
appear in the section to segment mapping (these sections are
found in glibc, for example).
Fix it up by aligning the logic more closely to the binutils
macro by implementing the same logic and returning False only
if the checks fail. Additionally, introduce the third check
from the upstream binutils which checks the combination of
SHT_ALLOC sections and PT_LOAD-like segments.
Furthermore, in the original check, the PT_GNU_RELRO type was
misspelled with a 0 (zero) instead of an O so this check
could never have worked.
Fixes: #263
Eli Bendersky [Tue, 4 Feb 2020 13:20:27 +0000 (05:20 -0800)]
Travis: drop Python 3.4, add 3.8
Tim Gates [Mon, 16 Dec 2019 13:23:09 +0000 (00:23 +1100)]
Fix simple typo: wether -> whether (#259)
Closes #258
Eli Bendersky [Mon, 9 Dec 2019 13:24:36 +0000 (05:24 -0800)]
Tweak release instructions in TODO
Eli Bendersky [Thu, 5 Dec 2019 13:43:23 +0000 (05:43 -0800)]
Update release notes in TODO
Eli Bendersky [Thu, 5 Dec 2019 13:34:21 +0000 (05:34 -0800)]
Prepare for release 0.26
William Woodruff [Fri, 8 Nov 2019 03:07:28 +0000 (22:07 -0500)]
Lazy DIE parsing (#249)
Supersedes/closes #216.
William Woodruff [Sun, 27 Oct 2019 13:10:49 +0000 (09:10 -0400)]
elf/constants: Add SHN_XINDEX (#248)
Olof Johansson [Mon, 21 Oct 2019 12:18:44 +0000 (05:18 -0700)]
Include README.rst instead of README in manifest (#247)
`setup.py bdist_wheel` warns about not finding README.
William Woodruff [Fri, 18 Oct 2019 16:17:27 +0000 (12:17 -0400)]
dwarf/die: Handle DW_FORM_flag_present in value translation (#246)
* dwarf/die: Handle DW_FORM_flag_present in value translation
When an attribute has form DW_FORM_flag_present it is implicitly
indicated as present, with no actual value.
Ref. DWARFv4, section 7.
* test: Add DW_FORM_flag_present value test
* test: Fix iteration
* test: Remove old assert
William Woodruff [Fri, 4 Oct 2019 22:24:46 +0000 (18:24 -0400)]
dwarf/constants: More DW_LANG, DW_ATE constants (#245)
Most of these were added in DWARFv5.
Eli Bendersky [Fri, 4 Oct 2019 13:08:39 +0000 (06:08 -0700)]
Realign columns for some constants
William Woodruff [Fri, 4 Oct 2019 13:06:05 +0000 (09:06 -0400)]
dwarf/enums: More attributes, tags, and forms (#244)
William Woodruff [Thu, 3 Oct 2019 14:13:39 +0000 (10:13 -0400)]
dwarf/enums Add GNU parameter tags (#242)
William Woodruff [Wed, 18 Sep 2019 12:21:42 +0000 (08:21 -0400)]
dwarf_expr: Add DW_OP_{lo,hi}_user (#239)
William Woodruff [Tue, 17 Sep 2019 12:17:55 +0000 (08:17 -0400)]
dwarf_expr: Add DWARFv5 OPs (#240)
Eli Bendersky [Wed, 11 Sep 2019 13:02:15 +0000 (06:02 -0700)]
Portable import of collections.Mapping
Tested with Python 3.8
Based on #237 by @Plailect. Closes #237
Anders Dellien [Fri, 2 Aug 2019 13:56:49 +0000 (15:56 +0200)]
Improved handling of location information (#225)
This commit moves some of the location-handling code from the examples
to a new class (LocationParser) in order to make it more reusable.
Also adds two test files containing location information.
Dmitry Koltunov [Tue, 30 Jul 2019 03:11:38 +0000 (06:11 +0300)]
Fix for `CFIEntry.get_decoded()` (#232)
* test: test `CFIEntry.get_decoded()`
This test detects an error in `CFIEntry.get_decoded()`, that occurs when
decodes the `DW_CFA_def_cfa_register` instruction without some CFA
definition previously.
Signed-off-by: Koltunov Dmitry <koltunov@ispras.ru>
* add empty `cfa` for fixup decode of the `DW_CFA_def_cfa_register`
Signed-off-by: Koltunov Dmitry <koltunov@ispras.ru>
William Woodruff [Thu, 18 Jul 2019 13:29:00 +0000 (09:29 -0400)]
dwarf/descriptions: Remove DW_LANG_Upc (#234)
The standard defines only DW_LANG_UPC, and this
value also contained a typo.
William Woodruff [Thu, 18 Jul 2019 13:28:23 +0000 (09:28 -0400)]
dwarf_expr: Add DW_OP_{implicit,stack}_value (#233)
Eli Bendersky [Sat, 22 Jun 2019 12:19:54 +0000 (05:19 -0700)]
Bump minimal supported Python 3.x version to 3.4 and add testing with 3.7
Scott Johnson [Sat, 22 Jun 2019 12:16:23 +0000 (05:16 -0700)]
Fix deprecation warning in Python 3.7 (#231)
$SITE_PYTHON/lib/python3.7/site-packages/elftools/construct/lib/container.py:5
Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
This change is compatible with Python 3.3 and up, when the ABCs were
moved to collections.abc. Backward compatibility is retained through
the try/except block.
zephyrj [Mon, 22 Apr 2019 12:07:27 +0000 (13:07 +0100)]
Add ability to parse the NT_FILE note found in core files (#220)
Andreas Ziegler [Tue, 19 Mar 2019 01:48:19 +0000 (02:48 +0100)]
Improve symbol table handling in DynamicSegment (#219)
dynamic: parse DT_{GNU_}HASH for number of symbols
In ultra-stripped binaries we can find the symbol table by
parsing the dynamic segment and using the pointer in the
DT_SYMTAB tag as the base address. However, we don't know
anything about the number of symbols in the symbol table.
Earlier, this code relied on finding the closest pointer
value bigger than the base address of the symbol table. In
PIE executables and shared libraries however this method
could break as the pointer value for DT_SYMTAB is in the same
range as things like DT_RELASZ or DT_STRSZ, leading to a too
small number of symbols returned by iter_symbols().
The crashpad project has implemented a different strategy to
find the number of symbols: parsing the symbol lookup hash
tables (see [0]) as every symbol must have a corresponding
entry in the hash table. This commit implements this
behaviour for DynamicSegment, leaving the old code as a
backup if neither DT_HASH or DT_GNU_HASH tags have been
found.
For DT_HASH type tables, it is quite easy as the header
already contains the number of entries. For DT_GNU_HASH
things are a bit more complicated as we need to work forward
from the highest symbol referenced in the header (a good
explanation of the format can be found at [1]).
[0]: https://github.com/chromium/crashpad/commit/
1f1657d573c789aa36b6022440e34d9ec30d894c
[1]: https://flapenguin.me/2017/05/10/elf-lookup-dt-gnu-hash/
* dynamic: provide more functions for symbol access
So far, the DynamicSegment only provided a method to iterate
over all symbols but for some use cases it might be useful to
use the recovered symbol table more like a normal
SymbolTableSection.
To this end, provide get_symbol(index) to fetch a symbol by
its index, num_symbols() to get the total number of symbols
and get_symbol_by_name(name) to look for a list of symbols
with a given name.
Robert Xiao [Sat, 16 Mar 2019 13:48:47 +0000 (06:48 -0700)]
Enable parsing of relocations pointed to by DYNAMIC. (#135)
Robert Xiao [Mon, 11 Mar 2019 13:28:25 +0000 (06:28 -0700)]
Fix LookupError when testing with tox (#221)
On macOS I'm getting the following error when testing with tox on py27:
```
ERROR: invocation failed (exit code 1), logfile: /devel/pyelftools/.tox/py27/log/py27-33.log
ERROR: actionid: py27
msg: installpkg
cmdargs: ['/devel/pyelftools/.tox/py27/bin/pip', 'install', '-U', '--no-deps', '/devel/pyelftools/.tox/dist/pyelftools-0.25.zip']
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
Processing ./.tox/dist/pyelftools-0.25.zip
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/qz/XXX/T/pip-req-build-890d2p/setup.py", line 47, in <module>
scripts=['scripts/readelf.py']
File "/devel/pyelftools/.tox/py27/lib/python2.7/site-packages/setuptools/__init__.py", line 144, in setup
_install_setup_requires(attrs)
File "/devel/pyelftools/.tox/py27/lib/python2.7/site-packages/setuptools/__init__.py", line 137, in _install_setup_requires
dist.parse_config_files(ignore_option_errors=True)
File "/devel/pyelftools/.tox/py27/lib/python2.7/site-packages/setuptools/dist.py", line 704, in parse_config_files
self._parse_config_files(filenames=filenames)
File "/devel/pyelftools/.tox/py27/lib/python2.7/site-packages/setuptools/dist.py", line 600, in _parse_config_files
reader = io.TextIOWrapper(fp, encoding=encoding)
LookupError: unknown encoding:
```
This is due to the specification of LC_ALL as simply `en_US` without an encoding. Python 3.x seems to be fine with this, but Python 2.7 barfs. As a fix, setting `LC_ALL` to `en_US.utf-8` (including an explicit encoding spec) works.
Andreas Ziegler [Sat, 16 Feb 2019 13:25:59 +0000 (14:25 +0100)]
Also decode strings in _DynamicStringTable.get_string() (#217)
StringTableSection.get_string() returns an UTF-8 decoded
string (or '' if fetching the string failed) since #182
but the code in _DynamicStringTable was never updated to
decode anything at all so it just returns a bytes sequence
in Python 3.
Let's convert the string there as well to be able to use
both string tables the same way without having to worry
about decoding. Adapt the test cases accordingly.
Eli Bendersky [Thu, 31 Jan 2019 14:24:36 +0000 (06:24 -0800)]
Remove py34 testing target
Eli Bendersky [Thu, 31 Jan 2019 14:24:25 +0000 (06:24 -0800)]
Small stylistic fixes
Vasily E [Thu, 31 Jan 2019 14:17:14 +0000 (17:17 +0300)]
Fixup error on empty .debug_pubtypes section (#215)
* tox: explicitly set locale
Locale affects GNU binutils output translation which cause
run_readelf_tests.py to fail if system language is not English.
Signed-off-by: Efimov Vasily <real@ispras.ru>
* test: unittest reproducing error with empty ".debug_pubtypes" section
Signed-off-by: Efimov Vasily <real@ispras.ru>
* NameLUT: use `construct.If` to declare "name" field
This patch also fixes problem with empty first entry.
Signed-off-by: Efimov Vasily <real@ispras.ru>
* NameLUT._get_entries: remove unused `bytes_read`
Signed-off-by: Efimov Vasily <real@ispras.ru>
Anders Dellien [Wed, 30 Jan 2019 14:33:03 +0000 (15:33 +0100)]
Support for DWARFv4 location lists in dwarf_location_lists.py (#214)
In DWARFv4 the location lists are referenced with the 'sec_offset'
attribute form instead of 'data4' or 'data8'.
Anders Dellien [Mon, 24 Dec 2018 16:56:52 +0000 (17:56 +0100)]
More efficient AbbrevDecl handling (#212)
Create all the AbbrevDecl objects during parsing and later return
references to them - this gives a small performance gain.
rvijayc [Mon, 24 Dec 2018 14:02:08 +0000 (06:02 -0800)]
Added support for decoding .debug_pubtypes and .debug_pubnames sections (#208)
* Added support for decoding .debug_pubtypes and .debug_pubnames sections
* Added reference output to dwarf_pubnames_types.py example.
* Added readelf support, fixed review comments and documentation updates
* Avoid printing the entire die in pubnames example to workaround Python2 vs 3 imcompatibilites
Anders Dellien [Thu, 20 Dec 2018 13:21:35 +0000 (14:21 +0100)]
Simplify handling of null DIEs (#209)
The code that is intended to coalesce null DIEs into the DIE that
precedes them does not do that and is actually not needed as the
'unflattening' procedure takes care of any unexpected null DIEs.
Also added a unit test for verifying the DIE size calculation.