Whilst working on improvements to the error handling of the debug line
parsing code, I noticed that if an invalid offset were to be specified
in a call to getOrParseLineTable(), an entry in the LineTableMap would
still be created, even if the offset was not within the section range.
The immediate parsing attempt afterwards would fail (it would end up
getting a version of 0), and thereafter, any subsequent calls to
getOrParseLineTable or getLineTable would return the default-
constructed, invalid line table. In reality, we shouldn't even attempt
to parse this table, and we should always return a nullptr from these
two functions for this situation.
I have tested this via a unit test, which required some new framework
for unit testing debug line. My plan is to add quite a few more unit
tests for the new error reporting mechanism that will follow shortly,
hence the reason why the supporting code for the tests are written the
way they are - I intend to extend the DwarfGenerator class to support
generating debug line. At that point, I'll make sure that there are a
few positive test cases for this and the parsing code too.
Differential Revision: https://reviews.llvm.org/D44200
Reviewers: JDevlieghere, aprantl
llvm-svn: 326995
Summary:
Original change was D43313 (r326932) and reverted by r326953 because it
broke an LLD test and a windows build. The LLD test was already fixed in
lld commit r326944 (thanks maskray). This is the original change with
the windows build fixed.
llvm-svn: 326970
This patch enhances DWARFDebugFrame with the capability of parsing and
printing DWARF expressions in CFI instructions. It also makes FDEs and
CIEs accessible to lib users, so they can process them in client tools
that rely on LLVM. To make it self-contained with a test case, it
teaches llvm-readobj to be able to dump EH frames and checks they are
correct in a unit test. The llvm-readobj code is Maksim Panchenko's work
(maksfb).
Reviewers: JDevlieghere, espindola
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D43313
llvm-svn: 326932
Instead of only printing the CU-relative offset in non-verbose mode, it
makes more sense to only printed the resolved address. In verbose mode
we still print both.
Differential revision: https://reviews.llvm.org/D44148
rdar://33525475
llvm-svn: 326903
Summary:
This patch implements the name lookup functionality of the .debug_names
accelerator table and hooks it up to "llvm-dwarfdump -find". To make the
interface of the two kinds of accelerator tables more consistent, I've
created an abstract "DWARFAcceleratorTable::Entry" class, which provides
a consistent interface to access the common functionality of the table
entries (such as getting the die offset, die tag, etc.). I've also
modified the apple table to vend entries conforming to this interface.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: vleschuk, clayborg, echristo, llvm-commits
Differential Revision: https://reviews.llvm.org/D43067
llvm-svn: 326003
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.
Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.
Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.
Differential Revision: https://reviews.llvm.org/D42765
llvm-svn: 325970
Verifying any DWARF file that is optimized and contains at least one tag
with a DW_AT_location with a location list offset as a
DW_AT_form_dataXXX results in dwarfdump spuriously claiming that the
location list is invalid.
Differential revision: https://reviews.llvm.org/D40199
llvm-svn: 325430
Seeing some inlining missing in internal uses of symbolizer. I'll work
on a reproduction, tests, improvements & recommit as soon as possible.
(Chandler would like it to be known that this improvement did make
check-llvm 4x faster... - so there's certainly some fairly good
motivation to push on fixing/figuring this out & getting it back in)
This reverts commit r321345.
llvm-svn: 324981
Emitting the correct (root of compilation) file at index 0 will be
posted for review later; I wanted to get this minor change out of the
way first.
llvm-svn: 324669
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.
The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself. This
matches the format used for dumping strings in the .debug_info
section.
Differential Revision: https://reviews.llvm.org/D42802
llvm-svn: 324270
This change adds support to llvm-dwarfdump for dumping DWARF5
.debug_rnglists sections in regular ELF files.
It is not complete, in that several DW_RLE_* encodings are currently
not supported, but does dump the headert and the basic ranges for
DW_RLE_start_length and DW_RLE_start_end encodings.
Obvious next steps are to add verbose dumping that dumps the raw
encodings, rather than the interpreted contents, to add -verify support
of the section (e.g. to show that the correct number of offsets are
specified), add dumping of .debug_rnglists.dwo, and to add support for
other encodings.
Reviewed by: dblaikie, JDevlieghere
Differential Revision: https://reviews.llvm.org/D42481
llvm-svn: 324077
r323476 added support for DW_FORM_line_strp, and incorrectly made that
depend on having a DWARFUnit available. We shouldn't be tracking
.debug_line_str in DWARFUnit after all. After this patch, I can do an
NFC follow up and undo a bunch of the "plumbing" part of r323476.
Differential Revision: https://reviews.llvm.org/D42609
llvm-svn: 323691
The test was failing because of an incorrect sizeof check in the name
index parsing code. This code was meant to check that we have enough
input to parse the fixed-size part of the dwarf header, which it did by
comparing the input to sizeof(Header). Originally struct Header only
contained the fixed-size part, but during review, we've moved additional
members into it, which rendered the sizeof check invalid.
I resolve this by moving the fixed-size part to a separate struct and
updating the sizeof-expression to use that.
llvm-svn: 323648
The call to ScopedPrinter::printNumber with size_t argument was
ambiguous (I think) on 32-bit builds. Explicitly cast to a 64-bit int to
avoid this.
llvm-svn: 323642
Summary:
This modifies the dwarfdump output to align it with the new .debug_names
dump. It also renames two header fields to match similar fields in the
dwarf5 header.
A couple of tests needed to be updated to match new output. The changes
were fairly straight-forward, although not really automatable.
Reviewers: JDevlieghere, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42415
llvm-svn: 323641
Summary:
This commit renames DWARFAcceleratorTable to AppleAcceleratorTable to free up
the first name as an interface for the different accelerator tables.
Then I add a DWARFDebugNames class for the dwarf5 table.
Presently, the only common functionality of the two classes is the dump()
method, because this is the only method that was necessary to implement
dwarfdump -debug-names; and because the rest of the
AppleAcceleratorTable interface does not directly transfer to the dwarf5
tables (the main reason for that is that the present interface assumes
the tables are homogeneous, but the dwarf5 tables can have different
keys associated with each entry).
I expect to make the common interface richer as I add more functionality
to the new class (and invent a way to represent it in generic way).
In terms of sharing the implementation, I found the format of the two
tables sufficiently different to frustrate any attempts to have common
parsing or dumping code, so presently the implementations share just low
level code for formatting dwarf constants.
Reviewers: vleschuk, JDevlieghere, clayborg, aprantl, probinson, echristo, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42297
llvm-svn: 323638
This patch moves the DJB hash to support. This is consistent with other
hashing algorithms living there. The hash is used by the DWARF
accelerator tables. We're doing this now because the hashing function is
needed by dsymutil and we don't want to link against libBinaryFormat.
Differential revision: https://reviews.llvm.org/D42594
llvm-svn: 323616
Move standard forms from a switch statement to the table of forms;
fill in all the missing ones defined in DWARF v5. I'm guessing at
classifications in a couple of cases where v5 forms aren't actually
supported yet, but whoever adds support for the forms can fix the
classifications as needed.
llvm-svn: 323481
This form is like DW_FORM_strp, but points to .debug_line_str instead
of .debug_str as the string section. It's intended to be used from
the line-table header, and allows string-pooling of directory and
filenames across compilation units.
Differential Revision: https://reviews.llvm.org/D42553
llvm-svn: 323476
This frees up the first name to be used as an base class for the
apple table and the dwarf5 .debug_names accel table. The rename was
split off from D42297 (adding of debug_names support), which is still
under review.
llvm-svn: 323113
The compilation directory has always been #0, but as of DWARF v5 it is
explicitly listed in the line-table section instead of implicitly
being a reference to the compile_unit DIE's DW_AT_comp_dir attribute.
This means the dumper should number the dumped array starting with 0
or 1 depending on the DWARF version of the line table.
References in the generated DWARF are correct, it's just the dumper
that was wrong. Also some assembler-coded tests were similarly
confused about directory numbers.
llvm-svn: 322884
inlined subroutines for a given address.
This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.
This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.
It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.
Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]
All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.
Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.
Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.
Differential Revision: https://reviews.llvm.org/D40987
llvm-svn: 321345
Reorganizes the DWARF consumer to derive the string offsets table
contribution's format from the contribution header instead of
(incorrectly) from the unit's format.
Reviewers: JDevliegehere, aprantl
Differential Revision: https://reviews.llvm.org/D41146
llvm-svn: 321295
Adds missing support for DW_FORM_data16.
Update of r320852/r320886, fixing the unittest again, this time use a
raw char string for the test data.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 321011
Adds missing support for DW_FORM_data16.
Update of r320852, fixing the unittest to use a hand-coded struct
instead of std::array to guarantee data layout.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 320886
This fixes a bug where the verifier was complaining about empty
accelerator tables. When the table is empty, its size is not a valid
offset as it points after the end of the section.
This patch also makes the extractor return llvm:Error instead of bool
for better error reporting in the verifier.
Differential revision: https://reviews.llvm.org/D41063
rdar://35932007
llvm-svn: 320399
The previous implementation would only look 1 DW_AT_specification or DW_AT_abstract_origin deep. This means DWARFDie::getName() would fail in certain cases. I ran into such a case while creating a tool that used the LLVM DWARF parser to generate a symbolication format so I have seen this in the wild.
Differential Revision: https://reviews.llvm.org/D40156
llvm-svn: 319104
DWARF4 relative DW_AT_high_pc values are now displayed as absolute
addresses. The relative value is only shown when explicitly dumping the
forms, i.e. in show-form or verbose mode.
```
DW_AT_low_pc (0x0000000000000049)
DW_AT_high_pc (0x00000019)
```
becomes
```
DW_AT_low_pc (0x0000000000000049)
DW_AT_high_pc (0x0000000000000062)
```
Differential revision: https://reviews.llvm.org/D40317
rdar://35416943
llvm-svn: 319044
As a side effect, the .debug_line section will be dumped in physical
order, rather than in the order that compile units refer to their
associated portions of the .debug_line section. These are probably
always the same order anyway, and no tests noticed the difference.
Differential Revision: https://reviews.llvm.org/D39854
llvm-svn: 318839
Patch improves next things:
* Fixes assert/crash in getOpDesc when giving it a invalid expression op code.
* DWARFExpression::print() called DWARFExpression::Operation::getEndOffset() which
returned and used uninitialized field EndOffset. Patch fixes that.
* Teaches verifier to verify DW_AT_location and error out on broken expressions.
Differential revision: https://reviews.llvm.org/D39294
llvm-svn: 316756
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.
Differential revision: https://reviews.llvm.org/D38409
llvm-svn: 316619
This fixes possible out of bound access in
DWARFDie::getFirstChild()
which might happen when .debug_info section is corrupted,
like shown in testcase.
Differential revision: https://reviews.llvm.org/D39185
llvm-svn: 316566
Currently llvm-dwarfdump runs into llvm_unreachable when
faces DW_CFA_GNU_args_size. Patch implements the support.
Differential revision: https://reviews.llvm.org/D38879
llvm-svn: 315897
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.
Recommit after being reverted in r315299.
Differential revision: https://reviews.llvm.org/D36993
llvm-svn: 315316
This patch adds printing for DW_AT_type DIEs like it is already the case
for DW_AT_specification DIEs. This is a rather naive approach and only a
start. We should have pretty printers for different languages.
Differential revision: https://reviews.llvm.org/D36993
llvm-svn: 315297
This patch adds two new verifiers:
- It checks that the root DIE of a CU is actually a valid unit DIE.
(based on its tag)
- For DWARF5 which contains a unit type int he CU header, it checks that
this matches the type of the unit DIE.
Differential revision: https://reviews.llvm.org/D38453
llvm-svn: 315121
The test fails on Linux; see follow-up email on the llvm-commits list.
> Add the option to lookup an address in the debug information and print
> out the file, function, block and line table details.
>
> Differential revision: https://reviews.llvm.org/D38409
This also reverts the follow-up r314818:
> [test] Fix llvm-dwarfdump/cmdline.test
>
> Fixes test/tools/llvm-dwarfdump/cmdline.test
llvm-svn: 314825
Add the option to lookup an address in the debug information and print
out the file, function, block and line table details.
Differential revision: https://reviews.llvm.org/D38409
llvm-svn: 314817
This implement the insertion operator for DWARF address ranges so they
are consistently printed as [LowPC, HighPC).
While a dump method might have felt more consistent, it is used
exclusively for printing error messages in the verifier and never used
for actual dumping. Hence this approach is more intuitive and creates
less clutter at the call sites.
Differential revision: https://reviews.llvm.org/D38395
llvm-svn: 314523
This patch introduces 3 helper functions: error(), warn() and note() to
make printing during verification more consistent. When supported, the
respective prefixes are printed in color using the same color scheme as
clang.
Differential revision: https://reviews.llvm.org/D38368
llvm-svn: 314498
This patch implements the dwarfdump option --find=<name>. This option
looks for a DIE in the accelerator tables and dumps it if found. This
initial patch only adds support for .apple_names to keep the review
small, adding the other sections and pubnames support should be
trivial though.
Differential Revision: https://reviews.llvm.org/D38282
llvm-svn: 314439
This patch adds a check to the DWARF verifier to detect CUs without a
unit DIE.
Differential revision: https://reviews.llvm.org/D38363
llvm-svn: 314426
When dsymutil generates the companion file, its strips all unnecessary
sections by omitting their body and setting the offset in their
corresponding load command to zero.
One such section is the .eh_frame section, as it contains runtime
information rather than debug information and is part of the __TEXT
segment. When reading this section, we would just read the number of
bytes specified in the load command, starting from offset 0 (i.e. the
beginning of the file).
Rather than trying to parse this obviously invalid section, dwarfdump
now skips this.
Differential revision: https://reviews.llvm.org/D38135
llvm-svn: 314208
This patch adds dumping of line table instructions as well as the final
state at each specified pc value in verbose mode. This is essentially
the same as the default in Darwin's dwarfdump. Dumping the actual line
table opcodes can be particularly useful for something like debugging a
bad `.debug_line` section.
Differential revision: https://reviews.llvm.org/D37971
llvm-svn: 313910
This is a bit ugly because we can't put Optional into a union. Hide all
of that behind a set of accessors and make accesses safer using asserts.
llvm-svn: 313884
This patch implements the Darwin dwarfdump option --recurse-depth=<N>,
which limits the recursion depth when selectively printing DIEs at an
offset.
Differential Revision: https://reviews.llvm.org/D38064
llvm-svn: 313778
When symbolizing large binaries, parsing every CU in a DWP file is a
significant performance penalty. Instead, use the index to only load the
CUs that are needed.
llvm-svn: 313659
This speeds up dumping specific DIEs by not parsing abbreviations for
units that are not used.
(this is also handy to have in eventually to speed up llvm-symbolizer
for .dwp files, where parsing most of the DWP file can be avoided by
using the index)
llvm-svn: 313635
This patch makes the `.eh_frame` extension an alias for `.debug_frame`.
Up till now it was only possible to dump the section using objdump, but
not with dwarfdump. Since the two are essentially interchangeable, we
dump whichever of the two is present.
As a workaround, this patch also adds parsing for 3 currently
unimplemented CFA instructions: `DW_CFA_def_cfa_expression`,
`DW_CFA_expression`, and `DW_CFA_val_expression`. Because I lack the
required knowledge, I just parse the fields without actually creating
the instructions.
Finally, this also fixes the typo in the `.debug_frame` section name
which incorrectly contained a trailing `s`.
Differential revision: https://reviews.llvm.org/D37852
llvm-svn: 313530
This is the first of many commits that enable selectively dumping just
one record from the debug info.
This reapplies r313412 with some extra qualification to appease GCC and MSVC.
llvm-svn: 313419
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:
- Verify that all address ranges in a DIE are valid.
- Verify that no ranges within the DIE overlap.
- Verify that no ranges overlap with the ranges of a sibling.
- Verify that children are completely contained in its (direct)
parent's address range. (unless both are subprograms)
Differential revision: https://reviews.llvm.org/D37696
llvm-svn: 313255
This patch started as an attempt to rebase Greg's differential (D32821).
The result is both quite similar and different at the same time. It adds
the following checks:
- Verify that all address ranges in a DIE are valid.
- Verify that no ranges within the DIE overlap.
- Verify that no ranges overlap with the ranges of a sibling.
- Verify that children are completely contained in its (direct)
parent's address range. (unless both are subprograms)
Differential revision: https://reviews.llvm.org/D37696
llvm-svn: 313250
Since users typically don't really care about the .dwo / non.dwo
distinction, this patch makes it so dwarfdump --debug-<info,...> dumps
.debug_info and (if available) also .debug_info.dwo. This simplifies
the command line interface (I've removed all dwo-specific dump
options) and makes the tool friendlier to use.
Differential Revision: https://reviews.llvm.org/D37771
llvm-svn: 313207
This patches renames "brief" to "verbose" in de DIDumpOptions and
inverts the logic to match the new behavior where brief is the default.
Changing the default value uncovered some bugs related to the
DIDumpOptions not being propagated and have been fixed as well.
Differential revision: https://reviews.llvm.org/D37745
llvm-svn: 313139
As discussed on llvm-dev in
http://lists.llvm.org/pipermail/llvm-dev/2017-September/117301.html
this changes the command line interface of llvm-dwarfdump to match the
one used by the dwarfdump utility shipping on macOS. In addition to
being shorter to type this format also has the advantage of allowing
more than one section to be specified at the same time.
In a nutshell, with this change
$ llvm-dwarfdump --debug-dump=info
$ llvm-dwarfdump --debug-dump=apple-objc
becomes
$ dwarfdump --debug-info --apple-objc
Differential Revision: https://reviews.llvm.org/D37714
llvm-svn: 312970
This patch adds prologue verification, which is already present in
Apple's dwarfdump. It checks for invalid directory indices and warns
about duplicate file paths.
Differential revision: https://reviews.llvm.org/D37511
llvm-svn: 312782
It solves issue of wrong section index evaluating for ranges when
base address is used.
Based on David Blaikie's patch D36097.
Differential revision: https://reviews.llvm.org/D37214
llvm-svn: 312477
Summary:
Based on Fred's patch here: https://reviews.llvm.org/D6771
I can't seem to commandeer the old review, so I'm creating a new one.
With that change the locations exrpessions are pretty printed inline in the
DIE tree. The output looks like this for debug_loc entries:
DW_AT_location [DW_FORM_data4] (0x00000000
0x0000000000000001 - 0x000000000000000b: DW_OP_consts +3
0x000000000000000b - 0x0000000000000012: DW_OP_consts +7
0x0000000000000012 - 0x000000000000001b: DW_OP_reg0 RAX, DW_OP_piece 0x4
0x000000000000001b - 0x0000000000000024: DW_OP_breg5 RDI+0)
And like this for debug_loc.dwo entries:
DW_AT_location [DW_FORM_sec_offset] (0x00000000
Addr idx 2 (w/ length 190): DW_OP_consts +0, DW_OP_stack_value
Addr idx 3 (w/ length 23): DW_OP_reg0 RAX, DW_OP_piece 0x4)
Simple locations without ranges are printed inline:
DW_AT_location [DW_FORM_block1] (DW_OP_reg4 RSI, DW_OP_piece 0x4, DW_OP_bit_piece 0x20 0x0)
The debug_loc(.dwo) dumping in changed accordingly to factor the code.
Reviewers: dblaikie, aprantl, friss
Subscribers: mgorny, javed.absar, hiraditya, llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D37123
llvm-svn: 312042
This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.
Differential Revision: https://reviews.llvm.org/D36835
llvm-svn: 311201
As was requested in D36313 thread,
with this patch section names and uniqueness calculated once,
and not every time when a range is dumped.
Differential revision: https://reviews.llvm.org/D36740
llvm-svn: 310923
Teaches llvm-dwarfdump to print section index and name of range
when it dumps .debug_info.
Differential revision: https://reviews.llvm.org/D36313
llvm-svn: 310915
These lead to tests failing spuriously as the values after being rendered to a
string were incorrect.
Reviewers: clayborg
Differential Revision: https://reviews.llvm.org/D36319
llvm-svn: 310262
Followup to r309570, fixing it slightly differently (ranges_base and
addr_base should never be read from a DWO file - so there shouldn't be
any issue with 'overriding' the values - conditionalize the code and
assert that the values aren't being overriden).
llvm-svn: 309879
DIEs are lazily deserialized so it's possible that the DWO CU is created
before the DIE is parsed. DWO shares .debug_addr and .debug_ranges with the
object file so overwriting the offset with 0 will make the CU unusable.
No test case because I couldn't get clang to emit a non-zero range base.
llvm-svn: 309570
I was a bit lazy when I first implemented this & skipped the index
lookup - obviously for large files this becomes pretty crucial, so here
we go, do the index lookup. Speeds up large DWP symbolizing by... lots.
(20m -> 20s, actually, maybe more in a release build (that was a release
build without index lookup, compared to a debug/non-release build with
the index usage))
llvm-svn: 309507
If you've archived the DWP file somewhere it's probably useful to be
able to just tell llvm-symbolizer where it is when you're symbolizing
stack traces from the binary.
This only provides a mechanism for specifying a single DWP file, good if
you're symbolizing a program with a single DWP file, but it's likely if
the program is dynamically linked that you might have a DWP for each
dynamic library - in which case this feature won't help (at least as
it's surfaced in llvm-symbolizer for now) - in theory it could be
extended to specify a collection of DWP files that could all be
consulted for split CU hash resolution.
llvm-svn: 309498
SUMMARY
This patch adds a verification check on the abbreviation declarations in the .debug_abbrev section.
The check makes sure that no abbreviation declaration has more than one attributes with the same name.
Differential Revision: https://reviews.llvm.org/D35643
llvm-svn: 308579
This changes DwarfContext to delegate to DwarfObject instead of having
pure virtual methods.
With this DwarfContextInMemory is replaced with an implementation of
DwarfObject that is local to a .cpp file.
llvm-svn: 308543
Summary:
This patch modifies the handleDebugInfo() function so that we verify the contents of each unit
in the .debug_info section only if its header has been successfully verified.
This change will allow for more/different verification checks depending on the type of the unit since from
dwarf5, the .debug_info section may consist of different types of units.
Subscribers: aprantl
Differential Revision: https://reviews.llvm.org/D35521
llvm-svn: 308245
This patch adds verification checks for the unit header chain in the .debug_info section.
Specifically, for each unit in the .debug_info section, the verifier checks that:
The unit length is valid (i.e. the unit can actually fit in the .debug_info section)
The dwarf version of the unit is valid
The address size is valid (4 or 8)
The unit type (if the unit is in dwarf5) is valid
The debug_abbrev_offset is valid
llvm-svn: 307975
Code to convert MachO - specific section debug section names to standard DWARF v5
section names was in the wrong place.
Differential Revision: https://reviews.llvm.org/D35321
llvm-svn: 307872
Doing so is leaking an implementation detail.
I have an implementation that uses the lld infrastructure and doesn't
use a map or object::SectionRef.
llvm-svn: 307846
Variable was called 'Name' and contained text
name of relocation type. Problem was that
outside of this error handling scope we already
have different 'Name' variable that contains
section name.
Change helps to avoid confusion.
llvm-svn: 307530
This patch verifies the number of atoms, the validity of the form for each atom, as well as the validity of the
hashdata. For hashdata, we're verifying that the hashdata offset is correct and that the offset in the .debug_info for
each DIE in the hashdata is also valid.
llvm-svn: 306735
Requires callers to directly associate relocations with a DataExtractor
used to read data from a DWARF section, which helps a callee not make
assumptions about which section it is reading.
This is the next step in reducing DWARFFormValue's dependence on DWARFUnit.
Differential Revision: https://reviews.llvm.org/D34704
llvm-svn: 306699
Because of mistake introduced in r306517,
wrong variable ("name" instead of "Name") was used
in error message.
As a result it reported section name instead of
relocation name.
This file still needs cleanup to match LLVM coding style
and more tests I think.
llvm-svn: 306677
With fix in include folder character case:
#include "llvm/Codegen/AsmPrinter.h" -> #include "llvm/CodeGen/AsmPrinter.h"
Original commit message:
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.
That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.
Differential revision: https://reviews.llvm.org/D34328
llvm-svn: 306517
Change introduces error reporting policy for DWARFContextInMemory.
New callback provided by client is able to handle error on it's
side and return Halt or Continue.
That allows to either keep current behavior when parser prints all errors
but continues parsing object or implement something very different, like
stop parsing on a first error and report an error in a client style.
Differential revision: https://reviews.llvm.org/D34328
llvm-svn: 306512
Some forms have sizes that depend on the DWARF version, DWARF format
(32/64-bit), or the size of an address. Collect these into a struct
to simplify passing them around. Require callers to provide one when
they query a form's size.
Differential Revision: http://reviews.llvm.org/D34570
llvm-svn: 306315
The verifier should not output any message in such a case.
Added test case with no .apple_name section in the file to verify new functionality.
Made existing test case more specific.
llvm-svn: 305597
This patch adds code which verifies that each bucket in the .apple_names
accelerator table is either empty or has a valid hash index.
Differential Revision: https://reviews.llvm.org/D34177
llvm-svn: 305344
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
This patch introduces a new command line option, called brief, to
llvm-dwarfdump. When -brief is used, the attribute forms for the
.debug_info section will not be emitted to output.
Patch by Spyridoula Gravani!
rdar://problem/21474365
Differential Revision: https://reviews.llvm.org/D33867
llvm-svn: 304844
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
This commit introduces a structure that holds all the flags that
control the pretty printing of dwarf output.
Patch by Spyridoula Gravani!
Differential Revision: https://reviews.llvm.org/D33749
llvm-svn: 304446
With fix of uninitialized variable.
Original commit message:
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 304078
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses with use of llvm::LoadedObjectInfo
interface. We assigned file offsets as addressed. Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well. That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 304002
With fix of test compilation.
Initial commit message:
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section
which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason
of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well.
That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 303983
This change is intended to use for LLD in D33183.
Problem we have in LLD when building .gdb_index is that we need to know section
which address range belongs to.
Previously it was solved on LLD side by providing fake section addresses
with use of llvm::LoadedObjectInfo interface. We assigned file offsets as addressed.
Then after obtaining ranges lists, for each range we had to find section ID's.
That not only was slow, but also complicated implementation and was the reason
of incorrect behavior when
sections share the same offsets, like D33176 shows.
This patch makes DWARF parsers to return section index as well.
That solves problem mentioned above.
Differential revision: https://reviews.llvm.org/D33184
llvm-svn: 303978
llvm-symbolizer would fail to symbolize addresses in unlinked object
files when handling .dwo file data because the addresses would not be
relocated in the same way as the ranges in the skeleton CU in the object
file.
Fix that so object files can be symbolized the same as executables.
llvm-svn: 303532
We do not need to store relocation width field.
Patch removes relative code, that simplifies implementation.
Differential revision: https://reviews.llvm.org/D33274
llvm-svn: 303335
I revisited Decompressor API (issue with it was triggered during D32865 review)
and found it is probably provides more then we really need.
Issue was about next method's signature:
Error decompress(SmallString<32> &Out);
It is too strict. At first I wanted to change it to decompress(SmallVectorImpl<char> &Out),
but then found it is still not flexible because sticks to SmallVector.
During reviews was suggested to use templating to simplify code. Patch do that.
Differential revision: https://reviews.llvm.org/D33200
llvm-svn: 303331
RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8),
and width field was never used in code.
Relocations proccessing loop had checks for width field. Does not look like DWARF parser
should do that. There is probably no much sense to validate relocations during proccessing
them in parser.
Patch removes relocation's width relative code from DWARFContext.
Differential revision: https://reviews.llvm.org/D33194
llvm-svn: 303251
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.
Suggested during review of D33184.
llvm-svn: 303163
I am working on a speedup of building .gdb_index in LLD and
noticed that relocations that are proccessed in DWARFContextInMemory often uses
the same symbol in a row. This patch introduces caching to reduce the relocations
proccessing time.
For benchmark,
I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD.
Link time without --gdb-index is about 4,45s.
Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s
That means time spent on --gdb-index in this configuration is
19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch).
Differential revision: https://reviews.llvm.org/D31136
llvm-svn: 303051
There is no other explanation about why this only started happening
now, even though it crashes on old code (supposedly reachable from
here).
The only common factor between the failing bots is that they use GCC
(4.9 and 5.3) to compile Clang, while the others use Clang 3.8, but the
failure is while building the tests, as an assertion, on Clang.
Commenting it out for now in hope the bots will go back green, but we
should keep looking for the real cause, and update bugzilla.
llvm-svn: 302520
llvm-dwarfdump currently prints no message if decompression fails
for some reason. I noticed that during work on one of LLD patches
where LLD produced an broken output. It was a bit confusing to see
no output for section dumped and no any error message at all.
Patch adds error message for such cases.
Differential revision: https://reviews.llvm.org/D32865
llvm-svn: 302221
Adrian requested that we break things down to make things clean in the DWARFVerifier. This patch breaks everything down into nice individual functions and cleans up the code quite a bit and prepares us for the next round of verifiers.
Differential Revision: https://reviews.llvm.org/D32812
llvm-svn: 302062
Adrian requested we create a DWARFVerifier.cpp file to contain all of the DWARF verification stuff. This change simply moves the functionality over into DWARFVerifier.h and DWARFVerifier.cpp, renames the DWARFVerifier methods to start with lower case, and switches DWARFContext.cpp over to using the new functionality.
Differential Revision: https://reviews.llvm.org/D32809
llvm-svn: 302044
Check to make sure no compile units have the same DW_AT_stmt_list values. Report a verification error if they do.
Differential Revision: https://reviews.llvm.org/D32771
llvm-svn: 302039
This patch verifies the .debug_line:
- verify all addresses in a line table sequence have ascending addresses
- verify that all line table file indexes are valid
Unit tests added for both cases.
Differential Revision: https://reviews.llvm.org/D32765
llvm-svn: 301984
The directory and file tables now have form-based content descriptors.
Parse these and extract the per-directory/file records based on the
descriptors. For now we support only DW_FORM_string (inline) for the
path names; follow-up work will add support for indirect forms (i.e.,
DW_FORM_strp, strx<N>, and line_strp).
Differential Revision: http://reviews.llvm.org/D32713
llvm-svn: 301978
LTO and other fancy linking previously led to DWARF that contained invalid references. We already validate that CU relative references fall into the CU, and the DW_FORM_ref_addr references fall inside the .debug_info section, but we didn't validate that the references pointed to correct DIE offsets. This new verification will ensure that all references refer to actual DIEs and not an offset in between.
This caught a bug in DWARFUnit::getDIEForOffset() where if you gave it any offset, it would match the DIE that mathes the offset _or_ the next DIE. This has been fixed.
Differential Revision: https://reviews.llvm.org/D32722
llvm-svn: 301971
lldb-dwarfdump gets a new "--verify" option that will verify a single file's DWARF debug info and will print out any errors that it finds. It will return an non-zero exit status if verification fails, and a zero exit status if verification succeeds. Adding the --quiet option will suppress any output the STDOUT or STDERR.
The first part of the verify does the following:
- verifies that all CU relative references (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata) have valid CU offsets
- verifies that all DW_FORM_ref_addr references have valid .debug_info offsets
- verifies that all DW_AT_ranges attributes have valid .debug_ranges offsets
- verifies that all DW_AT_stmt_list attributes have valid .debug_line offsets
- verifies that all DW_FORM_strp attributes have valid .debug_str offsets
Unit tests were added for each of the above cases.
Differential Revision: https://reviews.llvm.org/D32707
llvm-svn: 301844
There was a garbage character in output introduced by myself in
r290040 "[DWARF] - Introduce DWARFDebugPubTable class for dumping pub* sections."
llvm-svn: 301631
It is useful to output size of ranges when address ranges
section of .gdb_index is dumped.
It helps to compare outputs produced by different linkers,
for example. In that case address ranges can look very different,
when they are the same at fact. Difference comes from different
low address because of different address of .text.
Differential revision: https://reviews.llvm.org/D32492
llvm-svn: 301527
I found this when investigated "Bug 32319 - .gdb_index is broken/incomplete" for LLD.
When we have object file with .debug_ranges section it may be filled with zeroes.
Relocations are exist in file to relocate this zeroes into real values later, but until that
a pair of zeroes is treated as terminator. And DWARF parser thinks there is no ranges at all
when I am trying to collect address ranges for building .gdb_index.
Solution implemented in this patch is to take relocations in account when parsing ranges.
Differential revision: https://reviews.llvm.org/D32228
llvm-svn: 301170
This is splitted from D32228,
currently DWARF parsers code has few places that applied relocations values manually.
These places has similar duplicated code. Patch introduces separate method that can be
used to obtain relocated value. That helps to reduce code and simplifies things.
Differential revision: https://reviews.llvm.org/D32284
llvm-svn: 300956
Summary:
In the current implementation, to find inline stack for an address incurs expensive linear search in 2 places:
* linear search for the top-level DIE
* recursive linear traverse the DIE tree to find the path to the leaf DIE
In this patch, a map is built from address to its corresponding leaf DIE. The inline stack is built by traversing from the leaf DIE up to the root DIE. This speeds up batch symbolization by ~10X without noticible memory overhead.
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32177
llvm-svn: 300742
Summary:
In the current implementation, to find inline stack for an address incurs expensive linear search in 2 places:
* linear search for the top-level DIE
* recursive linear traverse the DIE tree to find the path to the leaf DIE
In this patch, a map is built from address to its corresponding leaf DIE. The inline stack is built by traversing from the leaf DIE up to the root DIE. This speeds up batch symbolization by ~10X without noticible memory overhead.
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32177
llvm-svn: 300697
This change is basically relative to D31136, where I initially wanted to
implement some relocations handling optimization which shows it can give
significant boost. Though even without any caching algorithm looks
code can have some cleanup at first.
Refactoring separates out code for taking symbol address, used in relocations
computation.
Differential revision: https://reviews.llvm.org/D31747
llvm-svn: 300039
Some late additions to DWARF v5 were not in Dwarf.def; also one form
was redefined. Add the new cases to relevant switches in different
parts of LLVM. Replace DW_FORM_ref_sup with DW_FORM_ref_sup[4,8].
I did not add support for DW_FORM_strx3/addrx3 other that defining the
constants. We don't have any infrastructure to support these.
Differential Revision: http://reviews.llvm.org/D30664
llvm-svn: 297085
Requesting DWARF v5 will now get you the new compile-unit and
type-unit headers. llvm-dwarfdump will also recognize them.
Differential Revision: http://reviews.llvm.org/D30206
llvm-svn: 296514
When dumping .debug_info section we loop through all attributes mentioned in
.debug_abbrev section and dump values using DWARFFormValue::extractValue().
We need to skip implicit_const attributes here as their values are not
really located in .debug_info but directly in .debug_abbrev. This patch fixes
triggered assert() in DWARFFormValue::extractValue() caused by trying to
access implicit_const values from .debug_info.
llvm-svn: 296253
DWARF info contains info about the line number at which a function starts (DW_AT_decl_line).
This patch creates a function to look up the start line number for a function, and returns it in
DILineInfo when looking up debug info for a particular address.
Patch by Simon Que!
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D27962
llvm-svn: 294231
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html
For reference:
- Public headers should just declare the dump() method but not use
LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void MyClass::dump() {
// print stuff to dbgs()...
}
#endif
llvm-svn: 293359
Summary: This patch adds some new APIs to enable using the YAML DWARF representation in unit tests. The most basic new API is DWARFYAML::EmitDebugSections which converts a YAML string into a series of owned MemoryBuffer objects stored in a StringMap. The string map can then be used to construct a DWARFContext for parsing in place of an ObjectFile.
Reviewers: dblaikie, clayborg
Subscribers: mgorny, fhahn, jgosnell, aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D28828
llvm-svn: 292634
This allows us efficiently look for more than one attribute, something that is quite common in DWARF consumption.
Differential Revision: https://reviews.llvm.org/D28704
llvm-svn: 291967
Removed all DWARFDie::getAttributeValueAs*() calls.
Renamed:
Optional<DWARFFormValue> DWARFDie::getAttributeValue(dwarf::Attribute);
To:
Optional<DWARFFormValue> DWARFDie::find(dwarf::Attribute);
Added:
Optional<DWARFFormValue> DWARFDie::findRecursively(dwarf::Attribute);
All decoding of Optional<DWARFFormValue> values are now done using the dwarf::to*() functions from DWARFFormValue.h:
Old code:
auto DeclLine = DWARFDie.getAttributeValueAsSignedConstant(DW_AT_decl_line).getValueOr(0);
New code:
auto DeclLine = toUnsigned(DWARFDie.find(DW_AT_decl_line), 0);
This composition helps us since we can now easily do:
auto DeclLine = toUnsigned(DWARFDie.findRecursively(DW_AT_decl_line), 0);
This allows us to easily find attribute values in the current DIE only (the first new code above) or in any DW_AT_abstract_origin or DW_AT_specification Dies using the line above. Note that the code line length is shorter and more concise.
Differential Revision: https://reviews.llvm.org/D28581
llvm-svn: 291959
Now we only support returning Optional<> values and have changed all clients over to use Optional::getValueOr().
Differential Revision: https://reviews.llvm.org/D28569
llvm-svn: 291686
Decompressor intention is to reduce duplication of code.
Currently LLD has own implementation of decompressor
for compressed debug sections.
This class helps to avoid it and share the code.
LLD patch for reusing it is D28106
Differential revision: https://reviews.llvm.org/D28105
llvm-svn: 291675
Support for DW_FORM_implicit_const DWARFv5 feature.
When this form is used attribute value goes to .debug_abbrev section (as SLEB).
As this form would break any debug tool which doesn't support DWARFv5
it is guarded by dwarf version check. Attempt to use this form with
dwarf version <= 4 is considered a fatal error.
Differential Revision: https://reviews.llvm.org/D28456
llvm-svn: 291599
This patch adds support for YAML<->DWARF for debug_info sections.
This re-lands r290147, reverted in 290148, re-landed in r290204 after fixing the issue that caused bots to fail (thank you UBSan!), and reverted again in r290209 due to failures on big endian systems.
After adding support for preserving endianness, this should be good now.
llvm-svn: 290386
In order for the llvm DWARF parser to be used in LLDB we will need to be able to get the parent of a DIE. This patch adds that functionality by changing the DWARFDebugInfoEntry class to store a depth field instead of a sibling index. Using a depth field allows us to easily calculate the sibling and the parent without increasing the size of DWARFDebugInfoEntry.
I tested llvm-dsymutil on a debug version of clang where this fully parses DWARF in over 1200 .o files to verify there was no serious regression in performance.
Added a full suite of unit tests to test this functionality.
Differential Revision: https://reviews.llvm.org/D27995
llvm-svn: 290274
This patch adds support for YAML<->DWARF for debug_info sections.
This re-lands r290147, after fixing the issue that caused bots to fail (thank you UBSan!).
llvm-svn: 290204
DWARF 4 and later supports encoding the PC as an address or as as offset from the low PC. Clients using DWARFDie should be insulated from how to extract the high PC value. This function takes care of extracting the form value and looking for the correct form.
Differential Revision: https://reviews.llvm.org/D27885
llvm-svn: 290131
Patch implements parser of pubnames/pubtypes tables instead of static
function used before. It is now should be possible to reuse it
in LLD or other projects and clean up the duplication code.
Differential revision: https://reviews.llvm.org/D27851
llvm-svn: 290040