Putting addresses in the address pool, even with non-fission, can reduce
relocations - reusing the addresses from debug_info and debug_rnglists
(the latter coming soon)
llvm-svn: 344834
Considers the index when extracting location lists from a .dwp file.
Majority of the patch by David Blaikie.
Reviewers: dblaikie
Differential revision: https://reviews.llvm.org/D53155
llvm-svn: 344807
NFC-ish (the parsing of the units is not a functional change - no
errors/warnings are emitted during the shallow parsing - though without
parsing them here, the "max version" would be wrong (still zero) later
on, so in those cases the units do need to be parsed)
llvm-svn: 343884
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.
Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:
https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf
This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.
Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.
rdar://42001377
Differential Revision: https://reviews.llvm.org/D49887
llvm-svn: 343883
- Add fix so that all code paths that create DWARFContext
with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method
llvm-svn: 343317
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.
> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
> state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
> i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
> (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
> https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136
llvm-svn: 343103
Functions that have signed return addresses need additional dwarf support:
- After signing the LR, and before authenticating it, the LR register is in a
state the is unusable by a debugger or unwinder
- To account for this a new directive, .cfi_negate_ra_state, is added
- This directive says the signed state of the LR register has now changed,
i.e. unsigned -> signed or signed -> unsigned
- This directive has the same CFA code as the SPARC directive GNU_window_save
(0x2d), adding a macro to account for multiply defined codes
- This patch matches the gcc implementation of this support:
https://patchwork.ozlabs.org/patch/800271/
Differential Revision: https://reviews.llvm.org/D50136
llvm-svn: 343089
This extends the verifier to catch three new errors:
* Missing DW_AT_type attributes for DW_TAG_formal_parameter,
DW_TAG_variable and DW_TAG_array_type.
* Valid references for DW_AT_type pointing to a non-type tag.
Differential revision: https://reviews.llvm.org/D52223
llvm-svn: 342713
Verify that DW_AT_specification and DW_AT_abstract_origin reference a
DIE with a compatible tag.
Differential revision: https://reviews.llvm.org/D38719
llvm-svn: 342712
It's pretty common for the verifier to dump the relevant DIE when it
finds an issue. This tends to be relatively verbose and error prone
because we have to pass the DIDumpOptions to the DIE's dump method. This
patch adds a helper function to the verifier to make this easier.
llvm-svn: 342526
Eliminating some duplication of rangelist dumping code at the expense of
some version-dependent code in dump and extract routines.
Reviewer: dblaikie, JDevlieghere, vleschuk
Differential revision: https://reviews.llvm.org/D51081
llvm-svn: 342048
With the merge of TUs and CUs into a single container, some code that
relied on the CU range having an ordered range of contiguous addresses
(for locating a CU at a given offset) broke. But the units from
debug_info (currently only CUs, but CUs and TUs in DWARFv5) are in a
contiguous sub-range of that container - searching only through that
subrange is still valid & so do that.
llvm-svn: 341889
The -diff option makes it easy to diff dwarf by hiding addresses and
offsets. However not all of them were hidden, which should be fixed by
this patch.
Differential revision: https://reviews.llvm.org/D51593
llvm-svn: 341377
According to the standard, for the .debug_names (the "dwarf accelerator
tables"):
> If a subprogram or inlined subroutine is included, and has a
> DW_AT_linkage_name attribute, there will be an additional index entry
> for the linkage name.
For Swift we generate DW_structure_types with a linkage name and the
verifier was incorrectly rejecting this. This patch fixes that by only
considering the linkage name in those particular cases. The test is the
"reduced" debug info of the failing swift test on swift.org.
Differential revision: https://reviews.llvm.org/D51420
llvm-svn: 341311
Both DWARFDebugLine and DWARFDebugAddr used the same callback mechanism
for handling recoverable errors. They both implemented similar warn() function
to be used as such callbacks.
In this revision we get rid of code duplication and move this warn() function
to DWARFContext as DWARFContext::dumpWarning().
Reviewers: lhames, jhenderson, aprantl, probinson, dblaikie, JDevlieghere
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D51033
llvm-svn: 340528
DWARF-related classes in lib/DebugInfo/DWARF contained
duplicating code for creating StringError instances, like:
template <typename... Ts>
static Error createError(char const *Fmt, const Ts &... Vals) {
std::string Buffer;
raw_string_ostream Stream(Buffer);
Stream << format(Fmt, Vals...);
return make_error<StringError>(Stream.str(), inconvertibleErrorCode());
}
Similar function was placed in Support lib in https://reviews.llvm.org/D49824
This revision makes DWARF classes use this function
instead of their local implementation of it.
Reviewers: aprantl, dblaikie, probinson, wolfgangp, JDevlieghere, jhenderson
Reviewed By: JDevlieghere, jhenderson
Differential Revision: https://reviews.llvm.org/D49964
llvm-svn: 340163
We don't expect module names to be present in the index. This patch adds
DW_TAG_module to the blacklist.
Differential revision: https://reviews.llvm.org/D50237
llvm-svn: 338878
This is patch 4 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.
Patch 4 combines separate DWARFUnitVectors for compile and type units
into a single DWARFUnitVector that contains both. For now the
implementation distinguishes compile units from type units by putting
all compile units at the front of the vector, reflecting the DWARF v4
distinction between .debug_info and .debug_types sections. A future
patch will change this to allow the free mixing of unit kinds, as is
specified by DWARF v5.
Differential Revision: https://reviews.llvm.org/D49744
llvm-svn: 338633
This is patch 3 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.
Patch 3 simply renames DWARFUnitSection to DWARFUnitVector, as the
object-file section of a unit is nearly irrelevant now.
Differential Revision: https://reviews.llvm.org/D49743
llvm-svn: 338632
This is patch 2 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.
Patch 2 takes the existing std::deque<DWARFUnitSection> for type units
and makes it a simple DWARFUnitSection, simplifying the handling of
type units and making it more consistent with compile units.
Differential Revision: https://reviews.llvm.org/D49742
llvm-svn: 338629
This is patch 1 of 4 NFC refactorings to handle type units and compile
units more consistently and with less concern about the object-file
section that they came from.
Patch 1 replaces the templated DWARFUnitSection with a non-templated
version. That is, instead of being a SmallVector of pointers to a
specific unit kind, it is not a SmallVector of pointers to the base
class for both type and compile units. Virtual methods are magic.
Differential Revision: https://reviews.llvm.org/D49741
llvm-svn: 338628
Summary: SmallVector's elements are moved when resizing and cause use-after-free.
Reviewers: probinson, dblaikie
Subscribers: JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D49702
llvm-svn: 337772
The intent is to use it for location list tables as well. Change is almost NFC with the exception
of the spelling of some strings used during dumping (all lowercase now).
Reviewer: JDevlieghere
Differential Revision: https://reviews.llvm.org/D49500
llvm-svn: 337763
For instance, When dumping .apple_types, the second atom represents the
DW_TAG. In addition to printing the raw value, we now also pretty print
the value if the ATOM tells us how.
llvm-svn: 337026
Make the DIE iterator bidirectional so we can move to the previous
sibling of a DIE.
Differential revision: https://reviews.llvm.org/D49173
llvm-svn: 336823
I don't think there's a need to use `const char *`. In most (probably all?)
cases, we need a length of a name later, so discarding a length will
lead to a wasted effort.
Differential Revision: https://reviews.llvm.org/D49046
llvm-svn: 336612
Summary:
If the encoding is not specified in CIE augmentation string, then it
should be DW_EH_PE_absptr instead of DW_EH_PE_omit.
Reviewers: ruiu, MaskRay, plotfi, rafauler
Reviewed By: MaskRay
Subscribers: rafauler, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D49000
llvm-svn: 336577
Errors found processing the DW_AT_ranges attribute are propagated by lower level
routines and reported by their callers.
Reviewer: JDevlieghere
Differential Revision: https://reviews.llvm.org/D48344
llvm-svn: 335188
Summary:
This method was not correct for entries in DWO files as it assumed it
could just add up the CU and DIE offsets to get the absolute DIE offset.
This is not correct for the DWO files, as here the CU offset will
reference the skeleton unit, whereas the DIE offset will be the offset
in the full unit in the DWO file.
Unfortunately, this means that we are not able to determine the absolute
DIE offset using the information in the .debug_names section alone,
which means we have to offload some of this work to the users of this
class.
To demonstrate how this can be done, I've added/fixed the ability to
lookup entries using accelerator tables in DWO files in llvm-dwarfdump.
To make this happen, I've needed to make two extra changes in other
classes:
- made the DWARFContext method to lookup a CU based on the section
offset public. I've needed this functionality to lookup a CU, and this
seems like a useful thing in general.
- made DWARFUnit::getDWOId call extractDIEsIfNeeded. Before this, the
DWOId was filled in only if the root DIE happened to be parsed
before we called the accessor. Since the lazy parsing is supposed to
happen under the hood, calling extractDIEsIfNeeded seems appropriate.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D48009
llvm-svn: 334578
Summary:
Back when we were introducing the DWARF v5 name index, there was a
short discussion whether we shouldn't have a nicer api for iterating
over the index. At that time, I did not find it necessary since the
iteration over names was done only from within the index itself (and I
figured the internal implementation can deal with a slightly rough
interface).
However, now I ran into a use for this kind of API in LLDB (for finding
all names matching a regular expression), so it looked like a nice
opportunity to introduce one. To make the API more useful, I've made the
NameTableEntry class a bit smarter: it now stores the string section
reference (so it can return its name) and its position in the name index
(mainly useful for dumping/logging).
I also convert the internal users to use the new API, which also gives
test coverage for the added code.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47590
llvm-svn: 333738
Summary:
Both (Apple and DWARF5) implementations of the iterators had bugs which
resulted in crashes if one attempted to iterate through the accelerator
tables all the way.
For the Apple tables, the issue was that we did not clear the DataOffset
field when we reached the end, which made our iterator compare unequal
to the "end" iterator. For the Dwarf5 tables, the problem was that we
incremented the CurrentIndex pointer and then used the incremented
(possibly invalid) pointer to check whether we have reached the end of
the index list.
The reason these bugs went undetected is because their only user
(dwarfdump) only ever searched for the first match. Besides allowing us
to test this fix, changing llvm-dwarfdump --find to display all matches
seems like a good improvement (it makes the behavior consistent with the
--name option), so I change llvm-dwarfdump to do that.
The existing tests would be sufficient to test this fix with the new
llvm-dwarfdump behavior, but I add a special test that demonstrates that
the tool indeed displays multiple results. The find.test test needed to
be tweaked a bit as the tool now does not print the ".debug_info
contents" header (also consistent with how --name works).
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D47543
llvm-svn: 333635
When requesting to dump both the parent chain and children, we used to
print the DIE more than once because we propagated the dump options to
the parent without clearing the respective flags. This commit fixes this
oversight and adds a test.
rdar://39415292
Differential revision: https://reviews.llvm.org/D47263
llvm-svn: 333350
When printing an error for an invalid address range in a DIE, we used to
print the child above the parent, which is counter intuitive. This patch
reverses the order and indents the child to mimic the way we print the
debug info section.
llvm-svn: 333006
In DWARF v5, the DWO ID is in the (split/skeleton) CU header, not an
attribute on the CU DIE.
This changes the size of those headers, so use the parsed size whenever
we have one, for simplicitly.
Differential Revision: https://reviews.llvm.org/D47158
llvm-svn: 333004
Rather than relying on the user to do the address calculating in
DW_AT_location we should just dump the absolute address.
rdar://problem/38513870
Differential revision: https://reviews.llvm.org/D47152
llvm-svn: 332873
Change the "recoverable" error callback to take an Error instaed of a
string.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D46831
llvm-svn: 332845
This is a resubmit of r331868 (D46583), which was reverted due to
failures on the PS4 bot.
These have been resolved with r332246/D46748.
llvm-svn: 332349
Extract information related to a "unit header" from DWARFUnit into a
new DWARFUnitHeader class, and add a DWARFUnit member for the header.
This is one step in the direction of allowing type units in the
.debug_info section for DWARF v5.
Differential Revision: https://reviews.llvm.org/D46707
llvm-svn: 332289
Summary:
If we are not emitting a linkage name in the .debug_info sections, we
should not add it into the index either. This makes sure our index is
consistent with the actual debug info.
I am also explicitly setting the --dwarf-linkage-names=All in the
name-collsions test as that one would now fail on targets where this
defaults to "Abstract" (in fact, it would have failed already if there
wasn't a bug in the DWARF verifier, which I fix as well).
Reviewers: probinson, aprantl, JDevlieghere
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46748
llvm-svn: 332246
length excluding the table header. Instead it must encode the contribution length minus the length
field itself.
Reviewer: JDevliegehere
Differential Revision: https://reviews.llvm.org/D45922
llvm-svn: 332030
The print format was causing at least 2 unit-test failures from r331971.
The signed/unsigned comparison warnings only appeared to affect two lines but
it was unclear whether it might just pop up on other lines, so I have been
explicit in all the literals in the tests.
There were other bot unit-test failures that I am still investigating.
llvm-svn: 331978
Reviewed by: dblaikie, JDevlieghere, espindola
Differential Revision: https://reviews.llvm.org/D44560
Summary:
The .debug_line parser previously reported errors by printing to stderr and
return false. This is not particularly helpful for clients of the library code,
as it prevents them from handling the errors in a manner based on the calling
context. This change switches to using llvm::Error and callbacks to indicate
what problems were detected during parsing, and has updated clients to handle
the errors in a location-specific manner. In general, this means that they
continue to do the same thing to external users. Below, I have outlined what
the known behaviour changes are, relating to this change.
There are two levels of "errors" in the new error mechanism, to broadly
distinguish between different fail states of the parser, since not every
failure will prevent parsing of the unit, or of subsequent unit. Malformed
table errors that prevent reading the remainder of the table (reported by
returning them) and other minor issues representing problems with parsing that
do not prevent attempting to continue reading the table (reported by calling a
specified callback funciton). The only example of this currently is when the
last sequence of a unit is unterminated. However, I think it would be good to
change the handling of unrecognised opcodes to report as minor issues as well,
rather than just printing to the stream if --verbose is used (this would be a
subsequent change however).
I have substantially extended the DwarfGenerator to be able to handle
custom-crafted .debug_line sections, allowing for comprehensive unit-testing
of the parser code. For now, I am just adding unit tests to cover the basic
error reporting, and positive cases, and do not currently intend to test every
part of the parser, although the framework should be sufficient to do so at a
later point.
Known behaviour changes:
- The dump function in DWARFContext now does not attempt to read subsequent
tables when searching for a specific offset, if the unit length field of a
table before the specified offset is a reserved value.
- getOrParseLineTable now returns a useful Error if an invalid offset is
encountered, rather than simply a nullptr.
- The parse functions no longer use `WithColor::warning` directly to report
errors, allowing LLD to call its own warning function.
- The existing parse error messages have been updated to not specifically
include "warning" in their message, allowing consumers to determine what
severity the problem is.
- If the line table version field appears to have a value less than 2, an
informative error is returned, instead of just false.
- If the line table unit length field uses a reserved value, an informative
error is returned, instead of just false.
- Dumping of .debug_line.dwo sections is now implemented the same as regular
.debug_line sections.
- Verbose dumping of .debug_line[.dwo] sections now prints the prologue, if
there is a prologue error, just like non-verbose dumping.
As a helper for the generator code, I have re-added emitInt64 to the
AsmPrinter code. This previously existed, but was removed way back in r100296,
presumably because it was dead at the time.
This change also requires a change to LLD, which will be committed separately.
llvm-svn: 331971
The new verifier check has found an error in the
debug-names-name-collisions.ll test on the PS4 bot:
error: Name Index @ 0x0: Entry @ 0xdc: mismatched Name of DIE @ 0x23: index - _ZN3foo3fooE; debug_info - foo.
Reverting while I investigate whether this is a bug in the verifier or
the generator.
This reverts commit r331868.
llvm-svn: 331869
Summary:
This patch implements a check which makes sure all entries required by
the DWARF v5 specification are present in the Name Index. The algorithm
tries to follow the wording of Section 6.1.1.1 of the spec as closely as
possible.
The main deviation from it is that instead of a whitelist-based approach
in the spec "The name index must contain an entry for each debugging
information entry that defines a named subprogram, label, variable,
type, or namespace" I chose a blacklist-based one, where I consider
everything to be "in" and then remove the entries that don't make sense.
I did this because it has more potential for catching interesting cases
and the above is a bit vague (it uses plain words like "variable" and
"subprogram", but the rest of the section speaks about specific TAGs).
This approach has raised some interesting questions, the main one being
whether enumerator values should be indexed. The consensus seems to be
that they should, although it does not follow from section 6.1.1.1.
For the time being I made the verifier ignore these, as LLVM does not do
this yet, and I wanted to get a clean run when verifying generated debug
info.
Another interesting case was the DW_TAG_imported_declaration. It was not
immediately clear to me whether this should go in or not, but currently
it is not indexed, and (unlike the enumerators) in does not seem to cause
problems for LLDB, so I've also ignored it.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46583
llvm-svn: 331868
LLVM always puts function definition DIEs at the top level, but under
some circumstances GCC does not (at least in this case with member
functions of a function-local type).
To ensure that doesn't appear as though the local type's member function
is unduly inlined within the outer function - ensure the inline
discovery DIE parent walk stops at the first DW_TAG_subprogram.
llvm-svn: 331291
This prevents infinite recursion in DWARFDie::findRecursively for
malformed DWARF where a DIE references itself.
This fixes PR36257.
Differential revision: https://reviews.llvm.org/D43092
llvm-svn: 331200
Updated two more debug line related warnings to use WithColor. This was
necessary to ensure consistent output order of the warnings on Windows
for debug line tests.
Differential Revision: https://reviews.llvm.org/D45871
llvm-svn: 330440
Summary:
This patch add checks to verify that the information in the name index
entries is consistent with the debug_info section. Specifically, we
check that entries point to valid DIEs, and their names, tags, and
compile units match the information in the debug_info sections.
These checks are only run if the previous checks did not find any errors
in the name index headers. Attempting to proceed with the checks anyway
would likely produce a lot of spurious errors and the verification code
would need to be very careful to avoid crashing.
I also add a couple of more checks to the abbreviation-validation code
to verify that some attributes are always present (an index without a
DW_IDX_die_offset attribute is fairly useless).
The entry verification works only on indexes without any type units - I
haven't attempted to extend it to type units, as we don't even have a
DWARF v5-compatible type unit generator at the moment.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45323
llvm-svn: 329392
Summary:
The positions of the DwarfVersion and AddressSize arguments were
reversed, which caused parsing for dwarf opcodes which contained
address-size-dependent operands (such as DW_OP_addr). Amusingly enough,
none of the address-size asserts fired, as dwarf version was always 4,
which is a valid address size.
I ran into this when constructing weird inputs for the DWARF verifier. I
I add a test case as hand-written dwarf -- I am not sure how to trigger
this differently, as having a DW_OP_addr inside a location list is a
fairly non-standard thing to do.
Fixing this error exposed a bug in the debug_loc.dwo parser, which was
always being constructed with an address size of 0. I fix that as well
by following the pattern in the non-dwo parser of picking up the address
size from the first compile unit (which is technically not correct, but
probably good enough in practice).
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45324
llvm-svn: 329381
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: echristo, zturner, samsonov
Reviewed By: echristo
Subscribers: JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D45134
llvm-svn: 328935
We should align the value of the field, not the overall section offset.
This distinction matters if one of the debug_names contributions is not
of size which is a multiple of four. The dwarf producers may choose to
emit rounded contributions, but they are not required to do so. In the
latter case, without this patch we would corrupt the parsing state, as
we would adjust the offset even if subsequent contributions contained
correctly rounded augmentation strings.
llvm-svn: 328796
Before this patch we were parsing the attributes as section offsets, as
that is what apple_names is doing. However, this is not correct as DWARF
v5 specifies that this attribute should use the Reference form class.
This also updates all the testcases (except the ones that deliberately
pass a different form) to use the correct form class.
llvm-svn: 328773
Summary:
This commit adds checks of the abbreviation table in a DWARF v5 Name
Index. The most interesting/useful check is the one which checks that
each index attributes is encoded using the correct form class, but it
also checks for the more obvious errors like unknown
forms/tags/attributes and duplicated attributes.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44736
llvm-svn: 328202
This is mostly just plumbing to get a DWARFDataExtractor where we
compute abbr_offset so we can use getRelocatedValue.
This is part of PR36793.
llvm-svn: 328154
Summary:
We have had at least three pieces of code (in DWARFAbbreviationDeclaration,
DWARFAcceleratorTable and DWARFDie) that have hand-rolled support for
dumping unknown dwarf enum values. While not terrible, they are a bit
distracting and enable small differences to creep in (Unknown_ffff vs.
Unknown_0xffff). I ended up needing to add a fourth place
(DWARFVerifier), so it seems it would be a good time to centralize.
This patch creates an alternative to the XXXString dumping functions in
the BinaryFormat library, which formats an unknown value as
DW_TYPE_unknown_1234, instead of just an empty string. It is based on
the formatv function, as that allows us to avoid materializing the
string for unknown values (and because this way I don't have to invent a
name for the new functions :P).
In this patch I add formatters for dwarf attributes, forms, tags, and
index attributes as these are the ones in use currently, but adding
other enums is straight-forward.
Reviewers: dblaikie, JDevlieghere, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44570
llvm-svn: 328090
Summary:
This patch adds more checks to the .debug_names validator. Specifically,
they check for:
- buckets claiming to be non-empty but pointing to mismatched hashes
(most consumers would interpret this as an empty bucket, but it
questionable whether the generator meant that)
- hashes that are not reachable from any bucket
- names with incorrect hashes
Together, these checks ensure that any name in the index can be reached
through the hash table using the regular lookup algorithm. We also warn
if we encounter a name index without a hash table.
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44433
llvm-svn: 327699
Summary:
This patch replaces the two switches which are deducing the size of
various forms with a single implementation. I have put the new
implementation into BinaryFormat, to avoid introducing dependencies
between the two independent libraries (DebugInfo and CodeGen) that need
this functionality.
Reviewers: aprantl, JDevlieghere, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44418
llvm-svn: 327486
Make sure that DWARF line information generated by Windows can be properly read by Posix OS and vice versa.
Differential Revision: https://reviews.llvm.org/D44290
llvm-svn: 327430
Summary:
Even though the getDIEOffset offset function was common for the two
accelerator table implementations, it was doing two different things:
for the Apple tables, it was returning the die offset relative to the
start of the section, whereas for DWARF v5 tables, it was relative to
the start of the CU.
I resolve this by renaming the function to getDIESectionOffset to make
it obvious what the function returns, and change the DWARF
implementation to return the section offset. I also keep the CU-relative
accessor, but only in the DWARF implementation (there is no way to get
this information for the Apple tables). This was not caught by existing
tests because the hand-written inputs also erroneously used section
offsets instead of CU-relative ones.
While looking at this, I noticed that the Apple implementation was not
fully correct either -- the header contains a DIEOffsetBase field, which
should be added to offsets encoded with the DW_FORM_ref*** family, but
this was not being used. This went unnoticed because all current writers
set this field to zero anyway. I fix this as well and add a hand-written
test which demonstrates the issue.
Reviewers: JDevlieghere, dblaikie
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D44202
llvm-svn: 327116
Move the DWARF syntax highlighting into support. This has several
advantages, most notably that this makes the WithColor RAII wrapper
available outside libDebugInfo. Furthermore, several projects all have
their own code for handling colored output. This provides a place to
centralize it.
Differential revision: https://reviews.llvm.org/D44215
llvm-svn: 327108
Adding verbose dumping to the recent implementation of dumping of v5 range list entries.
We're capturing the entries as is as they come in during extraction, including their file offset,
so we can dump them in more detail.
The offset table entries which are table-relative are shown as is (as in non-verbose mode)
and with the actual file offset they map to.
Reviewers: dblaikie, aprantl, jdevlieghere, jhenderson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43366
llvm-svn: 327059
Summary:
This patch adds basic .debug_names verification capabilities to the
DWARF verifier. Right now, it checks that the headers and abbreviation
tables of the individual name indexes can be parsed correctly, it
verifies the buckets table and the cross-checks the CU lists for
consistency. I intend to add further checks in follow-up patches.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: vleschuk, echristo, clayborg, llvm-commits
Differential Revision: https://reviews.llvm.org/D44211
llvm-svn: 327011
Whilst working on improvements to the error handling of the debug line
parsing code, I noticed that if an invalid offset were to be specified
in a call to getOrParseLineTable(), an entry in the LineTableMap would
still be created, even if the offset was not within the section range.
The immediate parsing attempt afterwards would fail (it would end up
getting a version of 0), and thereafter, any subsequent calls to
getOrParseLineTable or getLineTable would return the default-
constructed, invalid line table. In reality, we shouldn't even attempt
to parse this table, and we should always return a nullptr from these
two functions for this situation.
I have tested this via a unit test, which required some new framework
for unit testing debug line. My plan is to add quite a few more unit
tests for the new error reporting mechanism that will follow shortly,
hence the reason why the supporting code for the tests are written the
way they are - I intend to extend the DwarfGenerator class to support
generating debug line. At that point, I'll make sure that there are a
few positive test cases for this and the parsing code too.
Differential Revision: https://reviews.llvm.org/D44200
Reviewers: JDevlieghere, aprantl
llvm-svn: 326995
Summary:
Original change was D43313 (r326932) and reverted by r326953 because it
broke an LLD test and a windows build. The LLD test was already fixed in
lld commit r326944 (thanks maskray). This is the original change with
the windows build fixed.
llvm-svn: 326970
This patch enhances DWARFDebugFrame with the capability of parsing and
printing DWARF expressions in CFI instructions. It also makes FDEs and
CIEs accessible to lib users, so they can process them in client tools
that rely on LLVM. To make it self-contained with a test case, it
teaches llvm-readobj to be able to dump EH frames and checks they are
correct in a unit test. The llvm-readobj code is Maksim Panchenko's work
(maksfb).
Reviewers: JDevlieghere, espindola
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D43313
llvm-svn: 326932
Instead of only printing the CU-relative offset in non-verbose mode, it
makes more sense to only printed the resolved address. In verbose mode
we still print both.
Differential revision: https://reviews.llvm.org/D44148
rdar://33525475
llvm-svn: 326903
Summary:
This patch implements the name lookup functionality of the .debug_names
accelerator table and hooks it up to "llvm-dwarfdump -find". To make the
interface of the two kinds of accelerator tables more consistent, I've
created an abstract "DWARFAcceleratorTable::Entry" class, which provides
a consistent interface to access the common functionality of the table
entries (such as getting the die offset, die tag, etc.). I've also
modified the apple table to vend entries conforming to this interface.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: vleschuk, clayborg, echristo, llvm-commits
Differential Revision: https://reviews.llvm.org/D43067
llvm-svn: 326003
In DWARF v5 the Line Number Program Header is extensible, allowing values with
new content types. In this extension a content type is added,
DW_LNCT_LLVM_source, which contains the embedded source code of the file.
Add new optional attribute for !DIFile IR metadata called source which contains
source text. Use this to output the source to the DWARF line table of code
objects. Analogously extend METADATA_FILE in Bitcode and .file directive in ASM
to support optional source.
Teach llvm-dwarfdump and llvm-objdump about the new values. Update the output
format of llvm-dwarfdump to make room for the new attribute on file_names
entries, and support embedded sources for the -source option in llvm-objdump.
Differential Revision: https://reviews.llvm.org/D42765
llvm-svn: 325970
Verifying any DWARF file that is optimized and contains at least one tag
with a DW_AT_location with a location list offset as a
DW_AT_form_dataXXX results in dwarfdump spuriously claiming that the
location list is invalid.
Differential revision: https://reviews.llvm.org/D40199
llvm-svn: 325430
Seeing some inlining missing in internal uses of symbolizer. I'll work
on a reproduction, tests, improvements & recommit as soon as possible.
(Chandler would like it to be known that this improvement did make
check-llvm 4x faster... - so there's certainly some fairly good
motivation to push on fixing/figuring this out & getting it back in)
This reverts commit r321345.
llvm-svn: 324981
Emitting the correct (root of compilation) file at index 0 will be
posted for review later; I wanted to get this minor change out of the
way first.
llvm-svn: 324669
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.
The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself. This
matches the format used for dumping strings in the .debug_info
section.
Differential Revision: https://reviews.llvm.org/D42802
llvm-svn: 324270
This change adds support to llvm-dwarfdump for dumping DWARF5
.debug_rnglists sections in regular ELF files.
It is not complete, in that several DW_RLE_* encodings are currently
not supported, but does dump the headert and the basic ranges for
DW_RLE_start_length and DW_RLE_start_end encodings.
Obvious next steps are to add verbose dumping that dumps the raw
encodings, rather than the interpreted contents, to add -verify support
of the section (e.g. to show that the correct number of offsets are
specified), add dumping of .debug_rnglists.dwo, and to add support for
other encodings.
Reviewed by: dblaikie, JDevlieghere
Differential Revision: https://reviews.llvm.org/D42481
llvm-svn: 324077
r323476 added support for DW_FORM_line_strp, and incorrectly made that
depend on having a DWARFUnit available. We shouldn't be tracking
.debug_line_str in DWARFUnit after all. After this patch, I can do an
NFC follow up and undo a bunch of the "plumbing" part of r323476.
Differential Revision: https://reviews.llvm.org/D42609
llvm-svn: 323691
The test was failing because of an incorrect sizeof check in the name
index parsing code. This code was meant to check that we have enough
input to parse the fixed-size part of the dwarf header, which it did by
comparing the input to sizeof(Header). Originally struct Header only
contained the fixed-size part, but during review, we've moved additional
members into it, which rendered the sizeof check invalid.
I resolve this by moving the fixed-size part to a separate struct and
updating the sizeof-expression to use that.
llvm-svn: 323648
The call to ScopedPrinter::printNumber with size_t argument was
ambiguous (I think) on 32-bit builds. Explicitly cast to a 64-bit int to
avoid this.
llvm-svn: 323642
Summary:
This modifies the dwarfdump output to align it with the new .debug_names
dump. It also renames two header fields to match similar fields in the
dwarf5 header.
A couple of tests needed to be updated to match new output. The changes
were fairly straight-forward, although not really automatable.
Reviewers: JDevlieghere, aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42415
llvm-svn: 323641
Summary:
This commit renames DWARFAcceleratorTable to AppleAcceleratorTable to free up
the first name as an interface for the different accelerator tables.
Then I add a DWARFDebugNames class for the dwarf5 table.
Presently, the only common functionality of the two classes is the dump()
method, because this is the only method that was necessary to implement
dwarfdump -debug-names; and because the rest of the
AppleAcceleratorTable interface does not directly transfer to the dwarf5
tables (the main reason for that is that the present interface assumes
the tables are homogeneous, but the dwarf5 tables can have different
keys associated with each entry).
I expect to make the common interface richer as I add more functionality
to the new class (and invent a way to represent it in generic way).
In terms of sharing the implementation, I found the format of the two
tables sufficiently different to frustrate any attempts to have common
parsing or dumping code, so presently the implementations share just low
level code for formatting dwarf constants.
Reviewers: vleschuk, JDevlieghere, clayborg, aprantl, probinson, echristo, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42297
llvm-svn: 323638
This patch moves the DJB hash to support. This is consistent with other
hashing algorithms living there. The hash is used by the DWARF
accelerator tables. We're doing this now because the hashing function is
needed by dsymutil and we don't want to link against libBinaryFormat.
Differential revision: https://reviews.llvm.org/D42594
llvm-svn: 323616
Move standard forms from a switch statement to the table of forms;
fill in all the missing ones defined in DWARF v5. I'm guessing at
classifications in a couple of cases where v5 forms aren't actually
supported yet, but whoever adds support for the forms can fix the
classifications as needed.
llvm-svn: 323481
This form is like DW_FORM_strp, but points to .debug_line_str instead
of .debug_str as the string section. It's intended to be used from
the line-table header, and allows string-pooling of directory and
filenames across compilation units.
Differential Revision: https://reviews.llvm.org/D42553
llvm-svn: 323476
This frees up the first name to be used as an base class for the
apple table and the dwarf5 .debug_names accel table. The rename was
split off from D42297 (adding of debug_names support), which is still
under review.
llvm-svn: 323113
The compilation directory has always been #0, but as of DWARF v5 it is
explicitly listed in the line-table section instead of implicitly
being a reference to the compile_unit DIE's DW_AT_comp_dir attribute.
This means the dumper should number the dumped array starting with 0
or 1 depending on the DWARF version of the line table.
References in the generated DWARF are correct, it's just the dumper
that was wrong. Also some assembler-coded tests were similarly
confused about directory numbers.
llvm-svn: 322884
inlined subroutines for a given address.
This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.
This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.
It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.
Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]
All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.
Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.
Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.
Differential Revision: https://reviews.llvm.org/D40987
llvm-svn: 321345
Reorganizes the DWARF consumer to derive the string offsets table
contribution's format from the contribution header instead of
(incorrectly) from the unit's format.
Reviewers: JDevliegehere, aprantl
Differential Revision: https://reviews.llvm.org/D41146
llvm-svn: 321295
Adds missing support for DW_FORM_data16.
Update of r320852/r320886, fixing the unittest again, this time use a
raw char string for the test data.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 321011
Adds missing support for DW_FORM_data16.
Update of r320852, fixing the unittest to use a hand-coded struct
instead of std::array to guarantee data layout.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 320886
This fixes a bug where the verifier was complaining about empty
accelerator tables. When the table is empty, its size is not a valid
offset as it points after the end of the section.
This patch also makes the extractor return llvm:Error instead of bool
for better error reporting in the verifier.
Differential revision: https://reviews.llvm.org/D41063
rdar://35932007
llvm-svn: 320399
The previous implementation would only look 1 DW_AT_specification or DW_AT_abstract_origin deep. This means DWARFDie::getName() would fail in certain cases. I ran into such a case while creating a tool that used the LLVM DWARF parser to generate a symbolication format so I have seen this in the wild.
Differential Revision: https://reviews.llvm.org/D40156
llvm-svn: 319104
DWARF4 relative DW_AT_high_pc values are now displayed as absolute
addresses. The relative value is only shown when explicitly dumping the
forms, i.e. in show-form or verbose mode.
```
DW_AT_low_pc (0x0000000000000049)
DW_AT_high_pc (0x00000019)
```
becomes
```
DW_AT_low_pc (0x0000000000000049)
DW_AT_high_pc (0x0000000000000062)
```
Differential revision: https://reviews.llvm.org/D40317
rdar://35416943
llvm-svn: 319044