3d1d8c767b made
DWARFExpression::iterator's Operation member `mutable`. After a few prep
commits, the iterator can instead be made a `const` iterator since no
caller can change the Operation.
Differential Revision: https://reviews.llvm.org/D113958
The only caller of Operation::verify() is DWARFExpression::verify(),
which iterates past the (ephemeral) Operation immediately after.
- Stop setting Operation::Error because the mutation will never be
observed.
- Change verify() to a static function to be sure all callers are
updated.
Differential Revision: https://reviews.llvm.org/D113957
Add `const` to DWARFExpression::Operation's accessors and make
Operation::extract() private, since it's only used by the friend class
DWARFExpression::iterator.
It is often useful to know which die is the parent of the current die.
This patch adds information about parent offset into the dump:
0x0000000b: DW_TAG_compile_unit
DW_AT_producer ("by_hand")
0x00000014: DW_TAG_base_type (0x0000000b) <<<<<<<<<<<<<<
DW_AT_name ("int")
Now it is easy to see which die is the parent of the current die.
This patch makes that behaviour to be default.
We can make it to be opt-in if neccessary.
This functionality differs from already existed "--show-parents"
in that sence that parent information is shown for all dies and
only link to the immediate parent is shown.
Differential Revision: https://reviews.llvm.org/D113406
Matching a recent clang change I've made, now 'int[3]' is formatted
without the space between the type and array bound. This commit updates
libDebugInfoDWARF/llvm-dwarfdump to match that formatting.
Actually we can, for now, remove the explicit "operator" handling
entirely - since clang currently won't try to flag any of these as
rebuildable. That seems like a reasonable state for now, but it could be
narrowed down to only apply to conversion operators, most likely - but
would need more nuance for op> and op>> since they would be incorrectly
flagged as already having their template arguments (due to the trailing
'>').
Specifically in DWARFv5 the unit for the line table entry was correct
but the context was incorrect - leading to looking up .debug_line_str in
the dwp instead of the executable.
(perhaps we could/should remove the context pointer entirely, and rely
on the one in the unit... I might try that as a separate follow-up
commit)
Some dwarf loaders in LLVM are hard-coded to only accept 4-byte and 8-byte address sizes. This patch generalizes acceptance into `DWARFContext::isAddressSizeSupported` and provides a common way to generate rejection errors.
The MSP430 target has been given new tests to cover dwarf loading cases that previously failed due to 2-byte addresses.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111953
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
This patch implements suggestion done while reviewing D102634. It adds two fields:
ParentIdx and SiblingIdx. These fields allow fast navigation to die parent and
die sibling. These fields are set at the moment when dies are loaded.
dsymutil works 2% faster with this patch(run on clang binary).
Differential Revision: https://reviews.llvm.org/D110363
Clang will encode names that should be able to be simplified as
"_STNname|<template, args>" (eg: "_STNt1|<int>") - this verification
mode will detect these names, decode them, create the original name
("t1<int>") and the simple name ("t1") - letting the simple name run
through the usual rebuilding logic - then compare the two sources of the
full name - the rebuilt and the _STN encoding.
This helps ensure that -gsimple-template-names is lossless.
This should probably be rendered as "std::nullptr_t" but for now clang
uses the unqualified name (which is ambiguous with possible user defined
name in the global namespace), so match that here.
Laying more foundation for full template name rebuilding - more complex
type printing benefits from an object to carry some state rather than
passing it around as parameters to every function.
DWARFUnit::clearDIEs() uses std::vector::shrink_to_fit() to make
capacity of DieArray matched with its size(). The shrink_to_fit()
is not binding request to make capacity match with size().
Thus the memory could still be reserved after DWARFUnit::clearDIEs()
is called. This patch erases capacity when DWARFUnit::clearDIEs() is requested.
So the memory occupied by dies would be freed.
Differential Revision: https://reviews.llvm.org/D109499
This does add some extra superfluous whitespace (eg: "int *") intended
to make the Simplified Template Names work easier - this makes the
DIE-based names match more exactly the clang-generated names, so it's
easier to identify cases that don't generate matching names.
(arguably we could change clang to skip that whitespace or add some
fuzzy matching to accommodate differences in certain whitespace - but
this seemed easier and fairly low-impact)
This is used by BOLT to do patching of DebugInfo section, and Line Table. Directly by using find, and through getAttrFieldOffsetForUnit.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D107874
DWARFDie::getDeclFile(...) previously only supported getting the DW_AT_decl_file if the DIE itself contained the DW_AT_decl_file attribute, or if the DIE had a DW_AT_abstract_origin that pointed to another DIE that had a DW_AT_decl_file. This patch allows the function to get the right attribute value if there is a DW_AT_specification that points to another DIE. We also test that if a DW_AT_abtract_origin or DW_AT_specification points to a DIE in another CU with a DW_FORM_ref_addr, that the right line table is used to extract the file index.
Full tests were added for the following cases:
- DIE has a DW_AT_decl_file attribute
- DIE has a DW_AT_abtract_origin that points to another die in the same CU
- DIE has a DW_AT_abtract_origin that points to another die in another CU
- DIE has a DW_AT_specification that points to another die in the same CU
- DIE has a DW_AT_specification that points to another die in another CU
Differential Revision: https://reviews.llvm.org/D108480
verifyDieRanges function checks for the intersected address ranges.
It adds child DieRangeInfo into parent DieRangeInfo to check
whether children have overlapping address ranges. It is safe to not add
DieRangeInfo with empty address range into parent's children list.
This decreases the number of children which should be navigated and as a result
decreases execution time(parents having a lot of children with empty ranges
spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl"
execution time decreased from 220 sec till 75 sec.
Differential Revision: https://reviews.llvm.org/D107554
This ensures that debug_types references aren't looked for in
debug_info section.
Behavior is still going to be questionable in an unlinked object file -
since cross-cu references could refer to symbols in another .debug_info
(or, in theory, .debug_types) chunk - but if a producer only uses
ref_addr to refer to things within the same .debug_info chunk in an
object file (eg: whole program optimization/LTO - producing two CUs into
a single .debug_info section in an object file - the ref_addrs there
could be resolved relative to that .debug_info chunk, not needing to
consider comdat (DWARFv5 type units or other creatures) chunks of
.debug_info, etc)
When we build with split dwarf in single mode the .o files that contain both "normal" debug sections and dwo sections, along with relocaiton sections for "normal" debug sections.
When we create DWARF context in DWARFObjInMemory we process relocations and store them in the map for .debug_info, etc section.
For DWO Context we also do it for non dwo dwarf sections. Which I believe is not necessary. This leads to a lot of memory being wasted. We observed 70GB extra memory being used.
I went with context sensitive approach, flag is passed in. I am not sure if it's always safe not to process relocations for regular debug sections if Obj contains .dwo sections.
If it is alternatvie might be just to scan, in constructor, sections and if there are .dwo sections not to process regular debug ones.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D106624
list for attributes that don't have the loclist class.
Summary: The overflow error occurs when we try to dump
location list for those attributes that do not have the
loclist class, like DW_AT_count and DW_AT_byte_size.
After re-reviewed the entire list, I sorted those
attributes into two parts, one for dumping location list
and one for dumping the location expression.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D105613
This call would incorrectly overwrite (with the .debug_rnglists.dwo from
the executable, if there was one) the rnglists section instead of the
correct value (from the .debug_rnglists.dwo in the .dwo file) that's
applied in DWARFUnit::tryExtractDIEsIfNeeded
Originally committed as 04c203e310
Reverted in 768510632c due to the test
failing when encountering windows directory separators.
Fix the path separator platform issue with a FileCheck pattern {{[/\\]}}
Original commit message:
A followup to the feature added in 69da27c749
that added the optional "start file name" to match "start line" - but this
didn't work with Split DWARF because of the need for the decl file number
resolution code to refer back to the skeleton unit to find its .debug_line
contribution. So this patch adds the necessary infrastructure to track the
skeleton unit corresponding to a split full unit for the purpose of this
lookup.
A followup to the feature added in
69da27c749 that added the optional "start
file name" to match "start line" - but this didn't work with Split DWARF
because of the need for the decl file number resolution code to refer
back to the skeleton unit to find its .debug_line contribution. So this
patch adds the necessary infrastructure to track the skeleton unit
corresponding to a split full unit for the purpose of this lookup.
llvm-dwarfdump was silent even when the format of DWARF was invalid
and/or llvm-dwarfdump did not understand/support some of the constructs.
This can be pretty confusing as llvm-dwarfdump is a tool for DWARF
producers+consumers development.
Review comments also by @dblaikie.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D104271
getRelocatedSection interface should not check that the object file is
relocatable, as executable files may have relocations preserved with
`--emit-relocs` linker flag. The relocations are useful in context of post-link
binary analysis for function reference identification. For example, BOLT relies
on relocations to perform function reordering.
Reviewed By: MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D102296
In many cases it is helpful to know at what address the resolved function starts.
This patch adds a new StartAddress member to the DILineInfo structure.
Reviewed By: jhenderson, dblaikie
Differential Revision: https://reviews.llvm.org/D102316
UnwindTable::parseRows() may return successfully if the CFIProgram has either
no CFI instructions or only DW_CFA_nop instructions and the UnwindRow return
argument will be empty. But currently, the callers are not checking for this case
which is leading to incorrect dumps in the unwind tables in such cases i.e.
CFA=unspecified
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D101892
When DIE is extracted manually, the DieArray is empty. When dump is invoked on aforementioned DIE it tries to extract child, even if Dump options say otherwise. Resulting in crash.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D99698
In some cases a broken or invalid debug info could cause a crash in DWARFUnit::getInlinedChainForAddress during parsing a chain of in-lined functions. This patch fixes this issue.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D98119
`AttributeSpec` does not contain values while `DWARFAttribute` already
does. Therefore one no longer needs to pass `uint64_t *OffsetPtr`.
Differential Revision: https://reviews.llvm.org/D98194
D81469 introduced a check to error on CIE version different
than 1 for eh_frame, but older compilers mistakenly create binaries
with this version set to 3 for DWARF4 or 4 to DWARF5. Move the check
to dump time instead of eh_frame parse time, so we can be tolerant
with older binaries.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D97830
GCC warning:
```
/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp: In member function ‘llvm::Expected<long unsigned int> llvm::dwarf::CFIProgram::Instruction::getOperandAsUnsigned(const llvm::dwarf::CFIProgram&, uint32_t) const’:
/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp:425:1: warning: control reaches end of non-void function [-Wreturn-type]
425 | }
| ^
/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp: In member function ‘llvm::Expected<long int> llvm::dwarf::CFIProgram::Instruction::getOperandAsSigned(const llvm::dwarf::CFIProgram&, uint32_t) const’:
/llvm-project/llvm/lib/DebugInfo/DWARF/DWARFDebugFrame.cpp:477:1: warning: control reaches end of non-void function [-Wreturn-type]
477 | }
| ^
```
This patch adds the ability to evaluate the state machine for CIE and FDE unwind objects and produce a UnwindTable with all UnwindRow objects needed to unwind registers. It will also dump the UnwindTable for each CIE and FDE when dumping DWARF .debug_frame or .eh_frame sections in llvm-dwarfdump or llvm-objdump. This allows users to see what the unwind rows actually look like for a given CIE or FDE instead of just seeing a list of opcodes.
This patch adds new classes: UnwindLocation, RegisterLocations, UnwindRow, and UnwindTable.
UnwindLocation is a class that describes how to unwind a register or Call Frame Address (CFA).
RegisterLocations is a class that tracks registers and their UnwindLocations. It gets populated when parsing the DWARF call frame instruction opcodes for a unwind row. The registers are mapped from their register numbers to the UnwindLocation in a map.
UnwindRow contains the result of evaluating a row of DWARF call frame instructions for the CIE, or a row from a FDE. The CIE can produce a set of initial instructions that each FDE that points to that CIE will use as the seed for the state machine when parsing FDE opcodes. A UnwindRow for a CIE will not have a valid address, whille a UnwindRow for a FDE will have a valid address.
The UnwindTable is a class that contains a sorted (by address) vector of UnwindRow objects and is the result of parsing all opcodes in a CIE, or FDE. Parsing a CIE should produce a UnwindTable with a single row. Parsing a FDE will produce a UnwindTable with one or more UnwindRow objects where all UnwindRow objects have valid addresses. The rows in the UnwindTable will be sorted from lowest Address to highest after parsing the state machine, or an error will be returned if the table isn't sorted. To parse a UnwindTable clients can use the following methods:
static Expected<UnwindTable> UnwindTable::create(const CIE *Cie);
static Expected<UnwindTable> UnwindTable::create(const FDE *Fde);
A valid table will be returned if the DWARF call frame instruction opcodes have no encoding errors. There are a few things that can go wrong during the evaluation of the state machine and these create functions will catch and return them.
Differential Revision: https://reviews.llvm.org/D89845
This allows to reuse the RelocationResolver from the code
that doesn't want to deal with `RelocationRef` class.
I am going to use it in llvm-readobj. See the description
of D91530 for more details.
Differential revision: https://reviews.llvm.org/D91533
In the current state, if getFromHash(0) is called and there's no CU with
dwo_id=0, the lookup will stop at an empty slot, then the check
`Rows[H].getSignature() != S` won't cause the lookup to fail and return
a nullptr (as it should), because the empty slot has a 0 in the
signature field, and a pointer to the empty slot will be incorrectly
returned.
This patch fixes this by using the index field in the hash entry to
check for empty slots: signature = 0 can match a valid hash but
according to the spec the index for an occupied slot will always be
non-zero.
Differential Revision: https://reviews.llvm.org/D91670
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
There's no way to know whether there's a loclist contribution to parse
if there's no loclistx encoding - and if there is one, there's no need
to walk back from the loclist_base (or, uin the case of
info.dwo/loclist.dwo - starting at 0 in the contribution) to parse the
header, instead rely on the DWARF32/64 and address size in the CU
that's already available.
This would come up in split DWARF (non-split wouldn't try to read a
loclist header in the absence of a loclist_base) when one unit had
location lists and another does not (because the loclists.dwo section
would be non-empty in that case - in the case where it's empty the
parsing would silently skip).
Simplify the testing a bit, rather than needing a whole dwp, etc - by
creating a malformed loclists.dwo section (and use single file Split
DWARF) that would trip up any attempt to parse it - but no attempt
should be made.
Register context information was already being passed into the DWARFDebugFrame code that dumps unwind information but it wasn't being used. This change adds the ability to dump registers names of a valid MC register context was passed in and if it knows about the register. Updated the tests to use the newly returned register names.
Differential Revision: https://reviews.llvm.org/D88767
It's not possible to do this in complete generality - a CU using a
sec_offset DW_AT_ranges has no way of knowing where its rnglists
contribution starts, so should not attempt to parse any full rnglist
table/header to do so. And even using FORM_rnglistx there's no need to
parse the header - the offset can be computed using the CU's DWARF
format (32 or 64) to compute offset entry sizes, and then the list
parsed at that offset without ever trying to find a rnglist contribution
header immediately prior to the rnglists_base.
Since DWARFv5 places TUs in debug_info, some of DWARFContext's APIs have
become a bit erroneous, including TUs in the CU list by accident.
Correct that by providing compile_units (& dwo_compile_units) that
filter out the type units from the debug_info units.
Differential Revision: https://reviews.llvm.org/D87935
Flag DIEs that have DW_CHILDREN_yes set in their abbreviation but don't
actually have any children.
rdar://59809554
Differential revision: https://reviews.llvm.org/D88048
When concatenating directory with filename in getFilenameByIndex, we
might end up with a path that contains extra dots. For example, if the
input is /path and ./example, we would return /path/./example. Run
sys::path::remove_dots on the output to eliminate unnecessary dots.
Differential Revision: https://reviews.llvm.org/D87657
Since a function might have portions of its code coming from multiple
different files, "start line" is ambiguous (it can't just be resolved
relative to the file/line specified). Add start file to disambiguate it.
When llvm-dwarfdump encounters no null terminated strings, we should
warn user about it rather than ignore it and print nothing.
Before this patch, when llvm-dwarfdump dumps a .debug_str section whose
content is "abc", it prints:
```
.debug_str contents:
```
After this patch:
```
.debug_str contents:
warning: no null terminated string at offset 0x0
```
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D86998
This patch adds a helper function DumpStrSection to simplify codes.
Besides, nonprintable chars in debug_str and debug_str.dwo sections
are printed as escaped chars.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D86918
Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing
down memory usage of symbolizers that don't need to parse this data at
all. There's not much benefit to caching these anyway - since they are
O(1) lookup and reading once you know where the offset list starts (and
can do bounds checking with the offset list size too).
In general, I think it might be time to start paying down some of the
technical debt of loc/loclist/range/rnglist parsing to try to unify it a
bit more.
eg:
* Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection,
LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if
these were all wrapped up in two variables - one for loclists, one for
rnglists)
* rnglists and loclists are handled differently (see:
LoclistTableHeader, but no RnglistTableHeader)
* maybe all these types could be less stateful - lazily parse what they
need to, even reparsing rather than caching because it doesn't seem
too expensive, for instance. (though admittedly so long as it's
constantcost/overead per compilatiton that's probably adequate)
* Maybe implementing and using a DWARFDataExtractor that can be
sub-ranged (so we could slice it up to just the single contribution) -
though maybe that's not so useful because loc/ranges need to refer to
it by absolute, not contribution-relative mechanisms
Differential Revision: https://reviews.llvm.org/D86110
dumpStringOffsetsSection() expects the size of a contribution to be
correctly aligned. The patch adds the corresponding verifications for
pre-v5 cases.
Differential Revision: https://reviews.llvm.org/D85739
Note that DWARFUnit::getAbbreviations() returns nullptr if the
abbreviations could not be read, but callers used the returned
pointer without checking.
Differential Revision: https://reviews.llvm.org/D85738
Allow the GNU .debug_macro extension to be parsed and printed by
llvm-dwarfdump. In an upcoming patch support will be added for emitting
that format also.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D82974
Although the DWARF specification states that .debug_aranges entries
can't have length zero, these can occur in the wild. There's no
particular reason to enforce this part of the spec, since functionally
they have no impact. The patch removes the error and introduces a new
warning for premature terminator entries which does not stop parsing.
This is a relanding of cb3a598c87, adding the missing obj2yaml part
that was needed.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46805. See also
https://reviews.llvm.org/D71932 which originally introduced the error.
Reviewed by: ikudrin, dblaikie, Higuoxing
Differential Revision: https://reviews.llvm.org/D85313