When reading location lists in dwo files the addresses cannot be
resolved, but that's not a problem.
Long term this probably should be fixed with a different API that
exposes location expressions without the need to resolve the address
ranges, since that's all the verifier (in its current state) requires.
(though the verifier should probably also eventually verify the address
ranges in location lists are a subset of the enclosing scope's address
range)
When verifying dwo files address ranges won't be able to be resolved due
to missing debug_addr (or missing debug_ranges in the case of DWARFv4
Split DWARF).
Since they refer to the debug_line in the skeleton unit, they can't be
resolved from the dwo CU.
But they can be resolved for split TUs, since those refer to
.debug_line.dwo, which is available in the dwo file.
Add a test to `llvm-dwarfdump` to simply test that the error messages
make sense when passing bad `.dSYM`s.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D115889
This produced a few verifier warnings that came up while I was
investigating something else here. Change the assembly to match the
comment so it's warning free. Doesn't seem necessary to change the
CHECKs for the test since it's just a bug in the test, not in the code
under test.
Avoid duplicating the string decoding - improve the error messages down
in form parsing (& produce an Expected<const char*> instead of
Optional<const char*> to communicate the extra error details)
Seems helpful when you're dealing with invalid/problematic DWARF. Some
diagnostic messages are probably redundant with the verbose dumping and
could be simplified with this.
Introduced/discussed in https://reviews.llvm.org/D38719
The header validation logic was also explicitly building the DWARFUnits
to validate. But then other calls, like "Units.getUnitForOffset" creates
the DWARFUnits again in the DWARFContext proper - so, let's avoid
creating the DWARFUnits twice by walking the DWARFContext's units rather
than building a new list explicitly.
This does reduce some verifier power - it means that any unit with a
header parsing failure won't get further validation, whereas the
verifier-created units were getting some further validation despite
invalid headers. I don't think this is a great loss/seems "right" in
some ways to me that if the header's invalid we should stop there.
Exposing the raw DWARFUnitVectors from DWARFContext feels a bit
sub-optimal, but gave simple access to the getUnitForOffset to keep the
rest of the code fairly similar.
With link-time optimizations enabled, resulting DWARF mayend up containing
cross CU references (through the DW_AT_abstract_origin attribute).
Consider the following example:
// sum.c
__attribute__((always_inline)) int sum(int a, int b)
{
return a + b;
}
// main.c
extern int sum(int, int);
int main()
{
int a = 5, b = 10, c = sum(a, b);
return 0;
}
Compiled as follows:
$ clang -g -flto -fuse-ld=lld main.c sum.c -o main
Results in the following DWARF:
-- sum.c CU: abstract instance tree
...
0x000000b0: DW_TAG_subprogram
DW_AT_name ("sum")
DW_AT_decl_file ("sum.c")
DW_AT_decl_line (1)
DW_AT_prototyped (true)
DW_AT_type (0x000000d3 "int")
DW_AT_external (true)
DW_AT_inline (DW_INL_inlined)
0x000000bc: DW_TAG_formal_parameter
DW_AT_name ("a")
DW_AT_decl_file ("sum.c")
DW_AT_decl_line (1)
DW_AT_type (0x000000d3 "int")
0x000000c7: DW_TAG_formal_parameter
DW_AT_name ("b")
DW_AT_decl_file ("sum.c")
DW_AT_decl_line (1)
DW_AT_type (0x000000d3 "int")
...
-- main.c CU: concrete inlined instance tree
...
0x0000006d: DW_TAG_inlined_subroutine
DW_AT_abstract_origin (0x00000000000000b0 "sum")
DW_AT_low_pc (0x00000000002016ef)
DW_AT_high_pc (0x00000000002016f1)
DW_AT_call_file ("main.c")
DW_AT_call_line (5)
DW_AT_call_column (0x19)
0x00000081: DW_TAG_formal_parameter
DW_AT_location (DW_OP_reg0 RAX)
DW_AT_abstract_origin (0x00000000000000bc "a")
0x00000088: DW_TAG_formal_parameter
DW_AT_location (DW_OP_reg2 RCX)
DW_AT_abstract_origin (0x00000000000000c7 "b")
...
Note that each entry within the concrete inlined instance tree in
the main.c CU has a DW_AT_abstract_origin attribute which
refers to a corresponding entry within the abstract instance
tree in the sum.c CU.
llvm-dwarfdump --statistics did not properly report
DW_TAG_formal_parameters/DW_TAG_variables from concrete inlined
instance trees which had 0% location coverage and which
referred to a different CU, mainly because information about abstract
instance trees and their parameters/variables was stored
locally - just for the currently processed CU,
rather than globally - for all CUs.
In particular, if the concrete inlined instance tree from
the example above was to look like this
(i.e. parameter b has 0% location coverage, hence why it's missing):
0x0000006d: DW_TAG_inlined_subroutine
DW_AT_abstract_origin (0x00000000000000b0 "sum")
DW_AT_low_pc (0x00000000002016ef)
DW_AT_high_pc (0x00000000002016f1)
DW_AT_call_file ("main.c")
DW_AT_call_line (5)
DW_AT_call_column (0x19)
0x00000081: DW_TAG_formal_parameter
DW_AT_location (DW_OP_reg0 RAX)
DW_AT_abstract_origin (0x00000000000000bc "a")
llvm-dwarfdump --statistics would have not reported b as such.
Patch by Dimitrije Milosevic.
Differential revision: https://reviews.llvm.org/D113465
It is often useful to know which die is the parent of the current die.
This patch adds information about parent offset into the dump:
0x0000000b: DW_TAG_compile_unit
DW_AT_producer ("by_hand")
0x00000014: DW_TAG_base_type (0x0000000b) <<<<<<<<<<<<<<
DW_AT_name ("int")
Now it is easy to see which die is the parent of the current die.
This patch makes that behaviour to be default.
We can make it to be opt-in if neccessary.
This functionality differs from already existed "--show-parents"
in that sence that parent information is shown for all dies and
only link to the immediate parent is shown.
Differential Revision: https://reviews.llvm.org/D113406
Matching a recent clang change I've made, now 'int[3]' is formatted
without the space between the type and array bound. This commit updates
libDebugInfoDWARF/llvm-dwarfdump to match that formatting.
Actually we can, for now, remove the explicit "operator" handling
entirely - since clang currently won't try to flag any of these as
rebuildable. That seems like a reasonable state for now, but it could be
narrowed down to only apply to conversion operators, most likely - but
would need more nuance for op> and op>> since they would be incorrectly
flagged as already having their template arguments (due to the trailing
'>').
Functions in different sections (common in object files - inline
functions, -ffunction-sections, etc) can't overlap, so factor in the
section when diagnosing overlapping address ranges.
This removes a major false-positive when running llvm-dwarfdump on
unlinked code.
Some dwarf loaders in LLVM are hard-coded to only accept 4-byte and 8-byte address sizes. This patch generalizes acceptance into `DWARFContext::isAddressSizeSupported` and provides a common way to generate rejection errors.
The MSP430 target has been given new tests to cover dwarf loading cases that previously failed due to 2-byte addresses.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111953
Clang will encode names that should be able to be simplified as
"_STNname|<template, args>" (eg: "_STNt1|<int>") - this verification
mode will detect these names, decode them, create the original name
("t1<int>") and the simple name ("t1") - letting the simple name run
through the usual rebuilding logic - then compare the two sources of the
full name - the rebuilt and the _STN encoding.
This helps ensure that -gsimple-template-names is lossless.
This should probably be rendered as "std::nullptr_t" but for now clang
uses the unqualified name (which is ambiguous with possible user defined
name in the global namespace), so match that here.
Move most type tests to a pre-generated assembly file to make it easier
to add more weird cases without having to hand craft more DWARF.
Move the novel array types that aren't reachable via clang-generated
DWARF to a separate file for easy maintenance.
This does add some extra superfluous whitespace (eg: "int *") intended
to make the Simplified Template Names work easier - this makes the
DIE-based names match more exactly the clang-generated names, so it's
easier to identify cases that don't generate matching names.
(arguably we could change clang to skip that whitespace or add some
fuzzy matching to accommodate differences in certain whitespace - but
this seemed easier and fairly low-impact)
verifyDieRanges function checks for the intersected address ranges.
It adds child DieRangeInfo into parent DieRangeInfo to check
whether children have overlapping address ranges. It is safe to not add
DieRangeInfo with empty address range into parent's children list.
This decreases the number of children which should be navigated and as a result
decreases execution time(parents having a lot of children with empty ranges
spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl"
execution time decreased from 220 sec till 75 sec.
Differential Revision: https://reviews.llvm.org/D107554
This ensures that debug_types references aren't looked for in
debug_info section.
Behavior is still going to be questionable in an unlinked object file -
since cross-cu references could refer to symbols in another .debug_info
(or, in theory, .debug_types) chunk - but if a producer only uses
ref_addr to refer to things within the same .debug_info chunk in an
object file (eg: whole program optimization/LTO - producing two CUs into
a single .debug_info section in an object file - the ref_addrs there
could be resolved relative to that .debug_info chunk, not needing to
consider comdat (DWARFv5 type units or other creatures) chunks of
.debug_info, etc)
Improves maintainability (edit/modify the tests without recompiling) and
error messages (previously the failure would be a gtest failure
mentioning nothing of the input or desired text) and the option to
improve tests with more checks.
(maybe these tests shouldn't all be in separate files - we could
probably have DWARF yaml that contains multiple errors while still being
fairly maintainable - the various invalid offsets (ref_addr, rnglists,
ranges, etc) could probably be all in one test, but for the simple sake
of the migration I just did the mechanical thing here)
list for attributes that don't have the loclist class.
Summary: The overflow error occurs when we try to dump
location list for those attributes that do not have the
loclist class, like DW_AT_count and DW_AT_byte_size.
After re-reviewed the entire list, I sorted those
attributes into two parts, one for dumping location list
and one for dumping the location expression.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D105613
This call would incorrectly overwrite (with the .debug_rnglists.dwo from
the executable, if there was one) the rnglists section instead of the
correct value (from the .debug_rnglists.dwo in the .dwo file) that's
applied in DWARFUnit::tryExtractDIEsIfNeeded
llvm-dwarfdump was silent even when the format of DWARF was invalid
and/or llvm-dwarfdump did not understand/support some of the constructs.
This can be pretty confusing as llvm-dwarfdump is a tool for DWARF
producers+consumers development.
Review comments also by @dblaikie.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D104271