This is a natural extension of the previous changes to use the Cursor
class independently in the standard and extended opcode paths, and in
turn allows delaying error handling until the entire line has been
printed in verbose mode, removing interleaved output in some cases.
Reviewed by: MaskRay, JDevlieghere
Differential Revision: https://reviews.llvm.org/D81562
Standard opcodes usually have ULEB128 arguments, so it is generally not
possible to recover from such errors. This patch causes the parser to
stop parsing the table in such situations.
Also don't emit the operands or add data to the table if there is an
error reading these opcodes.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D81470
Summary:
This makes the code easier to reason about, as it will behave the same
way regardless of whether there is any more data coming after the
presumed end of the prologue.
Reviewers: jhenderson, dblaikie, probinson, ikudrin
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77557
The verbose printing of unrecognised standard opcodes was broken in
multiple ways (additional blank lines, a closing parenthesis without
opening parenthesis and so on). This patch fixes it, and makes the
output more consistent with other opcodes.
The new line printing for debug line verbose output was inconsistent.
For new rows in the matrix, a blank line followed, whilst the
DW_LNS_copy opcode actually resulted in two blank lines. There was also
potential inconsistency in the blank lines at the end of the table. This
patch mostly resolves these issues - no blank lines appear in the output
except for a single line after the prologue and at table end to separate
it from any subsquent table, plus some instances after error messages.
Also add a unit test for verbose output to test the fine details of new
line placement and other aspects of verbose output.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D81102
Verbose and non-verbose parsing of .debug_line produced their output at
different points in the program. The most obvious impact of this was
that error messages were produced at different times, but it also
potentially reduced what clients could do by customising the stream or
warning/error handlers.
This change makes the two variants consistent by printing non-verbose
output inline, the same as verbose output.
Testing of the error messages has been modified to check the messages
always appear in the same location to illustrate the behaviour.
Reviewed by: JDevlieghere, dblaikie, MaskRay, labath
Differential Revision: https://reviews.llvm.org/D80989
The flushes previously existed to help ensure consistent error message
output when stdout and stderr were passed to the same location. This is
no longer necessary as errs() is now tied to outs().
Reviewed by: dblaikie, MaskRay, JDevlieghere, labath
Differential Revision: https://reviews.llvm.org/D80803
Previously, if an extended opcode was truncated, it would manifest as an
"unexpected line op length error" which wasn't quite accurate. This
change checks for errors any time data is read whilst parsing an
extended opcode, and reports any errors detected.
Reviewed by: MaskRay, labath, aprantl
Differential Revision: https://reviews.llvm.org/D80797
Like non-verbose output, so that it is easy to recognize the `Line,Column,File,ISA,Discriminator` column values.
Reviewed By: JDevlieghere, jhenderson
Differential Revision: https://reviews.llvm.org/D80874
Update for upstream comments. Improve test by writing all the debug
info by hand.
Reviewers: dblaikie, jhenderson
Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80168
This will ensure that nothing can ever start parsing data from a future
sequence and part-read data will be returned as 0 instead.
Reviewed by: aprantl, labath
Differential Revision: https://reviews.llvm.org/D80796
For most tables, we already use commas in headers. This set of patches
unifies dumping the remaining ones.
Differential Revision: https://reviews.llvm.org/D80806
For most tables, we already use commas in headers. This set of patches
unifies dumping the remaining ones.
Differential Revision: https://reviews.llvm.org/D80806
For most tables, we already use commas in headers. This set of patches
unifies dumping the remaining ones.
Differential Revision: https://reviews.llvm.org/D80806
This patch extends the parsing and dumping support of llvm-dwarfdump
for debug_macro.dwo section.
Following forms are supported:
- DW_MACRO_define
- DW_MACRO_undef
- DW_MACRO_start_file
- DW_MACRO_end_file
- DW_MACRO_define_strx
- DW_MACRO_undef_strx
- DW_MACRO_define_strp
- DW_MACRO_undef_strp
Reviewed by: ikudrin, dblaikie
Differential Revision: https://reviews.llvm.org/D78500
A CIE with the Length == 0 is a terminator:
https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html
And GNU objdump recognizes them and prints the following for such entries:
"00000000 ZERO terminator"
This patch teaches llvm-objdump to do the same. I had to update tests to use
"CHECK-NEXT" too.
(Note: it looks perhaps not right that printing is done inside the DebugInfo library,
I'd expect to see the change in the llvm-objdump's code somewhere instead,
but that is how it done atm).
Differential revision: https://reviews.llvm.org/D80476
I've noticed an issue with "Data.getRelocatedValue(...)" call.
it might silently ignore an error when a content is truncated.
That leads to an infinite loop in the code (e.g. llvm-readobj hangs).
After fixing the issue I've found that actually we always tried
to read past the end of a section, even when a content was valid.
It happened because the terminator CIE (a CIE with the length == 0)
was never handled. At first I've tried just to stop adding the terminator
entry (and return), but it does not seem to be correct, because tools like
llvm-objdump might want to print something for such entries
(see comments in the code and test cases).
This patch fixes issues mentioned, provides new test cases for
both llvm-readobj and lib/DebugInfo and adds FIXMEs to existent
test cases related.
Differential revision: https://reviews.llvm.org/D80299
Demangling Itanium symbols either consumes the whole input or fails,
but Microsoft symbols can be successfully demangled with just some
of the input.
Add an outparam that enables clients to know how much of the input was
consumed, and use this flag to give llvm-undname an opt-in warning
on partially consumed symbols.
Differential Revision: https://reviews.llvm.org/D80173
The patch changes dumping of offsets in .debug_str_offsets sections so
that they are printed as 16-digit hex values if the contribution is in
the DWARF64 format.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of unit_length, debug_info_offset, and
debug_info_length fields in headers in .debug_pubname and
.debug_pubtypes sections so that they are printed as 16-digit hex values
if the contribution is in the DWARF64 format. Dumping of offsets in the
tables is changed in the same way.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of a unit_length field and offsets in headers
in .debug_loclists and .debug_rnglists sections so that they are printed
as 16-digit hex values if the contribution is in the DWARF64 format.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of unit_length and header_length fields in
headers in .debug_line sections so that they are printed as 16-digit hex
values if the contribution is in the DWARF64 format.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of the unit_length field in a unit header so
that it is printed as a 16-digit hex value if the unit is in the DWARF64
format.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of DWARF form values which sizes depend on
the DWARF format so that they are printed as 16-digit hex values for
DWARF64.
Differential Revision: https://reviews.llvm.org/D79997
The patch changes dumping of unit_length and debug_info_offset fields in
an address range header so that they are printed as 16-digit hex values
if the contribution is in the DWARF64 format.
Differential Revision: https://reviews.llvm.org/D79997
Imagine we have a broken .eh_frame.
Below is a possible sample output of llvm-readelf:
```
...
entry 2 {
initial_location: 0x10f5
address: 0x2080
}
}
}
.eh_frame section at offset 0x2028 address 0x2028:
LLVM ERROR: Parsing entry instructions at 0 failed
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/umb/LLVM/LLVM/llvm-project/build/bin/llvm-readelf -a 1
#0 0x000055f4a2ff5a1a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/home/umb/LLVM/LLVM/llvm-project/build/bin/llvm-readelf+0x2b9a1a)
...
#15 0x00007fdae5dc209b __libc_start_main /build/glibc-B9XfQf/glibc-2.28/csu/../csu/libc-start.c:342:3
#16 0x000055f4a2db746a _start (/home/umb/LLVM/LLVM/llvm-project/build/bin/llvm-readelf+0x7b46a)
Aborted
```
I.e. it calls abort(), suggests to submit a bug report and exits with the code 134.
This patch changes the logic to propagate errors to callers.
This fixes the behavior for llvm-dwarfdump, llvm-readobj and other possible tools.
Differential revision: https://reviews.llvm.org/D79165
Summary: This implements searching for function symbols and public symbols by address.
More specifically,
-Implements NativeSession::findSymbolByAddress for function symbols and
public symbols. I think data symbols are also searched for, but isn't
implemented in this patch.
-Adds classes for NativeFunctionSymbol and NativePublicSymbol
-Adds a '-use-native-pdb-reader' option to llvm-symbolizer, for testing
purposes.
Reviewers: rnk, amccarth, labath
Subscribers: mgorny, hiraditya, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79269
This patch adds support for dumping DW_MACRO_define_strx,
DW_MACRO_undef_strx in llvm-dwarfdump. These forms are currently
supported only in debug_macro section.
Reviewed By: ikudrin, dblaikie
Differential Revision: https://reviews.llvm.org/D78736
It looks like that was an initial intention, but some code paths in
`DWARFExpression::Operation::extract()` did not initialize `EndOffset`
properly.
Differential Revision: https://reviews.llvm.org/D79622
Reduces time to link PGO instrumented net_unittets.exe by 11% (9.766s ->
8.672s, best of three). Reduces peak memory by 65.7MB (2142.71MB ->
2076.95MB).
Use a more compact struct, BulkPublic, for faster sorting. Sort in
parallel. Construct the hash buckets in parallel. Try to use one vector
to hold all the publics instead of copying them from one to another.
Allocate all the memory needed to serialize publics up front, and then
serialize them in place in parallel.
Reviewed By: aganea, hans
Differential Revision: https://reviews.llvm.org/D79467
With a fix to uninitialized EndOffset.
DW_OP_call_ref is the only operation that has an operand which depends
on the DWARF format. The patch fixes handling that operation in DWARF64
units.
Differential Revision: https://reviews.llvm.org/D79501
DW_OP_call_ref is the only operation that has an operand which depends
on the DWARF format. The patch fixes handling that operation in DWARF64
units.
Differential Revision: https://reviews.llvm.org/D79501
Summary:
Current implementation of DWARFDie::getName(DINameKind Kind) could
lead to double call to DWARFDie::find(DW_AT_name) in following
scenario:
getName(LinkageName);
getName(ShortName);
getName(LinkageName) calls find(DW_AT_name) if linkage name is not
found. Then, it is called again in getName(ShortName). This patch
alows to request LinkageName and ShortName separately
to avoid extra call to find(DW_AT_name).
It helps D74169 to parse clang debuginfo faster(~1%).
Reviewers: clayborg, dblaikie
Differential Revision: https://reviews.llvm.org/D79173
The number of public symbols is very large, and each deserialization
does a few heap allocations. The public symbols are serialized by the
linker, so we can assume they have the expected layout and use it
directly.
Saves O(#publics) temporary heap allocations and shrinks some data
structures.
This accounts for a large portion of the memory allocations in LLD.
This DebugSubsectionRecordBuilder object can be stored directly in
C13Builders, it mostly wraps other subsections.
Remove the container kind field from the object. It is always the same
for all elements in the vector, and we can pass it in during writing.
Summary:
In D77860, we have changed `getSymbolFlags()` return type to `Expected<uint32_t>`.
This change helps bubble the error further up the stack.
Reviewers: jhenderson, grimar, JDevlieghere, MaskRay
Reviewed By: jhenderson
Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79075
Summary:
Change std::vector to SmallVector to prevent re-allocations and to
have small pre-allocated storage.
Reviewers: clayborg, dblaikie
Differential Revision: https://reviews.llvm.org/D79123
We unconditionally compared the DW_AT_ranges offset to the length of the
.debug_ranges section. For DWARF5 we should look at the debug_rnglists
section instead.
Differential revision: https://reviews.llvm.org/D78971
We were passing the AppleObjCSection instead of the AddrSection. Maybe
the API changed and this remained unnoticed because the types are the
same, or maybe it's just a typo.
The sizes of offsets in the `.debug_str_offsets.dwo` section depend on
the format of compilation or type units referencing them: 4 bytes for
DWARF32 units and 8 bytes for DWARF64 ones. The fix uses parsed units
to determine the actual size of offsets in the corresponding part of
the `.debug_str_offsets.dwo` section.
Differential Revision: https://reviews.llvm.org/D78555
Summary: AttrIndex could be removed from DWARFAbbreviationDeclaration::getAttributeValue.
Reviewers: clayborg, dblaikie
Differential Revision: https://reviews.llvm.org/D78672
The method is called from only one place and the call is already guarded
by a condition which checks that IsDWO is false.
Differential Revision: https://reviews.llvm.org/D78482
the tests pass on Linux.
Summary:
This change implements readFromExe, and calculating VA and RVA, which
are some of the functionalities that will be used for native PDB reading
for llvm symbolizer.
bug: https://bugs.llvm.org/show_bug.cgi?id=41795
Summary:
This change implements readFromExe, and calculating VA and RVA, which
are some of the functionalities that will be used for native PDB reading
for llvm symbolizer.
bug: https://bugs.llvm.org/show_bug.cgi?id=41795
Reviewers: hans, amccarth, rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78128
Summary:
Without this we could silently accept an invalid prologue because the
default DataExtractor behavior is to return an empty string when
reaching the end of file. And empty string is also used to terminate
these lists.
This makes the parsing code slightly more complicated, but this
complexity will go away once the parser starts working with truncating
data extractors. The reason I am doing it this way is because without
this, the truncation would regress the quality of error messages (right
now, we produce bad error messages only near EOF, but truncation would
make everything behave as if it was near EOF).
Reviewers: dblaikie, probinson, jhenderson
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77555
Summary:
If we have an (invalid) relocation which relocates bytes which partially
lie outside the range of the relocated section, the getRelocatedValue
would return confusing results. It would first read zero (because that's
what the underlying DataExtractor api does for out-of-bounds reads), and
then relocate that zero anyway.
A more appropriate behavior is to return zero straight away. This is
what this patch does.
Reviewers: dblaikie, jhenderson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78113
Originally committed as 416fa7720e
Reverted (due to buildbot failure - breaking lldb) in 7a45aeacf3.
I still can't seem to build lldb locally, but Pavel Labath has kindly
provided a potential fix to preserve the old behavior in lldb by
registering a simple recoverable error handler there that prints to the
desired stream in lldb, rather than stderr.
GCC emits this new form along with others forms(supported in llvm-dwardump)
and since it's support was missing in llvm-dwarfdump, it was not
able to correctly dump the content a debug_macro section for GCC
generated binaries.
This patch extends llvm-dwarfdump to support this form,
now GCC generated debug_macro section can be correctly dumped
using llvm-dwarfdump.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D78006
Summary:
Without that we could be silently reading zeroes, as that's the default
DataExtractor behavior. The entire parse would still most likely fail,
but it would do that with a seemingly unrelated/nonsensical error
message.
Reviewers: dblaikie, probinson, jhenderson
Subscribers: hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77554
This probably isn't ideal - the error was being printed specifically
inline with the dumping that was more legible - but then the error
wasn't reported to stderr and didn't produce a non-zero exit code.
Probably the error message could be improved by adding more context now
that it isn't printed in-situ of the DIE dumping as much.
Makes it easier to test "this doesn't produce an error" (& indeed makes
that the implied default so we don't accidentally write tests that have
silent/sneaky errors as well as the positive behavior we're testing for)
Though the support for applying relocations is patchy enough that a
bunch of tests treat lack of relocation application as more of a warning
than an error - so rather than me trying to figure out how to add
support for a bunch of relocation types, let's degrade that to a warning
to match the usage (& indeed, it's sort of more of a tool warning anyway
- it's not that the DWARF is wrong, just that the tool can't fully cope
with it - and it's not like the tool won't dump the DWARF, it just won't
follow/render certain relocations - I guess in the most general case it
might try to render an unrelocated value & instead render something
bogus... but mostly seems to be about interesting relocations used in
eh_frame (& honestly it might be nice if we were lazier about doing this
relocation resolution anyway - if you're not dumping eh_frame, should we
really be erroring about the relocations in it?))
Summary:
Although the function had a bool return value, it was always returning
true. Presumably this is because the main type of errors one can
encounter here is running off the end of the stream, and until very
recently, the DataExtractor class made it very difficult to detect that.
The situation has changed now, and we can easily detect errors here,
which this patch does.
Reviewers: dblaikie, aprantl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77308
In DWARFv5, type units are stored in .debug_info sections, along with
compilation units, and they are distinguished by the unit_type field
in the header, not by the name of the section. It is impossible to
associate the correct index section of a DWP file with the unit before
the unit's header is read. This patch fixes reading DWARFv5 type units
by parsing the header first and then applying the index entry according
to the actual unit type.
Differential Revision: https://reviews.llvm.org/D77552
In package files, the base offset provided by index sections should be
used to find the contribution of a unit. The patch adds that base
offset when reading range list tables.
Differential revision: https://reviews.llvm.org/D77401
This fixes the reading of location lists headers for compilation units
in package files by adjusting the reading offset according to the
corresponding record in the unit index. This is required for
DW_FORM_loclistx to work.
Differential revision: https://reviews.llvm.org/D77146
Without the patch, all version 5 compile units in a DWP file read
location tables from the beginning of a .debug_loclists.dwo section.
The patch fixes that by adjusting the reading offset the same way as
for pre-v5 units. The section identifier to find the contribution
entry corresponds to the version of the unit.
Differential revision: https://reviews.llvm.org/D77145
DWARFv5 defines index sections in package files in a slightly different
way than the pre-standard GNU proposal, see Section 7.3.5 in the DWARF
standard and https://gcc.gnu.org/wiki/DebugFissionDWP for GNU proposal.
The main concern here is values for section identifiers, which are
partially overlapped with changed meanings. The patch adds support for
v5 index sections and resolves that difficulty by defining a set of
identifiers for internal use which can represent and distinct values
of both standards.
Differential Revision: https://reviews.llvm.org/D75929
This is a preparation for an upcoming patch which adds support for
DWARFv5 unit index sections. The patch adds tag "_EXT_" to identifiers
which reference sections that are deprecated in the DWARFv5 standard.
See D75929 for the discussion.
Differential Revision: https://reviews.llvm.org/D77141
The old name was a bit misleading because the functions actually return
contributions to the corresponding sections.
Differential revision: https://reviews.llvm.org/D77302
Summary:
This patch adds parsing and dumping DWARFv5 .debug_macro section in llvm-dwarfdump,
it does not introduce any new switch. Existing switch "--debug-macro"
should be used to dump macinfo or macro section.
Reviewed By: dblaikie, ikudrin, jhenderson
Differential Revision: https://reviews.llvm.org/D73086
Summary:
The directory_count and file_name_count fields are (section 6.2.4 of
DWARF5 spec) supposed to be uleb128s, not bytes. This bug meant that it
was not possible to correctly parse headers with more than 128 files or
directories.
I've found this bug by code inspection, though the limit is so small
someone would have run into it for real sooner or later. I've verified
that the producer side handles many files correctly, and that we are
able to parse such files after this fix.
Reviewers: dblaikie, jhenderson
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76498
Summary:
1. FileLineInfoSpecifier::Default isn't the default for anything.
Rename to RawValue, which accurately reflects its role.
2. Most functions that take a part of a FileLineInfoSpecifier end up
constructing a full one later or plumb two values through. Make them
all just take a complete FileLineInfoSpecifier.
3. Printing basenames only was handled differently from all other
variants, make it parallel to all the other variants.
Reviewers: jhenderson
Subscribers: hiraditya, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76394
This adds the --debug-vars option to llvm-objdump, which prints
locations (registers/memory) of source-level variables alongside the
disassembly based on DWARF info. A vertical line is printed for each
live-range, with a label at the top giving the variable name and
location, and the position and length of the line indicating the program
counter range in which it is valid.
Currently, this only works for object files, not executables or shared
libraries.
Differential revision: https://reviews.llvm.org/D70720
Summary:
This is a preparatory change for allowing LLVM to emit DW_OP_convert
operations converting to the generic type.
If DW_OP_convert's operand is 0, it converts the top of stack to the
generic type, as specified by DWARFv5 section 2.5.1.6:
"[...] takes one operand, which is an unsigned LEB128 integer that
represents the offset of a debugging information entry in the current
compilation unit, or value 0 which represents the generic type."
This adds support for such operations to llvm-dwarfdump.
Reviewers: aprantl, markus, jdoerfert, jhenderson
Reviewed By: aprantl
Subscribers: hiraditya, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D76141
When emitting PDBs, the TypeStreamMerger class is used to merge .debug$T records from the input .OBJ files into the output .PDB stream.
Records in .OBJs are not required to be aligned on 4-bytes, and "The Netwide Assembler 2.14" generates non-aligned records.
When compiling with -DLLVM_ENABLE_ASSERTIONS=ON, an assert was triggered in MergingTypeTableBuilder when non-ghash merging was used.
With ghash merging there was no assert.
As a result, LLD could potentially generate a non-aligned TPI stream.
We now align records on 4-bytes when record indices are remapped, in TypeStreamMerger::remapIndices().
Differential Revision: https://reviews.llvm.org/D75081
If the minimum_instruction_length of a debug line program is 0, no
address advancing via special opcodes, DW_LNS_const_add_pc, and
DW_LNS_advance_pc can occur, since the minimum_instruction_length is
used in a multiplication. This patch adds a warning reporting when this
issue occurs.
Reviewed by: probinson
Differential Revision: https://reviews.llvm.org/D75189
The line_range value of a debug line program header is used in divisions
related to special opcodes and DW_LNS_const_add_pc opcodes. As such, a
value of 0 cannot be used. This change introduces a new warning, if such
a situation is identified, and does not perform the relevant
calculations.
Reviewed by: probinson, aprantl
Differential Revision: https://reviews.llvm.org/D43470
This patch adds a check which reports an unsupported value of the
maximum_operations_per_instruction field in a debug line table header.
This is reported once per line table, at most, and only if the tablet
would otherwise need to use it (i.e. never for tables with version 3 or
less, or for tables which don't use DW_LNS_const_add_pc or special
opcodes). Unsupported values are currently any apart from 1.
Reviewed by: probinson, MaskRay
Differential Revision: https://reviews.llvm.org/D74819
This is a follow-up for D75609. As @dblaikie suggested, it prints
the actual number for an unknown section identifier when dumping
unit index sections.
Differential Revision: https://reviews.llvm.org/D75668
This fixes printing long values that might reside in CIE and FDE,
including offsets, lengths, and addresses.
Differential Revision: https://reviews.llvm.org/D73887
The condition was not accurate enough and could interpret some FDEs in
.eh_frame or 64-bit DWARF .debug_frame sections as CIEs. Even though
such FDEs are unlikely in a normal situation, the wrong interpretation
could hide an issue in a buggy generator.
Differential Revision: https://reviews.llvm.org/D73886
A DWARFSectionKind is read from input. It is not validated on parsing,
so an unexpected value may result in reaching llvm_unreachable() in
DWARFUnitIndex::getColumnHeader() when dumping the index section.
Differential Revision: https://reviews.llvm.org/D75609
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Fixed memory sanitizer bot bugs as well.
Differential Revision: https://reviews.llvm.org/D75390
Summary:
getInitialLength is a *DWARF*DataExtractor method so I had to "upgrade"
some DataExtractors to be able to make use of it.
Reviewers: ikudrin, jhenderson, probinson
Subscribers: aprantl, hiraditya, llvm-commits, dblaikie
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75535
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Also fixed an issue where strings for files in the file table could be added in opposite order due to parameters to function calls not having a strong ordering, which caused tests to fail. Added new arch specfic directories so when targets are not enabled, we continue to function just fine.
Differential Revision: https://reviews.llvm.org/D75390
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Also fixed an issue where strings for files in the file table could be added in opposite order due to parameters to function calls not having a strong ordering, which caused tests to fail.
Differential Revision: https://reviews.llvm.org/D75390
Summary:
In this patch I've done a slightly bigger rewrite to also remove the
hardcoded header lengths.
Reviewers: jhenderson, dblaikie, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75119
Summary:
This could be considered obvious, but I am putting it up to illustrate
the usefulness/impact of the getInitialLength change.
Reviewers: dblaikie, jhenderson, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75117
The integrity checks for index entries in DWARFUnitHeader::extract()
might cause the function to return before checking the state of an
Error object, which leads to a crash in runtime. The patch fixes the
issue by moving the checks in a safe place.
Differential Revision: https://reviews.llvm.org/D75177
Summary:
Error reporting in DebugInfoDWARF library currently done in three ways :
1. Direct calls to WithColor::error()/WithColor::warning()
2. ErrorPolicy defaultErrorHandler(Error E);
3. void dumpWarning(Error Warning);
additionally, other locations could have more variations:
lld/ELF/SyntheticSection.cpp
if (Error e = cu->tryExtractDIEsIfNeeded(false)) {
error(toString(sec) + ": " + toString(std::move(e)));
DebugInfo/DWARF/DWARFUnit.cpp
if (Error e = tryExtractDIEsIfNeeded(CUDieOnly))
WithColor::error() << toString(std::move(e));
Thus error reporting could look inconsistent. To have a consistent error
messages it is necessary to have a possibility to redefine error
reporting functions. This patch creates two handlers and allows to
redefine them. It also patches all places inside DebugInfoDWARF
to use these handlers.
The intention is always to use following handlers for error reporting
purposes inside DebugInfoDWARF:
DebugInfo/DWARF/DWARFContext.h
std::function<void(Error E)> RecoverableErrorHandler = WithColor::defaultErrorHandler;
std::function<void(Error E)> WarningHandler = WithColor::defaultWarningHandler;
This is last patch from series of patches: D74481, D74635, D75118.
Reviewers: jhenderson, dblaikie, probinson, aprantl, JDevlieghere
Reviewed By: jhenderson
Subscribers: grimar, hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D74308
Summary:
Current LLVM code base does not use error handler with ErrorPolicy.
This patch removes ErrorPolicy from DWARFContext.
This patch is extracted from the D74308.
Reviewers: jhenderson, dblaikie, grimar, aprantl, JDevlieghere
Reviewed By: grimar
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D75118
Summary:
This patch introduces a function to house the code needed to do the
DWARF64 detection dance. The function decodes the initial length field
and returns it as a pair containing the actual length, and the DWARF
encoding.
This patch does _not_ attempt to handle the problem of detecting lengths
which extend past the size of the section, or cases when reads of a
single contribution accidentally escape beyond its specified length, but
I think it's useful in its own right.
Reviewers: dblaikie, jhenderson, ikudrin
Subscribers: hiraditya, probinson, aprantl, JDevlieghere, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74560
The patch was reverted in 69da40033 because of test failures on windows.
The problem was the unpredictable order of some of the error messages,
which I've tried to strenghten in that patch.
It turns out this is not possible to do in verbose mode because there
the data is being writted as it is being parsed. No amount of flushing
(as I've done in the non-verbose mode) will help that. Indeed, even
without any buffering the warning messages can end in the middle of a
line in non-verbose mode.
In this patch, I have reverted the changes which tested the relative
position of the warning message, except for the messages about
unsupported initial length, which are the ones I really wanted to test,
and which do come out reasonably.
The original commit message was:
This patch if motivated by D74560, specifically the subthread about what
to print upon encountering reserved initial length values.
If the debug_line prologue has an unsupported version, we skip parsing
the rest of the data. If we encounter an reserved initial length field,
we don't even parse the version. However, we still print out all members
(with value 0) in the dump function.
This patch introduces early exits in the Prologue::dump function so that
we print only the fields that were parsed successfully. In case of an
unsupported version, we skip printing all subsequent prologue fields --
because we don't even know if this version has those fields. In case of a
reserved unit length, we don't print anything -- if the very first field
of the prologue is invalid, it's hard to say if we even have a prologue
to begin with.
Note that the user will still be able to see the invalid/reserved
initial length value in the error message. I've modified (reordered)
debug_line_invalid.test to show that the error message comes straight
after the debug_line offset. I've also added some flush() calls to the
dumping code to ensure this is the case in all situations (without that,
the warnings could get out of sync if the output was not a terminal -- I
guess this is why std::iostreams have the tie() function).
Reviewers: jhenderson, ikudrin, dblaikie
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75043
Summary:
This patch creates the llvm-gsymutil binary that can convert object files to GSYM using the --convert <path> option. It can also dump and lookup addresses within GSYM files that have been saved to disk.
To dump a file:
llvm-gsymutil /path/to/a.gsym
To perform address lookups, like with atos, on GSYM files:
llvm-gsymutil --address 0x1000 --address 0x1100 /path/to/a.gsym
To convert a mach-o or ELF file, including any DWARF debug info contained within the object files:
llvm-gsymutil --convert /path/to/a.out --out-file /path/to/a.out.gsym
Conversion highlights:
- convert DWARF debug info in mach-o or ELF files to GSYM
- convert symbols in symbol table to GSYM and don't convert symbols that overlap with DWARF debug info
- extract UUID from object files
- extract .text (read + execute) section address ranges and filter out any DWARF or symbols that don't fall in those ranges.
- if .text sections are extracted, and if the last gsym::FunctionInfo object has no size, cap the size to the end of the section the function was contained in
Dumping GSYM files will dump all sections of the GSYM file in textual format.
Reviewers: labath, aadsm, serhiy.redko, jankratochvil, xiaobai, wallace, aprantl, JDevlieghere, jdoerfert
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74883
Summary:
This patch if motivated by D74560, specifically the subthread about what
to print upon encountering reserved initial length values.
If the debug_line prologue has an unsupported version, we skip parsing
the rest of the data. If we encounter an reserved initial length field,
we don't even parse the version. However, we still print out all members
(with value 0) in the dump function.
This patch introduces early exits in the Prologue::dump function so that
we print only the fields that were parsed successfully. In case of an
unsupported version, we skip printing all subsequent prologue fields --
because we don't even know if this version has those fields. In case of a
reserved unit length, we don't print anything -- if the very first field
of the prologue is invalid, it's hard to say if we even have a prologue
to begin with.
Note that the user will still be able to see the invalid/reserved
initial length value in the error message. I've modified (reordered)
debug_line_invalid.test to show that the error message comes straight
after the debug_line offset. I've also added some flush() calls to the
dumping code to ensure this is the case in all situations (without that,
the warnings could get out of sync if the output was not a terminal -- I
guess this is why std::iostreams have the tie() function).
Reviewers: jhenderson, ikudrin, dblaikie
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75043
While the value of the CIE pointer field in a DWARF FDE record is
an offset to the corresponding CIE record from the beginning of
the section, for EH FDE records it is relative to the current offset.
Previously, we did not make that distinction when dumped both kinds
of FDE records and just printed the same value for the CIE pointer
field and the CIE offset; that was acceptable for DWARF FDEs but was
wrong for EH FDEs.
This patch fixes the issue by explicitly printing the offset of the
linked CIE object.
Differential Revision: https://reviews.llvm.org/D74613
macro section dumping.
Summary: Previously macinfo infrastructure was using functions
names that were ambiguous i.e `getMacro/getMacroDWO` in a sense
of conveying stated intentions. This patch refactored them into more
reasonable `getDebugMacinfo/getDebugMacinfoDWO` names thus making
room for macro implementation.
Reviewers: aprantl, probinson, jini.susan.george, dblaikie
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D75037
The CIE pointer field of an FDE record contains an offset to
a corresponding CIE record. In object files, this value comes with
relocation because the value has to be fixed when a linker combines
the final section from multiple sources. In most object files there is
only one CIE record at offset 0 of the .debug_frame section, so reading
a relocated or a raw value makes no difference. However, in partially
linked object files there are multiple CIE records and the relocations
should be applied to recover the right offset value.
Differential Revision: https://reviews.llvm.org/D74612
Summary:
The Offset provides the offset within the function in a SourceLocation struct. This allows us to show the byte offset within a function. We also track offsets within inline functions as well. Updated the lookup tests to verify the offset for functions and inline functions.
0x1000: main + 32 @ /tmp/main.cpp:45
Reviewers: labath, aadsm, serhiy.redko, jankratochvil, xiaobai, wallace, aprantl, JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74680
Summary:
This patch is extracted from D74308.
It patches all usages of WithColor::error() and WithColor::warning
in DebugInfoDWARF library.
Depends on D74481
Reviewers: jhenderson, dblaikie, probinson, aprantl, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74635
Summary:
this review is extracted from D74308.
It creates two error handlers which allow to redefine error
reporting routine and should be used for all places
where errors are reported:
std::function<void(Error)> RecoverableErrorHandler = defaultErrorHandler;
std::function<void(Error)> WarningHandler = defaultWarningHandler;
It also creates accessors to above handlers which should be used to
report errors.
function_ref<void(Error)> getRecoverableErrorHandler() {
return RecoverableErrorHandler;
}
function_ref<void(Error)> getWarningHandler() { return WarningHandler; }
It patches all error reporting places inside DWARFContext and DWARLinker.
Reviewers: jhenderson, dblaikie, probinson, aprantl, JDevlieghere
Reviewed By: jhenderson, JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D74481
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
Prior to this patch, if a DW_LNE_set_address opcode was parsed with an
address size (i.e. with a length after the opcode) of anything other 1,
2, 4, or 8, an llvm_unreachable would be hit, as the data extractor does
not support other values. This patch introduces a new error check that
verifies the address size is one of the supported sizes, in common with
other places within the DWARF parsing.
This patch also fixes calculation of a generated line table's size in
unit tests. One of the tests in this patch highlighted a bug introduced
in 1271cde474, when non-byte operands were used as arguments for
extended or standard opcodes.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D73962
Summary:
The DWARF transformer is added as a class so it can be unit tested fully.
The DWARF is converted to GSYM format and handles many special cases for functions:
- omit functions in compile units with 4 byte addresses whose address is UINT32_MAX (dead stripped)
- omit functions in compile units with 8 byte addresses whose address is UINT64_MAX (dead stripped)
- omit any functions whose high PC is <= low PC (dead stripped)
- StringTable builder doesn't copy strings, so we need to make backing copies of strings but only when needed. Many strings come from sections in object files and won't need to have backing copies, but some do.
- When a function doesn't have a mangled name, store the fully qualified name by creating a string by traversing the parent decl context DIEs and then. If we don't do this, we end up having cases where some function might appear in the GSYM as "erase" instead of "std::vector<int>::erase".
- omit any functions whose address isn't in the optional TextRanges member variable of DwarfTransformer. This allows object file to register address ranges that are known valid code ranges and can help omit functions that should have been dead stripped, but just had their low PC values set to zero. In this case we have many functions that all appear at address zero and can omit these functions by making sure they fall into good address ranges on the object file. Many compilers do this when the DWARF has a DW_AT_low_pc with a DW_FORM_addr, and a DW_AT_high_pc with a DW_FORM_data4 as the offset from the low PC. In this case the linker can't write the same address to both the high and low PC since there is only a relocation for the DW_AT_low_pc, so many linkers tend to just zero it out.
Reviewers: aprantl, dblaikie, probinson
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74450
We do not keep the actual value of the CIE ID field, because it is
predefined, and use a constant when dumping a CIE record. The issue
was that the predefined value is different for .debug_frame and
.eh_frame sections, but we always printed the one which corresponds
to .debug_frame. The patch fixes that by choosing an appropriate
constant to print.
See the following for more information about .eh_frame sections:
https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html
Differential Revision: https://reviews.llvm.org/D73627
The DWARFv2-4 specification for the line table header states that the
include directories and file name tables both end with a single null
byte. Prior to this change, the parser did not detect if this byte was
missing, because it also stopped reading the tables once it reached the
prologue end, as claimed by the header_length field. This change adds a
check that the terminator has been seen at the end of each table.
Reviewed by: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D74413
The number of standard opcodes is defined to be opcode_base - 1, so a
value of 0 for the opcode_base caused a crash as an attempt was made to
reserve many entries in a vector. This change fixes the crash, by
issuing a warning and skipping reading of standard opcode lengths in the
event of an opcode_base of 0.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D74309
Also remove some test duplication and add a test case that shows the
maximum version is rejected (this also shows that the value in the error
message is actually in decimal, and not just missing an 0x prefix).
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D74403
The patch removes unnecessary members of DWARFDebugAddr and further
simplifies the implementation by separating parsing methods of tables
in the DWARFv5 and pre-standard formats.
Differential Revision: https://reviews.llvm.org/D74197
As a preparation for the subsequent patches, this updates the wordings
of some error messages in DWARFDebugAddr.
Differential Revision: https://reviews.llvm.org/D74196
This replaces a collocation "a .debug_addr table" with "an address table"
because the latter sounds more accurate.
Differential Revision: https://reviews.llvm.org/D74407
As there is no header in pre-DWARFv5 address tables, and we fill
the class data members with some artificial values, we should not
dump them as that might be misleading.
Differential Revision: https://reviews.llvm.org/D74195
As addresses in the address tables may have relocations, thus,
the relocations should be resolved to read the correct address.
That is especially important for targets that use RELA relocations
because in that case addends are stored in relocation sections.
Differential Revision: https://reviews.llvm.org/D74404
Summary:
Dwarf stores source-file names the three parts:
<compilation_directory><include_directory><filename>
Prior to this change, the code only allowed retrieving either all
three as the absolute path, or just the filename. But many
compile-command lines--especially those in hermetic build systems
don't specify an absolute path, nor just the filename, but rather the
path relative to the compilation directory. This features allows
retrieving them in that style.
Add tests for path printing styles.
Modify createBasicPrologue to handle include directories.
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73383
Summary:
That patch is extracted from https://reviews.llvm.org/D74308.
Currently there are two patterns to name error handling functions:
using "Callback" and "Handler". This patch uses "Handler" for all
usage places.
Reviewers: jhenderson, dblaikie, probinson, aprantl
Reviewed By: jhenderson, dblaikie
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D74354
If a debug line section with version of greater than 5 is encountered,
prior to this change the parser would accept it and treat it as version
5. This might work to some extent, but then it might not at all, as it
really depends on the format of the unspecified future version, which
will be different (otherwise there would be no point in changing the
version number). Any information we could provide has a good chance of
being invalid, so we should just refuse to parse such tables.
Reviewed by: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D74204
If dumping an Split DWARF file that hasn't been split into separate
files (such as from llc - that includes the plain and .dwo sections in
the same file) allow both macinfo and macinfo.dwo sections to be dumped.
The function a) returned 32-bits when in DWARF64, the PrologueLength
field is 64-bits in size, and b) didn't work for DWARF version 5.
Also deleted some related dead code. With this deletion, getLength is
itself dead, but another change is about to make use of it.
Reviewed by: probinson
Differential Revision: https://reviews.llvm.org/D73626
Summary:
gnu addr2line prints DWARF line table discriminators like so:
<file>:<line> (discriminator <Number>)
This matches that behavior.
Document how and when --output-style=GNU prints discriminators
Add test for new GNU-style discriminator printing.
Reviewers: rupprecht, labath, jhenderson
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73318
Summary:
Add test case for the same. This test case will also serve as a
starting point for later symbolizer tests.
Reviewers: dblaikie, jdoerfert
Subscribers: hiraditya, llvm-commits, jhenderson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73583
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "assume
stated length is correct" is taken which means the offset might need
adjusting.
This is a relanding of b94191fe, fixing an LLD test and the LLDB build.
Reviewed by: dblaikie, labath
Differential Revision: https://reviews.llvm.org/D72158
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Many of the debug line prologue errors are not inherently fatal. In most
cases, we can make reasonable assumptions and carry on. This patch does
exactly that. In the case of length problems, the approach of "the
claimed length is correct" is taken to be consistent with other
instances such as the SectionParser, which ignores the read length.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72158
It is possible to try to keep parsing a debug line program even when the
length of an extended opcode does not match what is expected for that
opcode. This patch changes what was previously a fatal error to be
non-fatal. The parser now continues by assuming the the claimed length
is correct, even if it means moving the offset backwards.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72155
The Version was used only to determine the size of an operand of
DW_OP_call_ref. The size was 4 for all versions apart from 2, but
the DW_OP_call_ref operation was introduced only in DWARF3. Thus,
the code may be simplified and using of Version may be eliminated.
Differential Revision: https://reviews.llvm.org/D73264
As DataExtractor already has a method to extract an unsigned value of
a specified size, there is no need to duplicate that.
Differential Revision: https://reviews.llvm.org/D73263
The padding field is reserved for DWARF and does not contain any useful
information. No need to read, store and report it.
Differential Revision: https://reviews.llvm.org/D73042
This structure was used to get the size of the fixed-size part of a Name
Index header for 32-bit DWARF. It is unsuitable for 64-bit DWARF because
the size of the unit length field is different.
Differential Revision: https://reviews.llvm.org/D73040
This helps to detect and report parsing errors better.
The patch follows the ideas of LLDB's patches D59370 and D59381.
It adds tests for valid and some invalid cases. More checks and
tests to come. Note that the patch fixes validation of the Length
field because the value does not include the field itself.
The existing users are updated to show the error messages.
Differential Revision: https://reviews.llvm.org/D71875
Summary:
This patch implements `formatv()` formatting for `dwarf::LineNumberOps`
and makes use of it for the `llvm-dwarfdump --debug-line` dump.
Previously, unknown line number standard opcodes would lead to undefined
behaviour. The code would attempt to format the data pointer of an empty
`StringRef` (a null pointer) using `%s`. According to the description
for `format()`, use of that interface carries the "risk of `printf`".
Passing a null pointer in place of an array to a C library function
results in undefined behaviour.
Reviewers: jhenderson, daltenty, stevewan
Reviewed By: jhenderson
Subscribers: aprantl, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72369
Reasonable assumptions can be made when a parsed address length does not
match the expected length, so there's no need for this to be fatal.
Reviewed by: ikudrin
Differential Revision: https://reviews.llvm.org/D72154
Unlike most of our errors in the debug line parser, the "no end of
sequence" message was missing any reference to which line table it
refererred to. This change adds the offset to this message.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D72443
The previous message mentioned DW_LLE_offset_pair, but this is
incorrect/confusing because we can get this message even with DWARF4
(which does not use DW_LLE encodings). This happens because DWARF<=4
location entries are "upgraded" to DWARF v5 during parsing.
The new error message refrains from referencing specific constants.
Fixes pr44482.
If the claimed unit length of a debug line program is such that the line
table would finish past the end of the .debug_line section, an infinite
loop occurs because the data extractor will continue to "read" zeroes
without changing the offset. This previously didn't hit an error because
the line table program handles a series of zeroes as a bad extended
opcode.
This patch fixes the inifinite loop and adds a warning if the program
doesn't fit in the available data.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D72279
When getting the file name form the line table prologue we assume that a
valid string form value can always be extracted as a string. If you look
at the implementation of DWARFormValue this is not necessarily true. I
hit this assertion from LLDB when I create a "dummy" DWARFContext that
was missing the string section.
The V5 directory and filename tables had checks in to make sure we
hadn't read past the end of the line table prologue. Since previous
changes to the data extractor class ensure we never read past the end,
these checks are now redundant, so this patch removes them.
There is still a check to show that the whole prologue remains within
the prologue length.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D71768
Summary:
I used this information to motivate splitting up the Intrinsic::ID enum
(5d986953c8) and adding a key method to
clang::Sema (586f65d31f) which saved a
fair amount of object file size.
Example output for clang.pdb:
Top 10 types responsible for the most TPI input bytes:
index total bytes count size
0x3890: 8,671,220 = 1,805 * 4,804
0xE13BE: 5,634,720 = 252 * 22,360
0x6874C: 5,181,600 = 408 * 12,700
0x2A1F: 4,520,528 = 1,574 * 2,872
0x64BFF: 4,024,020 = 469 * 8,580
0x1123: 4,012,020 = 2,157 * 1,860
0x6952: 3,753,792 = 912 * 4,116
0xC16F: 3,630,888 = 633 * 5,736
0x69DD: 3,601,160 = 985 * 3,656
0x678D: 3,577,904 = 319 * 11,216
In this case, we can see that record 0x3890 is responsible for ~8MB of
total object file size for objects in clang.
The user can then use llvm-pdbutil to find out what the record is:
$ llvm-pdbutil dump -types -type-index 0x3890
Types (TPI Stream)
============================================================
Showing 1 records.
0x3890 | LF_FIELDLIST [size = 4804]
- LF_STMEMBER [name = `WORDTYPE_MAX`, type = 0x1001, attrs = public]
- LF_MEMBER [name = `U`, Type = 0x37F0, offset = 0, attrs = private]
- LF_MEMBER [name = `BitWidth`, Type = 0x0075 (unsigned), offset = 8, attrs = private]
- LF_METHOD [name = `APInt`, # overloads = 8, overload list = 0x3805]
...
In this case, we can see that these are members of the APInt class,
which is emitted in 1805 object files.
The next largest type is ASTContext:
$ llvm-pdbutil dump -types -type-index 0xE13BE bin/clang.pdb
0xE13BE | LF_FIELDLIST [size = 22360]
- LF_BCLASS
type = 0x653EA, offset = 0, attrs = public
- LF_MEMBER [name = `Types`, Type = 0x653EB, offset = 8, attrs = private]
- LF_MEMBER [name = `ExtQualNodes`, Type = 0x653EC, offset = 24, attrs = private]
- LF_MEMBER [name = `ComplexTypes`, Type = 0x653ED, offset = 48, attrs = private]
- LF_MEMBER [name = `PointerTypes`, Type = 0x653EE, offset = 72, attrs = private]
...
ASTContext only appears 252 times, but the list of members is long, and
must be repeated everywhere it is used.
This was the output before I split Intrinsic::ID:
Top 10 types responsible for the most TPI input:
0x686C: 69,823,920 = 1,070 * 65,256
0x686D: 69,819,640 = 1,070 * 65,252
0x686E: 69,819,640 = 1,070 * 65,252
0x686B: 16,371,000 = 1,070 * 15,300
...
These records were all lists of intrinsic enums.
Reviewers: MaskRay, ruiu
Subscribers: mgrang, zturner, thakis, hans, akhuang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71437
This patch fixes an inconsistency where we were using std::function in
some places and function_ref in others to pass around the error handling
callback.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D71762
Now that DWARFv5 provides a way to identify DWARF expressions based on
form, rather than only by attribute - use it to always provide pretty
printing for any exprloc attribute, not only the attributes known to
contain expressions.
Tests "dwarfdump-rnglists-dwarf64.s" and "dwarfdump-rnglists.s" were
malformed because they had missing required DWO ID fields in split
compilation unit headers. The patch fixes the tests and checks
the reading of a unit header more thoroughly.
Differential Revision: https://reviews.llvm.org/D71704
Extends DWARF expression language to express locals/globals locations. (via
target-index operands atm) (possible variants are: non-virtual registers
or address spaces)
The WebAssemblyExplicitLocals can replace virtual registers to targertindex
operand type at the time when WebAssembly backend introduces
{get,set,tee}_local instead of corresponding virtual registers.
Reviewed By: aprantl, dschuff
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D52634
Summary:
With -gdwarf-5 local variable locations are emitted as DW_FORM_loclistx
form instead of the regular DW_FORM_sec_offset. Teach
DWARFDie::getLocations to understand the new format and use it in
llvm-symbolizer "FRAME" command.
Reviewers: pcc, jdoerfert
Subscribers: srhines, aprantl, hiraditya, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70756
as it causes a layering violation/dependency cycle:
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp -> llvm/DebugInfo/DWARF/DWARFExpression.h
llvm/include/llvm/DebugInfo/DWARF/DWARFOptimizer.h -> llvm/CodeGen/NonRelocatableStringpool.h
This reverts commit abc7f6800d.
The debug line verbose printing was printing the wrong values for rows
added via DW_LNE_end_sequence, because the row was being printed AFTER
its state had been reset following it being appended to the line table.
This patch fixes this issue by printing the row before appending it.
Reviewers: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D71664
That patch is extracted from the D70709. It moves CompileUnit, DeclContext
into llvm/DebugInfo/DWARF. It also adds new file DWARFOptimizer with
AddressesMap class. AddressesMap generalizes functionality
from RelocationManager.
Differential Revision: https://reviews.llvm.org/D71271
Commit 84a9756 added an extra blank line at the end of any line table.
However, a blank line is also printed after the line table header, which
meant that two blank lines in a row were being printed after a header,
if there were no rows. This patch defers the post-header blank line
printing until it has been determined that there are rows to print.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D71540
This helps delineate it in the output from later tables or other output.
Reviewed by: JDevlieghere
Differential Revision: https://reviews.llvm.org/D71344
That patch adds checking into DWARFVerifier that the Skeleton
compilation unit does not have children.
Differential Revision: https://reviews.llvm.org/D71244
The Isa register is a uint8_t, but at least on Windows this is
internally an unsigned char, which meant that prior to this patch it got
formatted as an ASCII character, rather than a decimal number. This
patch fixes this by casting it to a uint64_t before printing. I did it
this way instead of using a uint8_t formatter because a) it is simpler,
and b) it allows us to change the internal type of Isa in the future
without this code breaking.
I also took the opportunity to test the printing of the other standard
opcodes.
Reviewed by: probinson
Differential Revision: https://reviews.llvm.org/D71274
GCC says:
.../llvm/lib/DebugInfo/GSYM/FunctionInfo.cpp:195:12:
error: ‘InfoType’ is not a class, namespace, or enumeration
case InfoType::EndOfList:
^
Presumably, GCC thinks InfoType is a variable here. Work around it by
using the name IT as is done above.
Summary:
Lookup functions are designed to not fully decode a FunctionInfo, LineTable or InlineInfo, they decode only what is needed into a LookupResult object. This allows lookups to avoid costly memory allocations and avoid parsing large amounts of information one a suitable match is found.
LookupResult objects contain the address that was looked up, the concrete function address range, the name of the concrete function, and a list of source locations. One for each inline function, and one for the concrete function. This allows one address to turn into multiple frames and improves the signal you get when symbolicating addresses in GSYM files.
Reviewers: labath, aprantl
Subscribers: mgorny, hiraditya, llvm-commits, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70993
Summary:
Currently these function return the raw content of the appropriate table
header, which means they are relative to the DW_AT_{loc,rng}list_base,
and one has to relocate them in order to do anything.
This changes the functions to perform the relocation themselves, which
seems more clearer, particularly as they are sitting right next to the
find{Rng,Loc}listFromOffset functions, but one *cannot* simply take the
result of these functions and take pass them there.
The only effect of this patch is to change what value is dumped for the
DW_AT_ranges attribute, which I think is for the better, as previously
the values appeared to point into thin air.
(The main reason I am looking at this is because I was trying to
implement equivalent functionality in lldb's DWARFUnit, and was stumped
by this behavior.
Reviewers: dblaikie, JDevlieghere, aprantl
Subscribers: hiraditya, llvm-commits, SouraVX
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71006
Build ID is a protocol for looking up debug files that's already
supported by various tools including debuggers. For example, when
locating debug files, gdb would check the following directories:
- /usr/lib/debug/.build-id/ab/cdef1234.debug
- /usr/bin/ls.debug
- /usr/bin/.debug/ls.debug
- /usr/lib/debug/usr/bin/ls.debug
llvm-symbolizer currently consults all of these except for build ID
based one. This patch implements support for build ID lookup. The
set of debug directories to search is specified by the new option:
--debug-file-directory, whose name matches the debug-file-directory
variable used by gdb for the same purpose.
Differential Revision: https://reviews.llvm.org/D70759
Summary:
lldb's loclists parser has support for DW_LLE_start_end(x) encodings. To
avoid regressing when switching the implementation to llvm's, I add
parsing support for all previously unsupported location list encodings.
Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, probinson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70949
Summary:
The dump() function already accepts a callback. This makes
getAbsoluteRanges do the same. The existing DWARFUnit overload is
implemented on top of the new function.
This enables usage of the debug_rnglists parser from within lldb (which
has it's own dwarf parser).
Reviewers: dblaikie, JDevlieghere, aprantl
Subscribers: hiraditya, probinson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70952
Summary:
This does exactly what it says on the box. The only small gotcha is the
section index computation for offset_pair entries, which can use either
the base address section, or the section from the offset_pair entry.
This is to support both the cases where the base address is relocated
(points to the base of the CU, typically), and the case where the base
address is a constant (typically zero) and relocations are on the
offsets themselves.
Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, llvm-commits, probinson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70540
This patch adds support for debug_macinfo.dwo section[pre-standardized]
to llvm and llvm-dwarfdump.
Reviewers: probinson, dblaikie, aprantl, jini.susan.george, alok
Differential Revision: https://reviews.llvm.org/D70705
Tags: #debug-info #llvm
Summary:
Use getSubroutineName() to the the subrouting name; this function knows
how to handle cases when DW_TAG_subprogram refers to an earlier
declaration:
0x00000050: DW_TAG_subprogram
DW_AT_linkage_name ("_ZN1A1fEv")
DW_AT_name ("f")
...
0x00000067: DW_TAG_subprogram
DW_AT_low_pc (0x0000000000000000)
DW_AT_high_pc (0x0000000000000020)
DW_AT_specification (0x00000050 "_ZN1A1fEv")
...
0x0000008c: DW_TAG_variable
Reviewers: pcc, vitalybuka, jdoerfert
Subscribers: srhines, hiraditya, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70630
Summary:
Support location lists in FRAME command.
These are used for the majority of local variables in optimized code.
Also support DW_OP_breg in addition to DW_OP_fbreg when it refers to the
same register as DW_AT_frame_base.
Reviewers: pcc, jdoerfert
Subscribers: srhines, hiraditya, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70629
Summary:
llvm-symbolizer protocol is empty string means end-of-output.
Do not emit empty string when a function or a variable do not have a
name for any reason. Emit "??".
Reviewers: pcc, vitalybuka, jdoerfert
Subscribers: srhines, hiraditya, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70626
The original commit message follows.
This patch adds support for debug_loclists.dwo section in llvm and llvm-dwarfdump.
Also Fixes PR43622, PR43623.
Reviewers: dblaikie, probinson, labath, aprantl, jini.susan.george
Differential Revision: https://reviews.llvm.org/D69462