If the minimum_instruction_length of a debug line program is 0, no
address advancing via special opcodes, DW_LNS_const_add_pc, and
DW_LNS_advance_pc can occur, since the minimum_instruction_length is
used in a multiplication. This patch adds a warning reporting when this
issue occurs.
Reviewed by: probinson
Differential Revision: https://reviews.llvm.org/D75189
The line_range value of a debug line program header is used in divisions
related to special opcodes and DW_LNS_const_add_pc opcodes. As such, a
value of 0 cannot be used. This change introduces a new warning, if such
a situation is identified, and does not perform the relevant
calculations.
Reviewed by: probinson, aprantl
Differential Revision: https://reviews.llvm.org/D43470
This patch adds a check which reports an unsupported value of the
maximum_operations_per_instruction field in a debug line table header.
This is reported once per line table, at most, and only if the tablet
would otherwise need to use it (i.e. never for tables with version 3 or
less, or for tables which don't use DW_LNS_const_add_pc or special
opcodes). Unsupported values are currently any apart from 1.
Reviewed by: probinson, MaskRay
Differential Revision: https://reviews.llvm.org/D74819
This is a follow-up for D75609. As @dblaikie suggested, it prints
the actual number for an unknown section identifier when dumping
unit index sections.
Differential Revision: https://reviews.llvm.org/D75668
This fixes printing long values that might reside in CIE and FDE,
including offsets, lengths, and addresses.
Differential Revision: https://reviews.llvm.org/D73887
The condition was not accurate enough and could interpret some FDEs in
.eh_frame or 64-bit DWARF .debug_frame sections as CIEs. Even though
such FDEs are unlikely in a normal situation, the wrong interpretation
could hide an issue in a buggy generator.
Differential Revision: https://reviews.llvm.org/D73886
A DWARFSectionKind is read from input. It is not validated on parsing,
so an unexpected value may result in reaching llvm_unreachable() in
DWARFUnitIndex::getColumnHeader() when dumping the index section.
Differential Revision: https://reviews.llvm.org/D75609
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Fixed memory sanitizer bot bugs as well.
Differential Revision: https://reviews.llvm.org/D75390
Summary:
getInitialLength is a *DWARF*DataExtractor method so I had to "upgrade"
some DataExtractors to be able to make use of it.
Reviewers: ikudrin, jhenderson, probinson
Subscribers: aprantl, hiraditya, llvm-commits, dblaikie
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75535
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Also fixed an issue where strings for files in the file table could be added in opposite order due to parameters to function calls not having a strong ordering, which caused tests to fail. Added new arch specfic directories so when targets are not enabled, we continue to function just fine.
Differential Revision: https://reviews.llvm.org/D75390
YAML files were not being run during lit testing as there was no lit.local.cfg file. Once this was fixed, some buildbots would fail due to a StringRef that pointed to a std::string inside of a temporary llvm::Triple object. These issues are fixed here by making a local triple object that stays around long enough so the StringRef points to valid data. Also fixed an issue where strings for files in the file table could be added in opposite order due to parameters to function calls not having a strong ordering, which caused tests to fail.
Differential Revision: https://reviews.llvm.org/D75390
Summary:
In this patch I've done a slightly bigger rewrite to also remove the
hardcoded header lengths.
Reviewers: jhenderson, dblaikie, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75119
Summary:
This could be considered obvious, but I am putting it up to illustrate
the usefulness/impact of the getInitialLength change.
Reviewers: dblaikie, jhenderson, ikudrin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75117
The integrity checks for index entries in DWARFUnitHeader::extract()
might cause the function to return before checking the state of an
Error object, which leads to a crash in runtime. The patch fixes the
issue by moving the checks in a safe place.
Differential Revision: https://reviews.llvm.org/D75177
Summary:
Error reporting in DebugInfoDWARF library currently done in three ways :
1. Direct calls to WithColor::error()/WithColor::warning()
2. ErrorPolicy defaultErrorHandler(Error E);
3. void dumpWarning(Error Warning);
additionally, other locations could have more variations:
lld/ELF/SyntheticSection.cpp
if (Error e = cu->tryExtractDIEsIfNeeded(false)) {
error(toString(sec) + ": " + toString(std::move(e)));
DebugInfo/DWARF/DWARFUnit.cpp
if (Error e = tryExtractDIEsIfNeeded(CUDieOnly))
WithColor::error() << toString(std::move(e));
Thus error reporting could look inconsistent. To have a consistent error
messages it is necessary to have a possibility to redefine error
reporting functions. This patch creates two handlers and allows to
redefine them. It also patches all places inside DebugInfoDWARF
to use these handlers.
The intention is always to use following handlers for error reporting
purposes inside DebugInfoDWARF:
DebugInfo/DWARF/DWARFContext.h
std::function<void(Error E)> RecoverableErrorHandler = WithColor::defaultErrorHandler;
std::function<void(Error E)> WarningHandler = WithColor::defaultWarningHandler;
This is last patch from series of patches: D74481, D74635, D75118.
Reviewers: jhenderson, dblaikie, probinson, aprantl, JDevlieghere
Reviewed By: jhenderson
Subscribers: grimar, hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D74308
Summary:
Current LLVM code base does not use error handler with ErrorPolicy.
This patch removes ErrorPolicy from DWARFContext.
This patch is extracted from the D74308.
Reviewers: jhenderson, dblaikie, grimar, aprantl, JDevlieghere
Reviewed By: grimar
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D75118
Summary:
This patch introduces a function to house the code needed to do the
DWARF64 detection dance. The function decodes the initial length field
and returns it as a pair containing the actual length, and the DWARF
encoding.
This patch does _not_ attempt to handle the problem of detecting lengths
which extend past the size of the section, or cases when reads of a
single contribution accidentally escape beyond its specified length, but
I think it's useful in its own right.
Reviewers: dblaikie, jhenderson, ikudrin
Subscribers: hiraditya, probinson, aprantl, JDevlieghere, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74560
The patch was reverted in 69da40033 because of test failures on windows.
The problem was the unpredictable order of some of the error messages,
which I've tried to strenghten in that patch.
It turns out this is not possible to do in verbose mode because there
the data is being writted as it is being parsed. No amount of flushing
(as I've done in the non-verbose mode) will help that. Indeed, even
without any buffering the warning messages can end in the middle of a
line in non-verbose mode.
In this patch, I have reverted the changes which tested the relative
position of the warning message, except for the messages about
unsupported initial length, which are the ones I really wanted to test,
and which do come out reasonably.
The original commit message was:
This patch if motivated by D74560, specifically the subthread about what
to print upon encountering reserved initial length values.
If the debug_line prologue has an unsupported version, we skip parsing
the rest of the data. If we encounter an reserved initial length field,
we don't even parse the version. However, we still print out all members
(with value 0) in the dump function.
This patch introduces early exits in the Prologue::dump function so that
we print only the fields that were parsed successfully. In case of an
unsupported version, we skip printing all subsequent prologue fields --
because we don't even know if this version has those fields. In case of a
reserved unit length, we don't print anything -- if the very first field
of the prologue is invalid, it's hard to say if we even have a prologue
to begin with.
Note that the user will still be able to see the invalid/reserved
initial length value in the error message. I've modified (reordered)
debug_line_invalid.test to show that the error message comes straight
after the debug_line offset. I've also added some flush() calls to the
dumping code to ensure this is the case in all situations (without that,
the warnings could get out of sync if the output was not a terminal -- I
guess this is why std::iostreams have the tie() function).
Reviewers: jhenderson, ikudrin, dblaikie
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75043
Summary:
This patch creates the llvm-gsymutil binary that can convert object files to GSYM using the --convert <path> option. It can also dump and lookup addresses within GSYM files that have been saved to disk.
To dump a file:
llvm-gsymutil /path/to/a.gsym
To perform address lookups, like with atos, on GSYM files:
llvm-gsymutil --address 0x1000 --address 0x1100 /path/to/a.gsym
To convert a mach-o or ELF file, including any DWARF debug info contained within the object files:
llvm-gsymutil --convert /path/to/a.out --out-file /path/to/a.out.gsym
Conversion highlights:
- convert DWARF debug info in mach-o or ELF files to GSYM
- convert symbols in symbol table to GSYM and don't convert symbols that overlap with DWARF debug info
- extract UUID from object files
- extract .text (read + execute) section address ranges and filter out any DWARF or symbols that don't fall in those ranges.
- if .text sections are extracted, and if the last gsym::FunctionInfo object has no size, cap the size to the end of the section the function was contained in
Dumping GSYM files will dump all sections of the GSYM file in textual format.
Reviewers: labath, aadsm, serhiy.redko, jankratochvil, xiaobai, wallace, aprantl, JDevlieghere, jdoerfert
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74883
Summary:
This patch if motivated by D74560, specifically the subthread about what
to print upon encountering reserved initial length values.
If the debug_line prologue has an unsupported version, we skip parsing
the rest of the data. If we encounter an reserved initial length field,
we don't even parse the version. However, we still print out all members
(with value 0) in the dump function.
This patch introduces early exits in the Prologue::dump function so that
we print only the fields that were parsed successfully. In case of an
unsupported version, we skip printing all subsequent prologue fields --
because we don't even know if this version has those fields. In case of a
reserved unit length, we don't print anything -- if the very first field
of the prologue is invalid, it's hard to say if we even have a prologue
to begin with.
Note that the user will still be able to see the invalid/reserved
initial length value in the error message. I've modified (reordered)
debug_line_invalid.test to show that the error message comes straight
after the debug_line offset. I've also added some flush() calls to the
dumping code to ensure this is the case in all situations (without that,
the warnings could get out of sync if the output was not a terminal -- I
guess this is why std::iostreams have the tie() function).
Reviewers: jhenderson, ikudrin, dblaikie
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75043
While the value of the CIE pointer field in a DWARF FDE record is
an offset to the corresponding CIE record from the beginning of
the section, for EH FDE records it is relative to the current offset.
Previously, we did not make that distinction when dumped both kinds
of FDE records and just printed the same value for the CIE pointer
field and the CIE offset; that was acceptable for DWARF FDEs but was
wrong for EH FDEs.
This patch fixes the issue by explicitly printing the offset of the
linked CIE object.
Differential Revision: https://reviews.llvm.org/D74613
macro section dumping.
Summary: Previously macinfo infrastructure was using functions
names that were ambiguous i.e `getMacro/getMacroDWO` in a sense
of conveying stated intentions. This patch refactored them into more
reasonable `getDebugMacinfo/getDebugMacinfoDWO` names thus making
room for macro implementation.
Reviewers: aprantl, probinson, jini.susan.george, dblaikie
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D75037
The CIE pointer field of an FDE record contains an offset to
a corresponding CIE record. In object files, this value comes with
relocation because the value has to be fixed when a linker combines
the final section from multiple sources. In most object files there is
only one CIE record at offset 0 of the .debug_frame section, so reading
a relocated or a raw value makes no difference. However, in partially
linked object files there are multiple CIE records and the relocations
should be applied to recover the right offset value.
Differential Revision: https://reviews.llvm.org/D74612
Summary:
The Offset provides the offset within the function in a SourceLocation struct. This allows us to show the byte offset within a function. We also track offsets within inline functions as well. Updated the lookup tests to verify the offset for functions and inline functions.
0x1000: main + 32 @ /tmp/main.cpp:45
Reviewers: labath, aadsm, serhiy.redko, jankratochvil, xiaobai, wallace, aprantl, JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74680
Summary:
This patch is extracted from D74308.
It patches all usages of WithColor::error() and WithColor::warning
in DebugInfoDWARF library.
Depends on D74481
Reviewers: jhenderson, dblaikie, probinson, aprantl, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74635
Summary:
this review is extracted from D74308.
It creates two error handlers which allow to redefine error
reporting routine and should be used for all places
where errors are reported:
std::function<void(Error)> RecoverableErrorHandler = defaultErrorHandler;
std::function<void(Error)> WarningHandler = defaultWarningHandler;
It also creates accessors to above handlers which should be used to
report errors.
function_ref<void(Error)> getRecoverableErrorHandler() {
return RecoverableErrorHandler;
}
function_ref<void(Error)> getWarningHandler() { return WarningHandler; }
It patches all error reporting places inside DWARFContext and DWARLinker.
Reviewers: jhenderson, dblaikie, probinson, aprantl, JDevlieghere
Reviewed By: jhenderson, JDevlieghere
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D74481
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
Prior to this patch, if a DW_LNE_set_address opcode was parsed with an
address size (i.e. with a length after the opcode) of anything other 1,
2, 4, or 8, an llvm_unreachable would be hit, as the data extractor does
not support other values. This patch introduces a new error check that
verifies the address size is one of the supported sizes, in common with
other places within the DWARF parsing.
This patch also fixes calculation of a generated line table's size in
unit tests. One of the tests in this patch highlighted a bug introduced
in 1271cde474, when non-byte operands were used as arguments for
extended or standard opcodes.
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D73962