This is difficult/not possible to test in LLVM, but is visible as a
crash in LLD when parsing DWARF to generate gdb-index.
This function is called by llvm-dwarfdump when parsing high_pc for
non-verbose output (to print the actual high_pc rather than the low_pc
relative value), but in that case llvm-dwarfdump doesn't print section
names (if it did, it would hit this problem).
We could add some other features to llvm-dwarfdump to expose this, but
nothing really springs to my mind. I will add a test to lld, though.
llvm-svn: 350010
Currently the section name (& possibly number) is only printed on
addresses in ranges - but no reason it couldn't also be displayed on
other addresses (like low/high PC).
Refactor in that direction by pulling out the section lookup and name
ambiguity dumping logic into a reusable helper.
llvm-svn: 349995
Propagate the llvm::Error a little further up. This is NFC for
llvm-dwarfdump in this change, but allows ld.lld to emit more precise
error messages about which object and archive the erroneous DWARF is in.
llvm-svn: 349978
Originally committed in r349333, reverted in r349353.
GCC emitted these unconditionally on/before 4.4/March 2012
Clang emitted these unconditionally on/before 3.5/March 2014
This improves performance when parsing CUs (especially those using split
DWARF) that contain no code ranges (such as the mini CUs that may be
created by ThinLTO importing - though generally they should be/are
avoided, especially for Split DWARF because it produces a lot of very
small CUs, which don't scale well in a bunch of other ways too
(including size)).
The revert was due to a (Google internal) test that had some checked in old
object files missing DW_AT_ranges. That's since been fixed.
llvm-svn: 349968
- When signing return addresses with -msign-return-address=<scope>{+<key>},
either the A key instructions or the B key instructions can be used. To
correctly authenticate the return address, the unwinder/debugger must know
which key was used to sign the return address.
- When and exception is thrown or a break point reached, it may be necessary to
unwind the stack. To accomplish this, the unwinder/debugger must be able to
first authenticate an the return address if it has been signed.
- To enable this, the augmentation string of CIEs has been extended to allow
inclusion of a 'B' character. Functions that are signed using the B key
variant of the instructions should have and FDE whose associated CIE has a 'B'
in the augmentation string.
- One must also be able to preserve these semantics when first stepping from a
high level language into assembly and then, as a second step, into an object
file. To achieve this, I have introduced a new assembly directive
'.cfi_b_key_frame ', that tells the assembler the current frame uses return
address signing with the B key.
- This ensures that the FDE is associated with a CIE that has 'B' in the
augmentation string.
Differential Revision: https://reviews.llvm.org/D51798
llvm-svn: 349895
This is to address post-commit feedback from Paul Robinson on r348954.
The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)).
I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)").
Reviewers: aprantl, probinson, JDevlieghere
Differential Revision: https://reviews.llvm.org/D55721
llvm-svn: 349670
- Reapply changes intially introduced in r343089
- The archtecture info is no longer loaded whenever a DWARFContext is created
- The runtimes libraries (santiziers) make use of the dwarf context classes but
do not intialise the target info
- The architecture of the object can be obtained without loading the target info
- Adding a method to the dwarf context to get this information and multiplex the
string printing later on
Differential Revision: https://reviews.llvm.org/D55774
llvm-svn: 349472
GCC emitted these unconditionally on/before 4.4/March 2012
Clang emitted these unconditionally on/before 3.5/March 2014
This improves performance when parsing CUs (especially those using split
DWARF) that contain no code ranges (such as the mini CUs that may be
created by ThinLTO importing - though generally they should be/are
avoided, especially for Split DWARF because it produces a lot of very
small CUs, which don't scale well in a bunch of other ways too
(including size)).
llvm-svn: 349333
Doesn't handle varargs and other fun things, but it's a start. (also
doesn't print these strictly as valid C++ when it's a pointer to
function, it'll print as "void(int)*" instead of "void (*)(int)")
llvm-svn: 348965
This lays the foundation for dumping types not referenced by DW_AT_type
attributes (in the near-term, that'll be DW_AT_containing_type for a
DW_TAG_ptr_to_member_type - in the future, potentially dumping the
pretty printed name next to the DW_TAG for the type, rather than only
when the type is referenced from elsewhere)
llvm-svn: 348961
Previously we would create an lldb::Function object for each function
parsed, but we would not add these to the clang AST. This is a first
step towards getting local variable support working, as we first need an
AST decl so that when we create local variable entries, they have the
proper DeclContext.
Differential Revision: https://reviews.llvm.org/D55384
llvm-svn: 348631
VarStreamArray was built on the assumption that it is backed by a
StreamRef, and offset 0 of that StreamRef is the first byte of the first
record in the array.
This is a logical and intuitive assumption, but unfortunately we have
use cases where it doesn't hold. Specifically, a PDB module's symbol
stream is prefixed by 4 bytes containing a magic value, and the first
byte of record data in the array is actually at offset 4 of this byte
sequence.
Previously, we would just truncate the first 4 bytes and then construct
the VarStreamArray with the resulting StreamRef, so that offset 0 of the
underlying stream did correspond to the first byte of the first record,
but this is problematic, because symbol records reference other symbol
records by the absolute offset including that initial magic 4 bytes. So
if another record wants to refer to the first record in the array, it
would say "the record at offset 4".
This led to extremely confusing hacks and semantics in loading code, and
after spending 30 minutes trying to get some math right and failing, I
decided to fix this in the underlying implementation of VarStreamArray.
Now, we can say that a stream is skewed by a particular amount. This
way, when we access a record by absolute offset, we can use the same
values that the records themselves contain, instead of having to do
fixups.
Differential Revision: https://reviews.llvm.org/D55344
llvm-svn: 348499
Previously these were dropped. We now understand them sufficiently
well to start emitting them. From the debugger's perspective, this
now enables us to have debug info about typedefs (both global and
function-locally scoped)
Differential Revision: https://reviews.llvm.org/D55228
llvm-svn: 348306
Part of the patch to not build the hash map eagerly was omitted
due to a merge conflict. Add it back, which should fix the failing
tests.
llvm-svn: 348166
When there is no .debug_addr section for some reason,
llvm-dwarfdump would print the bogus empty section name when dumping ranges
in .debug_info:
DW_AT_ranges [DW_FORM_rnglistx] (indexed (0x0) rangelist = 0x00000004
[0x0000000000000000, 0x0000000000000001) ""
[0x0000000000000000, 0x0000000000000002) "")
That happens because of the code which uses 0 (zero) as a section index as a default value.
The code should use -1ULL instead because technically 0 is a valid zero section index
in ELF and -1ULL is a special constant used that means "no section available".
This is mostly a fix for the overall correctness/safety of the code,
but a test case is provided too.
Differential revision: https://reviews.llvm.org/D55113
llvm-svn: 348115
Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.
Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.
With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).
Some stats on symbol sizes for the curious:
PDB size: 507M
sym bytes: 316,508,016
sym count: 18,954,971
sym byte avg: 16.7
As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.
Reviewers: zturner, aganea, ruiu
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D54554
llvm-svn: 347687
When you have a member function with a ref-qualifier, for example:
struct Foo {
void Func() &;
void Func2() &&;
};
clang-cl was not emitting this information. Doing so is a bit
awkward, because it's not a property of the LF_MFUNCTION type, which
is what you'd expect. Instead, it's a property of the this pointer
which is actually an LF_POINTER. This record has an attributes
bitmask on it, and our handling of this bitmask was all wrong. We
had some parts of the bitmask defined incorrectly, but importantly
for this bug, we didn't know about these extra 2 bits that represent
the ref qualifier at all.
Differential Revision: https://reviews.llvm.org/D54667
llvm-svn: 347354
PointerAttributes is a bitwise-or of several other fields, each of
which is already printed on its own line with a better explanation.
So this doesn't really help much.
llvm-svn: 347275
Especially for symbolizer it can be efficient to have to search through
the entire index when it isn't needed - llvm-symbolizer looks up only a
few CUs & already has an index available in getUnitForEntry, once it's
passed down to DWARFUnitHeader::extract then there's no need for it to
call getFromOffset.
llvm-svn: 347134
This is a follow-up to r346715. Use PRIx64 to formatted print of 64-bit
value in the `DWARFDebugLoclists::LocationList::dump` to escape problem
on big-endian hosts.
llvm-svn: 347049
In a previous patch, we pre-processed the TPI stream in order to build
the reverse mapping from nested type -> parent type so that we could
accurately reconstruct a DeclContext hierarchy.
However, there were some issues. An LF_NESTTYPE record is really just a
typedef, so although it happens to be used to indicate the name of the
nested type and referring to the global record which defines the type,
it is also used for every other kind of nested typedef. When we rebuild
the DeclContext hierarchy, we want it to be as accurate as possible,
which means that if we have something like:
struct A {
struct B {};
using C = B;
};
We don't want to create two CXXRecordDecls in the AST each with the
exact same definition. We just want to create one for B and then
define C as an alias to B. Previously, however, it would not be able
to distinguish between the two cases and it would treat A::B and
A::C as being two classes each with separate definitions. We address
the first half of improving the pre-processing logic so that only
actual definitions are treated this way.
Later, in a followup patch, we can handle the case of nested
typedefs since we're already going to be enumerating the field list
anyway and this patch introduces the general framework for
distinguishing between the two cases.
Differential Revision: https://reviews.llvm.org/D54357
llvm-svn: 346786
The `DWARFDebugAddrTable::dump` routine prints 32/64-bits addresses.
These values are stored in a vector of `uint64_t` independently of their
original sizes. But `format` function gets format string with PRIx32
suffix in case of 32-bit address size. At least on MIPS 32-bit targets
that leads to incorrect output.
This patch changes formats strings and always use PRIx64 to print
`uint64_t` values.
Differential Revision: http://reviews.llvm.org/D54424
llvm-svn: 346715
This was being used as a sort of indirect out parameter from shouldDump
- seems simpler to use it as the actual result of the call. (this does
mean using a pointer to an Optional & actually using all 3 states (null,
None, and present) which is, admittedly, a tad subtle - but given the
limited scope, seems OK to me - open to discussion though, if others
feel strongly about it)
llvm-svn: 346691
Summary: The debug_info_offset values in .debug_{,gnu_}pub{name,types} may be relocated. Change it to DWARFSection so that we can get relocated values.
Reviewers: ruiu, dblaikie, grimar, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D54375
llvm-svn: 346615
This change allows for link-time merging of debugging information from
Microsoft precompiled types OBJs compiled with cl.exe /Z7 /Yc and /Yu.
This fixes llvm.org/PR34278
Differential Revision: https://reviews.llvm.org/D45213
llvm-svn: 346154
Adding functionality to the DWARF verifier for DWARF v5 strx* forms which
index into the string offsets table.
Differential Revision: https://reviews.llvm.org/D54049
llvm-svn: 346061
This is a minor bug fix. Previously, if you tried to encode the RSP
register on the x86 platform, that might have succeeded and been encoded
incorrectly. However, no existing producer or consumer passes the x86_64
registers when targeting x86_32.
llvm-svn: 345879