Commit Graph

1521 Commits

Author SHA1 Message Date
Alexandre Ganea 90f4b94da3 [CodeView] More appropriate name and type for a Microsoft precompiled headers parameter. NFC
llvm-svn: 350520
2019-01-07 13:53:16 +00:00
David Blaikie b917c3a41a llvm-dwarfdump: Skip address index info (and dump only the address, if found) when non-verbose dumping addrx forms
There's a few bugs here still - demonstrated with FIXITs in the test.

llvm-svn: 350046
2018-12-24 06:52:31 +00:00
David Blaikie 2a38c17b34 DebugInfo: Accurately propagate the section used by a relocation when accessing ranges defined by low/high_pc
This is difficult/not possible to test in LLVM, but is visible as a
crash in LLD when parsing DWARF to generate gdb-index.

This function is called by llvm-dwarfdump when parsing high_pc for
non-verbose output (to print the actual high_pc rather than the low_pc
relative value), but in that case llvm-dwarfdump doesn't print section
names (if it did, it would hit this problem).

We could add some other features to llvm-dwarfdump to expose this, but
nothing really springs to my mind. I will add a test to lld, though.

llvm-svn: 350010
2018-12-22 22:20:40 +00:00
David Blaikie 25179613f6 llvm-dwarfdump: Dump the section name/number for addr attributes
llvm-svn: 350009
2018-12-22 20:34:58 +00:00
David Blaikie 9efb0153f0 llvm-dwarfdump: Remove extraneous space between '(' and 'indexed'
When dumping string or address indexes

llvm-svn: 349997
2018-12-22 08:43:08 +00:00
David Blaikie c04d2bf22a llvm-dwarfdump: Print the section name/number for addr_index attributes
(addr attributes coming shortly)

llvm-svn: 349996
2018-12-22 08:33:55 +00:00
David Blaikie 87ae80fb2f DebugInfo: Refactor named section dumping into a reusable helper
Currently the section name (& possibly number) is only printed on
addresses in ranges - but no reason it couldn't also be displayed on
other addresses (like low/high PC).

Refactor in that direction by pulling out the section lookup and name
ambiguity dumping logic into a reusable helper.

llvm-svn: 349995
2018-12-22 08:23:10 +00:00
David Blaikie e4e0b9f48f DebugInfo: Remove extra attribute lookup
llvm-svn: 349985
2018-12-22 02:24:13 +00:00
David Blaikie 219c6bd388 libDebugInfo: Refactor error handling in range list parsing
Propagate the llvm::Error a little further up. This is NFC for
llvm-dwarfdump in this change, but allows ld.lld to emit more precise
error messages about which object and archive the erroneous DWARF is in.

llvm-svn: 349978
2018-12-22 00:31:02 +00:00
David Blaikie c3f30a7fc6 Reapply: DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)
Originally committed in r349333, reverted in r349353.

GCC emitted these unconditionally on/before 4.4/March 2012
Clang emitted these unconditionally on/before 3.5/March 2014

This improves performance when parsing CUs (especially those using split
DWARF) that contain no code ranges (such as the mini CUs that may be
created by ThinLTO importing - though generally they should be/are
avoided, especially for Split DWARF because it produces a lot of very
small CUs, which don't scale well in a bunch of other ways too
(including size)).

The revert was due to a (Google internal) test that had some checked in old
object files missing DW_AT_ranges. That's since been fixed.

llvm-svn: 349968
2018-12-21 22:25:01 +00:00
Luke Cheeseman 41a9e53500 [Dwarf/AArch64] Return address signing B key dwarf support
- When signing return addresses with -msign-return-address=<scope>{+<key>},
  either the A key instructions or the B key instructions can be used. To
  correctly authenticate the return address, the unwinder/debugger must know
  which key was used to sign the return address.
- When and exception is thrown or a break point reached, it may be necessary to
  unwind the stack. To accomplish this, the unwinder/debugger must be able to
  first authenticate an the return address if it has been signed.
- To enable this, the augmentation string of CIEs has been extended to allow
  inclusion of a 'B' character. Functions that are signed using the B key
  variant of the instructions should have and FDE whose associated CIE has a 'B'
  in the augmentation string.
- One must also be able to preserve these semantics when first stepping from a
  high level language into assembly and then, as a second step, into an object
  file. To achieve this, I have introduced a new assembly directive
  '.cfi_b_key_frame ', that tells the assembler the current frame uses return
  address signing with the B key.
- This ensures that the FDE is associated with a CIE that has 'B' in the
  augmentation string.

Differential Revision: https://reviews.llvm.org/D51798

llvm-svn: 349895
2018-12-21 10:45:08 +00:00
David Blaikie ac69af7ad6 llvm-dwarfdump: Improve/fix pretty printing of array dimensions
This is to address post-commit feedback from Paul Robinson on r348954.

The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)).

I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)").

Reviewers: aprantl, probinson, JDevlieghere

Differential Revision: https://reviews.llvm.org/D55721

llvm-svn: 349670
2018-12-19 19:34:24 +00:00
Luke Cheeseman f57d7d8237 [AArch64] - Return address signing dwarf support
- Reapply changes intially introduced in r343089
- The archtecture info is no longer loaded whenever a DWARFContext is created
- The runtimes libraries (santiziers) make use of the dwarf context classes but
  do not intialise the target info
- The architecture of the object can be obtained without loading the target info
- Adding a method to the dwarf context to get this information and multiplex the
  string printing later on

Differential Revision: https://reviews.llvm.org/D55774

llvm-svn: 349472
2018-12-18 10:37:42 +00:00
Zachary Turner bb3d7e565f [PDB] Add some helper functions for working with scopes.
llvm-svn: 349361
2018-12-17 16:15:36 +00:00
Eric Liu 6c933a2bed Revert "DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)"
This reverts commit r349333. It caused internal test to fail. I have
sent more information to the author.

llvm-svn: 349353
2018-12-17 14:14:40 +00:00
David Blaikie 884deed1b3 DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)
GCC emitted these unconditionally on/before 4.4/March 2012
Clang emitted these unconditionally on/before 3.5/March 2014

This improves performance when parsing CUs (especially those using split
DWARF) that contain no code ranges (such as the mini CUs that may be
created by ThinLTO importing - though generally they should be/are
avoided, especially for Split DWARF because it produces a lot of very
small CUs, which don't scale well in a bunch of other ways too
(including size)).

llvm-svn: 349333
2018-12-17 08:27:19 +00:00
David Blaikie 023674a9e4 DebugInfo/DWARF: Pretty print subroutine types
Doesn't handle varargs and other fun things, but it's a start. (also
doesn't print these strictly as valid C++ when it's a pointer to
function, it'll print as "void(int)*" instead of "void (*)(int)")

llvm-svn: 348965
2018-12-12 19:53:03 +00:00
David Blaikie 3f8f004daf DebugInfo/DWARF: Improve dumping of pointers to members ('int foo::*' rather than 'int*')
llvm-svn: 348962
2018-12-12 19:34:02 +00:00
David Blaikie 815cffaad8 DebugInfo/DWARF: Refactor type dumping to dump types, rather than DIEs that reference types
This lays the foundation for dumping types not referenced by DW_AT_type
attributes (in the near-term, that'll be DW_AT_containing_type for a
DW_TAG_ptr_to_member_type - in the future, potentially dumping the
pretty printed name next to the DW_TAG for the type, rather than only
when the type is referenced from elsewhere)

llvm-svn: 348961
2018-12-12 19:33:08 +00:00
David Blaikie 92b5493a14 DebugInfo/DWARF: Refactor getAttributeValueAsReferencedDie to accept a DWARFFormValue
Save searching for the attribute again when you already have the
DWARFFormValue at hand.

llvm-svn: 348960
2018-12-12 19:23:55 +00:00
David Blaikie 73066d60f1 llvm-dwarfdump: Dump array dimensions in stringified type names
llvm-svn: 348954
2018-12-12 18:46:25 +00:00
Zachary Turner a42bbe3981 [NativePDB] Reconstruct function declarations from debug info.
Previously we would create an lldb::Function object for each function
parsed, but we would not add these to the clang AST. This is a first
step towards getting local variable support working, as we first need an
AST decl so that when we create local variable entries, they have the
proper DeclContext.

Differential Revision: https://reviews.llvm.org/D55384

llvm-svn: 348631
2018-12-07 19:34:02 +00:00
Zachary Turner a93458b050 [PDB] Move some code around. NFC.
llvm-svn: 348505
2018-12-06 17:49:15 +00:00
Zachary Turner 579264bd59 Support skewed stream arrays.
VarStreamArray was built on the assumption that it is backed by a
StreamRef, and offset 0 of that StreamRef is the first byte of the first
record in the array.

This is a logical and intuitive assumption, but unfortunately we have
use cases where it doesn't hold. Specifically, a PDB module's symbol
stream is prefixed by 4 bytes containing a magic value, and the first
byte of record data in the array is actually at offset 4 of this byte
sequence.

Previously, we would just truncate the first 4 bytes and then construct
the VarStreamArray with the resulting StreamRef, so that offset 0 of the
underlying stream did correspond to the first byte of the first record,
but this is problematic, because symbol records reference other symbol
records by the absolute offset including that initial magic 4 bytes. So
if another record wants to refer to the first record in the array, it
would say "the record at offset 4".

This led to extremely confusing hacks and semantics in loading code, and
after spending 30 minutes trying to get some math right and failing, I
decided to fix this in the underlying implementation of VarStreamArray.
Now, we can say that a stream is skewed by a particular amount. This
way, when we access a record by absolute offset, we can use the same
values that the records themselves contain, instead of having to do
fixups.

Differential Revision: https://reviews.llvm.org/D55344

llvm-svn: 348499
2018-12-06 16:55:00 +00:00
Zachary Turner 7c6b19f49b [PDB] Emit S_UDT records in LLD.
Previously these were dropped.  We now understand them sufficiently
well to start emitting them.  From the debugger's perspective, this
now enables us to have debug info about typedefs (both global and
function-locally scoped)

Differential Revision: https://reviews.llvm.org/D55228

llvm-svn: 348306
2018-12-04 21:48:46 +00:00
George Rimar 7e981f330b [llvm-dwarfdump] - Dump the older versions of .eh_frame/.debug_frame correctly.
The issue is the following.

DWARF 2 used version 1 for .debug_frame.
(Appendix G, p. 416 http://dwarfstd.org/doc/DWARF5.pdf)

lib/MC now always sets version 1 for .eh_frame (and sets 1-4 versions for .debug_frame correctly):
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCDwarf.cpp#L1530
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCDwarf.cpp#L1562
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCDwarf.cpp#L1602

In version 1, return_address_register was defined as ubyte, while other versions
switched to uleb128.
(p 62, http://www.dwarfstd.org/doc/dwarf-2.0.0.pdf)

Patch teaches llvm-dwarfdump about this difference.

Differential revision: https://reviews.llvm.org/D54860

llvm-svn: 348242
2018-12-04 10:01:39 +00:00
Zachary Turner 1e0cce796c Fix issue with Tpi Stream hash map.
Part of the patch to not build the hash map eagerly was omitted
due to a merge conflict.  Add it back, which should fix the failing
tests.

llvm-svn: 348166
2018-12-03 19:05:12 +00:00
Zachary Turner f861e291d6 Don't build the Tpi Hash map by default.
This is very slow and should be done for specific cases where
lookups will need to happen.

llvm-svn: 348160
2018-12-03 18:32:05 +00:00
George Rimar 6d85c58328 [llvm-dwarfdump] - Stop printing the bogus empty section name on invalid dwarf.
When there is no .debug_addr section for some reason,
llvm-dwarfdump would print the bogus empty section name when dumping ranges
in .debug_info:

DW_AT_ranges [DW_FORM_rnglistx]   (indexed (0x0) rangelist = 0x00000004
    [0x0000000000000000, 0x0000000000000001) ""
    [0x0000000000000000, 0x0000000000000002) "")

That happens because of the code which uses 0 (zero) as a section index as a default value.
The code should use -1ULL instead because technically 0 is a valid zero section index
in ELF and -1ULL is a special constant used that means "no section available".

This is mostly a fix for the overall correctness/safety of the code,
but a test case is provided too.

Differential revision: https://reviews.llvm.org/D55113

llvm-svn: 348115
2018-12-03 10:33:40 +00:00
Reid Kleckner ffba54493f Add missing error checking code intended for r347687
llvm-svn: 347690
2018-11-27 19:14:11 +00:00
Reid Kleckner 291d015de4 [PDB] Add symbol records in bulk
Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.

Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.

With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).

Some stats on symbol sizes for the curious:
  PDB size: 507M
  sym bytes: 316,508,016
  sym count:  18,954,971
  sym byte avg: 16.7

As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.

Reviewers: zturner, aganea, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54554

llvm-svn: 347687
2018-11-27 19:00:23 +00:00
Luke Cheeseman 6db3a6a4a7 Revert r347490 as it breaks address sanitizer builds
llvm-svn: 347499
2018-11-23 17:13:06 +00:00
Luke Cheeseman d6dbd64104 Revert r343341
- Cannot reproduce the build failure locally and the build logs have
  been deleted.

llvm-svn: 347490
2018-11-23 11:01:47 +00:00
Zachary Turner c68f895702 [CodeView] Add support for ref-qualified member functions.
When you have a member function with a ref-qualifier, for example:

struct Foo {
  void Func() &;
  void Func2() &&;
};

clang-cl was not emitting this information. Doing so is a bit
awkward, because it's not a property of the LF_MFUNCTION type, which
is what you'd expect. Instead, it's a property of the this pointer
which is actually an LF_POINTER. This record has an attributes
bitmask on it, and our handling of this bitmask was all wrong. We
had some parts of the bitmask defined incorrectly, but importantly
for this bug, we didn't know about these extra 2 bits that represent
the ref qualifier at all.

Differential Revision: https://reviews.llvm.org/D54667

llvm-svn: 347354
2018-11-20 22:13:43 +00:00
Zachary Turner b35e1d7dc3 [CodeView] Don't print PointerAttributes when dumping.
PointerAttributes is a bitwise-or of several other fields, each of
which is already printed on its own line with a better explanation.
So this doesn't really help much.

llvm-svn: 347275
2018-11-20 00:10:27 +00:00
David Blaikie 81959a2730 llvm-symbolizer: Avoid calling getFromOffset when the index entry is already available
Especially for symbolizer it can be efficient to have to search through
the entire index when it isn't needed - llvm-symbolizer looks up only a
few CUs & already has an index available in getUnitForEntry, once it's
passed down to DWARFUnitHeader::extract then there's no need for it to
call getFromOffset.

llvm-svn: 347134
2018-11-17 05:57:58 +00:00
Fangrui Song 7570932977 Use llvm::copy. NFC
llvm-svn: 347126
2018-11-17 01:44:25 +00:00
Simon Atanasyan 705fbd5d4f [DWARF] Use PRIx64 instead of 'x' to format 64-bit values
This is a follow-up to r346715. Use PRIx64 to formatted print of 64-bit
value in the `DWARFDebugLoclists::LocationList::dump` to escape problem
on big-endian hosts.

llvm-svn: 347049
2018-11-16 13:14:26 +00:00
Zachary Turner 03a24052f3 [NativePDB] Improved support for nested type reconstruction.
In a previous patch, we pre-processed the TPI stream in order to build
the reverse mapping from nested type -> parent type so that we could
accurately reconstruct a DeclContext hierarchy.

However, there were some issues. An LF_NESTTYPE record is really just a
typedef, so although it happens to be used to indicate the name of the
nested type and referring to the global record which defines the type,
it is also used for every other kind of nested typedef. When we rebuild
the DeclContext hierarchy, we want it to be as accurate as possible,
which means that if we have something like:

  struct A {
    struct B {};
    using C = B;
  };

We don't want to create two CXXRecordDecls in the AST each with the
exact same definition. We just want to create one for B and then
define C as an alias to B. Previously, however, it would not be able
to distinguish between the two cases and it would treat A::B and
A::C as being two classes each with separate definitions. We address
the first half of improving the pre-processing logic so that only
actual definitions are treated this way.

Later, in a followup patch, we can handle the case of nested
typedefs since we're already going to be enumerating the field list
anyway and this patch introduces the general framework for
distinguishing between the two cases.

Differential Revision: https://reviews.llvm.org/D54357

llvm-svn: 346786
2018-11-13 20:07:32 +00:00
Simon Atanasyan 22dc538618 [DWARF] Do not use PRIx32 for printing uint64_t values
The `DWARFDebugAddrTable::dump` routine prints 32/64-bits addresses.
These values are stored in a vector of `uint64_t` independently of their
original sizes. But `format` function gets format string with PRIx32
suffix in case of 32-bit address size. At least on MIPS 32-bit targets
that leads to incorrect output.

This patch changes formats strings and always use PRIx64 to print
`uint64_t` values.

Differential Revision: http://reviews.llvm.org/D54424

llvm-svn: 346715
2018-11-12 22:43:17 +00:00
David Blaikie 582a5ebce0 NFC: DebugInfo: Reduce scope of DebugOffset to simplify code
This was being used as a sort of indirect out parameter from shouldDump
- seems simpler to use it as the actual result of the call. (this does
mean using a pointer to an Optional & actually using all 3 states (null,
None, and present) which is, admittedly, a tad subtle - but given the
limited scope, seems OK to me - open to discussion though, if others
feel strongly about it)

llvm-svn: 346691
2018-11-12 18:53:28 +00:00
Fangrui Song 158b26213f [DWARF] Change pubnames to use DWARFSection instead of StringRef
Summary: The debug_info_offset values in .debug_{,gnu_}pub{name,types} may be relocated. Change it to DWARFSection so that we can get relocated values.

Reviewers: ruiu, dblaikie, grimar, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D54375

llvm-svn: 346615
2018-11-11 18:57:28 +00:00
Alexandre Ganea 4b2957243b [LLD] Fix Microsoft precompiled headers cross-compile on Linux
Differential revision: https://reviews.llvm.org/D54122

llvm-svn: 346403
2018-11-08 14:42:37 +00:00
Jorge Gorbe Moya bf1badb6bb Add parentheses to silence warning.
DWARFContext.cpp:356:20: error: using the result of an assignment as a condition without parentheses [-Werror,-Wparentheses]

llvm-svn: 346365
2018-11-07 22:30:01 +00:00
Paul Robinson 746c22389c [DWARFv5] Read and dump multiple .debug_info sections.
Type units go in .debug_info comdats, not .debug_types, in v5.

Differential Revision: https://reviews.llvm.org/D53907

llvm-svn: 346360
2018-11-07 21:39:09 +00:00
Fangrui Song 54d23a8eb7 [DWARF] Support types CU list in .gdb_index dumping
Some executables have non-empty types CU list and -gdb-index would report "<error reporting>" before.

llvm-svn: 346181
2018-11-05 23:27:53 +00:00
Alexandre Ganea 71c43ceaf8 [COFF][LLD] Add link support for Microsoft precompiled headers OBJs
This change allows for link-time merging of debugging information from
Microsoft precompiled types OBJs compiled with cl.exe /Z7 /Yc and /Yu.

This fixes llvm.org/PR34278

Differential Revision: https://reviews.llvm.org/D45213

llvm-svn: 346154
2018-11-05 19:20:47 +00:00
Wolfgang Pieb 5253cccbd5 [DWARF v5] Verifier: Add checks for DW_FORM_strx* forms.
Adding functionality to the DWARF verifier for DWARF v5 strx* forms which 
index into the string offsets table.

Differential Revision: https://reviews.llvm.org/D54049

llvm-svn: 346061
2018-11-03 00:27:35 +00:00
Fangrui Song 999570a2f4 [DWARF] Fix typo, .gnu_index -> .gdb_index
llvm-svn: 346039
2018-11-02 20:34:40 +00:00
Reid Kleckner 46ff186b29 [codeview] Add breaks to fix -Wimplicit-fallthrough
This is a minor bug fix. Previously, if you tried to encode the RSP
register on the x86 platform, that might have succeeded and been encoded
incorrectly. However, no existing producer or consumer passes the x86_64
registers when targeting x86_32.

llvm-svn: 345879
2018-11-01 19:36:29 +00:00