Commit Graph

896 Commits

Author SHA1 Message Date
Zachary Turner 0c60f269fc [CodeView] Provide a common interface for type collections.
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).

But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.

This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.

This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.

The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.

Differential Revision: https://reviews.llvm.org/D33293

llvm-svn: 303388
2017-05-18 23:03:06 +00:00
Zachary Turner 5a83fb153f Fix some minor issues in PDB parsing library.
1) Until now I'd never seen a valid PDB where the DBI stream and
   the PDB Stream disagreed on the "Age" field.  Because of that,
   we had code to assert that they matched.  Recently though I was
   given a PDB where they disagreed, so this assumption has proven
   to be incorrect.  Remove this check.

2) We were walking the entire list of hash values for types up front
   and then throwing away the values.  For large PDBs this was a
   significant slow down.  Remove this.

With this patch, I can dump the list of all compilands from a
1.5GB PDB file in just a few seconds.

llvm-svn: 303351
2017-05-18 15:14:44 +00:00
George Rimar 47f84b1a3c [DWARF] - Simplify RelocVisitor implementation.
We do not need to store relocation width field.
Patch removes relative code, that simplifies implementation.

Differential revision: https://reviews.llvm.org/D33274

llvm-svn: 303335
2017-05-18 08:25:11 +00:00
George Rimar f98b9ac5da [lib/Object] - Minor API update for llvm::Decompressor.
I revisited Decompressor API (issue with it was triggered during D32865 review)
and found it is probably provides more then we really need.

Issue was about next method's signature:

Error decompress(SmallString<32> &Out);
It is too strict. At first I wanted to change it to decompress(SmallVectorImpl<char> &Out),
but then found it is still not flexible because sticks to SmallVector.

During reviews was suggested to use templating to simplify code. Patch do that.

Differential revision: https://reviews.llvm.org/D33200

llvm-svn: 303331
2017-05-18 08:00:01 +00:00
Bob Haarman de33a63784 [llvm-pdbdump] in yaml2pdb, generate default output filename if none given
Summary:
llvm-pdbdump yaml2pdb used to fail with a misleading error
message ("An I/O error occurred on the file system") if no output file
was specified. This change adds an assert to PDBFileBuilder to check
that an output file name is specified, and makes llvm-pdbdump generate
an output file name based on the input file name if no output file
name is explicitly specified.

Reviewers: amccarth, zturner

Reviewed By: zturner

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D33296

llvm-svn: 303299
2017-05-17 20:46:48 +00:00
Zachary Turner 1d795c451e [CodeView] Simplify the use of visiting type records & streams.
There is often a lot of boilerplate code required to visit a type
record or type stream.  The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine.  Currently this
requires at least 6 lines of code:

  codeview::TypeVisitorCallbackPipeline Pipeline;
  Pipeline.addCallbackToPipeline(Deserializer);
  Pipeline.addCallbackToPipeline(MyCallbacks);

  codeview::CVTypeVisitor Visitor(Pipeline);
  consumeError(Visitor.visitTypeRecord(Record));

With this patch, it becomes one line of code:

  consumeError(codeview::visitTypeRecord(Record, MyCallbacks));

This is done by having the deserialization happen internally inside
of the visitTypeRecord function.  Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.

Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.

Differential Revision: https://reviews.llvm.org/D33245

llvm-svn: 303271
2017-05-17 16:39:06 +00:00
George Rimar fed9f09f48 [DWARF] - Cleanup relocations proccessing.
RelocAddrMap was a pair of <width, address>, where width is relocation size (4/8/x, x < 8), 
and width field was never used in code.

Relocations proccessing loop had checks for width field. Does not look like DWARF parser
should do that. There is probably no much sense to validate relocations during proccessing 
them in parser.

Patch removes relocation's width relative code from DWARFContext.

Differential revision: https://reviews.llvm.org/D33194

llvm-svn: 303251
2017-05-17 12:10:51 +00:00
George Rimar 41e656768d [DWARF] - Add RelocAddrEntry for cleanup. NFCi.
Was mentioned as possible cleanup during review of D33184.

llvm-svn: 303171
2017-05-16 14:05:45 +00:00
George Rimar 4671f2e08c [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.

Suggested during review of D33184.

llvm-svn: 303163
2017-05-16 12:30:59 +00:00
George Rimar 3824cca7b3 Revert r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector."
Something went wrong, it broke BB.
http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental_build/38477/consoleFull#-200034420049ba4694-19c4-4d7e-bec5-911270d8a58c

llvm-svn: 303162
2017-05-16 12:05:03 +00:00
George Rimar 8680b6ee9c [DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.
Suggested during review of D33184.

llvm-svn: 303159
2017-05-16 11:54:19 +00:00
George Rimar 958b01aa69 [DWARF] - Speedup handling of relocations in DWARFContextInMemory.
I am working on a speedup of building .gdb_index in LLD and 
noticed that relocations that are proccessed in DWARFContextInMemory often uses
the same symbol in a row. This patch introduces caching to reduce the relocations
proccessing time.

For benchmark,
I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD.

Link time without --gdb-index is about 4,45s.
Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s
That means time spent on --gdb-index in this configuration is 
19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch).

Differential revision: https://reviews.llvm.org/D31136

llvm-svn: 303051
2017-05-15 11:45:28 +00:00
Zachary Turner dd3a739d52 [CodeView] Add a random access type visitor.
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans.  This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.

Differential Revision: https://reviews.llvm.org/D33009

llvm-svn: 302936
2017-05-12 19:18:12 +00:00
Wolfgang Pieb 15fa44698c [DWARF] Fix a parsing issue with type unit headers.
Reviewers: dblaikie

Differential Revision: https://reviews.llvm.org/D32987

llvm-svn: 302574
2017-05-09 19:38:38 +00:00
Aaron Ballman f22f885b66 Removing a file that is not necessary (and was causing link diagnostics with MSVC 2015); NFC.
llvm-svn: 302531
2017-05-09 14:22:48 +00:00
Diana Picus e8da53f4e0 Revert "[Dwarf] Disable reference verification for now (PR32972)"
This reverts commit r302520 because it break the unit tests.

llvm-svn: 302524
2017-05-09 13:05:43 +00:00
Renato Golin 94d6c8fb36 [Dwarf] Disable reference verification for now (PR32972)
There is no other explanation about why this only started happening
now, even though it crashes on old code (supposedly reachable from
here).

The only common factor between the failing bots is that they use GCC
(4.9 and 5.3) to compile Clang, while the others use Clang 3.8, but the
failure is while building the tests, as an assertion, on Clang.

Commenting it out for now in hope the bots will go back green, but we
should keep looking for the real cause, and update bugzilla.

llvm-svn: 302520
2017-05-09 12:36:50 +00:00
Greg Clayton 58a2e0d90b Add const to "DWARFDie &Die" in a few functions as they can't change the DWARFDie.
llvm-svn: 302471
2017-05-08 21:29:17 +00:00
Eugene Zemtsov 3b52dbd934 Fix typo
llvm-svn: 302470
2017-05-08 21:20:53 +00:00
Greg Clayton 5404f114d3 Fix typo "veify" to "verify".
llvm-svn: 302466
2017-05-08 20:53:00 +00:00
Zachary Turner 1dacb24222 [CodeView] Add support for random access type visitors.
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record.  This works fine for situations like dumping,
but not when you want to visit types in random order.  For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.

In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline.  This
patch adds such a mode.  In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.

Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.

Differential Revision: https://reviews.llvm.org/D32928

llvm-svn: 302454
2017-05-08 18:38:43 +00:00
Zachary Turner 8c74673388 [CodeView] Reserve TypeDatabase records up front.
Most of the time we know exactly how many type records we
have in a list, and we want to use the visitor to deserialize
them into actual records in a database.  Previously we were
just using push_back() every time without reserving the space
up front in the vector.  This is obviously terrible from a
performance standpoint, and it's not uncommon to have PDB
files with half a million type records, where the performance
degredation was quite noticeable.

llvm-svn: 302302
2017-05-05 22:02:37 +00:00
George Rimar 2122ff64c6 [llvm-dwarfdump] - Print an error message if section decompression failed.
llvm-dwarfdump currently prints no message if decompression fails 
for some reason. I noticed that during work on one of LLD patches 
where LLD produced an broken output. It was a bit confusing to see
no output for section dumped and no any error message at all.

Patch adds error message for such cases.

Differential revision: https://reviews.llvm.org/D32865

llvm-svn: 302221
2017-05-05 10:52:39 +00:00
Zachary Turner bedc85fb4b [pdb] Don't verify TPI hash values up front.
Verifying the hash values as we are currently doing
results in iterating every type record before the user
even tries to access the first one, and the API user
has no control over, or ability to hook into this
process.

As a result, when the user wants to iterate over types
to print them or index them, this results in a second
iteration over the same list of types.  When there's
upwards of 1,000,000 type records, this is obviously
quite undesirable.

This patch raises the verification outside of TpiStream
, and llvm-pdbdump hooks a hash verification visitor
into the normal dumping process.  So we still verify
the hash records, but we can do it while not requiring
a second iteration over the type stream.

Differential Revision: https://reviews.llvm.org/D32873

llvm-svn: 302206
2017-05-04 23:53:54 +00:00
Zachary Turner 1eb9a0297c [PDB] Don't build the entire source file list up front.
I tried to run llvm-pdbdump on a very large (~1.5GB) PDB to
try and identify show-stopping performance problems.  This
patch addresses the first such problem.

When loading the DBI stream, before anyone has even tried to
access a single record, we build an in memory map of every
source file for every module.  In the particular PDB I was
using, this was over 85 million files.  Specifically, the
complexity is O(m*n) where m is the number of modules and
n is the average number of source files (including headers)
per module.

The whole reason for doing this was so that we could have
constant time access to any module and any of its source
file lists.  However, we can still get O(1) access to the
source file list for a given module with a simple O(m)
precomputation, and access to the list of modules is
already O(1) anyway.

So this patches reduces the O(m*n) up-front precomputation
to an O(m) one, where n is ~6,500 and n*m is about 85 million
in my pathological test case.

Differential Revision: https://reviews.llvm.org/D32870

llvm-svn: 302205
2017-05-04 23:53:29 +00:00
Greg Clayton 48ff66a280 Don't return an invalid line table if the DW_AT_stmt_list value is not in the .debug_line section.
llvm-svn: 302180
2017-05-04 18:29:44 +00:00
Paul Robinson ae2e6f37f3 clang-format and restyle DWARFFormValue before working on it. NFC
llvm-svn: 302086
2017-05-03 21:53:21 +00:00
Zachary Turner 4f145b2a59 Remove unused private field.
llvm-svn: 302069
2017-05-03 19:42:06 +00:00
Greg Clayton c5b2d561e8 Break verification down into smaller functions to keep code clean.
Adrian requested that we break things down to make things clean in the DWARFVerifier. This patch breaks everything down into nice individual functions and cleans up the code quite a bit and prepares us for the next round of verifiers.

Differential Revision: https://reviews.llvm.org/D32812

llvm-svn: 302062
2017-05-03 18:25:46 +00:00
Davide Italiano 2e23ce4cad [CodeView] Remove constructor initialization of a removed field.
I should've staged this with my last commit.

llvm-svn: 302059
2017-05-03 18:02:46 +00:00
Zachary Turner cf468d86f3 [CodeView] Use actual strings for dealing with checksums and lines.
The raw CodeView format references strings by "offsets", but it's
confusing what table the offset refers to.  In the case of line
number information, it's an offset into a buffer of records,
and an indirection is required to get another offset into a
different table to find the final string.  And in the case of
checksum information, there is no indirection, and the offset
refers directly to the location of the string in another buffer.

This would be less confusing if we always just referred to the
strings by their value, and have the library be smart enough
to correctly resolve the offsets on its own from the right
location.

This patch makes that possible.  When either reading or writing,
all the user deals with are strings, and the library does the
appropriate translations behind the scenes.

llvm-svn: 302053
2017-05-03 17:11:40 +00:00
Zachary Turner 2d5c2cd3ce [llvm-readobj] Update readobj to re-use parsing code.
llvm-readobj hand rolls some CodeView parsing code for string
tables, so this patch updates it to re-use some of the newly
introduced parsing code in LLVMDebugInfoCodeView.

Differential Revision: https://reviews.llvm.org/D32772

llvm-svn: 302052
2017-05-03 17:11:11 +00:00
Greg Clayton b8c162b53c Create DWARFVerifier.cpp and .h and move all DWARF verification code over into it.
Adrian requested we create a DWARFVerifier.cpp file to contain all of the DWARF verification stuff. This change simply moves the functionality over into DWARFVerifier.h and DWARFVerifier.cpp, renames the DWARFVerifier methods to start with lower case, and switches DWARFContext.cpp over to using the new functionality.

Differential Revision: https://reviews.llvm.org/D32809

llvm-svn: 302044
2017-05-03 16:02:29 +00:00
Zachary Turner c504ae3cef Resubmit r301986 and r301987 "Add codeview::StringTable"
This was reverted due to a "missing" file, but in reality
what happened was that I renamed a file, and then due to
a merge conflict both the old file and the new file got
added to the repository.  This led to an unused cpp file
being in the repo and not referenced by any CMakeLists.txt
but #including a .h file that wasn't in the repo.  In an
even more unfortunate coincidence, CMake didn't report the
unused cpp file because it was in a subdirectory of the
folder with the CMakeLists.txt, and not in the same directory
as any CMakeLists.txt.

The presence of the unused file was then breaking certain
tools that determine file lists by globbing rather than
by what's specified in CMakeLists.txt

In any case, the fix is to just remove the unused file from
the patch set.

llvm-svn: 302042
2017-05-03 15:58:37 +00:00
Greg Clayton 8df55b43e1 Verify that no compile units share the same line table in "llvm-dwarfdump --verify"
Check to make sure no compile units have the same DW_AT_stmt_list values. Report a verification error if they do.

Differential Revision: https://reviews.llvm.org/D32771

llvm-svn: 302039
2017-05-03 15:45:31 +00:00
Daniel Jasper dff096f217 Revert r301986 (and subsequent r301987).
The patch is failing to add StringTableStreamBuilder.h, but that isn't
even discovered because the corresponding StringTableStreamBuilder.cpp
isn't added to any CMakeLists.txt file and thus never built. I think
this patch is just incomplete.

llvm-svn: 302002
2017-05-03 07:29:25 +00:00
Zachary Turner 59e83892e0 Fix use after free in BinaryStream library.
This was reported by the ASAN bot, and it turned out to be
a fairly fundamental problem with the design of VarStreamArray
and the way it passes context information to the extractor.

The fix was cumbersome, and I'm not entirely pleased with it,
so I plan to revisit this design in the future when I'm not
pressed to get the bots green again.  For now, this fixes
the issue by storing the context information by value instead
of by reference, and introduces some impossibly-confusing
template magic to make things "work".

llvm-svn: 301999
2017-05-03 05:34:00 +00:00
Zachary Turner 67736594f7 Fix type conversion error.
llvm-svn: 301987
2017-05-02 23:41:51 +00:00
Zachary Turner 7dba20bd2b Make codeview::StringTable.
Previously we had knowledge of how to serialize and deserialize
a string table inside of DebugInfo/PDB, but the string table
that it serializes contains a piece that is actually considered
CodeView and can appear outside of a PDB.  We already have logic
in llvm-readobj and MCCodeView to read and write this format,
so it doesn't make sense to duplicate the logic in DebugInfoPDB
as well.

This patch makes codeview::StringTable (for writing) and
codeview::StringTableRef (for reading), updates DebugInfoPDB
to use these classes for its own writing, and updates llvm-readobj
to additionally use StringTableRef for reading.

It's a bit more difficult to get MCCodeView to use this for
writing, but it's a logical next step.

llvm-svn: 301986
2017-05-02 23:36:17 +00:00
Greg Clayton 6707046f90 Add line table verification to lldb-dwarfdump --verify
This patch verifies the .debug_line:
- verify all addresses in a line table sequence have ascending addresses
- verify that all line table file indexes are valid

Unit tests added for both cases.

Differential Revision: https://reviews.llvm.org/D32765

llvm-svn: 301984
2017-05-02 22:48:52 +00:00
Paul Robinson 2bc3873fe6 [DWARFv5] Parse new line-table header format.
The directory and file tables now have form-based content descriptors.
Parse these and extract the per-directory/file records based on the
descriptors.  For now we support only DW_FORM_string (inline) for the
path names; follow-up work will add support for indirect forms (i.e.,
DW_FORM_strp, strx<N>, and line_strp).

Differential Revision: http://reviews.llvm.org/D32713

llvm-svn: 301978
2017-05-02 21:40:47 +00:00
Greg Clayton c7695a8e45 Verify that all references point to actual DIEs in "llvm-dwarfdump --verify"
LTO and other fancy linking previously led to DWARF that contained invalid references. We already validate that CU relative references fall into the CU, and the DW_FORM_ref_addr references fall inside the .debug_info section, but we didn't validate that the references pointed to correct DIE offsets. This new verification will ensure that all references refer to actual DIEs and not an offset in between.

This caught a bug in DWARFUnit::getDIEForOffset() where if you gave it any offset, it would match the DIE that mathes the offset _or_ the next DIE. This has been fixed.

Differential Revision: https://reviews.llvm.org/D32722

llvm-svn: 301971
2017-05-02 20:28:33 +00:00
Zachary Turner e204a6c9a3 Rename pdb::StringTable -> pdb::PDBStringTable.
With the forthcoming codeview::StringTable which a pdb::StringTable
would hold an instance of as one member, this ambiguity becomes
confusing.  Rename to PDBStringTable to avoid this.

llvm-svn: 301948
2017-05-02 18:00:13 +00:00
Paul Robinson ba1c91564b Make DWARFDebugLine use StringRef for directory/file tables. NFC
Differential Revision: http://reviews.llvm.org/D32728

llvm-svn: 301940
2017-05-02 17:37:32 +00:00
Zachary Turner edef14510e [PDB/CodeView] Read/write codeview inlinee line information.
Previously we wrote line information and file checksum
information, but we did not write information about inlinee
lines and functions.  This patch adds support for that.

llvm-svn: 301936
2017-05-02 16:56:09 +00:00
Paul Robinson 9d4eb6922e Stylistic makeover of DWARFDebugLine before working on it. NFC
Rename parameters and locals to CamelCase, doxygenize the header, and
run clang-format on the whole thing.

llvm-svn: 301883
2017-05-01 23:27:55 +00:00
Zachary Turner 8a2ebfb1cd [CodeView] Write CodeView line information.
Differential Revision: https://reviews.llvm.org/D32716

llvm-svn: 301882
2017-05-01 23:27:42 +00:00
Greg Clayton 48432cfbeb Adds initial llvm-dwarfdump --verify support with unit tests.
lldb-dwarfdump gets a new "--verify" option that will verify a single file's DWARF debug info and will print out any errors that it finds. It will return an non-zero exit status if verification fails, and a zero exit status if verification succeeds. Adding the --quiet option will suppress any output the STDOUT or STDERR.

The first part of the verify does the following:

- verifies that all CU relative references (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata) have valid CU offsets
- verifies that all DW_FORM_ref_addr references have valid .debug_info offsets
- verifies that all DW_AT_ranges attributes have valid .debug_ranges offsets
- verifies that all DW_AT_stmt_list attributes have valid .debug_line offsets
- verifies that all DW_FORM_strp attributes have valid .debug_str offsets

Unit tests were added for each of the above cases.

Differential Revision: https://reviews.llvm.org/D32707

llvm-svn: 301844
2017-05-01 22:07:02 +00:00
Zachary Turner 7cc13e557c [PDB/CodeView] Rename some classes.
In preparation for introducing writing capabilities for each of
these classes, I would like to adopt a Foo / FooRef naming
convention, where Foo indicates that the class can manipulate and
serialize Foos, and FooRef indicates that it is an immutable view of
an existing Foo.  In other words, Foo is a writer and FooRef is a
reader.  This patch names some existing readers to conform to the
FooRef convention, while offering no functional change.

llvm-svn: 301810
2017-05-01 16:46:39 +00:00
Zachary Turner 5b6e4e0aed [llvm-pdbdump] Abstract some of the YAML/Raw printing code.
There is a lot of duplicate code for printing line info between
YAML and the raw output printer.  This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.

llvm-svn: 301728
2017-04-29 01:13:21 +00:00