This is a stub for a new concrete implementation of IPDBRawSymbol.
Nothing uses this uses this implementation yet. My plan is to
locally switch lldb-pdbdump from the DIA reader to the Native one
and flesh out the implementations of these method stubs in the order
they're needed.
llvm-svn: 294633
This is not a list of pairs, it is a hash table data structure. We now
correctly parse this out and dump it from llvm-pdbdump.
We still need to understand the conditions that lead to a type
getting an entry in the hash adjuster table. That will be done
in a followup investigation / patch.
Differential Revision: https://reviews.llvm.org/D29090
llvm-svn: 293090
While the builder pattern has proven useful for certain other
larger types, in this case it was hampering the ability to use
the data structure, as for runtime access we need a map that
we can efficiently read from and write to. So the two are merged
into a single data structure that can efficiently be read to,
written from, deserialized from bytes, and serialized to bytes.
llvm-svn: 292664
This was being parsed / serialized ad-hoc inside the code
for a specific PDB stream. But this data structure is used
in multiple ways / places within the PDB format. To be able
to re-use it we need to raise this code out and make it more
generic. In doing so, a number of bugs are fixed in the
original implementation, and support is added for growing
the hash table and deleting items from the hash table,
which had either been omitted or incorrect implemented in
the initial version.
Differential Revision: https://reviews.llvm.org/D28715
llvm-svn: 292535
This patch adds a new class NameHashTableBuilder which creates /names streams.
This patch contains a test to confirm that a stream created by
NameHashTableBuilder can be read by NameHashTable reader class.
Differential Revision: https://reviews.llvm.org/D28707
llvm-svn: 292040
Add an explicit LLVM_ENABLE_DIA_SDK option to control building support
for DIA SDK-based debugging. Control its value to match whether DIA SDK
support was found and expose it in LLVMConfig (alike LLVM_ENABLE_ZLIB).
Its value is needed for LLDB to determine whether to run tests requiring
DIA support. Currently it is obtained from llvm/Config/config.h;
however, this file is not available for standalone builds. Following
this change, LLDB will be modified to use the value from LLVMConfig.
Differential Revision: https://reviews.llvm.org/D26255
llvm-svn: 290818
Long is not the same size across a number of the platforms we support.
Use unsigned int here instead, it is more appropriate because
overflow/wrap-around is possible and, in this case, expected.
llvm-svn: 290068
Summary: The code we use to read PDBs assumed that streams we ask it to read exist, and would read memory outside a vector and crash if this wasn't the case. This would, for example, cause llvm-pdbdump to crash on PDBs generated by lld. This patch handles such cases more gracefully: the PDB reading code in LLVM now reports errors when asked to get a stream that is not present, and llvm-pdbdump will report missing streams and continue processing streams that are present.
Reviewers: ruiu, zturner
Subscribers: thakis, amccarth
Differential Revision: https://reviews.llvm.org/D27325
llvm-svn: 288722
PDBFileBuilder supports two different ways to create files.
One is PDBFileBuilder::commit. That function takes a filename
and write a result to the file. The other is PDBFileBuilder::build.
That returns a new PDBFile object.
This patch removes the latter because no one is using it and
in a real life situation we are very unlikely to need it.
Even if you need it, it'd be easy to write a new PDB to a memory
buffer and read it back.
Removing PDBFileBuilder::build enables us to remove other classes
build transitively.
Differential Revision: https://reviews.llvm.org/D26987
llvm-svn: 287697
This is required by DbiStream, but DbiStreamBuilder didn't align
these substreams, so the output of DbiSTreamBuilder couldn't be
read by DbiStream.
Test will be added to LLD.
llvm-svn: 287067
These numbers are intended to be capped at 65535, but
`std::max<uint16_t>(UINT16_MAX, N)` always returns N for any N because
the expression is the same as `std::max((uint16_t)UINT16_MAX, (uint16_t)N)`.
llvm-svn: 287060
This patch defines a new function to add a SectionContribs stream
to a PDB file. Unlike SectionMap, SectionContribs contains a list
of input sections as opposed to output sections.
Note that this patch needs improving because currently we do not
set Module field in SectionContribs entries. In a follow-up patch,
I'll add Modules and then fix it after that.
Differential Revision: https://reviews.llvm.org/D26210
llvm-svn: 286677
This change enables LLD to construct a Section Map stream in a PDB file.
I do not understand all these fields in the Section Map yet, but it seems
like a copy of a COFF section header in another format.
With this patch, DbiStreamBuilder can emit a Section Map which
llvm-pdbdump can dump.
Differential Revision: https://reviews.llvm.org/D26112
llvm-svn: 285606
Summary: This adds support for dumping the globals stream from PDB files using llvm-pdbdump, similar to the support we have for the publics stream.
Reviewers: ruiu, zturner
Subscribers: beanz, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D25801
llvm-svn: 284861
This is just a quick utility handy for getting rough summaries of types
in a given object or dwo file. I've been using it to investigate the
amount of type info redundancy across a project build, for example.
llvm-svn: 284537
The previous commit was failing because we filled empty slots of
the debug stream index with kInvalidStreamIndex. It should've been 0.
llvm-svn: 283925
Previously, there is no way to create a stream other than pre-defined
special stream such as DBI or IPI. This patch adds a new method,
addDbgStream, to add a debug stream to a PDB file.
Differential Revision: https://reviews.llvm.org/D25356
llvm-svn: 283823
This is the first step towards round-tripping symbol information,
and thusly being able to write symbol information to a PDB.
This patch writes the symbol information for each compiland to
the Yaml when running in pdb2yaml mode. There's still some loose
ends, such as what to do about relocations (necessary in order to
print linkage names), how to print enums with friendly names, and
how to give the dumper access to the StringTable, but this is a
good first start.
llvm-svn: 283641
When we create a PDB file using PDBFileBuilder, the information
in the superblock, such as the size of the resulting file, is not
available.
Previously, PDBFileBuilder::initialize took a superblock assuming
that all the members of the struct are correct. That is useful when
you want to restore the exact information from a YAML file, but
that's probably the only use case in which that is useful.
When we are creating a PDB file on the fly, we have to backfill the
members.
This patch redefines PDBFileBuilder::initialize to take only a
block size. Now all the other members are left as default values,
so that they'll be updated when commit() is called.
Differential Revision: https://reviews.llvm.org/D25108
llvm-svn: 282944
WritableStream needs the exact file size to open a file, but
until we fix the final layout of a PDB file, we don't know the
size of the file.
This patch changes the parameter type of PDBFileBuilder::commit
to solve that chiecken-and-egg problem. Now the function opens
a file after fixing the layout, so it can create a file with the
exact size.
Differential Revision: https://reviews.llvm.org/D25107
llvm-svn: 282940
The IPI stream is structurally identical to the TPI stream, but it
contains different record types. So we just re-use the TPI writing
code.
llvm-svn: 281638
We were inadvertently adding the size of the hash value stream to
the size of the TPI stream, even though the hash value stream is
an entirely separate stream.
llvm-svn: 281636
The `CVType` had two redundant fields which were confusing and
error-prone to fill out. By treating member records as a distinct
type from leaf records, we are able to simplify this quite a bit.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24432
llvm-svn: 281556
This completes being able to write all the interesting
values of a PDB TPI stream.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24370
llvm-svn: 281555
This simplifies a lot of code, and will actually be necessary for
an upcoming patch to serialize TPI record hash values.
The idea before was that visitors should be examining records, not
modifying them. But this is no longer true with a visitor that
constructs a CVRecord from Yaml. To handle this until now, we
were doing some fixups on CVRecord objects at a higher level, but
the code is really awkward, and it makes sense to just have the
visitor write the bytes into the CVRecord. In doing so I uncovered
a few bugs related to `Data` and `RawData` and fixed those.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24362
llvm-svn: 281067
This writes the full sequence of type records described in
Yaml to the TPI stream of the PDB file.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24316
llvm-svn: 281063
Previously we were assuming that any visitation of types would
necessarily be against a type we had binary data for. Reasonable
assumption when were just reading PDBs and dumping them, but once
we start writing PDBs from Yaml this breaks down, because we have
no binary data yet, only Yaml, and from that we need to read the
record kind and perform the switch based on that.
So this patch does that. Instead of having the visitor switch
on the kind that is already in the CVType record, we change the
visitTypeBegin() method to return the Kind, and switch on the
returned value. This way, the default implementation can still
return the value from the CVType, but the implementation which
visits Yaml records and serializes binary PDB type records can
use the field in the Yaml as the source of the switch.
llvm-svn: 280307
We were kind of hacking this together before by embedding the
ability to forward requests into the TypeDeserializer. When
we want to start adding more different kinds of visitor callback
interfaces though, this doesn't scale well and is very inflexible.
So introduce the notion of a pipeline, which itself implements
the TypeVisitorCallbacks interface, but which contains an internal
list of other callbacks to invoke in sequence.
Also update the existing uses of CVTypeVisitor to use this new
pipeline class for deserializing records before visiting them
with another visitor.
llvm-svn: 280293
Until now, our use case for the visitor has been to take a stream of bytes
representing a type stream, deserialize the records in sequence, and do
something with them, where "something" is determined by how the user
implements a particular set of callbacks on an abstract class.
For actually writing PDBs, however, we want to do the reverse. We have
some kind of description of the list of records in their in-memory format,
and we want to process each one. Perhaps by serializing them to a byte
stream, or perhaps by converting them from one description format (Yaml)
to another (in-memory representation).
This was difficult in the current model because deserialization and
invoking the callbacks were tightly coupled.
With this patch we change this so that TypeDeserializer is itself an
implementation of the particular set of callbacks. This decouples
deserialization from the iteration over a list of records and invocation
of the callbacks. TypeDeserializer is initialized with another
implementation of the callback interface, so that upon deserialization it
can pass the deserialized record through to the next set of callbacks. In
a sense this is like an implementation of the Decorator design pattern,
where the Deserializer is a decorator.
This will be useful for writing Pdbs from yaml, where we have a
description of the type records in Yaml format. In this case, the visitor
implementation would have each visitation callback method implemented in
such a way as to extract the proper set of fields from the Yaml, and it
could maintain state that builds up a list of these records. Finally at
the end we can pass this information through to another set of callbacks
which serializes them into a byte stream.
Reviewed By: majnemer, ruiu, rnk
Differential Revision: https://reviews.llvm.org/D23177
llvm-svn: 277871
pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit
without calling DbiStreamBuilder::finalize. Because `finalize` initializes
`Header` member, `Header` remained nullptr which caused a crash bug.
Differential Revision: https://reviews.llvm.org/D23143
llvm-svn: 277681
MappedBlockSTream can work with any sequence of block data where
the ordering is specified by a list of block numbers. So rather
than manually stitch them together in the case of the FPM, reuse
this functionality so that we can treat the FPM as if it were
contiguous.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23066
llvm-svn: 277609
The FPM is split at regular intervals across the MSF file, as the MS code
suggests. It turns out that the value of the interval is precisely the
block size. If the block size is 4096, then there are two Fpm pages every
4096 blocks.
So here we teach the PDBFile class to parse a split FPM, and also add more
options when dumping the FPM to display some additional information such
as orphaned pages (pages which the FPM says are allocated, but which
nothing appears to use), use after free pages (pages which the FPM says
are not allocated, but which are referenced by a stream), and multiple use
pages (pages which the FPM says are allocated but are used more than
once).
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23022
llvm-svn: 277388
Previously this change was submitted from a Windows machine, so
changes made to the case of filenames and directory names did
not survive the commit, and as a result the CMake source file
names and the on-disk file names did not match on case-sensitive
file systems.
I'm resubmitting this patch from a Linux system, which hopefully
allows the case changes to make it through unfettered.
llvm-svn: 277213
In a previous patch, it was suggested to use all caps instead of
rolling caps for initialisms, so this patch changes everything
to do this.
llvm-svn: 277190
This was a pure virtual base class whose purpose was to abstract
away the notion of how you retrieve the layout of a discontiguous
stream of blocks in an Msf file. This led to too many layers of
abstraction making it difficult to figure out what was going on
and extend things. Ultimately, a stream's layout is decided by
its length and the array of block numbers that it lives on. So
rather than have an abstract base class which can return this in
any number of ways, it's more straightforward to simply store them
as fields of a trivial struct, and also to give a more appropriate
name.
This patch does that. It renames IMsfStreamData to MsfStreamLayout,
and deletes the 2 concrete implementations, DirectoryStreamData
and IndexedStreamData. MsfStreamLayout is a trivial struct
with the necessary data.
llvm-svn: 277018
Previously it was storing all the fields of an msf::Layout as
separate members. This is a trivial cleanup to make it store
an msf::Layout directly. This makes the code more readable
since it becomes clear which fields of PDBFile are actually the
msf specific layout information in a sea of other bookkeeping
fields.
llvm-svn: 276460
This makes it easier to have the writable and readable PDB
interfaces share code since the read/write and write-only
interfaces now share a single allocator, you don't have to worry
about a builder building a read only interface and then having
the read-only interface's data become corrupt when the builder
goes out of scope. Now the allocator is specified explicitly
to all constructors, so all interfaces can share a single allocator
that is scoped appropriately.
llvm-svn: 276459
This provides a better layering of responsibilities among different
aspects of PDB writing code. Some of the MSF related code was
contained in CodeView, and some was in PDB prior to this. Further,
we were often saying PDB when we meant MSF, and the two are
actually independent of each other since in theory you can have
other types of data besides PDB data in an MSF. So, this patch
separates the MSF specific code into its own library, with no
dependencies on anything else, and DebugInfoCodeView and
DebugInfoPDB take dependencies on DebugInfoMsf.
llvm-svn: 276458
This facilitates code reuse between the builder classes and the
"frozen" read only versions of the classes used for parsing
existing PDB files.
llvm-svn: 276427
This implements support for writing compiland and compiland source
file info to a binary PDB. This is tested by adding support for
dumping these fields from an existing PDB to yaml, reading them
back in, and dumping them again and verifying the values are as
expected.
llvm-svn: 276426
Block 1 and 2 of an MSF file are bit vectors that represent the
list of blocks allocated and free in the file. We had been using
these blocks to write stream data and other data, so we mark them
as the free page map now. We don't yet serialize these pages to
the disk, but at least we make a note of what it is, and avoid
writing random data to them.
Doing this also necessitated cleaning up some of the tests to be
more general and hardcode fewer values, which is nice.
llvm-svn: 275629
Previously we would read a PDB, then write some of it back out,
but write the directory, super block, and other pertinent metadata
back out unchanged. This generates incorrect PDBs since the amount
of data written was not always the same as the amount of data read.
This patch changes things to use the newly introduced `MsfBuilder`
class to write out a correct and accurate set of Msf metadata for
the data *actually* written, which opens up the door for adding and
removing type records, symbol records, and other types of data to
an existing PDB.
llvm-svn: 275627
Some abstractions in LLVM "know" that they are reading in-bounds,
FixedStreamArray, and provide a simple result. This breaks down if the
stream map is bogus.
llvm-svn: 275010
This issue was encountered on libcmt.pdb, which has a type record that
looks like this:
Struct (0x1094) {
TypeLeafKind: LF_STRUCTURE (0x1505)
MemberCount: 3
Properties [ (0x200)
HasUniqueName (0x200)
]
FieldList: <field list> (0x1093)
DerivedFrom: 0x0
VShape: 0x0
SizeOf: 4
Name: <unnamed-tag>
LinkageName: .?AU<unnamed-tag>@@
}
The checks for startswith/endswith "<unnamed-tag>" should look at the
display name, not the linkage name.
llvm-svn: 274376
Somehow all the functionality to write PDB files got removed,
probably accidentally when uploading the patch perhaps the wrong
one got uploaded. This re-adds all the code, as well as the
corresponding test.
llvm-svn: 274248
This allows better catching of compiler errors since we can use
the override keyword to verify that methods are actually
overridden.
Also in this patch I've changed from storing a boolean Error
code everywhere to returning an llvm::Error, to propagate richer
error information up the call stack.
Reviewed By: ruiu, rnk
Differential Revision: http://reviews.llvm.org/D21410
llvm-svn: 272926
Both parameters to visitTypeBegin are actually members of CVRecord,
so we can just pass CVRecord instead of destructuring it.
Differential Revision: http://reviews.llvm.org/D21435
llvm-svn: 272899
This reverts commit 879139e1c6577b09df52de56a6bab856a19ed185.
This was committed accidentally when I blindly typed git svn
dcommit instead of the command to generate a patch.
llvm-svn: 272693
This fixes an alignment issue by forcing all cached allocations
to be 8 byte aligned, and also fixes an issue arising on big
endian systems by writing ulittle32_t's instead of uint32_t's
in the test.
llvm-svn: 272437
This is the next step towards being able to write PDBs.
MemoryBuffer is immutable, and StreamInterface is our replacement
which can be any combination of read-only, read-write, or write-only
depending on the particular implementation.
The one place where we were creating a PDBFile (in RawSession) is
updated to subclass ByteStream with a simple adapter that holds
a MemoryBuffer, and initializes the superclass with the buffer's
array, so that all the functionality of ByteStream works
transparently.
llvm-svn: 272370
This adds method and tests for writing to a PDB stream. With
this, even a PDB stream which is discontiguous can be treated
as a sequential stream of bytes for the purposes of writing.
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21157
llvm-svn: 272369
TPI hash table contains a parallel array for the type records.
For each type record R, a hash value is calculated by `H(R) % NumBuckets`
where H is a hash function, and the result is stored to a bucket element.
H is TPI1::hashPrec function in microsoft-pdb repository.
Our hash function does not support all type record types yet.
Currently it supports only records for line number.
I'll extend it in a follow up patch.
The aim of verify the hash table is not only detect corrupted files.
It ensures that our understanding of how the hash values are calculated
is correct.
llvm-svn: 272229
In order to efficiently write PDBs, we need to be able to make a
StreamWriter class similar to a StreamReader, which can transparently deal
with writing to discontiguous streams, and we need to use this for all
writing, similar to how we use StreamReader for all reading.
Most discontiguous streams are the typical numbered streams that appear in
a PDB file and are described by the directory, but the exception to this,
that until now has been parsed by hand, is the directory itself.
MappedBlockStream works by querying the directory to find out which blocks
a stream occupies and various other things, so naturally the same logic
could not possibly work to describe the blocks that the directory itself
resided on.
To solve this, I've introduced an abstraction IPDBStreamData, which allows
the client to query for the list of blocks occupied by the stream, as well
as the stream length. I provide two implementations of this: one which
queries the directory (for indexed streams), and one which queries the
super block (for the directory stream).
This has the side benefit of vastly simplifying the code to parse the
directory. Whereas before a mini state machine was rolled by hand, now we
simply use FixedStreamArray to read out the stream sizes, then build a
vector of FixedStreamArrays for the stream map, all in just a few lines of
code.
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21046
llvm-svn: 271982
The data strucutre in the new FPO stream is described in the
PE/COFF spec. There is one record per function if frame pointer
is omitted.
Differential Revision: http://reviews.llvm.org/D20999
llvm-svn: 271926
When printing line information and file checksums, we were printing
the file offset field from the struct header. This teaches
llvm-pdbdump how to turn those numbers into the filename. In the
case of file checksums, this is done by looking in the global
string table. In the case of line contributions, this is done
by indexing into the file names buffer of the DBI stream. Why
they use a different technique I don't know.
llvm-svn: 271630
To facilitate this, a couple of changes had to be made:
1. `ModuleSubstream` got moved from `DebugInfo/PDB` to
`DebugInfo/CodeView`, and various codeview related types are defined
there. It turns out `DebugInfo/CodeView/Line.h` already defines many of
these structures, but this is really old code that is not endian aware,
doesn't interact well with `StreamInterface` and not very helpful for
getting stuff out of a PDB. Eventually we should migrate the old readobj
`COFFDumper` code to these new structures, or at least merge their
functionality somehow.
2. A `ModuleSubstream` visitor is introduced. Depending on where your
module substream array comes from, different subsets of record types can
be expected. We are already hand parsing these substream arrays in many
places especially in `COFFDumper.cpp`. In the future we can migrate these
paths to the visitor as well, which should reduce a lot of code in
`COFFDumper.cpp`.
Differential Revision: http://reviews.llvm.org/D20936
Reviewed By: ruiu, majnemer
llvm-svn: 271621
This first pass only splits apart the records and dumps the line
info kinds and binary data. Subsequent patches will parse out
the binary data into more useful information and dump it in
detail.
llvm-svn: 271576
Unlike other sections that can grow to any size, the COFF section header
stream has maximum length because each record is fixed size and the COFF
file format limits the maximum number of sections. So I decided to not
create a specific stream class for it. Instead, I added a member function
to DbiStream class which returns a vector of COFF headers.
Differential Revision: http://reviews.llvm.org/D20717
llvm-svn: 271557
This converts remaining uses of ByteStream, which was still
left in the symbol stream and type stream, to using the new
StreamInterface zero-copy classes.
RecordIterator is finally deleted, so this is the only way left
now. Additionally, more error checking is added when iterating
the various streams.
With this, the transition to zero copy pdb access is complete.
llvm-svn: 271101
Due to differences in template instantiation rules, it is not
portable to static_assert(false) inside of an invalid specialization
of a template. Instead I just =delete the method so that it can't
be used, and leave a comment that it must be explicitly specialized.
llvm-svn: 271027
This reverts commit r271024 due to error: static_assert failed
"You must either provide a specialization of VarStreamArrayExtractor
or a custom extractor"
llvm-svn: 271026
This reduces the amount of memory used by llvm-pdbdump by roughly
1/3 of the size of the PDB file.
Differential Revision: http://reviews.llvm.org/D20724
Reviewed By: ruiu
llvm-svn: 271025
Since we want to move toward zero-copy access to stream data, we
want to remove all instances of copying operations. So get rid
of some of those here.
Differential Revision: http://reviews.llvm.org/D20720
Reviewed By: ruiu
llvm-svn: 270960
PDBs can be extremely large. We're already mapping the entire
PDB into the process's address space, but to make matters worse
the blocks of the PDB are not arranged contiguously. So, when
we have something like an array or a string embedded into the
stream, we have to make a copy. Since it's convenient to use
traditional data structures to iterate and manipulate these
records, we need the memory to be contiguous.
As a result of this, we were using roughly twice as much memory
as the file size of the PDB, because every stream was copied
out and re-stitched together contiguously.
This patch addresses this by improving the MappedBlockStream
to allocate from a BumpPtrAllocator only when a read requires
a discontiguous read. Furthermore, it introduces some data
structures backed by a stream which can iterate over both
fixed and variable length records of a PDB. Since everything
is backed by a stream and not a buffer, we can read almost
everything from the PDB with zero copies.
Differential Revision: http://reviews.llvm.org/D20654
Reviewed By: ruiu
llvm-svn: 270951
We have need to reuse this functionality, including making
additional generic stream types that are smarter about how and
when they copy memory versus referencing the original memory.
So all of these structures belong in the common library
rather than being pdb specific.
llvm-svn: 270751
name_ids() did not return all IDs but only the first NameCount items.
The number of non-zero entries in IDs vector is NameCount, but it
does not mean that all non-zero entries are at the beginning of IDs
vector.
Differential Revision: http://reviews.llvm.org/D20611
llvm-svn: 270656
This adds support for parsing and dumping the following
symbol types:
S_LPROCREF
S_ENVBLOCK
S_COMPILE2
S_REGISTER
S_COFFGROUP
S_SECTION
S_THUNK32
S_TRAMPOLINE
As of this patch, the test PDB files no longer have any unknown
symbol types.
llvm-svn: 270628
When dumping huge PDB files, too many of the options were grouped
together so you would get neverending spew of output. This patch
introduces more granular display options so you can only dump the
fields you actually care about.
llvm-svn: 270607
DBI stream contains a stream number of the symbol record stream.
Symbol record streams is an array of length-type-value members.
Each member represents one symbol.
Publics stream contains offsets to the symbol record stream.
This patch is to print out all symbols that are referenced by
the publics stream.
Note that even with this patch, llvm-pdbdump cannot dump all the
information in a publics stream since it contains more information
than symbol names. I'll improve it in followup patches.
Differential Revision: http://reviews.llvm.org/D20480
llvm-svn: 270262
I don't yet fully understand the meaning of these data strcutures,
but at least it seems that their sizes and types are correct.
With this change, we can read publics streams till end.
Differential Revision: http://reviews.llvm.org/D20343
llvm-svn: 269861
Publics stream seems to contain information as to public symbols.
It actually contains a serialized hash table along with fixed-sized
headers. This patch is not complete. It scans only till the end of
the stream and dump the header information. I'll write code to
de-serialize the hash table later.
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20256
llvm-svn: 269484
This reuses the CVTypeDumper from libcodeview to dump full
information about type records within a PDB file.
Differential Revision: http://reviews.llvm.org/D20022
Reviewed By: rnk
llvm-svn: 268808
Summary:
Port the dumper in llvm-readobj over to it.
I'm planning to use this visitor to power type stream merging.
While we're at it, try to switch from StringRef to ArrayRef<uint8_t> in some
places.
Reviewers: zturner, amccarth
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19899
llvm-svn: 268535
Ability to parse codeview type streams is also needed by
DebugInfoPDB for parsing PDBs, so moving this into a library
gives us this option. Since DebugInfoPDB had already hand
rolled some code to do this, that code is now convereted over
to using this common abstraction.
Differential Revision: http://reviews.llvm.org/D19887
Reviewed By: dblaikie, amccarth
llvm-svn: 268454
This parses the TPI stream (stream 2) from the PDB file. This stream
contains some header information followed by a series of codeview records.
There is some additional complexity here in that alongside this stream of
codeview records is a serialized hash table in order to efficiently query
the types. We parse the necessary bookkeeping information to allow us to
reconstruct the hash table, but we do not actually construct it yet as
there are still a few things that need to be understood first.
Differential Revision: http://reviews.llvm.org/D19840
Reviewed By: ruiu, rnk
llvm-svn: 268343
PDB has a lot of similar data structures. We already have code
for parsing a Name Map, but PDB seems to have a different but
very similar structure that is a hash table. This is the
beginning of code needed in order to parse the name hash table,
but it is not yet complete. It parses the basic metadata of
the hash table, the bucket array, and the names buffer, but
doesn't use any of these fields yet as the data structure
requires a non-trivial amount of work to understand.
llvm-svn: 268268
There are probably hundreds of crashers we can find by fuzzing
more. For now we do the simplest possible validation of the
block size. Later, more complicated validations can verify that
other fields of the super block such as directory size, number
of blocks, agree with the size of the file etc.
llvm-svn: 268084
The motivation for this change is that PDB has the notion of
streams and substreams. Substreams often consist of variable
length structures that are convenient to be able to treat as
guaranteed, contiguous byte arrays, whereas the streams they
are contained in are not necessarily so, as a single stream
could be spread across many discontiguous blocks.
So, when processing data from a substream, we want to be able
to assume that we have a contiguous byte array so that we can
cast pointers to variable length arrays and such.
This leads to the question of how to be able to read the same
data structure from either a stream or a substream using the
same interface, which is where this patch comes in.
We separate out the stream's read state from the underlying
representation, and introduce a `StreamReader` class. Then
we change the name of `PDBStream` to `MappedBlockStream`, and
introduce a second kind of stream called a `ByteStream` which is
simply a sequence of contiguous bytes. Finally, we update all
of the std::vectors in `PDBDbiStream` to use `ByteStream` instead
as a proof of concept.
llvm-svn: 268071
We now read out the rest of the substreams from the DBI streams. One of
these substreams, the FileInfo substream, contains information about which
source files contribute to each module (aka compiland). This patch
additionally parses out the file information from that substream, and
dumps it in llvm-pdbdump.
Differential Revision: http://reviews.llvm.org/D19634
Reviewed by: ruiu
llvm-svn: 267928
This gets more data out of the DBI strema of the PDB. In
particular it extracts the metadata for the list of modules
(compilands) that this PDB contains info about, and adds support
for dumping these fields to llvm-pdbdump.
Differential Revision: http://reviews.llvm.org/D19570
Reviewed By: ruiu
llvm-svn: 267818
Summary:
llvm-symbolizer wants to get linkage names of functions for historical
reasons. Linkage names are only recorded in the PDB for public symbols,
and the linkage name is apparently stored separately in some "public
symbol" record. We had a workaround in PDBContext which would look for
such symbols when the user requested linkage names.
However, when given an address that was truly in a private function and
public funciton, we would accidentally find nearby public symbols and
return those function names. The fix is to look for both function
symbols and public symbols and only prefer the public symbol name if the
addresses of the symbols agree.
Fixes PR27492
Reviewers: zturner
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19571
llvm-svn: 267732
The DBI stream contains a lot of bookkeeping information for other
streams. In particular it contains information about section contributions
and linked modules. This patch is a first attempt at parsing some of the
information out of the DBI stream. It currently only parses and dumps the
headers of the DBI stream, so none of the module data or section
contribution data is pulled out.
This is just a proof of concept that we understand the basic properties of
the DBI stream's metadata, and followup patches will try to extract more
detailed information out.
Differential Revision: http://reviews.llvm.org/D19500
Reviewed By: majnemer, ruiu
llvm-svn: 267585
r267049 broke multiple buildbots (e.g. clang-cmake-mips, and clang-x86_64-linux-selfhost-modules) which the follow-ups have not yet resolved and this is preventing subsequent committers from being notified about additional failures on the affected buildbots.
llvm-svn: 267148