Summary:
This patch fixes access to fpo streams in native pdb from DbiStream and makes
code consistent with DbiStreamBuilder.
Patch By: leonid.mashinskiy
Reviewers: zturner, aleksandr.urakov
Reviewed By: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D56725
llvm-svn: 352615
PDBs contain several serialized hash tables. In the microsoft-pdb
repo published to support LLVM implementing PDB support, the
provided initializes the bucket count for the TPI and IPI streams
to the maximum size. This occurs in tpi.cpp L33 and tpi.cpp L398.
In the LLVM code for generating PDBs, these streams are created with
minimum number of buckets. This difference makes LLVM generated
PDBs slower for when used for debugging.
Patch by C.J. Hebert
Differential Revision: https://reviews.llvm.org/D56942
llvm-svn: 352117
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Running "llvm-pdbutil dump -all" on linux (using the native PDB reader),
over a few PDBs pulled from the Microsoft public symbol store uncovered
a few small issues:
- stripped PDBs might not have the strings stream (/names)
- stripped PDBs might not have the "module info" stream
Differential Revision: https://reviews.llvm.org/D54006
llvm-svn: 346010
We changed an ArrayRef<uint8_t> to an ArrayRef<uint32_t>, but
it needs to be an ArrayRef<support::ulittle32_t>.
We also change ArrayRef<> to FixedStreamArray<>. Technically
an ArrayRef<> will work, but it can cause a copy in the underlying
implementation if the memory is not contiguous, and there's no
reason not to use a FixedStreamArray<>.
Thanks to nemanjai@ and thakis@ for helping me track this down
and confirm the fix.
llvm-svn: 344063
The Globals table is a hash table keyed on symbol name, so
it's possible to lookup symbols by name in O(1) time. Add
a function to the globals stream to do this, and add an option
to llvm-pdbutil to exercise this, then use it to write some
tests to verify correctness.
llvm-svn: 343951
These work a little differently because they are actually in
the globals stream and are treated as symbol records, even though
DIA presents them as types. So this also adds the necessary
infrastructure to cache records that live somewhere other than
the TPI stream as well.
llvm-svn: 343507
We didn't properly detect when a pointer was a member
pointer, and when that was the case we were not
properly returning class parent info. This caused
member pointers to render incorrectly in pretty mode.
However, we didn't even have pretty tests for pointers
in native mode, so those are also added now to ensure
this.
llvm-svn: 343393
This allows the native reader to find records of class/struct/
union type and dump them. This behavior is tested by using the
diadump subcommand against golden output produced by actual DIA
SDK on the same PDB file, and again using pretty -native to
confirm that we actually dump the classes. We don't find class
members or anything like that yet, for now it's just the class
itself.
llvm-svn: 342779
Some records point to an LF_CLASS, LF_UNION, LF_STRUCTURE, or LF_ENUM
which is a forward reference and doesn't contain complete debug
information. In these cases, we'd like to be able to quickly locate the
full record. The TPI stream stores an array of pre-computed record hash
values, one for each type record. If we pre-process this on startup, we
can build a mapping from hash value -> {list of possible matching type
indices}. Since hashes of full records are only based on the name and or
unique name and not the full record contents, we can then use forward
ref record to compute the hash of what *would* be the full record by
just hashing the name, use this to get the list of possible matches, and
iterate those looking for a match on name or unique name.
llvm-pdbutil is updated to resolve forward references for the purposes
of testing (plus it's just useful).
Differential Revision: https://reviews.llvm.org/D52283
llvm-svn: 342656
There were several issues with the previous implementation.
1) There were no tests.
2) We didn't support creating PDBSymbolTypePointer records for
builtin types since those aren't described by LF_POINTER
records.
3) We didn't support a wide enough variety of builtin types even
ignoring pointers.
This patch fixes all of these issues. In order to add tests,
it's helpful to be able to ignore the symbol index id hierarchy
because it makes the golden output from the DIA version not match
our output, so I've extended the dumper to disable dumping of id
fields.
llvm-svn: 342493
Previously we would dump the names of enum types, but not their
enumerator values. This adds support for enumerator values. In
doing so, we have to introduce a general purpose mechanism for
caching symbol indices of field list members. Unlike global
types, FieldList members do not have a TypeIndex. So instead,
we identify them by the pair {TypeIndexOfFieldList, IndexInFieldList}.
llvm-svn: 342415
Naively computing the hash after the PDB data has been generated is in practice
as fast as other approaches I tried. I also tried online-computing the hash as
parts of the PDB were written out (https://reviews.llvm.org/D51887; that's also
where all the measuring data is) and computing the hash in parallel
(https://reviews.llvm.org/D51957). This approach here is simplest, without
being slower.
Differential Revision: https://reviews.llvm.org/D51956
llvm-svn: 342333
Currently if we got something like `const Foo` we'd ignore it and
just rely on printing the unmodified `Foo` later on. However,
for testing the native reading code we really would like to be able
to see these so that we can verify that the native reader can
actually handle them. Instead of printing out the full type though,
just print out the header.
llvm-svn: 342295
r342003 added support for emitting FPO data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the PDB
file. However, that is not the end of the story. FPO can end
up in two different destinations in a PDB, each corresponding to
a different FPO data source.
The case handled by r342003 involves copying data from the
DEBUG_S_FRAMEDATA subsection of the .debug$S section to the
"New FPO" stream in the PDB, which is then referred to by the
DBI stream. The case handled by this patch involves copying
records from the .debug$F section of an object file to the "FPO"
stream (or perhaps more aptly, the "Old FPO" stream) in the PDB
file, which is also referred to by the DBI stream.
The formats are largely similar, and the difference is mostly
only visible in masm generated object files, such as some of the
low-level CRT object files like memcpy. MASM doesn't appear to
support writing the DEBUG_S_FRAMEDATA subsection, and instead
just writes these records to the .debug$F section.
Although clang-cl does not emit a .debug$F section ever, lld still
needs to support it so we have good debugging for CRT functions.
Differential Revision: https://reviews.llvm.org/D51958
llvm-svn: 342080
Summary:
There are two registers encoded in the S_FRAMEPROC flags: one for locals
and one for parameters. The encoding is described by the
ExpandEncodedBasePointerReg function in cvinfo.h. Two bits are used to
indicate one of four possible values:
0: no register - Used when there are no variables.
1: SP / standard - Variables are stored relative to the standard SP
for the ISA.
2: FP - Variables are addressed relative to the ISA frame
pointer, i.e. EBP on x86. If realignment is required, parameters
use this. If a dynamic alloca is used, locals will be EBP relative.
3: Alternative - Variables are stored relative to some alternative
third callee-saved register. This is required to address highly
aligned locals when there are dynamic stack adjustments. In this
case, both the incoming SP saved in the standard FP and the current
SP are at some dynamic offset from the locals. LLVM uses ESI in
this case, MSVC uses EBX.
Most of the changes in this patch are to pass around the CPU so that we
can decode these into real, named architectural registers.
Subscribers: hiraditya
Differential Revision: https://reviews.llvm.org/D51894
llvm-svn: 341999
In order to start testing this, I've added a new mode to
llvm-pdbutil which is only really useful for writing tests.
It just dumps the value of raw fields in record format.
This isn't really ideal and it won't allow us to test some
important cases, but it's better than nothing for now.
llvm-svn: 341729
The way DIA SDK works is that when you request a symbol, it
gets assigned an internal identifier that is unique for the
life of the session. You can then use this identifier to
get back the same symbol, with all of the same internal state
that it had before, even if you "destroyed" the original
copy of the object you had.
This didn't work properly in our native implementation, and
if you destroyed an object for a particular symbol, then
requested the same symbol again, it would get assigned a new
ID and you'd get a fresh copy of the object. In order to fix
this some refactoring had to happen to properly reuse cached
objects. Some unittests are added to verify that symbol
reuse is taking place, making use of the new unittest input
feature.
llvm-svn: 341503
Following D50807, and heading towards D50664, this intermediary change does the following:
1. Upgrade all custom Error types in llvm/trunk/lib/DebugInfo/ to use the new StringError behavior (D50807).
2. Implement std::is_error_code_enum and make_error_code() for DebugInfo error enumerations.
3. Rename GenericError -> PDBError (the file will be renamed in a subsequent commit)
4. Update custom error messages to follow the same formatting: (\w\s*)+\.
5. Keep generic "file not found" (ENOENT) errors as they are in PDB code. Previously, there used to be a custom enumeration for that purpose.
6. Remove a few extraneous LF in log() implementations. Printing LF is a responsability at a higher level, not at the error level.
Differential Revision: https://reviews.llvm.org/D51499
llvm-svn: 341228
The reference implementation uses a case-insensitive string
comparison for strings of equal length. This will cause the
string "tEo" to compare less than "VUo". However we were using
a case sensitive comparison, which would generate the opposite
outcome. Switch to a case insensitive comparison. Also, when
one of the strings contains non-ascii characters, fallback to
a straight memcmp.
The only way to really test this is with a DIA test. Before this
patch, the test will fail (but succeed if link.exe is used instead
of lld-link). After the patch, it succeeds even with lld-link.
llvm-svn: 336464
We add an option to dump the entire global / public symbol record
stream. Previously we would dump globals or publics, but not both.
And when we did dump them, we would always dump them in the order
they were referenced by the corresponding hash streams, not in
the order they were serialized in. This patch adds a lower level
mode that just dumps the whole stream in serialization order.
Additionally, when dumping global-extras, we now dump the hash
bitmap as well as the record offset instead of dumping all zeros
for the offsets.
llvm-svn: 336407
The DBI stream contains a list of module descriptors. At the
beginning of each descriptor is a structure representing the first
section contribution in the output file for that module. LLD
currently doesn't fill out this structure at all, but link.exe
does. So as a precursor to emitting this data in LLD, we first
need a way to dump it so that it can be checked.
This patch adds support for the dumping, and verifies via a test
that LLD emits bogus information.
llvm-svn: 330208
If a class's first data member is an instance of an empty class, then an
assertion in the PrettyClassLayoutGraphicalDumper would fail. The
storage is reserved, but it's not marked as in use.
As far as I understand, it's the assertion that's faulty, so I removed it
and updated the nearby comment.
Found by running llvm-pdbutil against its own PDB, and this assertion would
fail on HashAdjusters, which is a HashTable whose first data member is a
TraitsT, which is a PdbHashTraits<T>, which is an empty struct. (The struct
has a specialization for uint32_t, but that specialization doesn't apply
here because the T is actually ulittle32_t.)
Differential Revision: https://reviews.llvm.org/D45645
llvm-svn: 330135
We have a few functions that virtually all command wants to run on
process startup/shutdown. This patch adds InitLLVM class to do that
all at once, so that we don't need to copy-n-paste boilerplate code
to each llvm command's main() function.
Differential Revision: https://reviews.llvm.org/D45602
llvm-svn: 330046
These appear in a .debug$P section, which is exactly the same in
format as a .debug$T section. So we shouldn't ignore these when
dumping types.
llvm-svn: 329326
Using this, you can use llvm-pdbutil to export the contents of a
stream to a binary file, then run explain on the binary file so
that it treats the offset as an offset into the stream instead
of an offset into a file. This makes it easy to compare the
contents of the same stream from two different files.
llvm-svn: 329207
This command can dump the binary contents of a stream to a file.
This is useful when you want to do side-by-side comparisons of
a specific stream from two PDBs to examine the differences between
them. You can export both of them to a file, then open them up
side by side in a hex editor (for example), so as to eliminate any
differences that might arise from the contents being on different
blocks in the PDB.
In subsequent patches I plan to improve the "explain" subcommand
so that you can explain the contents of a binary file that isn't
necessarily a full PDB, but one of these dumped streams, by telling
the subcommand how to interpret the contents.
llvm-svn: 329002
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: JDevlieghere, zturner, echristo, dberris, friss
Reviewed By: echristo
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D45141
llvm-svn: 328943
This will show more detail when using `llvm-pdbutil explain` on an
offset in the DBI or PDB streams. Specifically, it will dig into
individual header fields and substreams to give a more precise
description of what the byte represents.
llvm-svn: 328878
When we determine that a field belongs to an MSF super block or
the free page map, we wouldn't print any additional information.
With this patch, we now print the value of the field (for super
block fields) or the allocation status of the specified byte (in
the case of offsets in the FPM).
llvm-svn: 328808
We were trying to dig into the super block fields and print a
description of the field at the specified offset, but we were
printing the wrong field due to an off-by-one-field-error.
llvm-svn: 328804
When investigating various things, we often have a file offset
and what to know what's in the PDB at that address. For example
we may be doing a binary comparison of two LLD-generated PDBs
to look for sources of non-determinism, or we may wish to compare
an LLD-generated PDB with a Microsoft generated PDB for sources
of byte-for-byte incompatibility. In these cases, we can do a
binary diff of the two files, and once we find a mismatched byte
we can use explain to figure out what that byte is, immediately
honining in on the problem.
This patch implements this by trying to narrow the meaning of
a particular file offset down as much as possible.
Differential Revision: https://reviews.llvm.org/D44959
llvm-svn: 328799
This has been made obsolete by the fact that almost all of the
things it previously checked for are no longer relevant since
we can just compare bytes in a lot of places.
llvm-svn: 328562
When investigating bugs in PDB generation, the first step is
often to do the same link with link.exe and then compare PDBs.
But comparing PDBs is hard because two completely different byte
sequences can both be correct, so it hampers the investigation when
you also have to spend time figuring out not just which bytes are
different, but also if the difference is meaningful.
This patch fixes a couple of cases related to string table emission,
hash table emission, and the order in which we emit strings that
makes more of our bytes the same as the bytes generated by MS PDBs.
Differential Revision: https://reviews.llvm.org/D44810
llvm-svn: 328348
This is still failing on a different bot this time due to some
issue related to hashing absolute paths. Reverting until I can
figure it out.
llvm-svn: 328014
The issue causing this to fail in certain configurations
should be fixed.
It was due to the fact that DIA apparently expects there to be
a null string at ID 1 in the string table. I'm not sure why this
is important but it seems to make a difference, so set it.
llvm-svn: 328002
Natvis is a debug language supported by Visual Studio for
specifying custom visualizers. The /NATVIS option is an
undocumented link.exe flag which will take a .natvis file
and "inject" it into the PDB. This way, you can ship the
debug visualizers for a program along with the PDB, which
is very useful for postmortem debugging.
This is implemented by adding a new "named stream" to the
PDB with a special name of /src/files/<natvis file name>
and simply copying the contents of the xml into this file.
Additionally, we need to emit a single stream named
/src/headerblock which contains a hash table of embedded
files to records describing them.
This patch adds this functionality, including the /NATVIS
option to lld-link.
Differential Revision: https://reviews.llvm.org/D44328
llvm-svn: 327895
It previously only worked when the key and value types were
both 4 byte integers. We now have a use case for a non trivial
value type, so we need to extend it to support arbitrary value
types, which means templatizing it.
llvm-svn: 327647
Injected sources are basically a way to add actual source file content
to your PDB. Presumably you could use this for shipping your source code
with your debug information, but in practice I can only find this being
used for embedding natvis files inside of PDBs.
In order to effectively test LLVM's natvis file injection, we need a way
to dump the injected sources of a PDB in a way that is authoritative
(i.e. based on Microsoft's understanding of the PDB format, and not
LLVM's). To this end, I've added support for dumping injected sources
via DIA. I made a PDB file that used the /natvis option to generate a
test case.
Differential Revision: https://reviews.llvm.org/D44405
llvm-svn: 327428
Summary: This avoids crashing when a user tries to dump a pdb with the `-native` option.
Reviewers: zturner, llvm-commits, rnk
Reviewed By: zturner
Subscribers: mgrang
Differential Revision: https://reviews.llvm.org/D44117
llvm-svn: 326863
Summary:
The built-in PDB types enum has been extended to include char16_t and char32_t.
llvm-pdbutil was hitting an llvm_unreachable because it didn't know about these
new values. The new values are not yet in the DIA documentation, but are
listed in the cvconst.h header that comes as part of the DIA SDK.
Reviewers: asmith, zturner, rnk
Subscribers: stella.stamenova, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D43646
llvm-svn: 325838
This was originally reported as a bug with the symptom being "cvdump
crashes when printing an LLD-linked PDB that has an S_FILESTATIC record
in it". After some additional investigation, I determined that this was
a symptom of a larger problem, and in fact the real problem was in the
way we emitted the global PDB string table. As evidence of this, you can
take any lld-generated PDB, run cvdump -stringtable on it, and it would
return no results.
My hypothesis was that cvdump could not *find* the string table to begin
with. Normally it would do this by looking in the "named stream map",
finding the string /names, and using its value as the stream index. If
this lookup fails, then cvdump would fail to load the string table.
To test this hypothesis, I looked at the name stream map generated by a
link.exe PDB, and I emitted exactly those bytes into an LLD-generated
PDB. Suddenly, cvdump could read our string table!
This code has always been hacky and we knew there was something we
didn't understand. After all, there were some comments to the effect of
"we have to emit strings in a specific order, otherwise things don't
work". The key to fixing this was finally understanding this.
The way it works is that it makes use of a generic serializable hash map
that maps integers to other integers. In this case, the "key" is the
offset into a buffer, and the value is the stream number. If you index
into the buffer at the offset specified by a given key, you find the
name. The underlying cause of all these problems is that we were using
the identity function for the hash. i.e. if a string's offset in the
buffer was 12, the hash value was 12. Instead, we need to hash the
string *at that offset*. There is an additional catch, in that we have
to compute the hash as a uint32 and then truncate it to uint16.
Making this work is a little bit annoying, because we use the same hash
table in other places as well, and normally just using the identity
function for the hash function is actually what's desired. I'm not
totally happy with the template goo I came up with, but it works in any
case.
The reason we never found this bug through our own testing is because we
were building a /parallel/ hash table (in the form of an
llvm::StringMap<>) and doing all of our lookups and "real" hash table
work against that. I deleted all of that code and now everything goes
through the real hash table. Then, to test it, I added a unit test which
adds 7 strings and queries the associated values. I test every possible
insertion order permutation of these 7 strings, to verify that it really
does work as expected.
Differential Revision: https://reviews.llvm.org/D43326
llvm-svn: 325386
Summary:
- Fix a bug in PrettyBuiltinDumper that returns "void" as the name for
an unspecified builtin type. Since the unspecified param of a variadic
function is considered a builtin of unspecified type in PDBs, we set
"..." for its name.
- Provide a method to determine if a PDBSymbolFunc is variadic in
PrettyFunctionDumper since PDBSymbolFunc::getArgument() doesn't return the
last unspecified-type param.
- Add a pretty-func-dumper.test to test pretty dumping of variadic
functions.
Reviewers: zturner, llvm-commits
Reviewed By: zturner
Differential Revision: https://reviews.llvm.org/D41801
llvm-svn: 322608
This is not a record type that clang currently generates,
but it is a record that is encountered in object files generated
by cl. This record is unusual in that it refers directly to
the string table instead of indirectly to the string table via
the FileChecksums table. Because of this, it was previously
overlooked and we weren't remapping the string indices at all.
This would lead to crashes in MSVC when trying to display a
variable whose debug info involved an S_FILESTATIC.
Original bug report by Alexander Ganea
Differential Revision: https://reviews.llvm.org/D41718
llvm-svn: 321883
This is a special code that indicates that it's a function id.
While I'm still not certain how to interpret these, we definitely
should *not* be using these values as indices into an array directly.
For now, when we encounter one of these, just print the numeric value.
llvm-svn: 320775
A couple of places in LLD were passing references to
TypeTableCollections around, which makes it hard to change the
implementation at runtime. However, these cases only needed to
iterate over the types in the collection, and TypeCollection
already provides a handy abstract interface for this purpose.
By implementing this interface, we can get rid of the need to
pass TypeTableBuilder references around, which should allow us
to swap the implementation at runtime in subsequent patches.
llvm-svn: 319345
The motivation behind this patch is that future directions require us to
be able to compute the hash value of records independently of actually
using them for de-duplication.
The current structure of TypeSerializer / TypeTableBuilder being a
single entry point that takes an unserialized type record, and then
hashes and de-duplicates it is not flexible enough to allow this.
At the same time, the existing TypeSerializer is already extremely
complex for this very reason -- it tries to be too many things. In
addition to serializing, hashing, and de-duplicating, ti also supports
splitting up field list records and adding continuations. All of this
functionality crammed into this one class makes it very complicated to
work with and hard to maintain.
To solve all of these problems, I've re-written everything from scratch
and split the functionality into separate pieces that can easily be
reused. The end result is that one class TypeSerializer is turned into 3
new classes SimpleTypeSerializer, ContinuationRecordBuilder, and
TypeTableBuilder, each of which in isolation is simple and
straightforward.
A quick summary of these new classes and their responsibilities are:
- SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of
bytes. Does not do any hashing. Every time you call it, it will
re-serialize and return bytes again. The same instance can be re-used
over and over to avoid re-allocations, and in exchange for this
optimization the bytes returned by the serializer only live until the
caller attempts to serialize a new record.
- ContinuationRecordBuilder : Turns a FieldList-like record into a series
of fragments. Does not do any hashing. Like SimpleTypeSerializer,
returns references to privately owned bytes, so the storage is
invalidated as soon as the caller tries to re-use the instance. Works
equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a
long-standing theoretical limitation of the previous implementation.
- TypeTableBuilder : Accepts sequences of bytes that the user has already
serialized, and inserts them by de-duplicating with a hash table. For
the sake of convenience and efficiency, this class internally stores a
SimpleTypeSerializer so that it can accept unserialized records. The
same is not true of ContinuationRecordBuilder. The user is required to
create their own instance of ContinuationRecordBuilder.
Differential Revision: https://reviews.llvm.org/D40518
llvm-svn: 319198
The type index is from the TPI stream, not the IPI stream. Fix the
dumper, fix type index discovery, and add a test in LLD.
Also improve the log message we emit when we fail to rewrite type
indices in LLD. That's how I found this bug.
llvm-svn: 316461
The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.
X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.
Differential revision: https://reviews.llvm.org/D38480
llvm-svn: 314821
It is possible for two modules to have the same name if they are
archive members with the same name, or if we are doing LTO (in which
case all modules will have the name "lto.tmp").
Differential Revision: https://reviews.llvm.org/D37589
llvm-svn: 312744
We have llvm-readobj for dumping CodeView from object files, and
llvm-pdbutil has always been more focused on PDB. However,
llvm-pdbutil has a lot of useful options for summarizing debug
information in aggregate and presenting high level statistical
views. Furthermore, it's arguably better as a testing tool since
we don't have to write tests to conform to a state-machine like
structure where you match multiple lines in succession, each
depending on a previous match. llvm-pdbutil dumps much more
concisely, so it's possible to use single-line matches in many
cases where as with readobj tests you have to use multi-line
matches with an implicit state machine.
Because of this, I'm adding object file support to llvm-pdbutil.
In fact, this mirrors the cvdump tool from Microsoft, which also
supports both object files and pdb files. In the future we could
perhaps rename this tool llvm-cvutil.
In the meantime, this allows us to deep dive into object files
the same way we already can with PDB files.
llvm-svn: 312358
This adds a new command line option, -udt-stats, which breaks
down the stats of S_UDT records. These are one of the biggest
contributors to the size of /DEBUG:FASTLINK PDBs, so they need
some additional tools to be able to analyze their usage. This
option will dig into each S_UDT record and determine what kind
of record it points to, and then break down the statistics by
the target type. The goal here is to identify how our object
files differ from MSVC object files in S_UDT records, so that
we can output fewer of them and reach size parity.
llvm-svn: 312276
This adds support for dumping a summary of module symbols
and CodeView debug chunks. This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size. Then
at the end it prints the totals for the entire file.
Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.
llvm-svn: 311338
When dumping, we were treating the S_INLINESITESYM as referring
to a type record, when it actually refers to an id record. We
had this correct in TypeIndexDiscovery, so our merging algorithm
should be fine, but we had it wrong in the dumper, which means it
would appear to work most of the time, unless the index was out
of bounds in the type stream, when it would fail. Fixed this, and
audited a few other cases to make them match the behavior in
TypeIndexDiscovery.
Also, I've now observed a new symbol record with kind 0x1168 which
I have no clue what it is, so to avoid crashing we have to just
print "Unknown Symbol Kind".
llvm-svn: 311117
1) We weren't handling symbol types that weren't able to parse,
even if we knew what the leaf type was. This was triggering
when trying to dump /DEBUG:FASTLINK PDBs, where we expect a
certain symbol to show up, but we just don't know how to parse
it.
2) We lost the code for dumping record bytes, so this was added
back.
llvm-svn: 311116
PDBs need to contain 1 module for each object file/compiland,
and a special one synthesized by the linker. This one contains
a symbol record for each output section in the executable with
its address information. This patch adds such symbols to the
linker module. Note that we also are supposed to add an
S_COFFGROUP symbol for what appears to be each input section that
contributes to each output section, but it's not entirely clear
how to generate these yet, so I'm leaving that for a separate
patch.
llvm-svn: 310754
In the refactor to merge the publics and globals stream, a bug
was introduced that wrote the wrong value for one of the fields
of the PublicsStreamHeader. This caused debugging in WinDbg
to break.
We had no way of dumping any of these fields, so in addition to
fixing the bug I've added dumping support for them along with a
test that verifies the correct value is written.
llvm-svn: 310439