We don't have the right algorithm for copying S_UDT symbols
from object files to the globals stream, and having it wrong
is worse than not having it at all, since it breaks display
of local variables of UDT types (for example, "dv Foo" fails
in our current implementation, but succeeds if the S_UDT records
are omitted). Omit them until we fix the algorithm.
llvm-svn: 310867
The linker module contains a symbol of type S_COMPILE3 which
contains various information about the compiler and linker used
to create the PDB, such as the name of the linker, the target
machine, and the linker version. Interestingly, if we set the
version string to 0.0.0.0, then when trying to view local
variables WinDbg emits an error that private symbols are not
present. By setting this to a valid MSVC linker version string,
local variables can display.
As such, even though it is not representative of LLVM's version
information, we need this for compatibility.
llvm-svn: 310755
PDBs need to contain 1 module for each object file/compiland,
and a special one synthesized by the linker. This one contains
a symbol record for each output section in the executable with
its address information. This patch adds such symbols to the
linker module. Note that we also are supposed to add an
S_COFFGROUP symbol for what appears to be each input section that
contributes to each output section, but it's not entirely clear
how to generate these yet, so I'm leaving that for a separate
patch.
llvm-svn: 310754
Previously we were writing an empty globals stream. Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this. Regardless,
without it we don't have information about global variables, so
we need to fix it anyway. This patch does that.
With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.
Differential Revision: https://reviews.llvm.org/D36535
llvm-svn: 310743
The publics stream and globals stream are very similar. They both
contain a list of hash buckets that refer into a single shared stream,
the symbol record stream. Because of the need for each builder to manage
both an independent hash stream as well as a single shared record
stream, making the two builders be independent entities is not the right
design. This patch merges them into a single class, of which only a
single instance is needed to create all 3 streams. PublicsStreamBuilder
and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder
class, which writes all 3 streams at once.
Note that this patch does not contain any functionality change. So we're
still not yet writing any records to the globals stream. All we're doing
is making it so that when we do start writing records to the globals,
this refactor won't have to be part of that patch.
Differential Revision: https://reviews.llvm.org/D36489
llvm-svn: 310438
The compiler outputs PROC32_ID symbols into the object files
for functions, and these symbols have an embedded type index
which, when copied to the PDB, refer to the IPI stream. However,
the symbols themselves are also converted into regular symbols
(e.g. S_GPROC32_ID -> S_GPROC32), and type indices in the regular
symbol records refer to the TPI stream. So this patch applies
two fixes to function records.
1. It converts ID symbols to the proper non-ID record type.
2. After remapping the type index from the object file's index
space to the PDB file/IPI stream's index space, it then
remaps that index to the TPI stream's index space by.
Besides functions, during the remapping process we were also
discarding symbol record types which we did not recognize.
In particular, we were discarding S_BPREL32 records, which is
what MSVC uses to describe local variables on the stack. So
this patch fixes that as well by copying them to the PDB.
Differential Revision: https://reviews.llvm.org/D36426
llvm-svn: 310394
Summary:
PDB section contributions are supposed to use output section indices and
offsets, not input section indices and offsets.
This allows the debugger to look up the index of the module that it
should look up in the modules stream for symbol information. With this
change, windbg can now find line tables, but it still cannot print local
variables.
Fixes PR34048
Reviewers: zturner
Subscribers: hiraditya, ruiu, llvm-commits
Differential Revision: https://reviews.llvm.org/D36285
llvm-svn: 309987
We don't write any actual symbols to this stream yet, but for
now we just create the stream and hook it up to the appropriate
places and give it a valid header.
Differential Revision: https://reviews.llvm.org/D35290
llvm-svn: 309608
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.
Reviewers: zturner, ruiu
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D35871
llvm-svn: 309303
Assume that the LF_TYPESERVER2 record contains Windows-style paths. In
any case, 'sys::path::filename(Path, Style::windows)' will work on
Unix-style paths.
llvm-svn: 308241
Summary:
Object files compiled with /Zi emit type information into a type server
PDB. The .debug$S section will contain a single TypeServer2Record with
the absolute path and GUID of the type server. LLD needs to load the
type server PDB and merge all types and items it finds in it into the
destination PDB.
Depends on D35495
Reviewers: ruiu, inglorion
Subscribers: zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D35504
llvm-svn: 308235
Summary:
We were treating the GUIDs in TypeServer2Record as strings, and the
non-ASCII bytes in the GUID would not round-trip through YAML.
We already had the PDB_UniqueId type portably represent a Windows GUID,
but we need to hoist that up to the DebugInfo/CodeView library so that
we can use it in the TypeServer2Record as well as in PDB parsing code.
Reviewers: inglorion, amccarth
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D35495
llvm-svn: 308234
Summary:
Instead of wiring these through the CVTypeVisitor interface, clients
should inspect the CVTypeArray before visiting it and potentially load
up the type server's TPI stream if they need it.
No tests relied on this functionality because LLD was the only client.
Reviewers: ruiu
Subscribers: mgorny, hiraditya, zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D35394
llvm-svn: 308212
Summary:
We've accumulated about five or so data structures that are widely
referenced:
- PDBBuilder
- Type table
- Id table
- PDB string table
- Type server handler
I'm about to rewrite type server handling, and I need a new class in LLD
where I can put its state. By creating a new PDBLinker class, I hope to
put it there next.
Reviewers: ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35392
llvm-svn: 307979
Summary:
This fixes type indices for SDK or CRT static archives. Previously we'd
try to look next to the archive object file path, which would not exist
on the local machine.
Also error out if we can't resolve a type server record. Hypothetically
we can recover from this error by discarding debug info for this object,
but that is not yet implemented.
Reviewers: ruiu, amccarth
Subscribers: aprantl, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D35369
llvm-svn: 307946
Revert "[PDB] Tweak bad type index error handling"
check-lld with asan detects use-after-poison.
This reverts commits r307733 and r307726.
llvm-svn: 307752
Translate invalid type indices to a sentinel value instead of skipping
the record. Skipping records isn't a good recovery method, because we
can skip a scope open or close record, which will confuse the scope
management code.
We currently have lots of invalid type indices on Microsoft-provided
standard libraries, because the LF_TYPESERVER2 records contain absolute
paths that are only valid on their build servers. Our type server
handlers need to look at other things (GUIDs) to find these type server
PDBs.
llvm-svn: 307726
This is part of the continuing effort to increase parity between
LLD and MSVC PDBs. link still doesn't like our PDBs, so the most
obvious thing to check was whether adding an empty publics stream
would get it to do something else. It still fails in the same way
but at least this removes one more variable from the equation.
The next logical step would be to try creating an empty globals
stream.
Differential Revision: https://reviews.llvm.org/D35224
llvm-svn: 307598
lld-link is apparently a symlink to lld. Our pdb writing code
uses llvm::sys::fs::getMainExecutable() and writes that value
into the PDB, which will resolve symlinks etc. But what we
really want is the exact value of argv[0], which matches what
the MS linker does. Fix this by just writing argv[0] instead
of calling getMainExecutable.
llvm-svn: 307592
1) Don't write a /src/headerblock stream. This appears to be
written conditionally by MSVC, but it's not clear what the
condition is. For now, just remove it since we dont' know
what it is anyway and the particular pdb we've checked in
for the test doesn't have one.
2) Write a valid timestamp for the PDB file signature. This
leads to non-reproducible builds, but it matches the default
behavior of link, so it should be out default as well. If
we need reproducibility, we should add a separate command
line option for it that is off by default.
3) Write an empty FPO stream. MSVC seems to always write an
FPO stream. This change makes the stream directory match
up, although we still need to make the contents of the FPO
stream match.
llvm-svn: 307436
Without this we would just append whatever the user
wrote on the command line, so if we're in C:\foo
and we run lld-link bar/baz.obj, we would write
C:\foo\bar/baz.obj in various places in the PDB.
MSVC linker does not do this, so we shouldn't either.
This fixes some differences in the diff test, so we
update the test as well.
Differential Revision: https://reviews.llvm.org/D35092
llvm-svn: 307423
A couple of things were different about our generated PDBs.
1) We were outputting the wrong Version on the PDB Stream.
The version we were setting was newer than what MSVC is setting.
It's not clear what the implications are, but we change LLD
to use PdbImplVC70, as MSVC does.
2) For the optional debug stream indices in the DBI Stream, we
were outputting 0 to mean "the stream is not present". MSVC
outputs uint16_t(-1), which is the "correct" way to specify
that a stream is not present. So we fix that as well.
3) We were setting the PDB Stream signature to 0. This is supposed
to be the result of calling time(nullptr). Although this leads
to non-deterministic builds, a better way to solve that is by
having a command line option explicitly for generating a
reproducible build, and have the default behavior of lld-link
match the default behavior of link.
To test this, I'm making use of the new and improved `pdb diff`
sub command. To make it suitable for writing tests against, I had
to modify the diff subcommand slightly to print less verbose output.
Previously it would always print | <column> | <value1> | <value2> |
which is quite verbose, and the values are fragile. All we really
want to know is "did we produce the same value as link?" So I added
command line options to print a single character representing the
result status (different, identical, equivalent), and another to
hide the value display. Note that just inspecting the diff output
used to write the test, you can see some things that are obviously
wrong. That is just reflective of the fact that this is the state
of affairs today, not that we're asserting that this is "correct".
We can use this as a starting point to discover differences, fix
them, and update the test.
Differential Revision: https://reviews.llvm.org/D35086
llvm-svn: 307422
Based strictly on the name, this seems to have something to do
width edit & continue. The goal of this patch has nothing to do
with supporting edit and continue though. msvc link.exe writes
very basic information into this area even when *not* compiling
with support for E&C, and so the goal here is to bring lld-link
to parity. Since we cannot know what assumptions standard tools
make about the content of PDB files, we need to be as close as
possible.
This ECNames data structure is a standard PDB string hash table.
link.exe puts a single string into this hash table, which is the
full path to the PDB file on disk. It then references this string
from the module descriptor for the compiler generated `* Linker *`
module.
With this patch, lld-link will generate the exact same sequence of
bytes as MSVC link for this subsection for a given object file
input (as reported by `llvm-pdbutil bytes -ec`).
llvm-svn: 307356
Summary:
There are a variety of records that open scopes: function scopes, block
scopes, and inlined call site scopes. These symbol records contain
Parent and End fields with the offsets of other symbol records. The End
field contains the offset of the matching S_END or S_INLINESITE_END
record. The Parent field contains the offset of the parent record, or 0
if this is a top-level scope (i.e. a function).
With this change, `llvm-pdbutil pretty -all` no longer crashes on PDBs
produced by LLD. I haven't tried a real debugger yet.
Reviewers: zturner, ruiu
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34898
llvm-svn: 307278
Summary:
The main complexity in adding symbol records is that we need to
"relocate" all the type indices. Type indices do not have anything like
relocations, an opaque data structure describing where to find existing
type indices for fixups. The linker just has to "know" where the type
references are in the symbol records. I added an overload of
`discoverTypeIndices` that works on symbol records, and it seems to be
able to link the standard library.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34432
llvm-svn: 305933
Summary:
Previously we didn't add debug info chunks to the SparseChunks array, so
they didn't participate in section GC. Now we do.
Reviewers: ruiu
Subscribers: aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D34356
llvm-svn: 305811
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.
This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.
File checksums and string tables, on the other hand, need to be re-done.
Reviewers: zturner, ruiu
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D34257
llvm-svn: 305713
Summary:
Expose the module descriptor index and fill it in for section
contributions.
Reviewers: zturner
Subscribers: llvm-commits, ruiu, hiraditya
Differential Revision: https://reviews.llvm.org/D34126
llvm-svn: 305296
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries. Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.
Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.
Differential Revision: https://reviews.llvm.org/D33785
llvm-svn: 304484
Originally this was intended to be set up so that when linking
a PDB which refers to a type server, it would only visit the
PDB once, and on subsequent visitations it would just skip it
since all the records had already been added.
Due to some C++ scoping issues, this was not occurring and it
was revisiting the type server every time, which caused every
record to end up being thrown away on all subsequent visitations.
This doesn't affect the performance of linking clang-cl generated
object files because we don't use type servers, but when linking
object files and libraries generated with /Zi via MSVC, this means
only 1 object file has to be linked instead of N object files, so
the speedup is quite large.
llvm-svn: 303920
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types. In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted. Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.
llvm-svn: 303707
Previous algotirhm assumed that types and ids are in a single
unified stream. For inputs that come from object files, this
is the case. But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.
Differential Revision: https://reviews.llvm.org/D33417
llvm-svn: 303577
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows. After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling. Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.
llvm-svn: 303446
This is a squash of ~5 reverts of, well, pretty much everything
I did today. Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures. I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!
At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.
llvm-svn: 303409
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).
But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.
This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.
This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.
The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.
Differential Revision: https://reviews.llvm.org/D33293
llvm-svn: 303388