Commit Graph

99 Commits

Author SHA1 Message Date
Martin Storsjo 8cc0f71261 [COFF] Add and use a Wordsize field in Config. NFCI.
Differential Revision: https://reviews.llvm.org/D53143

llvm-svn: 344265
2018-10-11 17:45:58 +00:00
Martin Storsjo 21eb363302 [COFF] Set proper pointer size alignment for LocalImportChunk
When these are accessed with load/store instructions on ARM64,
it becomes strictly necessary to have them properly aligned.

This fixes PR39228.

Differential Revision: https://reviews.llvm.org/D53128

llvm-svn: 344264
2018-10-11 17:45:51 +00:00
Alexandre Ganea 149de8de19 [LLD][COFF] Fix ordering of CRT global initializers in COMDAT sections
(patch by Benoit Rousseau)

This patch fixes a bug where the global variable initializers were sometimes not invoked in the correct order when it involved a C++ template instantiation.

Differential Revision: https://reviews.llvm.org/D52749

llvm-svn: 343847
2018-10-05 12:56:46 +00:00
Martin Storsjo 57ddec0dd1 [COFF] Add support for creating range extension thunks for ARM
This is a feature that MS link.exe lacks; it currently errors out on
such relocations, just like lld did before.

This allows linking clang.exe for ARM - practically, any image over
16 MB will likely run into the issue.

Differential Revision: https://reviews.llvm.org/D52156

llvm-svn: 342962
2018-09-25 10:59:29 +00:00
Martin Storsjo 32d21d6a2d [COFF] Add support for delay loading DLLs for ARM64
Differential Revision: https://reviews.llvm.org/D52190

llvm-svn: 342447
2018-09-18 07:22:01 +00:00
Martin Storsjo 7a41693898 [COFF] Provide __CTOR_LIST__ and __DTOR_LIST__ symbols for MinGW
MinGW uses these kind of list terminator symbols for traversing
the constructor/destructor lists. These list terminators are
actual pointers entries in the lists, with the values 0 and
(uintptr_t)-1 (instead of just symbols pointing to the start/end
of the list).

(This mechanism exists in both the mingw-w64 crt startup code and
in libgcc; normally the mingw-w64 one is used, but a DLL build of
libgcc uses the libgcc one. Therefore it's not trivial to change
the mechanism without lots of cross-project synchronization and
potentially invalidating some combinations of old/new versions
of them.)

When mingw-w64 has been used with lld so far, the CRT startup object
files have so far provided these symbols, ending up with different,
incompatible builds of the CRT startup object files depending on
whether binutils or lld are going to be used.

In order to avoid the need of different configuration of the CRT startup
object files depending on what linker to be used, provide these symbols
in lld instead. (Mingw-w64 checks at build time whether the linker
provides these symbols or not.) This unifies this particular detail
between the two linkers.

This does disallow the use of the very latest lld with older versions
of mingw-w64 (the configure check for the list was added recently;
earlier it simply checked whether the CRT was built with gcc or clang),
and requires rebuilding the mingw-w64 CRT. But the number of users of
lld+mingw still is low enough that such a change should be tolerable,
and unifies this aspect of the toolchains, easing interoperability
between the toolchains for the future.

The actual test for this feature is added in ctors_dtors_priority.s,
but a number of other tests that checked absolute output addresses
are updated.

Differential Revision: https://reviews.llvm.org/D52053

llvm-svn: 342294
2018-09-14 22:26:59 +00:00
Martin Storsjo 802fcb4167 [COFF] When doing automatic dll imports, replace whole .refptr.<var> chunks with __imp_<var>
After fixing up the runtime pseudo relocation, the .refptr.<var>
will be a plain pointer with the same value as the IAT entry itself.
To save a little binary size and reduce the number of runtime pseudo
relocations, redirect references to the IAT entry (via the __imp_<var>
symbol) itself and discard the .refptr.<var> chunk (as long as the
same section chunk doesn't contain anything else than the single
pointer).

As there are now cases for both setting the Live variable to true
and false externally, remove the accessors and setters and just make
the variable public instead.

Differential Revision: https://reviews.llvm.org/D51456

llvm-svn: 341175
2018-08-31 07:45:20 +00:00
Martin Storsjo eac1b05f1d [COFF] Support MinGW automatic dllimport of data
Normally, in order to reference exported data symbols from a different
DLL, the declarations need to have the dllimport attribute, in order to
use the __imp_<var> symbol (which contains an address to the actual
variable) instead of the variable itself directly. This isn't an issue
in the same way for functions, since any reference to the function without
the dllimport attribute will end up as a reference to a thunk which loads
the actual target function from the import address table (IAT).

GNU ld, in MinGW environments, supports automatically importing data
symbols from DLLs, even if the references didn't have the appropriate
dllimport attribute. Since the PE/COFF format doesn't support the kind
of relocations that this would require, the MinGW's CRT startup code
has an custom framework of their own for manually fixing the missing
relocations once module is loaded and the target addresses in the IAT
are known.

For this to work, the linker (originall in GNU ld) creates a list of
remaining references needing fixup, which the runtime processes on
startup before handing over control to user code.

While this feature is rather controversial, it's one of the main features
allowing unix style libraries to be used on windows without any extra
porting effort.

Some sort of automatic fixing of data imports is also necessary for the
itanium C++ ABI on windows (as clang implements it right now) for importing
vtable pointers in certain cases, see D43184 for some discussion on that.

The runtime pseudo relocation handler supports 8/16/32/64 bit addresses,
either PC relative references (like IMAGE_REL_*_REL32*) or absolute
references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32,
IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a
relocation against the corresponding IAT slot. For the absolute references,
a normal base relocation is created, to update the embedded address
in case the image is loaded at a different address.

The list of runtime pseudo relocations contains the RVA of the
imported symbol (the IAT slot), the RVA of the location the relocation
should be applied to, and a size of the memory location. When the
relocations are fixed at runtime, the difference between the actual
IAT slot value and the IAT slot address is added to the reference,
doing the right thing for both absolute and relative references.

With this patch alone, things work fine for i386 binaries, and mostly
for x86_64 binaries, with feature parity with GNU ld. Despite this,
there are a few gotchas:
- References to data from within code works fine on both x86 architectures,
  since their relocations consist of plain 32 or 64 bit absolute/relative
  references. On ARM and AArch64, references to data doesn't consist of
  a plain 32 or 64 bit embedded address or offset in the code. On ARMNT,
  it's usually a MOVW+MOVT instruction pair represented by a
  IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of
  the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR
  instruction pair with an even more complex encoding, storing a PC
  relative address (with a range of +/- 4 GB). This could theoretically
  be remedied by extending the runtime pseudo relocation handler with new
  relocation types, to support these instruction encodings. This isn't an
  issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64.
- For x86_64, if references in code are encoded as 32 bit PC relative
  offsets, the runtime relocation will fail if the target turns out to be
  out of range for a 32 bit offset.
- Fixing up the relocations at runtime requires making sections writable
  if necessary, with the VirtualProtect function. In Windows Store/UWP apps,
  this function is forbidden.

These limitations are addressed by a few later patches in lld and
llvm.

Differential Revision: https://reviews.llvm.org/D50917

llvm-svn: 340726
2018-08-27 08:43:31 +00:00
Peter Collingbourne ab038025a5 COFF: Implement safe ICF on rodata using address-significance tables.
Differential Revision: https://reviews.llvm.org/D51050

llvm-svn: 340555
2018-08-23 17:44:42 +00:00
Peter Collingbourne 71c7de5b77 COFF: Preserve section type when processing /section flag.
It turns out that we were dropping this before.

Differential Revision: https://reviews.llvm.org/D45802

llvm-svn: 330481
2018-04-20 21:23:16 +00:00
Peter Collingbourne fa322abee9 COFF: Rename Chunk::getPermissions to getOutputCharacteristics.
In an upcoming change I will need to make a distinction between section
type (code, data, bss) and permissions. The term that I use for both
of these things is "output characteristics".

Differential Revision: https://reviews.llvm.org/D45799

llvm-svn: 330361
2018-04-19 20:03:24 +00:00
Peter Collingbourne 3108802f16 COFF: Friendlier undefined symbol errors.
Summary:
This change does three things:
- Try to find the file and line number of an undefined symbol
  reference by reading codeview debug info.
- Try to find the name of the function or global variable with the
  undefined symbol reference by searching the object file's symbol
  table.
- Prints the information in the same style as the ELF linker.

Differential Revision: https://reviews.llvm.org/D45467

llvm-svn: 330235
2018-04-17 23:32:33 +00:00
Peter Collingbourne 2f6d00612d COFF: Make SectionChunk::Relocs field an ArrayRef. NFCI.
Differential Revision: https://reviews.llvm.org/D45714

llvm-svn: 330172
2018-04-17 01:54:34 +00:00
Peter Collingbourne f1a11f87a0 COFF: Implement string tail merging.
In COFF, duplicate string literals are merged by placing them in a
comdat whose leader symbol name contains a specific prefix followed
by the hash and partial contents of the string literal. This gives
us an easy way to identify sections containing string literals in
the linker: check for leader symbol names with the given prefix.

Any sections that are identified in this way as containing string
literals may be tail merged. We do so using the StringTableBuilder
class, which is also used to tail merge string literals in the ELF
linker. Tail merging is enabled only if ICF is enabled, as this
provides a signal as to whether the user cares about binary size.

Differential Revision: https://reviews.llvm.org/D44504

llvm-svn: 327668
2018-03-15 21:14:02 +00:00
Reid Kleckner af2f7da74c [COFF] Add minimal support for /guard:cf
Summary:
This patch adds some initial support for Windows control flow guard. At
the end of the day, the linker needs to synthesize a table of RVAs very
similar to the structured exception handler table (/safeseh).

Both /safeseh and /guard:cf take sections of symbol table indices
(.sxdata and .gfids$y) and turn them into RVA tables referenced by the
load config struct in the CRT through special symbols.

Reviewers: ruiu, amccarth

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42592

llvm-svn: 324306
2018-02-06 01:58:26 +00:00
Peter Collingbourne 1621c20ffc Reland r319090, "COFF: Do not create SectionChunks for discarded comdat sections." with a fix for debug sections.
If /debug was not specified, readSection will return a null
pointer for debug sections. If the debug section is associative with
another section, we need to make sure that the section returned from
readSection is not a null pointer before adding it as an associative
section.

Differential Revision: https://reviews.llvm.org/D40533

llvm-svn: 319133
2017-11-28 01:30:07 +00:00
Peter Collingbourne c8477b8234 Revert r319090, "COFF: Do not create SectionChunks for discarded comdat sections."
Caused test failures in check-cfi on Windows.
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/20284

llvm-svn: 319100
2017-11-27 21:37:51 +00:00
Peter Collingbourne 3f2921f5ec COFF: Do not create SectionChunks for discarded comdat sections.
With this change, instead of creating a SectionChunk for each section
in the object file, we only create them when we encounter a prevailing
comdat section.

Also change how symbol resolution occurs between comdat symbols. Now
only the comdat leader participates in comdat resolution, and not any
other external associated symbols. This is more in line with how COFF
semantics are defined, and should allow for a more straightforward
implementation of non-ANY comdat types.

On my machine, this change reduces our runtime linking a release
build of chrome_child.dll with /nopdb from 5.65s to 4.54s (median of
50 runs).

Differential Revision: https://reviews.llvm.org/D40238

llvm-svn: 319090
2017-11-27 20:42:34 +00:00
Rui Ueyama f52496e1e0 Rename SymbolBody -> Symbol
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by

  perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)

nd clang-format-diff.

Differential Revision: https://reviews.llvm.org/D39459

llvm-svn: 317370
2017-11-03 21:21:47 +00:00
Martin Storsjo 67dd3415c0 [COFF] Don't error out on relocations to discarded sections in .eh_frame
This allows linking code with dwarf exception handling.

Differential Revision: https://reviews.llvm.org/D38681

llvm-svn: 315273
2017-10-10 06:05:29 +00:00
Rui Ueyama 3f851704c1 Move new lld's code to Common subdirectory.
New lld's files are spread under lib subdirectory, and it isn't easy
to find which files are actually maintained. This patch moves maintained
files to Common subdirectory.

Differential Revision: https://reviews.llvm.org/D37645

llvm-svn: 314719
2017-10-02 21:00:41 +00:00
Rui Ueyama cfc2f80df6 Remove {get,set}Align accessor functions and use Alignment member variable instead.
llvm-svn: 313204
2017-09-13 21:54:55 +00:00
Martin Storsjo d2752aa9ec [COFF] Add support for aligncomm directives
These are emitted for comm symbols in object files, when targeting
a GNU environment.

Alternatively, just ignore them since we already align CommonChunk
to the natural size of the content (up to 32 bytes). That would only
trade away the possibility to overalign small symbols, which doesn't
sound like something that might not need to be handled?

Differential Revision: https://reviews.llvm.org/D36304

llvm-svn: 310871
2017-08-14 19:07:27 +00:00
Reid Kleckner 8d2cbf2e9b [PDB] Improve our PDB OMF debug directory entry
In order to get dbghelp to load our pdb, we have to fill in the
PointerToRawData field as well as the AddressOfRawData field. One is the
file offset and the other is the RVA.

llvm-svn: 309900
2017-08-02 23:19:54 +00:00
Rui Ueyama e1b48e099c Rename ObjectFile ObjFile for COFF as well.
llvm-svn: 309228
2017-07-26 23:05:24 +00:00
Martin Storsjo 82eaf6cb50 [COFF] Add support for delay loading DLLs on ARM
Differential Revision: https://reviews.llvm.org/D35768

llvm-svn: 309017
2017-07-25 20:00:37 +00:00
Shoaib Meenai 9a61a791ac [COFF] Accept discarded relocations in DWARF debug sections
DWARF debug sections can also contain relocations against symbols in
discared segments. LLD should accept such relocations.

Differential Revision: https://reviews.llvm.org/D35526

llvm-svn: 308315
2017-07-18 15:11:05 +00:00
Reid Kleckner 3b8acb2c5a [COFF] Bounds check relocations
Summary:
This would have caught the invalid object file I used in my test case in
r307726. The OOB was only caught by ASan later, which is slow and
doesn't work on some platforms. LLD should do some basic input
validation itself. This check isn't perfect, so relocations can reach
OOB by up to seven bytes, but it's better than what we had and probably
cheap.

Reviewers: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35371

llvm-svn: 307948
2017-07-13 20:29:59 +00:00
Martin Storsjo 2779165d8c [COFF] Add initial support for some ARM64 relocations and import thunks
This is enough to link a working hello world executable, with
a call to an imported function, a string constant passed to
the imported function, and loads from a global variable.

Differential Revision: https://reviews.llvm.org/D34964

llvm-svn: 307629
2017-07-11 07:22:44 +00:00
Reid Kleckner a1001b8f38 [COFF] Allow debug info to relocate against discarded symbols
Summary:
In order to do this without switching on the symbol kind multiple times,
I created Defined::getChunkAndOffset and use that instead of
SymbolBody::getRVA in the inner relocation loop.

Now we get the symbol's chunk before switching over relocation types, so
we can test if it has been discarded outside the inner relocation type
switch. This also simplifies application of section relative
relocations. Previously we would switch on symbol kind to compute the
RVA, then the relocation type, and then the symbol kind again to get the
output section so we could subtract that from the symbol RVA. Now we
*always* have an OutputSection, so applying SECREL and SECTION
relocations isn't as much of a special case.

I'm still not quite happy with the cleanliness of this code. I'm not
sure what offsets and bases we should be using during the relocation
processing loop: VA, RVA, or OutputSectionOffset.

Reviewers: ruiu, pcc

Reviewed By: ruiu

Subscribers: majnemer, inglorion, llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D34650

llvm-svn: 306566
2017-06-28 17:06:35 +00:00
Reid Kleckner f5bb738f75 [PDB] Don't emit debug info associated with dead chunks
Summary:
Previously we didn't add debug info chunks to the SparseChunks array, so
they didn't participate in section GC. Now we do.

Reviewers: ruiu

Subscribers: aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D34356

llvm-svn: 305811
2017-06-20 17:14:09 +00:00
Reid Kleckner 44cdb10964 [PDB] Start emitting source file and line information
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.

This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.

File checksums and string tables, on the other hand, need to be re-done.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34257

llvm-svn: 305713
2017-06-19 17:21:45 +00:00
Reid Kleckner 79ac99b3e8 [COFF] Drop unused comdat sections when GC is turned off
Summary:
Adds a "Discarded" bool to SectionChunk to indicate if the section was
discarded by COMDAT deduplication. The Writer still just checks
`isLive()`.

Fixes PR33446

Reviewers: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34288

llvm-svn: 305582
2017-06-16 20:47:19 +00:00
Rui Ueyama 88172fb603 Use the same terminology as ELF.
This patch do s/color/class/g.

llvm-svn: 302326
2017-05-05 23:52:24 +00:00
Peter Collingbourne 6f24fdb6a0 COFF: Change the /lldmap output format to be more like the ELF linker.
Differential Revision: https://reviews.llvm.org/D28717

llvm-svn: 291990
2017-01-14 03:14:46 +00:00
Peter Collingbourne 79a5e6b1b7 COFF: New symbol table design.
This ports the ELF linker's symbol table design, introduced in r268178,
to the COFF linker.

Differential Revision: http://reviews.llvm.org/D21166

llvm-svn: 289280
2016-12-09 21:55:24 +00:00
Rui Ueyama 3a618e5606 Port parallel ICF to COFF.
LLD used to take 11.73 seconds to link Clang. Now it is 6.94 seconds.
MSVC link takes 83.02 seconds. Note that ICF is enabled by default on
Windows, so a low latency ICF is more important than in ELF.

llvm-svn: 288487
2016-12-02 08:03:58 +00:00
Rui Ueyama 09e0b5f2c9 Emit Section Contributions.
Differential Revision: https://reviews.llvm.org/D26211

llvm-svn: 286684
2016-11-12 00:00:51 +00:00
Benjamin Kramer bd521201b7 Apply clang-tidy's misc-move-constructor-init to lld.
No functionality change intended.

llvm-svn: 271686
2016-06-03 16:57:13 +00:00
David Majnemer 22dff0aafc [COFF] Don't hard-code the load configuration size
The load configuration directory is a structure whose size varies as the
OS gains additional functionality.  To account for this, the structure's
layout begins with a size field; this allows loaders to know which
fields are available.

However, LLD hard-coded the sizes (112 bytes for 64-bit and 64 for
32-bit).  This means that we might not inform the loader of all the
pertinent fields or we might claim that there are more fields than are
actually present.

To correctly account for this, the size field must be loaded from the
_load_config_used symbol.

N.B.  The COFF spec is either wrong or out of date, the load
configuration directory is not correctly documented in the
specification: it omits the size field.

llvm-svn: 263543
2016-03-15 09:48:27 +00:00
Rui Ueyama 489a806965 Update for LLVM function name change.
llvm-svn: 257801
2016-01-14 20:53:50 +00:00
Rui Ueyama dba6b576cf COFF: Rename RoundUpToAlignment -> align.
llvm-svn: 257220
2016-01-08 22:24:26 +00:00
Rui Ueyama 43e12900d9 COFF: Non-external COMDAT sections sholud not be merged by ICF.
If a section symbol is not external, that COMDAT section should never
be merge with other sections in other compilation unit. Previously,
we didn't take visibility into account.

Note that COMDAT sections with non-external visibility makes sense
because they can be removed by dead-stripping.

Fixes https://llvm.org/bugs/show_bug.cgi?id=25686

llvm-svn: 254578
2015-12-03 02:23:33 +00:00
Rui Ueyama 548d22c073 COFF: ICF should not merge sectinos if their alignments are not the same.
There's actually a room to improve this patch. Instead of not merging
sections that have different alignements, we can choose the section that
has the largest alignment requirement among all sections that are otherwise
considered the same. Then all section alignments are satisfied, so we can
merge them.

I don't know if that improvement could make any difference for real-world
input, so I'll leave it alone. Would be interesting to revisit later.

llvm-svn: 248581
2015-09-25 16:50:12 +00:00
Rui Ueyama de88072a00 COFF: Rename Ptr -> Repl.
This pointer points to a replacement for this chunk. Ptr was not a good name.

llvm-svn: 248579
2015-09-25 16:20:24 +00:00
Rui Ueyama 3cb1f5c860 COFF: Rename A.replaceWith(B) -> B.replace(A). NFC.
llvm-svn: 248197
2015-09-21 19:36:51 +00:00
Rui Ueyama 3cfd2bff1e Remove dead code.
llvm-svn: 248105
2015-09-20 01:19:36 +00:00
Rui Ueyama 63bbe84b27 COFF: Make Chunk::writeTo() const. NFC.
This should improve code readability especially because this function
is called inside parallel_for_each.

llvm-svn: 248103
2015-09-19 23:28:57 +00:00
Rui Ueyama 27e9e6540c Remove unused #includes.
llvm-svn: 248081
2015-09-19 02:28:32 +00:00
Rui Ueyama aa95e5a4cc COFF: Parallelize ICF.
The LLD's ICF algorithm is highly parallelizable. This patch does that
using parallel_for_each.

ICF accounted for about one third of total execution time. Previously,
it took 324 ms when self-hosting. Now it takes only 62 ms.

Of course your mileage may vary. My machine is a beefy 24-core Xeon machine,
so you may not see this much speedup. But this optimization should be
effective even for 2-core machine, since I saw speedup (324 ms -> 189 ms)
when setting parallelism parameter to 2.

llvm-svn: 248038
2015-09-18 21:06:34 +00:00