Commit Graph

14488 Commits

Author SHA1 Message Date
Fangrui Song 2bf06d9345 [ELF] Support symbol names with space in linker script expressions
Fix PR51961

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D110490
2021-09-27 09:50:42 -07:00
Fangrui Song db6a00daa0 [ELF] Remove unneeded binding parameter from addOptionalRegular. NFC
__rela_iplt_start uses spurious STB_WEAK, but it doesn't matter because STV_HIDDEN overrides the binding.
2021-09-25 15:47:27 -07:00
Fangrui Song d23fd8ae89 [ELF] Replace noneRel = R_*_NONE with static constexpr. NFC
All architectures define R_*_NONE to 0.
2021-09-25 15:16:44 -07:00
Fangrui Song 40cd4db442 [ELF] Default gotBaseSymInGotPlt to false (NFC for most architectures)
Most architectures use .got instead of .got.plt, so switching the default can
minimize customization.

This fixes an issue for SPARC V9 which uses .got .
AVR, AMDGPU, and MSP430 don't seem to use _GLOBAL_OFFSET_TABLE_.
2021-09-25 15:06:09 -07:00
Fangrui Song a892c0e49e [ELF][test] Improve test coverage 2021-09-25 11:57:54 -07:00
Mike Hommey 08ef24f6ab Wrap xar/xar.h include in extern "C" block
Without such wrapping, linking lld fails with missing symbols because of
C++ symbol mangling with older versions of the MacOSX SDK, in which
xar.h doesn't have an extern "C" block itself.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D110224
2021-09-23 09:37:30 +02:00
Fangrui Song 19d53d45f2 [ELF][AArch64] Refine and fix the condition when BTI/PAC PLT needs bti c
(As I mentioned in https://reviews.llvm.org/D62609#1534158 ,
the condition for using bti c for executable can be loosened.)

In two cases the address of a PLT may escape:

* canonical PLT entry for a STT_FUNC
* non-preemptible STT_GNU_IFUNC which is converted to STT_FUNC

The first case can be detected with `needsPltAddr`.

The second case is not straightforward to detect because for the Relocations.cpp
created `directSym`, it's difficult to know whether the associated `sym` has
exercised the `!needsPlt(expr)` code path. Just use the conservative `isInIplt`
condition. A non-preemptible ifunc not referenced by non-GOT-generating
non-PLT-generating relocations will have an unneeded `bti c`, but the cost is acceptable.

The second case fixes a bug as well: a -shared link may have non-preemptible ifunc.
Before the patch we did not emit `bti c` and could be wrong if the PLT address escaped.
GNU ld doesn't handle the case: `relocation R_AARCH64_ADR_PREL_PG_HI21 against STT_GNU_IFUNC symbol 'ifunc2' isn't handled by elf64_aarch64_final_link_relocate` (https://sourceware.org/bugzilla/show_bug.cgi?id=28370)

For -shared, if BTI is enabled but PAC is disabled, the PLT entry size increases
from 16 to 24 because we have to select the PLT scheme early, but the cost is
acceptable.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D110217
2021-09-22 11:51:09 -07:00
Hongtao Yu d9b511d8e8 [CSSPGO] Set PseudoProbeInserter as a default pass.
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D110209
2021-09-22 09:09:48 -07:00
Andrew Ng 05b1303421 [ELF][test] Restore important part of ICF alignment test
Restore the checking of addresses in ICF test which was testing the
behaviour of ICF with regards to different alignments of otherwise
identical sections. Also make the test more robust to layout changes.

Differential Revision: https://reviews.llvm.org/D110090
2021-09-22 14:15:33 +01:00
Amy Huang 6e994a833e [lld] Remove timers.ll because inconsistent timers behavior causes the test to fail sometimes
See https://reviews.llvm.org/D109904
2021-09-20 09:57:18 -07:00
Fangrui Song a954bb18b1 [ELF] Add --why-extract= to query why archive members/lazy object files are extracted
Similar to D69607 but for archive member extraction unrelated to GC. This patch adds --why-extract=.

Prior art:

GNU ld -M prints
```
Archive member included to satisfy reference by file (symbol)

a.a(a.o)                      main.o (a)
b.a(b.o)                      (b())
```

-M is mainly for input section/symbol assignment <-> output section mapping
(often huge output) and the information may appear ad-hoc.

Apple ld64
```
__Z1bv forced load of b.a(b.o)
_a forced load of a.a(a.o)
```

It doesn't say the reference file.

Arm's proprietary linker
```
Selecting member vsnprintf.o(c_wfu.l) to define vsnprintf.
...
Loading member vsnprintf.o from c_wfu.l.
              definition:  vsnprintf
              reference :  _printf_a
```

---

--why-extract= gives the user the full data (which is much shorter than GNU ld
-Map). It is easy to track a chain of references to one archive member with a
one-liner, e.g.

```
% ld.lld main.o a_b.a b_c.a c.a -o /dev/null --why-extract=- | tee stdout
reference       extracted       symbol
main.o  a_b.a(a_b.o)    a
a_b.a(a_b.o)    b_c.a(b_c.o)    b()
b_c.a(b_c.o)    c.a(c.o)        c()

% ruby -ane 'BEGIN{p={}}; p[$F[1]]=[$F[0],$F[2]] if $.>1; END{x="c.a(c.o)"; while y=p[x]; puts "#{y[0]} extracts #{x} to resolve #{y[1]}"; x=y[0] end}' stdout
b_c.a(b_c.o) extracts c.a(c.o) to resolve c()
a_b.a(a_b.o) extracts b_c.a(b_c.o) to resolve b()
main.o extracts a_b.a(a_b.o) to resolve a
```

Archive member extraction happens before --gc-sections, so this may not be a live path
under --gc-sections, but I think it is a good approximation in practice.

* Specifying a file avoids output interleaving with --verbose.
* Required `=` prevents accidental overwrite of an input if the user forgets `=`. (Most of compiler drivers' long options accept `=` but not ` `)

Differential Revision: https://reviews.llvm.org/D109572
2021-09-20 09:52:30 -07:00
Fangrui Song d001ab82e4 [ELF] Don't fall back to .text for e_entry
We have the rule to simulate
(https://sourceware.org/binutils/docs/ld/Entry-Point.html),
but the behavior is questionable
(https://sourceware.org/pipermail/binutils/2021-September/117929.html).

gold doesn't fall back to .text.
The behavior is unlikely relied by projects (there is even a warning for
executable links), so let's just delete this fallback path.

Reviewed By: jhenderson, peter.smith

Differential Revision: https://reviews.llvm.org/D110014
2021-09-20 09:35:12 -07:00
Nico Weber 1b2c36aa5f [lld/mac] Fix comment typo to cycle bots 2021-09-18 11:15:21 -04:00
Amy Huang 724a1dff8a [lld] Fix small error in previous commit
6f7483b1ec.
2021-09-17 17:47:21 -07:00
Amy Huang 6f7483b1ec Reland "[LLD] Remove global state in lld/COFF" after fixing asan and msan test failures
Original commit description:

  [LLD] Remove global state in lld/COFF

  This patch removes globals from the lldCOFF library, by moving globals
  into a context class (COFFLinkingContext) and passing it around wherever
  it's needed.

  See https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html for
  context about removing globals from LLD.

  I also haven't moved the `driver` or `config` variables yet.

  Differential Revision: https://reviews.llvm.org/D109634

This reverts commit a2fd05ada9.

Original commits were b4fa71eed3
and e03c7e367a.
2021-09-17 17:18:42 -07:00
Jez Ng 91ace9f062 [lld-macho] Construct CFString literals by copying the ConcatInputSection
... instead of constructing a new one each time. This allows us
to take advantage of {D105305}.

I didn't see a substantial difference when linking chromium_framework,
but this paves the way for reusing similar logic for splitting compact
unwind entries into sections. There are a lot more of those, so the
performance impact is significant.

Differential Revision: https://reviews.llvm.org/D109895
2021-09-17 19:46:20 -04:00
Vy Nguyen b428c3e8c1 [lld-macho] Ignore local personality symbols if non-local with the same name exisst, to avoid "too many personalities" error.
Sometimes people intentionally re-define a dylib personlity symbol as a local defined symbol as a workaround to a ld -r bug.
    As a result, we could see "too many personalities" to encode. This patch tries to handle this case by ignoring the local symbols entirely.

    Differential Revision: https://reviews.llvm.org/D107533
2021-09-17 12:59:42 -04:00
Nuri Amari aaf00f3f19 Add MachO signature verification test
Add a test to ensure that MachO files including
a LC_CODE_SIGNATURE load command produced by lld
are signed correctly.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D109840
2021-09-16 17:55:32 -07:00
Nuri Amari cc8229603b Extract LC_CODE_SIGNATURE related implementation out of LLD
Move the functionality in lld that handles writing of the LC_CODE_SIGNATURE load command and associated data section to a central reusable location.

This change is in preparation for another change that modifies llvm-objcopy to reproduce the LC_CODE_SIGNATURE load command and corresponding
data section to maintain the validity of signed macho object files passed through llvm-objcopy.

Reviewed By: #lld-macho, int3, oontvoo

Differential Revision: https://reviews.llvm.org/D109803
2021-09-16 17:43:39 -07:00
Fangrui Song 1d08a19a38 [ELF] Clarify --export-dynamic-symbol/--dynamic-list. NFC 2021-09-16 17:13:08 -07:00
Amy Huang a2fd05ada9 Temporarily revert "[LLD] Remove global state in lld/COFF" and "[lld] Add test to
check for timer output"

Seems to be causing a number of asan test failures.

This reverts commit b4fa71eed3
and e03c7e367a.
2021-09-16 11:58:11 -07:00
Amy Huang e03c7e367a [lld] Add test to check for timer output
This test checks that timers are working and printing as expected.

I also seem to have changed the order of the timers in my globals refactoring
patch, so I fixed it here.

Differential Revision: https://reviews.llvm.org/D109904
2021-09-16 11:36:46 -07:00
Amy Huang b4fa71eed3 [LLD] Remove global state in lld/COFF
This patch removes globals from the lldCOFF library, by moving globals
into a context class (COFFLinkingContext) and passing it around wherever
it's needed.

See https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html for
context about removing globals from LLD.

I also haven't moved the `driver` or `config` variables yet.

Differential Revision: https://reviews.llvm.org/D109634
2021-09-16 11:00:23 -07:00
Alfonso Gregory a2c319fdc6 [LLVM][CMake][NFC] Resolve FIXME: Rename LLVM_CMAKE_PATH to LLVM_CMAKE_DIR throughout the project
This way, we do not need to set LLVM_CMAKE_PATH to LLVM_CMAKE_DIR when (NOT LLVM_CONFIG_FOUND)

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D107717
2021-09-16 18:29:57 +02:00
Thomas Lively 962acf0a27 [lld][WebAssembly] Use llvm-objdump to test __wasm_init_memory
Rather than depending on the hex dump from obj2yaml. Now the test shows the
expected function body in a human readable format.

Differential Revision: https://reviews.llvm.org/D109730
2021-09-14 18:07:59 -07:00
Nico Weber ed2f0ad307 [lld/mac] Search .tbd before binary for framework files too
This matters for example for the iPhoneSimulator14.0.sdk, which has
a System/Library/Frameworks/UIKit.framework/UIKit that has
LC_BUILD_VERSION with minos of 14.0, so linking against that file
will produce warnings like:

   .../iPhoneSimulator14.0.sdk/System/Library/Frameworks/UIKit.framework/UIKit
   has version 14.0.0, which is newer than target minimum of 12.0.0

when targeting x86_64-apple-ios12.0-simulator. That doens't happen when
linking against UIKit.tbd instead, obviously.

Linking with RC_TRACE_DYLIB_SEARCHING=1 shows that ld64 also searches
the tbd file first, and we already get that right for non-framework
dylibs.

Fixes crbug.com/1249456.

Differential Revision: https://reviews.llvm.org/D109768
2021-09-14 15:26:45 -04:00
Sam Clegg 6ee55f9ab5 Fix test failure created by ef8c9135ef
Followup to https://reviews.llvm.org/D108877 to fix test
failure.
2021-09-14 07:35:05 -07:00
Sam Clegg ef8c9135ef [WebAssembly] Allow import and export of TLS symbols between DSOs
We previously had a limitation that TLS variables could not
be exported (and therefore could also not be imported).  This
change removed that limitation.

Differential Revision: https://reviews.llvm.org/D108877
2021-09-14 06:47:37 -07:00
Thomas Lively b2032f18c9 [lld][WebAssembly] Relax limitations on multithreaded instantiation
For multithreaded modules (i.e. modules with a shared memory), lld injects a
synthetic Wasm start function that is automatically called during instantiation
to initialize memory from passive data segments. Even though the module will be
instantiated separately on each thread, memory initialization should happen only
once. Furthermore, memory initialization should be finished by the time each
thread finishes instantiation. Since multiple threads may be instantiating their
modules at the same time, the synthetic function must synchronize them.

The current synchronization tries to atomically increment a flag from 0 to 1 in
memory then enters one of two cases. First, if the increment was successful, the
current thread is responsible for initializing memory. It does so, increments
the flag to 2 to signify that memory has been initialized, then notifies all
threads waiting on the flag. Otherwise, the thread atomically waits on the flag
with an expected value of 1 until memory has been initialized. Either the
initializer thread finishes initializing memory (i.e. sets the flag to 2) first
and the waiter threads do not end up blocking, or the waiter threads succesfully
start waiting before memory is initialized so they will be woken by the
initializer thread once it has finished.

One complication with this scheme is that there are various contexts on the Web,
most notably on the main browser thread, that cannot successfully execute a
wait. Executing a wait in these contexts causes a trap, and in this case would
cause instantiation to fail. The embedder must therefore ensure that these
contexts win the race and become responsible for initializing memory, since that
is the only code path that does not execute a wait.

Unfortunately, since only one thread can win the race and initialize memory,
this scheme makes it impossible to have multiple threads in contexts that cannot
wait. For example, it is not currently possible to instantiate the module on
both the main browser thread as well as in an AudioWorklet. To loosen this
restriction, this commit inserts an extra check so that the wait will not be
executed at all when memory has already been initialized, i.e. when the flag
value is 2. After this change, the module can be instantiated on threads in
non-waiting contexts as long as the embedder can guarantee either that the
thread will win the race and initialize memory (as before) or that memory has
already been initialized when instantiation begins. Threads in contexts that can
wait can continue racing to initialize memory.

Fixes (or at least improves) PR51702.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D109722
2021-09-13 15:03:51 -07:00
Sam Clegg b78c85a44a [WebAssembly] Convert to new "dylink.0" section format
This format is based on sub-sections (like the "linking" and "name"
sections) and is therefore easier to extend going forward.

spec change: https://github.com/WebAssembly/tool-conventions/pull/170
binaryen change: https://github.com/WebAssembly/binaryen/pull/4141
wabt change:  https://github.com/WebAssembly/wabt/pull/1707
emscripten change: https://github.com/emscripten-core/emscripten/pull/15019

Differential Revision: https://reviews.llvm.org/D109595
2021-09-12 05:30:38 -07:00
Sam Clegg 3a7bcba34b [lld][WebAssembly] Cleanup output of --verbose
Remove some unnecessary logging from wasm-ld when running under
`--verbose`.  Unlike `-debug` this logging is available in release
builds.  This change makes it little more minimal/readable.

Also, avoid compiling the `debugWrite` function in releaase builds
where it does nothing.  This should remove a lot debug strings from
the binary, and avoid having to construct unused debug strings at
runtime.

Differential Revision: https://reviews.llvm.org/D109583
2021-09-10 11:35:50 -04:00
Fangrui Song bcc34ab6c8 [lld] Enable ANSI escape code for Windows
Buffered diagnostics need ENABLE_VIRTUAL_TERMINAL_PROCESSING after D87272.
Do it unconditionally like FileCheck.
2021-09-09 16:51:11 -07:00
Sam Clegg 6355234660 [lld][WebAssembly] Fix crash on un-used __tls_base symbol
In the case that TLS is used in the single-threaded program, and
therefore effectively lowered away, we still optionally create a
`__tls_base` symbols, but the code for setting it was assuming it was
always created.

Differential Revision: https://reviews.llvm.org/D109518
2021-09-09 12:45:58 -04:00
Fangrui Song 0db402c5b4 [lld] Buffer writes when composing a single diagnostic
llvm::errs() is unbuffered. On a POSIX platform, composing a diagnostic
string may invoke the ::write syscall multiple times, which can be slow.
Buffer writes to a temporary SmallString when composing a single diagnostic to
reduce the number of ::write syscalls to one (also easier to read under
strace/truss).

For an invocation of ld.lld with 62000+ lines of
`ld.lld: warning: symbol ordering file: no such symbol: ` warnings (D87121),
the buffering decreases the write time from 1s to 0.4s (for /dev/tty) and
from 0.4s to 0.1s (for a tmpfs file). This can speed up
`relocation R_X86_64_PC32 out of range` diagnostic printing as well
with `--noinhibit-exec --no-fatal-warnings`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D87272
2021-09-09 09:27:14 -07:00
Sam Clegg 44177e5fb2 [WebAssembly] Add explict TLS symbol flag
As before we maintain backwards compat with older object files
by also infering the TLS flag based on the name of the segment.

This change is was split out from https://reviews.llvm.org/D108877.

Differential Revision: https://reviews.llvm.org/D109426
2021-09-09 10:03:30 -04:00
Fangrui Song aa4dfba522 [ELF] Infer EM_HEXAGON in getBitcodeMachineKind 2021-09-07 20:46:37 -07:00
Fangrui Song abd80ecf6e [ELF][test] Improve gitBitcodeMachineKind tests 2021-09-07 11:38:43 -07:00
Jez Ng d9ab62ca3d [lld-macho] Initialize LTO backend with diagnostic handler
Failing to do so results in `std::bad_function_call` being
thrown when a pass tries to emit a diagnostic.

I've copied the relevant test over from LLD-ELF's test suite.

Reviewed By: #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D109274
2021-09-04 17:40:07 -04:00
David Blaikie bc066e26c9 DebugInfo: Fix a few bot failures for type dumping fixes 2021-09-03 14:08:58 -07:00
Nico Weber c15b588852 [lld/mac] Don't assert during thunk insertion if there are undefined symbols
We end up calling resolveBranchVA(), which asserts for Undefineds.

As fix, just return early in Writer::run() if there are any diagnostics
after processing relocations (which is where undefined symbol errors are
emitted). This matches what the ELF port does.

Differential Revision: https://reviews.llvm.org/D109079
2021-09-03 12:22:41 -04:00
Nico Weber 9d22754389 Fix lld build after 5881dcff7e 2021-09-02 15:07:10 -04:00
Sid Manning 0d7e5daedc [lld][Hexagon] Add checks for instructions that can have TLS relocations
Several instructions with potential TLS relocations were missing.  This
issue was found when building the Canadian LLVM toolchain.
2021-09-01 13:15:18 -07:00
Alexandre Ganea 7f0664f193 [LLD][COFF] Clean paths in PDB even when /pdbsourcepath is omitted
Differential Revision: https://reviews.llvm.org/D109030
2021-08-31 19:05:10 -04:00
Fangrui Song f9277caffc [ELF][test] Fix R_AARCH64_ADR_PREL_PG_HI21 typo
Found by redfast00
2021-08-31 13:09:55 -07:00
Nico Weber 86c8f395ae [lld/mac] Leave more room for thunks in thunk placement code
Fixes PR51578 in practice.

Currently there's only enough room for a single thunk, which for real-life code
isn't enough. The error case only happens when there are many branch statements
very close to each other (0 or 1 instructions apart), with the function at the
finalization barrier small.

There's a FIXME on what to do if we hit this case, but that suggestion sounds
complicated to me (see end of PR51578 comment 5 for why).

Instead, just leave more room for thunks. Chromium's unit_tests links fine with
room for 3 thunks. Leave room for 100, which should fix this for most cases in
practice.

There's little cost for leaving lots of room: This slop value only determines
when we finalize sections, and we insert thunks for forward jumps into
unfinalized sections. So leaving room means we'll need a few more thunks, but
the thunk jump range is 128 MiB while a single thunk is just 12 bytes.

For Chromium's unit_tests:
With a slop of   3: thunk calls = 355418, thunks = 10903
With a slop of 100: thunk calls = 355426, thunks = 10904

Chances are 100 is enough for all use cases we'll hit in practice, but even
bumping it to 1000 would probably be fine.

Differential Revision: https://reviews.llvm.org/D108930
2021-08-30 22:09:05 -04:00
Nico Weber 83df94067d [lld/mac] Tweak estimateStubsInRangeVA a bit
- Move a few variables closer to their uses, remove some completely
  (no behavior change)
- Add some comments
- Make maxPotentialThunks include calls to stubs. It's possible that
  an earlier call to a stub late in the stub table will need a thunk,
  and that inserted thunk could push a stub earlier in the stub table
  out of range. This is unlikely to happen, but usually there are
  way fewer stub calls than non-stub calls, so if we're doing a
  conservative approximation here we might as well do it correctly.
  (For chromium's unit_tests target, 134421/242639 stub calls are
  direct calls without this change, compared to 134408/242639 with
  this change)

No real, meaningful behavior difference.

Differential Revision: https://reviews.llvm.org/D108924
2021-08-30 13:56:45 -04:00
Nico Weber 9721197520 [lld/mac] Set branchRange a bit more carefully
- Don't subtract thunkSize from branchRange. Most places care about
  the actual maximal branch range. Subtract thunkSize in the one place
  that wants to leave room for a thunk.
- Set it to 0x800_0000 instead of 0xFF_FFFF
- Subtract 4 for the positive branch direction since it's a
  two's complement 24bit number sign-extended mutiplied by 4,
  so its range is -0x800_0000..+0x7FF_FFFC
- Make boundary checks include the boundary values

This doesn't make a huge difference in practice. It's preparation
for a "real" fix for PR51578 -- but it also lets the repro in comment 0
in that bug place one more thunk before hitting the TODO.

Differential Revision: https://reviews.llvm.org/D108897
2021-08-30 12:36:06 -04:00
Fangrui Song 3726039561 [ELF] Simplify addGotEntry. NFC 2021-08-29 13:40:08 -07:00
Fangrui Song d3fdc312b2 [ELF] Untangle TLS IE and regular GOT from addGotEntry for non-mips. NFC 2021-08-29 13:21:06 -07:00
Fangrui Song 1861160697 [ELF] Move handleTlsRelocations. NFC
Prepare for addGotEntry simplification.
2021-08-29 13:11:35 -07:00