Commit Graph

242 Commits

Author SHA1 Message Date
Fangrui Song e9262edf0d [ELF] SymbolTable:🔣 don't filter out PlaceholderKind
Placeholders (-y and redirectSymbols removed versioned symbols) are very rare and
the check just makes symbol table iteration slower. Most iterations filter out
placeholders anyway, so this change just drops the filter behavior.

For "Add symbols to symtabs", we need to ensure that redirectSymbols sets
isUsedInRegularObj to false when making a symbol placeholder, to avoid an
assertion failure in SymbolTableSection<ELFT>::writeTo.

My .text is 2KiB smaller. The speed-up linking chrome is 0.x%.
2021-12-26 18:11:45 -08:00
Fangrui Song 417cd2e5c5 [ELF] SymbolTable: change some vector<Symbol *> to SmallVector
The generated assembly for Symbol::insert is much shorter (std::vector resize is
inefficient) and enables some inlining.
2021-12-23 16:49:38 -08:00
Fangrui Song 5fc4323eda [ELF] Change some global pointers to unique_ptr
Currently the singleton `config` is assigned by `config = make<Configuration>()`
and (if `canExitEarly` is false) destroyed by `lld::freeArena`.

`make<Configuration>` allocates a stab with `malloc(4096)`. This both wastes
memory and bloats the executable (every type instantiates `BumpPtrAllocator`
which costs more than 1KiB code on x86-64).

(No need to worry about `clang::no_destroy`. Regular invocations (`canExitEarly`
is true) call `_Exit` via llvm::sys::Process::ExitNoCleanup.)

Reviewed By: lichray

Differential Revision: https://reviews.llvm.org/D116143
2021-12-22 14:36:14 -08:00
Fangrui Song 00809c8889 [ELF] Apply version script patterns to non-default version symbols
Currently version script patterns are ignored for .symver produced
non-default version (single @) symbols. This makes such symbols
not localizable by `local:`, e.g.

```
.symver foo3_v1,foo3@v1
.globl foo_v1
foo3_v1:

ld.lld --version-script=a.ver -shared a.o
```

This patch adds the support:

* Move `config->versionDefinitions[VER_NDX_LOCAL].patterns` to `config->versionDefinitions[versionId].localPatterns`
* Rename `config->versionDefinitions[versionId].patterns` to `config->versionDefinitions[versionId].nonLocalPatterns`
* Allow `findAllByVersion` to find non-default version symbols when `includeNonDefault` is true. (Note: `symtab` keys do not have `@@`)
* Make each pattern check both the unversioned `pat.name` and the versioned `${pat.name}@${v.name}`
* `localPatterns` can localize `${pat.name}@${v.name}`. `nonLocalPatterns` can prevent localization by assigning `verdefIndex` (before `parseSymbolVersion`).

---

If a user notices new `undefined symbol` errors with a version script containing
`local: *;`, the issue is likely due to a missing `global:` pattern.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D107234
2021-08-04 23:52:56 -07:00
Fangrui Song a533eb7423 Revert "[ELF] Apply version script patterns to non-default version symbols"
This reverts commit 7ed22a6fa9.

buf is not cleared so the commit misses some cases.
2021-08-04 23:52:55 -07:00
Fangrui Song 7ed22a6fa9 [ELF] Apply version script patterns to non-default version symbols
Currently version script patterns are ignored for .symver produced
non-default version (single @) symbols. This makes such symbols
not localizable by `local:`, e.g.

```
.symver foo3_v1,foo3@v1
.globl foo_v1
foo3_v1:

ld.lld --version-script=a.ver -shared a.o
# In a.out, foo3@v1 is incorrectly exported.
```

This patch adds the support:

* Move `config->versionDefinitions[VER_NDX_LOCAL].patterns` to `config->versionDefinitions[versionId].localPatterns`
* Rename `config->versionDefinitions[versionId].patterns` to `config->versionDefinitions[versionId].nonLocalPatterns`
* Allow `findAllByVersion` to find non-default version symbols when `includeNonDefault` is true. (Note: `symtab` keys do not have `@@`)
* Make each pattern check both the unversioned `pat.name` and the versioned `${pat.name}@${v.name}`
* `localPatterns` can localize `${pat.name}@${v.name}`. `nonLocalPatterns` can prevent localization by assigning `verdefIndex` (before `parseSymbolVersion`).

---

If a user notices new `undefined symbol` errors with a version script containing
`local: *;`, the issue is likely due to a missing `global:` pattern.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D107234
2021-08-04 09:02:11 -07:00
Fangrui Song a2fc964417 [ELF] Replace SymbolTable::forEachSymbol with iterator_range symbols()
D62381 introduced forEachSymbol(). It seems that many call sites cannot
be parallelized because the body shared some states. Replace
forEachSymbol with iterator_range<filter_iterator<...>> symbols() to
simplify code and improve debuggability (std::function calls take some
frames).

It also allows us to use early return to simplify code added in D69650.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D70505
2019-11-26 09:09:32 -08:00
Fangrui Song ab04ad6af7 [ELF] Rename odd variable names "New" after r365730. NFC
New -> newSym or newFlags

Reviewed By: atanasyan

Differential Revision: https://reviews.llvm.org/D66127

llvm-svn: 368651
2019-08-13 06:19:39 +00:00
Fangrui Song e1ee3837ac [ELF] Handle non-glob patterns before glob patterns in version scripts & fix a corner case of --dynamic-list
This fixes PR38549, which is silently accepted by ld.bfd.
This seems correct because it makes sense to let non-glob patterns take
precedence over glob patterns.

lld issues an error because
`assignWildcardVersion(ver, VER_NDX_LOCAL);` is processed before `assignExactVersion(ver, v.id, v.name);`.

Move all assignWildcardVersion() calls after assignExactVersion() calls
to fix this.

Also, move handleDynamicList() to the bottom. computeBinding() called by
includeInDynsym() has this cryptic rule:

    if (versionId == VER_NDX_LOCAL && isDefined() && !isPreemptible)
      return STB_LOCAL;

Before the change:

* foo's version is set to VER_NDX_LOCAL due to `local: *`
* handleDynamicList() is called
  - foo.computeBinding() is STB_LOCAL
  - foo.includeInDynsym() is false
  - foo.isPreemptible is not set (wrong)
* foo's version is set to V1

After the change:

* foo's version is set to VER_NDX_LOCAL due to `local: *`
* foo's version is set to V1
* handleDynamicList() is called
  - foo.computeBinding() is STB_GLOBAL
  - foo.includeInDynsym() is true
  - foo.isPreemptible is set (correct)

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D64550

llvm-svn: 365760
2019-07-11 11:16:51 +00:00
Rui Ueyama 3837f4273f [Coding style change] Rename variables so that they start with a lowercase letter
This patch is mechanically generated by clang-llvm-rename tool that I wrote
using Clang Refactoring Engine just for creating this patch. You can see the
source code of the tool at https://reviews.llvm.org/D64123. There's no manual
post-processing; you can generate the same patch by re-running the tool against
lld's code base.

Here is the main discussion thread to change the LLVM coding style:
https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html
In the discussion thread, I proposed we use lld as a testbed for variable
naming scheme change, and this patch does that.

I chose to rename variables so that they are in camelCase, just because that
is a minimal change to make variables to start with a lowercase letter.

Note to downstream patch maintainers: if you are maintaining a downstream lld
repo, just rebasing ahead of this commit would cause massive merge conflicts
because this patch essentially changes every line in the lld subdirectory. But
there's a remedy.

clang-llvm-rename tool is a batch tool, so you can rename variables in your
downstream repo with the tool. Given that, here is how to rebase your repo to
a commit after the mass renaming:

1. rebase to the commit just before the mass variable renaming,
2. apply the tool to your downstream repo to mass-rename variables locally, and
3. rebase again to the head.

Most changes made by the tool should be identical for a downstream repo and
for the head, so at the step 3, almost all changes should be merged and
disappear. I'd expect that there would be some lines that you need to merge by
hand, but that shouldn't be too many.

Differential Revision: https://reviews.llvm.org/D64121

llvm-svn: 365595
2019-07-10 05:00:37 +00:00
Rui Ueyama d8f8abbd4a Use SymbolTable::insert() to implement --trace.
Differential Revision: https://reviews.llvm.org/D62381

llvm-svn: 361791
2019-05-28 06:33:06 +00:00
Fangrui Song 7f1ff68a16 [ELF] Deleted unused forward declarations. NFC
llvm-svn: 361614
2019-05-24 09:25:47 +00:00
Rui Ueyama 7f7d2b2e62 Move code for symbol resolution from SymbolTable.cpp to Symbols.cpp.
My recent commits separated symbol resolution from the symbol table,
so the functions to resolve symbols are now in a somewhat wrong file.
This patch moves it to Symbols.cpp.

The functions are now member functions of the symbol.

This is code move change. I modified function names so that they are
appropriate as member functions, though. No functionality change
intended.

Differential Revision: https://reviews.llvm.org/D62290

llvm-svn: 361474
2019-05-23 09:58:08 +00:00
Rui Ueyama 0baaf45be7 Move SymbolTable::addCombinedLTOObject() to LinkerDriver.
Also renames it LinkerDriver::compileBitcodeFiles.

The function doesn't logically belong to SymbolTable. We added this
function to the symbol table because symbol table used to be a
container of input files. This is no longer the case.

Differential Revision: https://reviews.llvm.org/D62291

llvm-svn: 361469
2019-05-23 09:26:27 +00:00
Fangrui Song b72b091389 [ELF] Improve error message for relocations to symbols defined in discarded sections
Rather than report "undefined symbol: ", give more informative message
about the object file that defines the discarded section.

In particular, PR41133, if the section is a discarded COMDAT, print the
section group signature and the object file with the prevailing
definition. This is useful to track down some ODR issues.

We need to
* add `uint32_t DiscardedSecIdx` to Undefined for this feature.
* make ComdatGroups public and change its type to DenseMap<CachedHashStringRef, const InputFile *>

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D59649

llvm-svn: 361359
2019-05-22 09:06:42 +00:00
Fangrui Song 5ea0d06e81 [ELF] Deleted unused ComdatGroups member variable left by D61854
llvm-svn: 361266
2019-05-21 14:40:38 +00:00
Rui Ueyama bbf154cf9c Move symbol resolution code out of SymbolTable class.
This is the last patch of the series of patches to make it possible to
resolve symbols without asking SymbolTable to do so.

The main point of this patch is the introduction of
`elf::resolveSymbol(Symbol *Old, Symbol *New)`. That function resolves
or merges given symbols by examining symbol types and call
replaceSymbol (which memcpy's New to Old) if necessary.

With the new function, we have now separated symbol resolution from
symbol lookup. If you already have a Symbol pointer, you can directly
resolve the symbol without asking SymbolTable to do that.

Now that the nice abstraction become available, I can start working on
performance improvement of the linker. As a starter, I'm thinking of
making --{start,end}-lib faster.

--{start,end}-lib is currently unnecessarily slow because it looks up
the symbol table twice for each symbol.

 - The first hash table lookup/insertion occurs when we instantiate a
   LazyObject file to insert LazyObject symbols.

 - The second hash table lookup/insertion occurs when we create an
   ObjFile from LazyObject file. That overwrites LazyObject symbols
   with Defined symbols.

I think it is not too hard to see how we can now eliminate the second
hash table lookup. We can keep LazyObject symbols in Step 1, and then
call elf::resolveSymbol() to do Step 2.

Differential Revision: https://reviews.llvm.org/D61898

llvm-svn: 360975
2019-05-17 01:55:20 +00:00
Rui Ueyama 54ee6df247 Pemove SymbolTable::addBitcode as it is redundant.
Differential Revision: https://reviews.llvm.org/D61897

llvm-svn: 360846
2019-05-16 03:54:50 +00:00
Rui Ueyama d668873bfe Consistently return `Symbol *` from SymbolTable's add-family functions.
llvm-svn: 360845
2019-05-16 03:54:41 +00:00
Rui Ueyama 943cd00580 De-template parseFile() and SymbolTable's add-family functions.
Differential Revision: https://reviews.llvm.org/D61896

llvm-svn: 360844
2019-05-16 03:45:13 +00:00
Rui Ueyama 5c073a94f9 Introduce CommonSymbol.
Previously, we handled common symbols as a kind of Defined symbol,
but what we were doing for common symbols is pretty different from
regular defined symbols.

Common symbol and defined symbol are probably as different as shared
symbol and defined symbols are different.

This patch introduces CommonSymbol to represent common symbols.
After symbols are resolved, they are converted to Defined symbols
residing in a .bss section.

Differential Revision: https://reviews.llvm.org/D61895

llvm-svn: 360841
2019-05-16 03:29:03 +00:00
Rui Ueyama 7d4761928e Simplify SymbolTable::add{Defined,Undefined,...} functions.
SymbolTable's add-family functions have lots of parameters because
when they have to create a new symbol, they forward given arguments
to Symbol's constructors. Therefore, the functions take at least as
many arguments as their corresponding constructors.

This patch simplifies the add-family functions. Now, the functions
take a symbol instead of arguments to construct a symbol. If there's
no existing symbol, a given symbol is memcpy'ed to the symbol table.
Otherwise, the functions attempt to merge the existing and a given
new symbol.

I also eliminated `CanOmitFromDynSym` parameter, so that the functions
take really one argument.

Symbol classes are trivially constructible, so looks like constructing
them to pass to add-family functions is as cheap as passing a lot of
arguments to the functions. A quick benchmark showed that this patch
seems performance-neutral.

This is a preparation for
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html

Differential Revision: https://reviews.llvm.org/D61855

llvm-svn: 360838
2019-05-16 02:14:00 +00:00
Rui Ueyama 2dd5283d2a Move SymbolTable::addFile to InputFiles.cpp.
The symbol table used to be a container of vectors of input files,
but that's no longer the case because the vectors are moved out of
SymbolTable and are now global variables.

Therefore, addFile doesn't have to belong to any class. This patch
moves the function out of the class.

This patch is a preparation for my RFC [1].

[1] http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html

Differential Revision: https://reviews.llvm.org/D61854

llvm-svn: 360666
2019-05-14 12:03:13 +00:00
Rui Ueyama f432fa6eee De-template SymbolTable::addShared.
Because of r357925, this member function doesn't have to be a
template of ELFT.

llvm-svn: 357982
2019-04-09 08:52:00 +00:00
Peter Collingbourne cc1618e668 ELF: De-template SharedFile. NFCI.
Differential Revision: https://reviews.llvm.org/D60305

llvm-svn: 357925
2019-04-08 17:35:55 +00:00
Fangrui Song b4744d306c [ELF] Support --{,no-}allow-shlib-undefined
Summary:
In ld.bfd/gold, --no-allow-shlib-undefined is the default when linking
an executable. This patch implements a check to error on undefined
symbols in a shared object, if all of its DT_NEEDED entries are seen.

Our approach resembles the one used in gold, achieves a good balance to
be useful but not too smart (ld.bfd traces all DSOs and emulates the
behavior of a dynamic linker to catch more cases).

The error is issued based on the symbol table, different from undefined
reference errors issued for relocations. It is most effective when there
are DSOs that were not linked with -z defs (e.g. when static sanitizers
runtime is used).

gold has a comment that some system libraries on GNU/Linux may have
spurious undefined references and thus system libraries should be
excluded (https://sourceware.org/bugzilla/show_bug.cgi?id=6811). The
story may have changed now but we make --allow-shlib-undefined the
default for now. Its interaction with -shared can be discussed in the
future.

Reviewers: ruiu, grimar, pcc, espindola

Reviewed By: ruiu

Subscribers: joerg, emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D57385

llvm-svn: 352826
2019-02-01 02:25:05 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Fangrui Song 50394f6e01 [ELF] A shared object is needed if any of its occurrences is needed
Summary:
If a DSO appears more than once with and without --as-needed, ld.bfd and gold consider --no-as-needed to takes precedence over --as-needed. lld didn't and this patch makes it do so.

This makes it a bit away from the position-dependent behavior (how
different occurrences of the same DSO interact) and protects us from
some mysterious runtime errors: if some interceptor libraries add their
own --no-as-needed dependencies (e.g. librt.so), and the user
application specifies -Wl,--as-needed -lrt , the absence of the
DT_NEEDED entry would make dlsym(RTLD_NEXT, "clock_gettime") return NULL
and would break at runtime.

Reviewers: ruiu, espindola

Reviewed By: ruiu

Subscribers: emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D56089

llvm-svn: 350105
2018-12-27 22:24:45 +00:00
George Rimar 94a16cb611 [ELF] - Make SymbolTable::addDefined return Defined.
Now it returns Symbol. This should be NFC that
avoids doing cast at the caller's sides.

Differential revision: https://reviews.llvm.org/D54627

llvm-svn: 347455
2018-11-22 11:40:08 +00:00
Rui Ueyama f3fad55787 Remove `Type` parameter from SymbolTable::insert(). NFC.
`Type` parameter was used only to check for TLS attribute mismatch,
but we can do that when we actually replace symbols, so we don't need
to type as an argument. This change should simplify the interface of
the symbol table a bit.

llvm-svn: 344394
2018-10-12 18:29:18 +00:00
Rui Ueyama b0d8502049 Remove SymbolTable::addAbsolute().
addAbsolute() could be implemented as a non-member function.

llvm-svn: 344305
2018-10-11 22:15:41 +00:00
Rui Ueyama 2600e0946b Rename SymbolTable::addRegular -> SymbolTable::addDefined.
We have addAbsolute, addBitcode, addCommon, etc. addRegular looked a
bit inconsistent.

llvm-svn: 344294
2018-10-11 20:43:01 +00:00
Rui Ueyama cf41adaaa0 Remove unused default arguments.
llvm-svn: 344292
2018-10-11 20:38:34 +00:00
Rui Ueyama c7497d3ac5 Remove SymbolTable::addUndefined<ELF32LE>(StringRef).
Because we can implement the function as a non-member function.

llvm-svn: 344290
2018-10-11 20:34:29 +00:00
Rui Ueyama 3c4344810a Make a member function private and rename it to avoid function overloading.
llvm-svn: 344196
2018-10-10 22:49:29 +00:00
Rui Ueyama 07b4536bb7 Change how we handle -wrap.
We have an issue with -wrap that the option doesn't work well when
renamed symbols get PLT entries. I'll explain what is the issue and
how this patch solves it.

For one -wrap option, we have three symbols: foo, wrap_foo and real_foo.
Currently, we use memcpy to overwrite wrapped symbols so that they get
the same contents. This works in most cases but doesn't when the relocation
processor sets some flags in the symbol. memcpy'ed symbols are just
aliases, so they always have to have the same contents, but the
relocation processor breaks that assumption.

r336609 is an attempt to fix the issue by memcpy'ing again after
processing relocations, so that symbols that are out of sync get the
same contents again. That works in most cases as well, but it breaks
ASan build in a mysterious way.

We could probably fix the issue by choosing symbol attributes that need
to be copied after they are updated. But it feels too complicated to me.

So, in this patch, I fixed it once and for all. With this patch, we no
longer memcpy symbols. All references to renamed symbols point to new
symbols after wrapSymbols() is done.

Differential Revision: https://reviews.llvm.org/D50569

llvm-svn: 340387
2018-08-22 07:02:26 +00:00
Rui Ueyama f43fba739c Revert r336609: Fix direct calls to __wrap_sym when it is relocated.
This reverts commit r336609 as it doesn't seem to work with AArch64
thunk creation when used with ASan.

llvm-svn: 337413
2018-07-18 18:24:46 +00:00
Rui Ueyama 3e730b8ae6 Fix direct calls to __wrap_sym when it is relocated.
Patch by Matthew Koontz!

Before, direct calls to __wrap_sym would not map to valid PLT entries,
so they would crash at runtime. This change maps such calls to the same
PLT entry as calls to sym that are then wrapped.

Differential Revision: https://reviews.llvm.org/D48502

llvm-svn: 336609
2018-07-09 22:03:05 +00:00
Rui Ueyama cc013f62c1 Make fetchIfLazy only fetch an object file. NFC.
Previously, fetchIfLazy did more than the name says. Now, setting
to UsedInRegularObj is moved to another function.

llvm-svn: 329092
2018-04-03 18:01:18 +00:00
George Rimar 1ef746ba21 [ELF] - Eliminate Lazy class.
Patch removes Lazy class which
is just an excessive layer.

Differential revision: https://reviews.llvm.org/D45083

llvm-svn: 329086
2018-04-03 17:16:52 +00:00
Rui Ueyama ee17371897 Merge {COFF,ELF}/Strings.cpp to Common/Strings.cpp.
This should resolve the issue that lld build fails in some hosts
that uses case-insensitive file system.

Differential Revision: https://reviews.llvm.org/D43788

llvm-svn: 326339
2018-02-28 17:38:19 +00:00
Rafael Espindola 3f4c673d38 Put undefined symbols from shared libraries in the symbol table.
With the recent fixes these symbols have more in common than not with
regular undefined symbols.

llvm-svn: 326242
2018-02-27 20:31:22 +00:00
Peter Collingbourne 09e04af42f ELF: Stop collecting a list of symbols in ArchiveFile.
There seems to be no reason to collect this list of symbols.

Also fix a bug where --exclude-libs would apply to all symbols that
appear in an archive's symbol table, even if the relevant archive
member was not added to the link.

Differential Revision: https://reviews.llvm.org/D43369

llvm-svn: 325380
2018-02-16 20:23:54 +00:00
George Rimar cd141ce3e3 [ELF] - Remove dead declaration. NFC.
llvm-svn: 323747
2018-01-30 11:03:27 +00:00
Rafael Espindola 9a84f6b954 Detemplate reportDuplicate.
We normally avoid "switch (Config->EKind)", but in this case I think
it is worth it.

It is only executed when there is an error and it allows detemplating
a lot of code.

llvm-svn: 321404
2017-12-23 17:21:39 +00:00
Rafael Espindola 8cd6674f5b Use a reference in addLazyArchive. NFC.
llvm-svn: 321194
2017-12-20 17:48:28 +00:00
Rafael Espindola a32ddc4639 Use a reference for the shared symbol file.
Every shared symbol has a file, so we can use a reference.

llvm-svn: 321187
2017-12-20 16:28:19 +00:00
Rafael Espindola 7b5cc6c5dc Use a reference for a value that is never null. NFC.
llvm-svn: 321186
2017-12-20 16:19:48 +00:00
Rafael Espindola f1687125ba Use a reference for a value that is never null. NFC.
llvm-svn: 321185
2017-12-20 16:16:40 +00:00
Rafael Espindola 8f619ab826 Compact symbols from 96 to 88 bytes.
By using an index instead of a pointer for verdef we can put the index
next to the alignment field. This uses the otherwise wasted area and
reduces the shared symbol size.

By itself the performance change of this is in the noise, but I have a
followup patch to remove another 8 bytes that improves performance
when combined with this.

llvm-svn: 320449
2017-12-12 01:45:49 +00:00