-r --gc-sections is usually not useful because it just makes intermediate output
smaller. https://bugs.llvm.org/show_bug.cgi?id=46700#c7 mentions a use case:
validating the absence of undefined symbols ealier than in the final link.
After D84129 (SHT_GROUP support in -r links), we can support -r
--gc-sections without extra code. So let's allow it.
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D84131
add_compile_options is more sensitive to its location in the file than add_definitions--it only takes effect for sources that are added after it. This updated patch ensures that the add_compile_options is done before adding any source files that depend on it.
Using add_definitions caused the flag to be passed to rc.exe on Windows and thus broke Windows builds.
After lots of follow-up fixes, there are still problems, such as
-Wno-suggest-override getting passed to the Windows Resource Compiler
because it was added with add_definitions in the CMake file.
Rather than piling on another fix, let's revert so this can be re-landed
when there's a proper fix.
This reverts commit 21c0b4c1e8.
This reverts commit 81d68ad27b.
This reverts commit a361aa5249.
This reverts commit fa42b7cf29.
This reverts commit 955f87f947.
This reverts commit 8b16e45f66.
This reverts commit 308a127a38.
This reverts commit 274b6b0c7a.
This reverts commit 1c7037a2a5.
* If two group members are combined, we should leave just one index in the SHT_GROUP content.
* If a group member is discarded (/DISCARD/ or upcoming -r --gc-sections combination),
we should drop its index in the SHT_GROUP content. LLD currently crashes (`getOutputSection()` is null).
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D84129
The PC Relative code now allows for calls that are marked with the relocation
R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC
pointer in R2 and does not require R2 to be restored after the call.
This patch is added to support local calls to callees that require a TOC
Reviewed By: sfertile, MaskRay, nemanjai, stefanp
Differential Revision: https://reviews.llvm.org/D83504
The "undefined symbol" error message from lld-link displays up to 3 references to that symbol, and the number of extra references not shown.
This patch removes the computation of the strings for those extra references.
It fixes a freeze of lld-link we accidentally encountered when activating asan on a large project, without linking with the asan library.
In that case, __asan_report_load8 was referenced more than 2 million times, causing the computation of that many display strings, of which only 3 were used.
Differential Revision: https://reviews.llvm.org/D83510
The test fails in 32-bit Windows builds for unclear reasons:
ld.lld: error: failed to open
C:\src\llvm_package_1100-rc1\build32_stage0\tools\lld\test\ELF\Output\arm-exidx-range.s.tmp:
The parameter is incorrect.
The `intrinsics_gen` target exists in the CMake exports since r309389
(see LLVMConfig.cmake.in), hence projects can depend on `intrinsics_gen`
even it they are built separately from LLVM.
Reviewed By: MaskRay, JDevlieghere
Differential Revision: https://reviews.llvm.org/D83454
Accounting for the fact that Wasm function indices are 32-bit, but in wasm64 we want uniform 64-bit pointers.
Includes reloc types for 64-bit table indices.
Differential Revision: https://reviews.llvm.org/D83729
Rather than handling zlib handling manually, use find_package from CMake
to find zlib properly. Use this to normalize the LLVM_ENABLE_ZLIB,
HAVE_ZLIB, HAVE_ZLIB_H. Furthermore, require zlib if LLVM_ENABLE_ZLIB is
set to YES, which requires the distributor to explicitly select whether
zlib is enabled or not. This simplifies the CMake handling and usage in
the rest of the tooling.
This is a reland of abb0075 with all followup changes and fixes that
should address issues that were reported in PR44780.
Differential Revision: https://reviews.llvm.org/D79219
After D69985, symbols for "-init" and "-fini" were unconditionally
marked as used even if they were just lazy symbols seen when scanning
archives. That resulted in exposing them in the symbol table of an
output file, as Undefined, which added unwanted dependencies. The patch
fixes the issue by checking the kind of the symbols before the marking.
Differential Revision: https://reviews.llvm.org/D83549
Previously, lld would crash if the .pdata size was not an even multiple
of the expected .pdata entry size. This makes it error gracefully instead.
(We hit this in Chromium due to an assembler problem: https://crbug.com/1101577)
Differential revision: https://reviews.llvm.org/D83479
It allows handling cases when we have SHT_REL[A] sections before target
sections in objects.
This fixes https://bugs.llvm.org/show_bug.cgi?id=46632
which says: "Normally it is not what compilers would emit. We have to support it,
because some custom tools might want to use this feature, which is not restricted by ELF gABI"
Differential revision: https://reviews.llvm.org/D83469
Implements the missing relocation types for AVR target.
The results have been cross-checked with binutils.
Original patch by LemonBoy. Some changes by me.
Differential Revision: https://reviews.llvm.org/D78741
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).
Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.
For more information see PR36198 and D43002.
Differential Revision: https://reviews.llvm.org/D80833
The PC Relative code allows for calls that are marked with the relocation
R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC
pointer in R2 and does not require R2 to be restored after the call.
This patch is added to support local calls to callees tha also do not have a TOC.
Reviewed By: sfertile, MaskRay, stefanp
Differential Revision: https://reviews.llvm.org/D82816
The R_PPC64_REL24 is used in function calls when the caller requires a
valid TOC pointer. If the callee shares the same TOC or does not clobber
the TOC pointer then a direct call can be made. If the callee does not
share the TOC a thunk must be added to save the TOC pointer for the caller.
Up until PC Relative was introduced all local calls on medium and large code
models were assumed to share a TOC. This is no longer the case because
if the caller requires a TOC and the callee is PC Relative then the callee
can clobber the TOC even if it is in the same DSO.
This patch is to add support for a TOC caller calling a PC Relative callee that
clobbers the TOC.
Reviewed By: sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D82950
The patch adds checking for various potential issues in parsing name
lookup tables and reporting them as recoverable errors, similarly as we
do for other tables.
Differential Revision: https://reviews.llvm.org/D83050
The parsing method did not check reading errors and might easily fall
into an infinite loop on an invalid input because of that.
Differential Revision: https://reviews.llvm.org/D83049
In the absence of TLS relaxation (rewrite of code sequences),
there is still an applicable optimization:
[gd]: General Dynamic: resolve DTPMOD to 1 and/or resolve DTPOFF statically
All the other relaxations are only performed when transiting to
executable (`!config->shared`).
Since [gd] is handled differently, we can fold `!config->shared` into canRelax
and simplify its use sites. Rename the variable to reflect to new semantics.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D83243
... to customize the tombstone value we use for an absolute relocation
referencing a discarded symbol. This can be used as a workaround when
some debug processing tool has trouble with current -1 tombstone value
(https://bugs.chromium.org/p/chromium/issues/detail?id=1102223#c11 )
For example, to get the current built-in rules (not considering the .debug_line special case for ICF):
```
-z dead-reloc-in-nonalloc='.debug_*=0xffffffffffffffff'
-z dead-reloc-in-nonalloc=.debug_loc=0xfffffffffffffffe
-z dead-reloc-in-nonalloc=.debug_ranges=0xfffffffffffffffe
```
To get GNU ld (as of binutils 2.35)'s behavior:
```
-z dead-reloc-in-nonalloc='*=0'
-z dead-reloc-in-nonalloc=.debug_ranges=1
```
This option has other use cases. For example, if we want to check
whether a non-SHF_ALLOC section has dead relocations.
With this patch, we can run a regular LLD and run another with a special
-z dead-reloc-in-nonalloc=, then compare their output.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D83264
In GNU ld, --no-relax can disable x86-64 GOTPCRELX relaxation.
It is not useful, so we don't implement it.
For RISC-V, --no-relax disables linker relaxations which have larger
impact.
Linux kernel specifies --no-relax when CONFIG_DYNAMIC_FTRACE is specified
(since http://git.kernel.org/linus/a1d2a6b4cee858a2f27eebce731fbf1dfd72cb4e ).
LLD has not implemented the relaxations, so this option is a no-op.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81359
The Symbol Table in LLD references the global object to add a symbol rather than adding it to itself.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D83184
Follow-up to D82899. Note, we need to disable R_DTPREL relaxation
because ARM psABI does not define TLS relaxation.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D83138
The location of a TLS variable is encoded as a DW_OP_const4u/DW_OP_const8u
followed by a DW_OP_push_tls_address (or DW_OP_GNU_push_tls_address https://sourceware.org/bugzilla/show_bug.cgi?id=11616 ).
This change follows up to D81784 and makes relocations types generalized as
R_DTPREL (e.g. R_X86_64_DTPOFF{32,64}, R_PPC64_DTPREL64) use -1 as the
tombstone value as well. This works for both TLS Variant I and Variant II
architectures.
* arm: .long tls(tlsldo) # not working currently (R_ARM_TLS_LDO32 is R_ABS)
* mips64: .dtpreldword tls+32768
* ppc64: .quad tls@DTPREL+0x8000
* riscv: neither GCC nor clang has implemented DW_AT_location. It is likely .long/.quad tls@dtprel+0x800
* x86-32: .long tls@DTPOFF
* x86-64: .long tls@DTPOFF; .quad tls@DTPOFF
tls has a non-negative st_value, so such relocations (st_value+addend)
never resolve to -1 in a normal (not discarded) case.
```
// clang -fuse-ld=lld -g -ffunction-sections a.c -Wl,--gc-sections
// foo and tls will be discarded by --gc-sections.
// DW_AT_location [DW_FORM_exprloc] (DW_OP_const8u 0xffffffffffffffff, DW_OP_GNU_push_tls_address)
thread_local int tls;
int foo() { return ++tls; }
int main() {}
```
Also, drop logic added in D26201 intended to address PR30793. It added a test
(gc-debuginfo-tls.s) using a non-SHF_ALLOC section and a local symbol, which
does not reflect the intended scenario: a relocation in a SHF_ALLOC section
referencing a discarded non-local symbol. For such a non .debug_* section, just
emit an error.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82899
On Windows co-operative programs can be expected to open LLD's
output in FILE_SHARE_DELETE mode. This allows us to delete the
file (by moving it to a temporary filename and then deleting
it) so that we can link another output file that overwrites
the existing file, even if the current file is in use.
A similar strategy is documented here:
https://boostgsoc13.github.io/boost.afio/doc/html/afio/FAQ/deleting_open_files.html
Differential Revision: https://reviews.llvm.org/D82567
Previously, we only supported binding dysyms to the GOT. This
diff adds support for binding them to any arbitrary section. C++
programs appear to use this, I believe for vtables and type_info.
This diff also makes our bind opcode encoding a bit smarter -- we now
encode just the differences between bindings, which will make things
more compact.
I was initially concerned about the performance overhead of iterating
over these relocations, but it turns out that the number of such
relocations is small. A quick analysis of my llvm-project build
directory showed that < 1.3% out of ~7M relocations are RELOC_UNSIGNED
bindings to symbols (including both dynamic and static symbols).
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D83103
This patch adds a few extra cases to the existing testing for eh_frame
and eh_frame_hdr behaviour in LLD. They all come from a private
testsuite we are trying to migrate to lit.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D82852
With this, a simple hello world links against libSystem.tbd and the
old ld64.lld linker kind of works again with newer SDKs.
The motivation here is to have an arm64 cross linker that's good
enough to be able to run simple configure link checks on non-mac
systems for generating config.h files. Once -flavor darwinnew can
link arm64, we'll switch to that.
Summary:
ld64 does this, and references an internal rdar:// number as an explanation. No
idea what that rdar issue is, but in practice, it seems that not putting a BSS
section at the end can cause subsequent sections in the same segment to be
overwritten with zeroes.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81888
After D81784, we resolve a relocation in .debug_* referencing an ICF folded
section symbol to a tombstone value.
Doing this for .debug_line has a problem (https://reviews.llvm.org/D81784#2116925 ):
.debug_line may describe folded lines as having addresses UINT64_MAX or
some wraparound small addresses.
```
int foo(int x) {
return x; // line 2
}
int bar(int x) {
return x; // line 6
}
```
```
Address Line Column File ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x00000000002016c0 1 0 1 0 0 is_stmt
0x00000000002016c7 2 9 1 0 0 is_stmt
prologue_end
0x00000000002016ca 2 2 1 0 0
0x00000000002016cc 2 2 1 0 0 end_sequence
// UINT64_MAX and wraparound small addresses
0xffffffffffffffff 5 0 1 0 0 is_stmt
0x0000000000000006 6 9 1 0 0 is_stmt
prologue_end
0x0000000000000009 6 2 1 0 0
0x000000000000000b 6 2 1 0 0 end_sequence
0x00000000002016d0 9 0 1 0 0 is_stmt
0x00000000002016df 10 6 1 0 0 is_stmt prologue_end
0x00000000002016e6 11 11 1 0 0 is_stmt
...
```
These entries can confuse debuggers:
gdb before 2020-07-01 (binutils-gdb a8caed5d7faa639a1e6769eba551d15d8ddd9510 "Recognize -1 as a tombstone value in .debug_line")
(can't continue due to a breakpoint in an invalid region of memory):
```
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x6
```
lldb (breakpoint has no effect):
```
(lldb) b 6
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
```
This patch special cases .debug_line to not use the tombstone value,
restoring the previous behavior: .debug_line will have entries with the
same addresses (ICF) but different line numbers. A breakpoint on line 2
or 6 will trigger on both functions.
Reviewed By: dblaikie, jhenderson
Differential Revision: https://reviews.llvm.org/D82828
Include the archive name as well as the member name when an error
is encountered parsing bitcode archives.
Differential Revision: https://reviews.llvm.org/D82884
This reduces peak memory on my test case from 1960.14MB to 1700.63MB
(-260MB, -13.2%) with no measurable impact on CPU time. I'm currently
working with a publics stream that is about 277MB. Before this change,
we would allocate 277MB of heap memory, serialize publics into them,
hold onto that heap memory, open the PDB, and commit into it. After
this change, we defer the serialization until commit time.
In the last change I made to public writing, I re-sorted the list of
publics multiple times in place to avoid allocating new temporary data
structures. Deferring serialization until later requires that we don't
reorder the publics. Instead of sorting the publics, I partially
construct the hash table data structures, store a publics index in them,
and then sort the hash table data structures. Later, I replace the index
with the symbol record offset.
This change also addresses a FIXME and moves the list of global and
public records from GSIHashStreamBuilder to GSIStreamBuilder. Now that
publics aren't being serialized, it makes even less sense to store them
as a list of CVSymbol records. The hash table used to deduplicate
globals is moved as well, since that is specific to globals, and not
publics.
Reviewed By: aganea, hans
Differential Revision: https://reviews.llvm.org/D81296
D79300 forgot to change `getBuffer().empty()` in LazyObjFile::parse to
`fetched`. This caused incorrect iterating after the current LazyObjFile was
fetched. This issue is benign and can just cause loss of "undefined symbols"
and "backward reference" diagnostics.
Before D79300 `mb = {}` caused --warn-backrefs-exclude to be useless for
a fetched LazyObjFile.
Add two test cases.
The meaning of -shared and -pie are expected to be changed in the
future when Module Linking-style libraries are implemented. Begin
issuing warnings to give people a heads-up that they will be changing.
For compatibility with Emscripten, add a --experimental-pic flag which
disables these warnings.
Differential Revision: https://reviews.llvm.org/D81760
Fixes PR46420
Similar to D43307 for non-LTO.
Module-level inline assembly can use .symver to create a symbol with `@` in the name.
For relocatable output, @ should be retained in the symbol name. `@ver` should
not be parsed and dropped.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D82433
Add support for the 34bit relocation R_PPC64_GOT_PCREL34 for
PC Relative in LLD.
Reviewers: sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D81948
This is the followup to D77647 which implements handling for the new
R_AARCH64_PLT32 relocation type in lld. This relocation would benefit the
PIC-friendly vtables feature described in D72959.
Differential Revision: https://reviews.llvm.org/D81184
See D59553, https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html and
https://sourceware.org/pipermail/binutils/2020-May/111357.html for
extensive discussions on a tombstone value.
See http://www.dwarfstd.org/ShowIssue.php?issue=200609.1
(Reserve an address value for "not present") for a DWARF enhancement proposal.
We resolve such relocations to a tombstone value to indicate that the address is invalid.
This solves several problems (the normal behavior is to resolve the relocation to the addend):
* For an empty function in a collected section, a pair of (0,0) can
terminate .debug_loc and .debug_ranges (as of binutils 2.34, GNU ld
resolves such a relocation to 1 to avoid the .debug_ranges issue)
* If DW_AT_high_pc is sufficiently large, the address range can collide
with a regular code range of low address (https://bugs.llvm.org/show_bug.cgi?id=41124 )
* If a text section is folded into another by ICF, we may leave entries
in multiple CUs claiming ownership of the same range of code, which can
confuse consumers.
* Debug information associated with COMDAT sections can have problems
similar to ICF, but is more complex - thus not addressed by this patch.
For pre-DWARF-v5 .debug_loc and .debug_ranges, a pair of 0 can terminate
entries (invalidating subsequent ranges).
-1 is a reserved value with special meaning (base address selection entry) which can't be used either.
Use -2 instead.
For all other .debug_*, use UINT32_MAX for 32-bit targets and UINT64_MAX
for 64-bit targets. In the code, we intentionally use
`uint64_t tombstone = UINT64_MAX` for 32-bit targets as well: this matches
SignExtend64 as used in `relocateAlloc`. (Actually UINT32_MAX does not work for R_386_32)
Note 0, we only special case `target->symbolicRel` (R_X86_64_64, R_AARCH64_ABS64, R_PPC64_ADDR64), not
short-range absolute relocations (e.g. R_X86_64_32). Only forms like DW_FORM_addr need to be special cased.
They can hold an arbitrary address (must be 64-bit on a 64-bit target). (In theory,
producers can make use of small code model to emit 32-bit relocations. This doesn't seem to be leveraged.)
Note 1, we have to ignore the addend, because we don't want to resolve
DW_AT_low_pc (which may have a non-zero addend) to -1+addend (wrap
around to a low address):
__attribute__((section(".text.x"))) void f1() { }
__attribute__((section(".text.x"))) void f2() { } // DW_AT_low_pc has a non-zero addend
Note 2, if the prevailing copy does not have debugging information while
a non-prevailing copy has (partial debug build), we don't do extra work
to attach debugging information to the prevailing definition. (clang
has a lot of debug info optimizations that are on-by-default that assume
the whole program is built with debug info).
clang -c -ffunction-sections a.cc # prevailing copy has no debug info
clang -c -ffunction-sections -g b.cc
Reviewed By: dblaikie, avl, jhenderson
Differential Revision: https://reviews.llvm.org/D81784
Summary:
There were a few issues with the previous setup:
1. The section sorting comparator used a declarative map of section names to
determine the correct order, but it turns out we need to match on more than
just names -- in particular, an upcoming diff will sort based on whether the
S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but
flexible form.
2. We were sorting OutputSections stored in a MapVector, which left the
MapVector in an inconsistent state -- the wrong keys map to the wrong values!
In practice, we weren't doing key lookups (only container iteration) after the
sort, so this was fine, but it was still a dubious state of affairs. This diff
copies the OutputSections to a vector before sorting them.
3. We were adding unneeded OutputSections to OutputSegments and then filtering
them out later, which meant that we had to remember whether an OutputSegment
was in a pre- or post-filtered state. This diff only adds the sections to the
segments if they are needed.
In addition to those major changes, two minor ones worth noting:
1. I renamed all OutputSection variable names to `osec`, to parallel `isec`.
Previously we were using some inconsistent combination of `osec`, `os`, and
`section`.
2. I added a check (and a test) for InputSections with names that clashed with
those of our synthetic OutputSections.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81887
If neither AT(lma) nor AT>lma_region is specified,
D76995 keeps `lmaOffset` (LMA - VMA) if the previous section is in the
default LMA region.
This patch additionally checks that the two sections are in the same
memory region.
Add a test case derived from https://bugs.llvm.org/show_bug.cgi?id=45313
.mdata : AT(0xfb01000) { *(.data); } > TCM
// It is odd to make .bss inherit lmaOffset, because the two sections
// are in different memory regions.
.bss : { *(.bss) } > DDR
With this patch, section VMA/LMA match GNU ld. Note, GNU ld supports
out-of-order (w.r.t sh_offset) sections and places .text and .bss in the
same PT_LOAD. We don't have that behavior.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81986
Fixes PR46348.
ObjFile<ELFT>::initializeSymbols contains two symbol iteration loops:
```
for each symbol
if non-inheriting && non-local
fill in this->symbols[i]
for each symbol
if local
fill in this->symbols[i]
else
symbol resolution
```
Symbol resolution can trigger a duplicate symbol error which will call
InputSectionBase::getObjMsg to iterate over InputFile::symbols. If a
non-local symbol appears after the non-local symbol being resolved
(violating ELF spec), its `this->symbols[i]` entry has not been filled
in, InputSectionBase::getObjMsg will crash due to
`dyn_cast<Defined>(nullptr)`.
To fix the bug, reorganize the two loops to ensure this->symbols is
complete before symbol resolution. This enforces the invariant:
InputFile::symbols has none null entry when InputFile::getSymbols() is called.
```
for each symbol
if non-inheriting
fill in this->symbols[i]
for each symbol starting from firstGlobal
if non-local
symbol resolution
```
Additionally, move the (non-local symbol in local part of .symtab)
diagnostic from Writer<ELFT>::copyLocalSymbols() to initializeSymbols().
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D81988
Some projects use the constructor attribute on functions that also
return values. In this case we just ignore them.
The error was reported in the libgpg-error project that marks
gpg_err_init with the `__constructor__` attribute.
Differential Revision: https://reviews.llvm.org/D81962
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).
Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.
For more information see PR36198 and D43002.
Differential Revision: https://reviews.llvm.org/D80833
Summary:
llvm-mc emits `__bss` sections with an offset of zero, but we weren't expecting
that in our input, so we were copying non-zero data from the start of the file and
putting it in `__bss`, with obviously undesirable runtime results. (It appears that
the kernel will copy those nonzero bytes as long as the offset is nonzero, regardless
of whether S_ZERO_FILL is set.)
I debated on whether to make a special ZeroFillSection -- separate from a
regular InputSection -- but it seemed like too much work for now. But I'm happy
to refactor if anyone feels strongly about having it as a separate class.
Depends on D80857.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80859
Summary:
Turns out this case is actually really common -- it happens whenever there's
a reference to an `extern` variable that ends up statically linked.
Depends on D80856.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80857
Summary:
As far as I can tell, it's identical to _GOT_LOAD. llvm-mc has the following
comment explaining why _GOT exists:
```
// x86_64 distinguishes movq foo@GOTPCREL so that the linker can
// rewrite the movq to an leaq at link time if the symbol ends up in
// the same linkage unit.
```
Depends on D80855.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: MaskRay, smeenai
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80856
Summary:
As mentioned in https://reviews.llvm.org/D81326#2093931, I'm not sure it
makes sense to use the default target triple to determine -arch.
Long-term we should probably detect it from the input object files, but
in the meantime it would be nice not to have to add it to all our tests
by using a convenient default.
Reviewers: #lld-macho
Subscribers: arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81983
A hasWildcard pattern iterates over symVector, which can be slow when there
are many --export-dynamic-symbol. In optimistic cases, most patterns don't use
a wildcard character. hasWildcard: false can avoid a symbol table iteration.
While here, add two tests using `[` and `?`, respectively.
Summary:
So things work on 32-bit machines. (@vzakhari reported the
breakage starting from D80177).
Reviewers: #lld-macho, vzakhari
Subscribers: llvm-commits, vzakhari
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81982
This removes the stub library that lld injected to satisfy the
dependency on the libSystem. Now with TBD support, we can provide the
stub library to permit the tests to function properly as they would on a
real system.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D81418
This is a complete Options.td compiled from ld(1) dated 2018-03-07 and
cross checked with ld64 source code version 512.4 dated 2018-03-18.
This is the first in a series of diffs for argument handling. Follow-ups
will include switch cases for all the new instances of `OPT_foo`, and
parsing/validation of arguments attached to options, e.g., more code
akin to `OPT_platform_version` and associated `parsePlatformVersion()`.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80582
This adds 4 new reloc types.
A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.
A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307
Differential Revision: https://reviews.llvm.org/D81704
Context: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md
This is just a first step, adding the new instruction variants while keeping the existing 32-bit functionality working.
Some of the basic load/store tests have new wasm64 versions that show that the basics of the target are working.
Further features need implementation, but these will be added in followups to keep things reviewable.
Differential Revision: https://reviews.llvm.org/D80769
Summary:
We should be reading / writing our addends / relocated addresses based on
r_length, and not just based on the type of the relocation. But since only
some r_length values are valid for a given reloc type, I've also added some
validation.
ld64 has code to allow for r_length = 0 in X86_64_RELOC_BRANCH relocs, but I'm
not sure how to create such a relocation...
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80854
Summary: After {D81326} landed, some tests started failing if they did
not have `-arch` specified. I think one of the reasons happened was due
to the fact that we were taking a reference to a temporary value that
was freed too early. Fixing that got the error to go away on my local
Linux machine.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D81802
This change introduces an LLD switch --thinlto-single-module to allow compiling only a part of the input modules. This is specifically enables:
1. Fast investigating/debugging modules of interest without spending time on compiling unrelated modules.
2. Compiler debug dump with -mllvm -debug-only= for specific modules.
It will be useful for large applications which has 1K+ input modules for thinLTO.
The switch can be combined with `--lto-obj-path=` or `--lto-emit-asm` to obtain intermediate object files or assembly files. So far the module name matching is implemented as a fuzzy name lookup where the modules with name containing the switch value are compiled.
E.g,
Command:
ld.lld main.o thin.a --thinlto-single-module=thin.a --lto-obj-path=single.o
log:
[ThinLTO] Selecting thin.a(thin1.o at 168) to compile
[ThinLTO] Selecting thin.a(thin2.o at 228) to compile
Command:
ld.lld main.o thin.a --thinlto-single-module=thin1.o --lto-obj-path=single.o
log:
[ThinLTO] Selecting thin.a(thin1.o at 168) to compile
Differential Revision: https://reviews.llvm.org/D80406
After D79300, we don't rewrite InputFile::mb to an empty buffer.
In thinLTOCreateEmptyIndexFiles(), we should check LazyObjFile::fetched
as well as checking whether mb is a bitcode, otherwise we would overwrite (path + .thinlto.bc) with an empty index.
Fixes PR45594.
In `ObjFile<ELFT>::initializeSymbols()`, for a defined symbol relative to
a discarded section (due to section group rules), it may have been
inserted as a lazy symbol. We need to demote it to an Undefined to
enable the `discarded section` error happened in a later pass.
Add `LazyObjFile::fetched` (if true) and `ArchiveFile::parsed` (if
false) to represent that there is an ongoing lazy symbol fetch and we
should replace the current lazy symbol with an Undefined, instead of
calling `Symbol::resolve` (`Symbol::resolve` should be called if the lazy
symbol was added by an unrelated archive/lazy object).
As a side result, one small issue in start-lib-comdat.s is now fixed.
The hack motivating D51892 will be unsupported: if
`.gnu.linkonce.t.__i686.get_pc_thunk.bx` in an archive is referenced
by another section, this will likely be errored unless the function is
also defined in a regular object file.
(Bringing back rL330869 would error `undefined symbol` instead of the
more relevant `discarded section`.)
Note, glibc i386's crti.o still works (PR31215), because
`.gnu.linkonce.t.__x86.get_pc_thunk.bx` is in crti.o (one of the first
regular object files in a linker command line).
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D79300
Add support to lld to use Text Based API stubs for linking. This is
support is incomplete not filtering out platforms. It also does not
account for architecture specific API handling and potentially does not
correctly handle trees of re-exports with inlined libraries being
treated as direct children of the top level library.
Use the default target triple configured by the user to determine the
default architecture for `ld64.lld`. Stash the architecture in the
configuration as when linking against TBDs, we will need to filter out
the symbols based upon the architecture. Treat the Haswell slice as it
is equivalent to `x86_64` but with the extra Haswell extensions (e.g.
AVX2, FMA3, BMI1, etc). This will make it easier to add new
architectures in the future.
This change also changes the failure mode where an invalid `-arch`
parameter will result in the linker exiting without further processing.
This merges the static and shared library and behaves as if
`-search_paths_first` was specified which is also the default behaviour
on ld64 (and now lld). Unify the paths, and use `llvm::sys::path` to
deal with the path to be truly agnostic to the host.
If both a.a and b.so define foo
```
ld.bfd -u foo a.a b.so # foo is defined
ld.bfd a.a b.so -u foo # foo is defined
ld.bfd -u foo b.so a.a # foo is undefined (provided at runtime by b.so)
ld.bfd b.so a.a -u foo # foo is undefined (provided at runtime by b.so)
```
In all cases we make foo undefined in the output. I tend to think the
GNU ld behavior makes more sense.
* In their model, they have to treat -u as a fake object file with an
undefined symbol before all input files, otherwise the first archive would not be fetched.
* Following their behavior allows us to drop a --warn-backrefs special case.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D81052
This is a very basic static library search addition. This is the pre-Xcode4
behaviour of searching all paths for the shared version before searching for
the static version of the library. This behaviour is supposed to be inverted
with `-search_paths_first` being the default. This adds the library search
with the intention of providing the setup to merge the paths into one path
and making it controllable by `OPT_search_paths_first`.
The LLVM code base already uses C++14, use std::make_unique
to avoid the explicit constructor invocation via new and to avoid
spelling out the type twice.
ld64 provides the `-search_path_firsts` which will search each path in
the library search path order for both `lib[name].dylib`, `lib[name].a`
before moving on (searching all paths for the dylib and then falling
back to the static library if a shared library was not found).
This option has been the default for a long time, but the command line
flag still exists. Ignore it for compatibility.
--no-allow-shlib-undefined (enabled by default when linking an
executable) rejects unresolved references in shared objects.
Users may be confused by the common diagnostics of unresolved symbols in
object files (LLD: "undefined symbol: foo"; GNU ld/gold: "undefined reference to")
Learn from GCC/clang " [-Wfoo]": append the option name to the
diagnostics. Users can find relevant information by searching
"--no-allow-shlib-undefined". It should also be obvious to them that
the positive form --allow-shlib-undefined can suppress the error.
Also downgrade the error to a warning if --noinhibit-exec is used (compatible
with GNU ld and gold).
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D81028
Previously, the SpecificAllocator was a static local in the `make<T>`
function template. Using static locals is nice because they are only
constructed and registered if they are accessed. However, if there are
multiple calls to make<> with different constructor parameters, we would
get multiple static local variable instances. This is undesirable and
leads to extra memory allocations. I noticed there were two sources of
DefinedRegular allocations while checking heap profiles.
My test refactoring in D80217 seems to have caused yaml2obj to emit
unaligned nlist_64 structs, causing ASAN'd lld to be unhappy. I don't
think this is an issue with yaml2obj though -- llvm-mc also seems to
emit unaligned nlist_64s. This diff makes lld able to safely do aligned
reads under ASAN builds while hopefully creating no overhead for regular
builds on architectures that support unaligned reads.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D80414
For consistency.
The no-id-dylib test was originally referencing the Inputs/ folder via a
relative path. Instead of updating that path, I decided to make the test
self-contained.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D80217
That's what ld64 uses for 64-bit targets. I figured it's best to make
this change sooner rather than later since a bunch of our tests are
relying on hardcoded addresses that depend on this value.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80177
I considered making a `Target::validate()` method, but I wasn't sure how
I felt about the overhead of doing yet another switch-dispatch on the
relocation type, so I put the validation in `relocateOne` instead...
might be a bit of a micro-optimization, but `relocateOne` does assume
certain things about the relocations it gets, and this error handling
makes that explicit, so it's not a totally unreasonable code
organization.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D80049
MIPS 64-bit ABI does not provide special PC-relative relocation like
R_MIPS_PC32 in 32-bit case. But we can use a "chain of relocation"
defined by N64 ABIs. In that case one relocation record might contain up
to three relocations which applied sequentially. Width of a final relocation
mask applied to the result of relocation depends on the last relocation
in the chain. In case of 64-bit PC-relative relocation we need the following
chain: `R_MIPS_PC32 | R_MIPS_64`. The first relocation calculates an
offset, but does not truncate the result. The second relocation just
apply calculated result as a 64-bit value.
The 64-bit PC-relative relocation might be useful in generation of
`.eh_frame` sections to escape passing `-Wl,-z,notext` flags to linker.
Differential Revision: https://reviews.llvm.org/D80390
This patch adds clang options:
-fbasic-block-sections={all,<filename>,labels,none} and
-funique-basic-block-section-names.
LLVM Support for basic block sections is already enabled.
+ -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic
block sections for all or a subset of basic blocks. "labels" only enables
basic block symbols.
+ -funique-basic-block-section-names: Enables unique section names for
basic block sections, disabled by default.
Differential Revision: https://reviews.llvm.org/D68049
First, do not reserve numSections in the Chunks array. In cases where
there are many non-prevailing sections, this will overallocate memory
which will not be used.
Second, free the memory for sparseChunks after initializeSymbols. After
that, it is never used.
This saves 50MB of 627MB for my use case without affecting performance.
The inlinees section contains references to the file checksum table. The
file checksum table in the PDB must have the same layout as the file
checksum table in the object file, so all the existing file id
references should stay valid.
Previously, we would do this:
for all inlined functions:
- lookup filename from checksum and string table
- make that filename absolute
- look up the new file id for that filename up in the new checksum
table
This lead to pdbMakeAbsolute and remove_dots ending up in the hot path.
We should only need to absolutify the source path once, not once every
time we process an inline function from that source file.
This speeds up linking chrome PGO stage 1 net_unittests.exe from 9.203s
to 8.500s (-7.6%). Looking just at time to process symbol records, it
goes from ~2000ms to ~1300ms, which is consistent with the overall
speedup of about 700ms. This will be less noticeable in debug builds,
which have fewer inlined functions records.
GNU ld from binutils 2.35 onwards will likely support
--export-dynamic-symbol but with different semantics.
https://sourceware.org/pipermail/binutils/2020-May/111302.html
Differences:
1. -export-dynamic-symbol is not supported
2. --export-dynamic-symbol takes a glob argument
3. --export-dynamic-symbol can suppress binding the references to the definition within the shared object if (-Bsymbolic or -Bsymbolic-functions)
4. --export-dynamic-symbol does not imply -u
I don't think the first three points can affect any user.
For the fourth point, Not implying -u can lead to some archive members unfetched.
Add -u foo to restore the previous behavior.
Exact semantics:
* -no-pie or -pie: matched non-local defined symbols will be added to the dynamic symbol table.
* -shared: matched non-local STV_DEFAULT symbols will not be bound to definitions within the shared object
even if they would otherwise be due to -Bsymbolic, -Bsymbolic-functions, or --dynamic-list.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D80487
LLD supports both REL and RELA for static relocations, but emits either
of REL and RELA for dynamic relocations. The relocation entry format is
specified by each psABI.
musl ld.so supports both REL and RELA. For such ld.so implementations,
REL (.rel.dyn .rel.plt) has size benefits even if the psABI chooses RELA:
sizeof(Elf64_Rel)=16 < sizeof(Elf64_Rela)=24.
* COPY, GLOB_DAT and J[U]MP_SLOT always have 0 addend. A ld.so
implementation does not need to read the implicit addend.
REL is strictly better.
* A RELATIVE has a non-zero addend. Such relocations can be packed
compactly with the RELR relocation entry format, which is out of scope
of this patch.
* For other dynamic relocation types (e.g. symbolic relocation R_X86_64_64),
a ld.so implementation needs to read the implicit addend. REL may have
minor performance impact, because reading implicit addends forces
random access reads instead of being able to blast out a bunch of
writes while chasing the relocation array.
This patch adds -z rel and -z rela to change the relocation entry format
for dynamic relocations. I have tested that a -z rel produced x86-64
executable works with musl ld.so
-z rela may be useful for debugging purposes on processors whose psABIs
specify REL as the canonical format: addends can be easily read by a tool.
Reviewed By: grimar, mcgrathr
Differential Revision: https://reviews.llvm.org/D80496
Previously in the object format we punted on this and simply wrote
zeros (and didn't include the function in the elem segment). With
this change we write a meaningful value which is the segment
relative table index of the associated function.
This matches the that wasm-ld produces in `-r` mode. This inconsistency
between the output the MC object writer and the wasm-ld object
writer could cause warnings to be emitted when reading back in the
output of `wasm-ld -r`. See:
https://github.com/emscripten-core/emscripten/issues/11217
This only applies to this one relocation type which is only generated
when compiling in PIC mode.
Differential Revision: https://reviews.llvm.org/D80774
When we originally wrote these tests we didn't have a stable and
fleshed out assembly format. Now we do so we should prefer that
over llvm ir for lld tests to avoid including more part of llvm
than necessary in order to run the test.
This change converts just 30 out of about 130 test files. More to
come when I have some more time.
Differential Revision: https://reviews.llvm.org/D80361
Summary:
Count the per-module number of basic blocks when the module summary is computed
and sum them up during Thin LTO indexing.
This is used to estimate the working set size under the partial sample PGO.
This is split off of D79831.
Reviewers: davidxl, espindola
Subscribers: emaste, inglorion, hiraditya, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80403
In D34993, we discussed and concluded that we should drop `__real_
symbol from the symbol table, but I did the opposite in D50569.
This patch is to drop `__real_` symbol.
MaskRay's note: omitting `__real_` is important if it is undefined:
otherwise a subsequent link may error due to the undefined `__real_` in .dynsym
Differential Revision: https://reviews.llvm.org/D51283
Bazel created interface shared objects (.ifso) may be misaligned. We use
llvm::support::detail::packed_endian_specific_integral under the hood
which allows reading of misaligned values, so there is not a need to
diagnose (in LLD we don't intend to support sophisticated parsing for
SHT_GNU_*).
In the 64-bit ELF V2 API Specification: Power Architecture, 2.3.3.1. GPR
Save and Restore Functions defines some special functions which may be
referenced by GCC produced assembly (LLVM does not reference them).
With GCC -Os, when the number of call-saved registers exceeds a certain
threshold, GCC generates `_savegpr0_* _restgpr0_*` calls and expects the
linker to define them. See
https://sourceware.org/pipermail/binutils/2002-February/017444.html and
https://sourceware.org/pipermail/binutils/2004-August/036765.html . This
is weird because libgcc.a would be the natural place. However, the linker
generation approach has the advantage that the linker can generate
multiple copies to avoid long branch thunks. We don't consider the
advantage significant enough to complicate our trunk implementation, so
we take a simple approach.
* Check whether `_savegpr0_{14..31}` are used
* If yes, define needed symbols and add an InputSection with the code sequence.
`_savegpr1_*` `_restgpr0_*` and `_restgpr1_*` are similar.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D79977
An undefined symbol in a shared object can be versioned, like `f@v1`.
We currently insert `f` as an Undefined into the symbol table, but we
should insert `f@v1` instead.
The string `v1` is inferred from SHT_GNU_versym and SHT_GNU_verneed.
This patch implements the functionality.
Failing to do this can cause two issues:
* If a versioned symbol referenced by a shared object is defined in the
executable, we will fail to export it.
* If a versioned symbol referenced by a shared object in another object
file, --no-allow-shlib-undefined may spuriously report an
"undefined reference to " error. See https://bugs.llvm.org/show_bug.cgi?id=44842
(Linking -lfftw3 -lm on Arch Linux can cause
`undefined reference to __log_finite`)
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D80059
Summary:
This patch fixes a bug where initialization code for .bss segments was
emitted in the memory initialization function even though the .bss
segments were discounted in the datacount section and omitted in the
data section. This was producing invalid binaries due to out-of-bounds
segment indices on the memory.init and data.drop instructions that
were trying to operate on the nonexistent .bss segments.
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80354
Summary:
This is a pre-requisite to parallelizing PDB symbol and type merging.
Currently this timer usage would not be thread safe.
Reviewers: aganea, MaskRay
Subscribers: jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80298
Summary:
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.
The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.
We exercise this functionality in our tests by using order files that
rearrange those symbols.
Depends on D79668.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: thakis, llvm-commits, pcc, ruiu
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79926
Note, we still name a preempted SharedSymbol "shared definition",
instead of "reference" as printed by GNU ld. This difference should not matter.
```
// GNU ld
ld.bfd: t: definition of f@v1
ld.bfd: t.so: reference to f@v1
```
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D80143
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.
The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.
We exercise this functionality in our tests by using order files that
rearrange those symbols.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D79926
The order file indicates how input sections should be sorted within each
output section, based on the symbols contained within those sections.
This diff sets the stage for implementing and testing
`.subsections_via_symbols`, where we will break up InputSections by each
symbol and sort them more granularly.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D79668
Under `lld/test/Driver/Inputs/`, all instances of `libtest.a` are
unreferenced. FYI, all of these are empty archives, and the files
contain only a magic number.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D80182
This is fixing a thinLTO module collision issue for thin archives. The problem is that we always use a zero offset to name members in a thin archive and that causes the following build error:
ld.lld: error: Expected at most one ThinLTO module per bitcode file
which happens to a thin archive that has two members with the same object file name (whose paths will be ignored by thinLTO driver)
The fix here is to use real member offset instead as is done for non-thin archives.
Differential Revision: https://reviews.llvm.org/D79880
Announced on https://lists.llvm.org/pipermail/llvm-dev/2020-May/141416.html
Similar to D79371, but for `multiclass B` (convenience helper for defining --foo and --no-foo)
Some changed options are also used by gold, but I haven't seen their
one-dash use cases outside of lld's testsuite.
With this change, basic archive files can be linked together. Input
section discovery has been refactored into a function since archive
files lazily resolve their symbols / the object files containing those
symbols.
Reviewed By: int3, smeenai
Differential Revision: https://reviews.llvm.org/D78342
This paves the way to doing more things in parallel, and allows us to
order type sources in dependency order. PDBs and PCH objects have to be
loaded before object files which use them.
This is a rebase of the unapplied remaining changes in
https://reviews.llvm.org/D59226. I found it very challenging to rebase
this across the LLD variable name style change. I recall there was a
tool for that, but I didn't take the time to use it.
Reviewers: aganea, akhuang
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79672
Install a cmake config file. Copied exactly from how clang exports.
I also wasn't sure whether the canonical capitalization is "lld" or
"LLD". The project() is still calling this lld, but most places seemed
to capitalize it.
Just skip trying to match for the path separator explicitly (instead
of making it match either a forward or backwards slash), simplifying
the test a little.
Allow disabling either the full auto import feature, or just
forbidding the cases that require runtime fixups.
As long as all auto imported variables are referenced from separate
.refptr$<name> sections, we can alias them on top of the IAT entries
and don't actually need any runtime fixups via pseudo relocations.
LLVM generates references to variables in .refptr stubs, if it
isn't known that the variable for sure is defined in the same object
module. Runtime pseudo relocs are needed if the addresses of auto
imported variables are used in constant initializers though.
Fixing up runtime pseudo relocations requires the use of
VirtualProtect (which is disallowed in WinStore/UWP apps) or
VirtualProtectFromApp. To allow any risk of ambiguity, allow
rejecting cases that would require this at the linker stage.
This adds support for the --disable-runtime-pseudo-reloc and
--disable-auto-import options in the MinGW driver (matching GNU ld.bfd)
with corresponding lld private options in the COFF driver.
Differential Revision: https://reviews.llvm.org/D78923
This is a followup to https://reviews.llvm.org/D78779.
When signatures mismatch we create set of variant symbols. Some of
the fields in these symbols were not be initialized correct.
Specifically we were seeing isUsedInRegularObj not being set correctly,
leading to the symbol not getting included in the symbol table
and a crash writing relections in --reloctable mode.
There is larger refactor due here, but this is a minimal change the
fixes the bug at hand.
Differential Revision: https://reviews.llvm.org/D79756
clang passes these flags; this makes it easier to try `clang -v`
output with `ld -flavor darwinnew`.
Differential Revision: https://reviews.llvm.org/D79797
This unblocks the linking of real programs, since many core system
functions are only available as sub-libraries of libSystem.
Differential Revision: https://reviews.llvm.org/D79228
Summary:
Add the vendor macro to "lld" for extended version output support,
such that it's able to print additional version info. This is
consistent with the Clang and LLVM version printer, and the
additional version message can be provided via PACKAGE_VENDOR.
Reviewers: hubert.reinterpretcast, kbarton, cebowleratibm, rzurob, ruiu
Reviewed By: hubert.reinterpretcast
Subscribers: emaste, mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79026
The initial attempt didn't work on Windows; apparently Powershell has a
different syntax for running commands sequentially and concatenating
their outputs. So I've created two temporary files instead.
Differential Revision: https://reviews.llvm.org/D79697
Both the .ARM.exidx and .eh_frame sections have a custom SyntheticSection
that acts as a container for the InputSections. The InputSections are added
to the SyntheticSection prior to /DISCARD/ which limits the affect a
/DISCARD/ can have to the whole SyntheticSection. In the majority of cases
this is sufficient as it is not common to discard subsets of the
InputSections. The Linux kernel has one of these scripts which has something
like:
/DISCARD/ : { *(.ARM.exidx.exit.text) *(.ARM.extab.exit.text) ... }
The .ARM.exidx.exit.text are not discarded because the InputSection has been
transferred to the Synthetic Section. The *(.ARM.extab.exit.text) sections
have not so they are discarded. When we come to write out the .ARM.exidx
sections the dangling references from .ARM.exidx.exit.text to
.ARM.extab.exit.text currently cause relocation out of range errors, but
could as easily cause a fatal error message if we check for dangling
references at relocation time.
This patch attempts to respect the /DISCARD/ command by running it on the
.ARM.exidx InputSections stored in the SyntheticSection.
The .eh_frame is in theory affected by this problem, but I don't think that
there is a dangling reference problem that can happen with these sections.
Fixes remaining part of pr44824
Differential Revision: https://reviews.llvm.org/D79687
This fixes an accidental breakage of exporting symbols using def
files, when the symbol name contains a period, since commit
0ca06f7950, mixing up a symbol name containing a period with
the case of exporting a symbol as a forward to another dll.
Differential Revision: https://reviews.llvm.org/D79619
Summary:
This allows us to link against stripped dylibs. Moreover, it's simply
more correct: The symbol table includes symbols that the dylib uses but
doesn't export.
This temporarily regresses our ability to do lazy symbol binding because
dyld_stub_binder isn't in libSystem's export trie. Rather, it is in one
of the sub-libraries libSystem re-exports. (This doesn't affect our
tests since we are mocking out dyld_stub_binder there.) A follow-up diff
will address this by adding support for sub-libraries.
Depends on D79114.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79226
Summary:
Otherwise we get undefined symbol errors depending on the order of
arguments on the command line.
Depends on D78270.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79114
Summary:
This diff implements lazy symbol binding -- very similar to the PLT
mechanism in ELF.
ELF's .plt section is broken up into two sections in Mach-O:
StubsSection and StubHelperSection. Calls to functions in dylibs will
end up calling into StubsSection, which contains indirect jumps to
addresses stored in the LazyPointerSection (the counterpart to ELF's
.plt.got).
Initially, the LazyPointerSection contains addresses that point into one
of the entry points in the middle of the StubHelperSection. The code in
StubHelperSection will push on the stack an offset into the
LazyBindingSection. The push is followed by a jump to the beginning of
the StubHelperSection (similar to PLT0), which then calls into
dyld_stub_binder. dyld_stub_binder is a non-lazily bound symbol, so this
call looks it up in the GOT.
The stub binder will look up the bind opcodes in the LazyBindingSection
at the given offset. The bind opcodes will tell the binder to update the
address in the LazyPointerSection to point to the symbol, so that
subsequent calls don't have to redo the symbol resolution. The binder
will then jump to the resolved symbol.
Depends on D78269.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78270
Summary:
1. Don't have isHidden() depend on isNeeded(). Whether a section is
hidden is orthogonal from whether it is needed: hidden sections will
never have a header regardless of whether they have a body. (I know we
override this method with return false for synthetic sections, but
regardless I think it's confusing to write it this way for non-synthetic
sections.)
2. Don't call writeTo() on unneeded sections. D78270 assumes that this
is true when implementing the stub helper section.
3. Filter out the unneeded sections early on to avoid having to deal
with them in multiple places.
4. Remove assumption in test that the referenced file has no other symbols.
(We should create separate input files for future tests to avoid such
issues.)
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79460
I noticed that std::error_code() does one-time initialization. Avoid
that overhead with Expected<T> and llvm::Error. Also, it is consistent
with the virtual interface and ELF, and generally cleaner.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D79643
Summary:
The WebAssembly backend automatically lowers atomic operations and TLS
to nonatomic operations and non-TLS data when either are present and
the atomics or bulk-memory features are not present, respectively. The
resulting object is no longer thread-safe, so the linker has to be
told not to allow it to be linked into a module with shared
memory. This was previously done by disallowing the 'atomics' feature,
which prevented any objct with its atomic operations or TLS removed
from being linked with any object containing atomics or TLS, and
therefore preventing it from being linked into a module with shared
memory since shared memory requires atomics.
However, as of https://github.com/WebAssembly/threads/issues/144, the
validation rules are relaxed to allow atomic operations to validate
with unshared memories, which makes it perfectly safe to link an
object with stripped atomics and TLS with another object that still
contains TLS and atomics as long as the resulting module has an
unshared memory. To allow this kind of link, this patch disallows a
pseudo-feature 'shared-mem' rather than 'atomics' to communicate to
the linker that the object is not thread-safe. This means that the
'atomics' feature is available to accurately reflect whether or not an
object has atomics enabled.
As a drive-by tweak, this change also requires that bulk-memory be
enabled in addition to atomics in order to use shared memory. This is
because initializing shared memories requires bulk-memory operations.
Reviewers: aheejin, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79542
For sampleFDO, because the optimized build uses profile generated from previous
release, often we couldn't tell a function without profile was truely cold or
just newly created so we had to treat them conservatively and put them in .text
section instead of .text.unlikely. The result was when we persue the best
performance by locking .text.hot and .text in memory, we wasted a lot of memory
to keep cold functions inside. This problem has been largely solved for regular
sampleFDO using profile-symbol-list (https://reviews.llvm.org/D66374), but for
the case when we use partial profile, we still waste a lot of memory because
of it.
In https://reviews.llvm.org/D62540, we propose to save functions with unknown
hotness information in a special section called ".text.unknown", so that
compiler will treat those functions as luck-warm, but runtime can choose not
to mlock the special section in memory or use other strategy to save memory.
That will solve most of the memory problem even if we use a partial profile.
The patch adds the support in lld for the special section.For sampleFDO,
because the optimized build uses profile generated from previous release,
often we couldn't tell a function without profile was truely cold or just
newly created so we had to treat them conservatively and put them in .text
section instead of .text.unlikely. The result was when we persue the best
performance by locking .text.hot and .text in memory, we wasted a lot of
memory to keep cold functions inside. This problem has been largely solved
for regular sampleFDO using profile-symbol-list
(https://reviews.llvm.org/D66374), but for the case when we use partial
profile, we still waste a lot of memory because of it.
In https://reviews.llvm.org/D62540, we propose to save functions with unknown
hotness information in a special section called ".text.unknown", so that
compiler will treat those functions as luck-warm, but runtime can choose not
to mlock the special section in memory or use other strategy to save memory.
That will solve most of the memory problem even if we use a partial profile.
The patch adds the support in lld for the special section.
Differential Revision: https://reviews.llvm.org/D79590
Reduces time to link PGO instrumented net_unittets.exe by 11% (9.766s ->
8.672s, best of three). Reduces peak memory by 65.7MB (2142.71MB ->
2076.95MB).
Use a more compact struct, BulkPublic, for faster sorting. Sort in
parallel. Construct the hash buckets in parallel. Try to use one vector
to hold all the publics instead of copying them from one to another.
Allocate all the memory needed to serialize publics up front, and then
serialize them in place in parallel.
Reviewed By: aganea, hans
Differential Revision: https://reviews.llvm.org/D79467
Before this patch, the debug record S_GTHREAD32 which represents global thread_local symbols, was emitted by LLD into the respective module stream. This makes Visual Studio unable to display thread_local symbols in the debugger.
After this patch, S_GTHREAD32 is moved into the globals stream. This matches MSVC behavior.
Differential Revision: https://reviews.llvm.org/D79005
Essentially takes the lld/Common/Threads.h wrappers and moves them to
the llvm/Support/Paralle.h algorithm header.
The changes are:
- Remove policy parameter, since all clients use `par`.
- Rename the methods to `parallelSort` etc to match LLVM style, since
they are no longer C++17 pstl compatible.
- Move algorithms from llvm::parallel:: to llvm::, since they have
"parallel" in the name and are no longer overloads of the regular
algorithms.
- Add range overloads
- Use the sequential algorithm directly when 1 thread is requested
(skips task grouping)
- Fix the index type of parallelForEachN to size_t. Nobody in LLVM was
using any other parameter, and it made overload resolution hard for
for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t.
Remove Threads.h and update LLD for that.
This is a prerequisite for parallel public symbol processing in the PDB
library, which is in LLVM.
Reviewed By: MaskRay, aganea
Differential Revision: https://reviews.llvm.org/D79390
A linker will create .ARM.exidx sections for InputSections that don't
have them. This can cause a relocation out of range error If the
InputSection happens to be extremely far away from the other sections.
This is often the case for the vector table on older ARM CPUs as the only
two places that the table can be placed is 0 or 0xffff0000. We fix this
by removing InputSections that need a linker generated .ARM.exidx
section if that would cause an error.
Differential Revision: https://reviews.llvm.org/D79289
Summary:
That unless the user requested an output object (--lto-obj-path), the an
unused empty combined module is not emitted.
This changed is helpful for some target (ex. RISCV-V) which encoded the
ABI info in IR module flags (target-abi). Empty unused module has no ABI
info so the linker would get the linking error during merging
incompatible ABIs.
Reviewers: tejohnson, espindola, MaskRay
Subscribers: emaste, inglorion, arichardson, hiraditya, simoncook, MaskRay, steven_wu, dexonsmith, PkmX, dang, lenary, s.egerton, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78988
We currently only support extern relocations.
`X86_64_RELOC_SIGNED_{1,2,4}` are like X86_64_RELOC_SIGNED, but with the
implicit addend fixed to 1, 2, and 4, respectively.
See the comment in `lib/Target/X86/MCTargetDesc/X86MachObjectWriter.cpp RecordX86_64Relocation`.
Reviewed By: int3
Differential Revision: https://reviews.llvm.org/D79311
Heap profiling with ETW shows that LLD performs 4,053,721 heap
allocations over its lifetime, and ~800,000 of them come from
assocEquals. These vectors are created just to do a comparison, so fuse
the comparison into the loop and avoid the allocation.
ICF is overall a small portion of the time spent linking, and I did not
measure overall throughput improvements from this change above the noise
threshold. However, these show up in the heap profiler, and the work is
done, so we might as well land it if the code is clear enough.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D79297
Sections with the SHF_LINK_ORDER flag must be ordered in the same relative
order as the Sections they have a link to. When using a linker script an
arbitrary expression may be used for the virtual address of the
OutputSection. In some cases the virtual address does not monotonically
increase as the OutputSection index increases, so if we base the ordering
of the SHF_LINK_ORDER sections on the index then we can get the order
wrong. We fix this by moving SHF_LINK_ORDER resolution till after we have
created OutputSection virtual addresses.
Differential Revision: https://reviews.llvm.org/D79286
This particular overload allocates memory, and we do this for every
S_[GL]PROC32_ID record. Instead, hardcode the offset of the typeindex
that we are looking for in the LF_[MEM]FUNC_ID record. We already
assumed that looking up the item index already found a record of this
kind.
Otherwise an ArgumentParser is constructed for every directive section,
and that involves copying the entire table of options into a vector.
There is no need for this, just have one option table.
Summary:
Lld test ELF/linkerscript/thunk-gen-mips.s was accidentally disabled due
to the use of wrong FileCheck directives. As a result the test seems to
have bitrotted as it fails to pass if fixing the directive. To ease
updates to the test in case of change of the __start address the checks
have been changed to use numeric variables to express all the addresses
based on the __start address.
Reviewed By: atanasyan
Differential Revision: https://reviews.llvm.org/D79270
This generalizes the main Windows command line tokenizer to be able to
produce StringRef substrings as well as freshly copied C strings. The
implementation is still shared with the normal tokenizer, which is
important, because we have unit tests for that.
.drective sections can be very long. They can potentially list up to
every symbol in the object file by name. It is worth avoiding these
string copies.
This saves a lot of memory when linking chrome.dll with PGO
instrumentation:
BEFORE AFTER % IMP
peak memory: 6657.76MB 4983.54MB -25%
real: 4m30.875s 2m26.250s -46%
The time improvement may not be real, my machine was noisy while running
this, but that the peak memory usage improvement should be real.
This change may also help apps that heavily use dllexport annotations,
because those also use linker directives in object files. Apps that do
not use many directives are unlikely to be affected.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D79262
Summary: Similar to other formats, input sections in the MachO
implementation are now grouped under output sections. This is primarily
a refactor, although there's some new logic (like resolving the output
section's flags based on its inputs).
Differential Revision: https://reviews.llvm.org/D77893
This change add support for defined wasm globals in the .s format,
the MC layer, and wasm-ld
Currently there is no support custom initialization and all wasm
globals are initialized to zero.
Fixes: PR45742
Differential Revision: https://reviews.llvm.org/D79137
Lld test ELF/linkerscript/input-archive.s fails when path contain a @
because is not accepted in unquoted token in linker scripts which leads
to the path being broken in 2 around the @. This commit quotes the path
used in the linker script created by this and similar testcases allowing
the test to pass even in the presence of an @ sign in the path.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D79103
The current implementation assumes that R_PPC64_TOC16_HA is always followed
by R_PPC64_TOC16_LO_DS. This can break with R_PPC64_TOC16_LO:
// Load the address of the TOC entry, instead of the value stored at that address
addis 3, 2, .LC0@tloc@ha # R_PPC64_TOC16_HA
addi 3, 3, .LC0@tloc@l # R_PPC64_TOC16_LO
blr
which is used by boringssl's util/fipstools/delocate/delocate.go
https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation.
In short, this tool converts an assembly file to avoid any potential relocations.
The distance to an input .toc is not a constant after linking, so it cannot use an `addis;ld` pair.
Instead, it jumps to a stub which loads the TOC entry address with `addis;addi`.
This patch checks the presence of R_PPC64_TOC16_LO and suppresses
toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen.
This approach is conservative and loses some relaxation opportunities but is easy to implement.
addis 3, 2, .LC0@toc@ha # no relaxation
addi 3, 3, .LC0@toc@l # no relaxation
li 9, 0
addis 4, 2, .LC0@toc@ha # can relax but suppressed
ld 4, .LC0@toc@l(4) # can relax but suppressed
Also note that interleaved R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS is
possible and this patch accounts for that.
addis 3, 2, .LC1@toc@ha # can relax
addis 4, 2, .LC2@toc@ha # can relax
ld 3, .LC1@toc@l(3) # can relax
ld 4, .LC2@toc@l(4) # can relax
Reviewed By: #powerpc, sfertile
Differential Revision: https://reviews.llvm.org/D78431
gold has an option --print-symbol-counts= which prints:
// For each archive
archive $archive $members $fetched_members
// For each object file
symbols $object $defined_symbols $used_defined_symbols
In most cases, `$defined_symbols = $used_defined_symbols` unless weak
symbols are present. Strangely `$used_defined_symbols` includes symbols defined relative to --gc-sections discarded sections.
The `symbols` lines do not appear to be useful.
`archive` lines are useful: `$fetched_members=0` lines correspond to
unused archives. The information can be used to trim dependencies.
This patch implements --print-archive-stats= which prints the number of
members and the number of fetched members for each archive.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D78983
Currently, getVA() returns a virtual address with the assumption that
the ImageBase is zero. As I understand, this is what lld-ELF is doing.
However, under our current design, it seems like an awkward setup --
I'm finding that I have to add and subtract ImageBase in several places
to make things work out.
As such, I think it's simpler to have getVA() return a non-relative VA,
but I'm not sure if I'm missing something. Would love to hear more from
folks familiar with lld-ELF.
Differential Revision: https://reviews.llvm.org/D78168
Build the trie by performing a three-way radix quicksort: We start by
sorting the strings by their first characters, then sort the strings
with the same first characters by their second characters, and so on
recursively. Each time the prefixes diverge, we add a node to the trie.
Thanks to @ruiu for the idea.
I used llvm-mc's radix quicksort implementation as a starting point. The
trie offset fixpoint code was taken from
MachONormalizedFileBinaryWriter.cpp.
Differential Revision: https://reviews.llvm.org/D76977
--gdb-index currently crashes when reading a translation unit with
DWARF v5 .debug_loclists . Call stack:
```
SyntheticSections.cpp GdbIndexSection::create
SyntheticSections.cpp readAddressAreas
DWARFUnit.cpp DWARFUnit::tryExtractDIEsIfNeeded
DWARFListTable.cpp DWARFListTableHeader::extract
...
DWARFDataExtractor.cpp DWARFDataExtractor::getRelocatedValue
lld/ELF/DWARF.cpp LLDDwarfObj<ELFT>::find (sec.sec is nullptr)
...
```
This patch adds support for .debug_loclists to make `DWARFUnit::tryExtractDIEsIfNeeded` happy.
Building --gdb-index does not need .debug_loclists
Reviewed By: dblaikie, grimar
Differential Revision: https://reviews.llvm.org/D79061
Drops the behavior from rL217112.
Use the Gnu driver mode by default for all platforms when ld is
invoked. Other names for the program (such as link or ld64) continue
working as before.
Reviewed By: MaskRay, srhines, smeenai, ruiu
Differential Revision: https://reviews.llvm.org/D78837
Summary:
Use the Gnu driver mode by default for all platforms when ld is
invoked. Other names for the program (such as link or ld64) continue
working as before.
Reviewers: MaskRay, int3, srhines, smeenai, ruiu
Reviewed By: MaskRay, srhines, smeenai, ruiu
Subscribers: smeenai, srhines, nickdesaulniers, llvm-commits
Tags: #lld, #llvm
Differential Revision: https://reviews.llvm.org/D78837
This reverts commit 03ffe58605.
Full tile of reverted commit is:
[ELF][PPC64] Don't perform toc-indirect to toc-relative relaxation for
R_PPC64_TOC16_HA not followed by R_PPC64_TOC16_LO_DS
Breaks the multistage lld PowerPC buildbot.
This diff implements basic support for writing a symbol table.
Attributes are loosely supported for extern symbols and not at all for
other types.
Initial version by Kellie Medlin <kelliem@fb.com>
Originally committed in a3d95a50ee and reverted in fbae153ca5 due to
UBSAN erroring over unaligned writes. That has been fixed in the
current diff with the following changes:
```
diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -133,6 +133,9 @@ SymtabSection::SymtabSection(StringTableSection &stringTableSection)
: stringTableSection(stringTableSection) {
segname = segment_names::linkEdit;
name = section_names::symbolTable;
+ // TODO: When we introduce the SyntheticSections superclass, we should make
+ // all synthetic sections aligned to WordSize by default.
+ align = WordSize;
}
size_t SymtabSection::getSize() const {
diff --git a/lld/MachO/Writer.cpp b/lld/MachO/Writer.cpp
--- a/lld/MachO/Writer.cpp
+++ b/lld/MachO/Writer.cpp
@@ -371,6 +371,7 @@ void Writer::assignAddresses(OutputSegment *seg) {
ArrayRef<InputSection *> sections = p.second;
for (InputSection *isec : sections) {
addr = alignTo(addr, isec->align);
+ // We must align the file offsets too to avoid misaligned writes of
+ // structs.
+ fileOff = alignTo(fileOff, isec->align);
isec->addr = addr;
addr += isec->getSize();
fileOff += isec->getFileSize();
@@ -396,6 +397,7 @@ void Writer::writeSections() {
uint64_t fileOff = seg->fileOff;
for (auto § : seg->getSections()) {
for (InputSection *isec : sect.second) {
+ fileOff = alignTo(fileOff, isec->align);
isec->writeTo(buf + fileOff);
fileOff += isec->getFileSize();
}
```
I don't think it's easy to write a test for alignment (that doesn't
involve brittly hard-coding file offsets), so there isn't one... but
UBSAN builds pass now.
Differential Revision: https://reviews.llvm.org/D79050
The current implementation assumes that R_PPC64_TOC16_HA is always followed
by R_PPC64_TOC16_LO_DS. This can break with:
// Load the address of the TOC entry, instead of the value stored at that address
addis 3, 2, .LC0@tloc@ha # R_PPC64_TOC16_HA
addi 3, 3, .LC0@tloc@l # R_PPC64_TOC16_LO
blr
which is used by boringssl's util/fipstools/delocate/delocate.go
https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation.
In short, this tool converts an assembly file to avoid any potential relocations.
The distance to an input .toc is not a constant after linking, so the assembly cannot use an `addis;ld` pair.
Instead, delocate changes the code to jump to a stub (`addis;addi`) which loads the TOC entry address.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D78431
This speeds up linking chrome.dll with PGO instrumentation by 13%
(154271ms -> 134033ms).
LLVM's Option library is very slow. In particular, it allocates at least
one large-ish heap object (Arg) for every argument. When PGO
instrumentation is enabled, all the __profd_* symbols are added to the
@llvm.used list, which compiles down to these /INCLUDE: directives. This
means we have O(#symbols) directives to parse in the section, so we end
up allocating an Arg for every function symbol in the object file. This
is unnecessary.
To address the issue and speed up the link, extend the fast path that we
already have for /EXPORT:, which has similar scaling issues.
I promise that I took a hard look at optimizing the Option library, but
its data structures are very general and would need a lot of cleanup. We
have accumulated lots of optional features (option groups, aliases,
multiple values) over the years, and these are now properties of every
parsed argument, when the vast majority of arguments do not use these
features.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D78845
This is primarily motivated by the desire to move from Python2 to
Python3. `PYTHON_EXECUTABLE` is ambiguous. This explicitly identifies
the python interpreter in use. Since the LLVM build seems to be able to
completed successfully with python3, use that across the build. The old
path aliases `PYTHON_EXECUTABLE` to be treated as Python3.
GNU tools generate mapping symbols "$d" for .ARM.exidx sections. The
symbols are added to the symbol table much earlier than the merging
takes place, and after that, they become dangling. Before the patch,
LLD output those symbols as SHN_ABS with the value of 0. The patch
removes such symbols from the symbol table.
Differential Revision: https://reviews.llvm.org/D78820
Summary:
Add logic for emitting the correct set of load commands and segments
when `-dylib` is passed.
I haven't gotten to implementing a real export trie yet, so we can only
emit a single symbol, but it's enough to replace the YAML test files
introduced in D76252.
Differential Revision: https://reviews.llvm.org/D76908
This diff implements basic support for writing a symbol table.
- Attributes are loosely supported for extern symbols and not at all for
other types
Immediate future work will involve implementing section merging.
Initial version by Kellie Medlin <kelliem@fb.com>
Differential Revision: https://reviews.llvm.org/D76742
Previously, the special segments `__PAGEZERO` and `__LINKEDIT` were
implemented as special LoadCommands. This diff implements them using
special sections instead which have an `isHidden()` attribute. We do not
emit section headers for hidden sections, but we use their addresses and
file offsets to determine that of their containing segments. In addition
to allowing us to share more segment-related code, this refactor is also
important for the next step of emitting dylibs:
1) dylibs don't have segments like __PAGEZERO, so we need an easy way of
omitting them w/o messing up segment indices
2) Unlike the kernel, which is happy to run an executable with
out-of-order segments, dyld requires dylibs to have their segment
load commands arranged in increasing address order. The refactor
makes it easier to implement sorting of sections and segments.
Differential Revision: https://reviews.llvm.org/D76839
Summary: The switch --plugin-opt=emit-asm can be used with the gold linker to dump the final assembly code generated by LTO in a user-friendly way. Unfortunately it doesn't work with lld. I'm hooking it up with lld. With that switch, lld emits assembly code into the output file (specified by -o) and if there are multiple input files, each of their assembly code will be emitted into a separate file named by suffixing the output file name with a unique number, respectively. The linking then stops after generating those assembly files.
Reviewers: espindola, wenlei, tejohnson, MaskRay, grimar
Reviewed By: tejohnson, MaskRay, grimar
Subscribers: pcc, emaste, inglorion, arichardson, hiraditya, MaskRay, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77231
These stub new function were not being added to the symbol table
which in turn meant that we were crashing when trying to output
relocations against them.
Differential Revision: https://reviews.llvm.org/D78779
When discarding local symbols with --discard-all or --discard-locals,
the ones which are used in relocations should be preserved. LLD used
the simplest approach and just ignored those switches when -r or
--emit-relocs was specified.
The patch implements handling the --discard-* switches for the cases
when relocations are kept by identifying used local symbols and allowing
removing only unused ones. This makes the behavior of LLD compatible
with GNU linkers.
Differential Revision: https://reviews.llvm.org/D77807
Fixed error detected by msan. The size field of the .ARM.exidx synthetic
section needs to be initialized to at least estimation level before
calling assignAddresses as that will use the size field.
This was previously reverted in 1ca16fc4f5.
Differential Revision: https://reviews.llvm.org/D78422
This reverts commit f969c2aa65.
There are some msan buildbot failures sanitzer-x86_64-linux-fast that
I need to investigate.
Differential Revision: https://reviews.llvm.org/D78422
The contents of the .ARM.exidx section must be ordered by SHF_LINK_ORDER
rules. We don't need to know the precise address for this order, but we
do need to know the relative order of sections. We have been using the
sectionIndex for this purpose, this works when the OutputSection order
has a monotonically increasing virtual address, but it is possible to
write a linker script with non-monotonically increasing virtual address.
For these cases we need to evaluate the base address of the OutputSection
so that we can order the .ARM.exidx sections properly.
This change moves the finalisation of .ARM.exidx till after the first
call to AssignAddresses. This permits us to sort on virtual address which
is linker script safe. It also permits a fix for part of pr44824 where
we generate .ARM.exidx section for the vector table when that table is so
far away it is out of range of the .ARM.exidx section. This fix will come
in a follow up patch.
Differential Revision: https://reviews.llvm.org/D78422
Time profiler emits relative timestamps for events (the number of
microseconds passed since the start of the current process).
This patch allows combining events from different processes while
preserving their relative timing by emitting a new attribute
"beginningOfTime". This attribute contains the system time that
corresponds to the zero timestamp of the time profiler.
This has at least two use cases:
- Build systems can use this to merge time traces from multiple compiler
invocations and generate statistics for the whole build. Tools like
ClangBuildAnalyzer could also leverage this feature.
- Compilers that use LLVM as their backend by invoking llc/opt in
a child process. If such a compiler supports generating time traces
of its own events, it could merge those events with LLVM-specific
events received from llc/opt, and produce a more complete time trace.
A proof-of-concept script that merges multiple logs that
contain a synchronization point into one log:
https://github.com/broadwaylamb/merge_trace_events
Differential Revision: https://reviews.llvm.org/D78030
For a relative path in INPUT() or GROUP(), this patch changes the search order by adding the directory of the current linker script.
The new search order (consistent with GNU ld >= 2.35 regarding the new test `test/ELF/input-relative.s`):
1. the directory of the current linker script (GNU ld from Binutils 2.35 onwards; https://sourceware.org/bugzilla/show_bug.cgi?id=25806)
2. the current working directory
3. library paths (-L)
This behavior makes it convenient to replace a .so or .a with a linker script with additional input. For example, glibc
```
% cat /usr/lib/x86_64-linux-gnu/libm.a
/* GNU ld script
*/
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /usr/lib/x86_64-linux-gnu/libm-2.29.a /usr/lib/x86_64-linux-gnu/libmvec.a )
```
could be simplified as `GROUP(libm-2.29.a libmvec.a)`.
Another example is to make libc++.a a linker script:
```
INPUT(libc++.a.1 libc++abi.a)
```
Note, -l is not affected.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D77779
After D78301 MC no longer emits a relocation for this case. Change to use
.inst and .reloc to synthesize the same instruction and relocation. One
more test case I missed.
If there is no SHF_TLS section, there will be no PT_TLS and Out::tlsPhdr may be a nullptr.
If the symbol referenced by an R_TLS is lazy, we should treat the symbol as undefined.
Also reorganize tls-in-archive.s and tls-weak-undef.s . They do not test what they intended to test.
This diff implements:
* dylib loading (much of which is being restored from @pcc and @ruiu's
original work)
* The GOT_LOAD relocation, which allows us to load non-lazy dylib
symbols
* Basic bind opcode emission, which tells `dyld` how to populate the GOT
Differential Revision: https://reviews.llvm.org/D76252
This fixes a bug as exposed by D77807.
Add tests for {--emit-relocs,-r} x {--discard-locals,--discard-all}. They add coverage for previously undertested cases:
* STT_SECTION associated to GCed sections (`gc`)
* STT_SECTION associated to retained sections (`text`)
* STT_SECTION associated to non-SHF_ALLOC sections (`.comment`)
* STB_LOCAL in GCed sections (`unused_gc`)
Reviewed By: grimar, ikudrin
Differential Revision: https://reviews.llvm.org/D78389
D13550 added the diagnostic to address/work around a crash.
The rule was refined by D19836 (test/ELF/tls-archive.s) to exclude Lazy symbols.
https://bugs.llvm.org/show_bug.cgi?id=45598 reported another case where the current logic has a false positive:
Bitcode does not record undefined module-level inline assembly symbols
(`IRSymtab.cpp:Builder::addSymbol`). Such an undefined symbol does not
have the FB_tls bit and lld will not consider it STT_TLS. When the symbol is
later replaced by a STT_TLS Defined, lld will error "TLS attribute mismatch".
This patch fixes this false positive by allowing a STT_NOTYPE undefined
symbol to be replaced by a STT_TLS.
Considered alternative:
Moving the diagnostics to scanRelocs() can improve the diagnostics (PR36049)
but that requires a fair amount of refactoring. We will need more
RelExpr members. It requires more thoughts whether it is worthwhile.
See `test/ELF/tls-mismatch.s` for behavior differences. We will fail to
diagnose a likely runtime bug (STT_NOTYPE non-TLS relocation referencing
a TLS definition). This is probably acceptable because compiler
generated code sets symbol types properly.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D78438
D77522 changed --warn-backrefs to not warn for linking sandwich
problems (-ldef1 -lref -ldef2). This removed lots of false positives.
However, glibc still has some problems. libc.a defines some symbols
which are normally in libm.a and libpthread.a, e.g. __isnanl/raise.
For a linking order `-lm -lpthread -lc`, I have seen:
```
// different resolutions: GNU ld/gold select libc.a(s_isnan.o) as the definition
backward reference detected: __isnanl in libc.a(printf_fp.o) refers to libm.a(m_isnanl.o)
// different resolutions: GNU ld/gold select libc.a(raise.o) as the definition
backward reference detected: raise in libc.a(abort.o) refers to libpthread.a(pt-raise.o)
```
To facilitate deployment of --warn-backrefs, add --warn-backrefs-exclude= so that
certain known issues (which may be impractical to fix) can be whitelisted.
Deliberate choices:
* Not a comma-separated list (`--warn-backrefs-exclude=liba.a,libb.a`).
-Wl, splits the argument at commas, so we cannot use commas.
--export-dynamic-symbol is similar.
* Not in the style of `--warn-backrefs='*' --warn-backrefs=-liba.a`.
We just need exclusion, not inclusion. For easier build system
integration, we should avoid order dependency. With the current
scheme, we enable --warn-backrefs, and indivial libraries can add
--warn-backrefs-exclude=<glob> to their LDFLAGS.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D77512
See http://lists.llvm.org/pipermail/llvm-dev/2020-April/140549.html
For the record, GNU ld changed to 64k max page size in 2014
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=7572ca8989ead4c3425a1500bc241eaaeffa2c89
"[RFC] ld/ARM: Increase maximum page size to 64kB"
Android driver forced 4k page size in AArch64 (D55029) and ARM (D77746).
A binary linked with max-page-size=4096 does not run on a system with a
higher page size configured. There are some systems out there that do
this and it leads to the binary getting `Killed!` by the kernel.
In the non-linker-script cases, when linked with -z noseparate-code
(default), the max-page-size increase should not cause any size
difference. There may be some VMA usage differences, though.
Reviewed By: psmith, MaskRay
Differential Revision: https://reviews.llvm.org/D77330
Use the unique filenames that are used when /lldsavetemps is passed.
After this change, module names for LTO blobs in PDBs will be unique.
Visual Studio and probably other debuggers expect module names to be
unique.
Revert some changes from 1e0b158db (2017) that are no longer necessary
after removing MSVC LTO support.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D78221
Implemented a bunch of relocations found in binaries with medium/large code model and the Local-Exec TLS model. The binaries link and run fine in Qemu.
In addition, the emulation `elf64_sparc` is now recognized.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D77672
GCC collect2 passes several options to the linker even if LTO is not used
(note, lld does not support GCC LTO). The lto-wrapper may be a relative
path (especially during development, when gcc is in a build directory), e.g.
-plugin-opt=relative/path/to/lto-wrapper
We need to ignore such options, which are currently interpreted by
cl::ParseCommandLineOptions() and will fail with `error: --plugin-opt: ld.lld: Unknown command line argument 'relative/path/to/lto-wrapper'`
because the path is apparently not an option registered by an `llvm:🆑:opt`.
See lto-plugin-ignore.s for how we interpret various -plugin-opt= options now.
Reviewed By: grimar, tejohnson
Differential Revision: https://reviews.llvm.org/D78158
This should make both static and dynamic NewPM plugins work with LTO.
And as a bonus, it makes static linking of OldPM plugins more reliable
for plugins with both an OldPM and NewPM interface.
I only implemented the command-line flag to specify NewPM plugins in
llvm-lto2, to show it works. Support can be added for other tools later.
Differential Revision: https://reviews.llvm.org/D76866
Summary:
wasm-ld requires --shared-memory to be passed when the atomics feature
is enabled because historically atomic operations were only valid with
shared memories. This change relaxes that requirement for when
building relocatable objects because their memories are not
meaningful. This technically maintains the validity of object files
because the threads spec now allows atomic operations with unshared
memories, although we don't support that elsewhere in the tools yet.
This fixes and Emscripten build issue reported at
https://bugs.chromium.org/p/webp/issues/detail?id=463.
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78072
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.
Differential revision: https://reviews.llvm.org/D78016
The alignment of ARM64 range extension thunks was fixed in
7c81649219, but ARM range extension thunks, and import
and delay import thunks also need aligning (like all code on ARM
platforms).
I'm adding a test for alignment of ARM64 import thunks - not
specifically adding tests for misalignment of all of them though.
Differential Revision: https://reviews.llvm.org/D77796
Summary:
A previous change (53211a) had updated the argument parsing to handle
large max memories, but 4294967296 would still wrap to zero after the
options were parsed. This change updates the configuration to use a
64-bit integer to store the max memory to avoid that overflow.
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77437
This patch changes the reproduce tests so that they no longer extract
the "long" paths of the generated reproduce tar archives. This
extraction prevented them from being run on Windows due to potential
issues relating to the Windows path length limit.
This patch also reduces the use of diff in these tests, as this was
raised as a performance concern in review D77659 and deemed unnecessary.
Differential Revision: https://reviews.llvm.org/D77750
The R_ARM_ALU_PC_G0 and R_ARM_LDR_PC_G0 relocations are used by the
ADR and LDR pseudo instructions, and are the basis of the group
relocations that can load an arbitrary constant via a series of add, sub
and ldr instructions.
The relocations need to be obtained via the .reloc directive.
R_ARM_ALU_PC_G0 is much more complicated as the add/sub instruction uses
a modified immediate encoding of an 8-bit immediate rotated right by an
even 4-bit field. This means that the range of representable immediates
is sparse. We extract the encoding and decoding functions for the modified
immediate from llvm/lib/Target/ARM/MCTargetDesc/ARMAddressingModes.h as
this header file is not accessible from LLD. Duplication of code isn't
ideal, but as these are well-defined mathematical functions they are
unlikely to change.
Differential Revision: https://reviews.llvm.org/D75349
Summary:
/PDBSTREAM:<name>=<file> adds the contents of <file> to stream <name> in the resulting PDB.
This allows native uses with workflows that (for example) add srcsrv streams to PDB files to provide a location for the build's source files.
Results should be equivalent to linking with lld-link, then running Microsoft's pdbstr tool with the command line:
pdbstr.exe -w -p:<PDB LOCATION> -s:<name> -i:<file>
except in cases where the named stream overlaps with a default named stream, such as "/names". In those cases, the added stream will be overridden, making the /pdbstream option a no-op.
Reviewers: thakis, rnk
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D77310
This is an alternative design to D77512.
D45195 added --warn-backrefs to detect
* A. certain input orders which GNU ld either errors ("undefined reference")
or has different resolution semantics
* B. (byproduct) some latent multiple definition problems (-ldef1 -lref -ldef2) which I
call "linking sandwich problems". def2 may or may not be the same as def1.
When an archive appears more than once (-ldef -lref -ldef), lld and GNU
ld may have the same resolution but --warn-backrefs may warn. This is
not uncommon. For example, currently lld itself has such a problem:
```
liblldCommon.a liblldCOFF.a ... liblldCommon.a
_ZN3lld10DWARFCache13getDILineInfoEmm in liblldCOFF.a refers to liblldCommon.a(DWARF.cpp.o)
libLLVMSupport.a also appears twice and has a similar warning
```
glibc has such problems. It is somewhat destined because of its separate
libc/libpthread/... and arbitrary grouping. The situation is getting
improved over time but I have seen:
```
-lc __isnanl references -lm
-lc _IO_funlockfile references -lpthread
```
There are also various issues in interaction with other runtime
libraries such as libgcc_eh and libunwind:
```
-lc __gcc_personality_v0 references -lgcc_eh
-lpthread __gcc_personality_v0 references -lgcc_eh
-lpthread _Unwind_GetCFA references -lunwind
```
These problems are actually benign. We want --warn-backrefs to focus on
its main task A and defer task B (which is also useful) to a more
specific future feature (see gold --detect-odr-violations and
https://bugs.llvm.org/show_bug.cgi?id=43110).
Instead of warning immediately, we store the message and only report it
if no subsequent lazy definition exists.
The use of the static variable `backrefDiags` is similar to `undefs` in
Relocations.cpp
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D77522
SymbolAssignment::addr stores the location counter. The type should be
uint64_t instead of unsigned. The upper half of the address space is
commonly used by operating system kernels.
Similarly, SymbolAssignment::size should be an uint64_t. A kernel linker
script can move the location counter from 0 to the upper half of the
address space.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D77445
This is part of the Propeller framework to do post link code layout
optimizations. Please see the RFC here:
https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the
detailed RFC doc here:
https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf
This patch adds lld support for basic block sections and performs relaxations
after the basic blocks have been reordered.
After the linker has reordered the basic block sections according to the
desired sequence, it runs a relaxation pass to optimize jump instructions.
Currently, the compiler emits the long form of all jump instructions. AMD64 ISA
supports variants of jump instructions with one byte offset or a four byte
offset. The compiler generates jump instructions with R_X86_64 32-bit PC
relative relocations. We would like to use a new relocation type for these jump
instructions as it makes it easy and accurate while relaxing these instructions.
The relaxation pass does two things:
First, it deletes all explicit fall-through direct jump instructions between
adjacent basic blocks. This is done by discarding the tail of the basic block
section.
Second, If there are consecutive jump instructions, it checks if the first
conditional jump can be inverted to convert the second into a fall through and
delete the second.
The jump instructions are relaxed by using jump instruction mods, something
like relocations. These are used to modify the opcode of the jump instruction.
Jump instruction mods contain three values, instruction offset, jump type and
size. While writing this jump instruction out to the final binary, the linker
uses the jump instruction mod to determine the opcode and the size of the
modified jump instruction. These mods are required because the input object
files are memory-mapped without write permissions and directly modifying the
object files requires copying these sections. Copying a large number of basic
block sections significantly bloats memory.
Differential Revision: https://reviews.llvm.org/D68065
Fixes https://bugs.llvm.org/show_bug.cgi?id=45391
The LTO code generator happens after version script scanning and may
create references which will fetch some lazy symbols.
Currently a version script does not assign VER_NDX_LOCAL to lazy symbols
and such symbols will be made global after they are fetched.
Change findByVersion and findAllByVersion to work on lazy symbols.
For unfetched lazy symbols, we should keep them non-local (D35263).
Check isDefined() in computeBinding() as a compensation.
This patch fixes a companion bug that --dynamic-list does not export
libcall fetched symbols.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D77280
finalizeSynthetic(in.symTab) calls sortSymTabSymbols() to order local
symbols before non-local symbols.
The newly added tests ensure that thunk symbols are added before
finalizeSynthetic(in.symTab), otherwise .symtab would be out of order.
A PC-relative relocation referencing a non-preemptible absolute symbol
(due to STT_TLS) is not representable in -pie/-shared mode.
Differential Revision: https://reviews.llvm.org/D77021
In the near future llvm-mc will resolve the fixups that generate
R_ARM_THUMB_PC8 and R_ARM_THUMB_PC12 at assembly time (see comments in
D72892), and forbid inter-section references. Change the LLD tests for
these relocations to use .inst and .reloc to avoid LLD tests failing when
this happens. The tests generate the same instructions, relocations
and symbols.
I will need to make equivalent changes for D75349 Arm equivalent
relocations, but this is still in review so these don't need changing
before llvm-mc.
Differential Revision: https://reviews.llvm.org/D77200
The aliased options in the --help output use double dashes. It is
inconsistent to have single-dashed messages. Additionally, -l and -t are
common short options and single-dashed forms prefixed with them can
cause confusion.
In most cases, LLD prints its multiline diagnostic messages starting
additional lines with ">>> ". That greatly helps external tools to parse
the output, simplifying combining several lines of the log back into one
message. The patch fixes the only message I found that does not follow
the common pattern.
Differential Revision: https://reviews.llvm.org/D77132
When reporting an "undefined symbol" diagnostic:
* We don't print @ for the reference.
* We don't print @ or @@ for the definition. https://bugs.llvm.org/show_bug.cgi?id=45318
This can lead to confusing diagnostics:
```
// foo may be foo@v2
ld.lld: error: undefined symbol: foo
>>> referenced by t1.o:(.text+0x1)
// foo may be foo@v1 or foo@@v1
>>> did you mean: foo
>>> defined in: t.so
```
There are 2 ways a symbol in symtab may get truncated:
* A @@ definition may be truncated *early* by SymbolTable::insert().
The name ends with a '\0'.
* A @ definition/reference may be truncated *later* by Symbol::parseSymbolVersion().
The name ends with a '@'.
This patch detects the second case and improves the diagnostics. The first case is
not improved but the second case is sufficient to make diagnostics not confusing.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D76999
This matches the behaviour of the ELF driver.
Also move the `createFiles` to be `checkConfig` and report `no input
files` there. Again this is mostly to match the structure of the ELF
linker better.
Differential Revision: https://reviews.llvm.org/D76960
Summary:
This is the first commit for the new Mach-O backend, designed to roughly
follow the architecture of the existing ELF and COFF backends, and
building off work that @ruiu and @pcc did in a branch a while back. Note
that this is a very stripped-down commit with the bare minimum of
functionality for ease of review. We'll be following up with more diffs
soon.
Currently, we're able to generate a simple "Hello World!" executable
that runs on OS X Catalina (and possibly on earlier OS X versions; I
haven't tested them). (This executable can be obtained by compiling
`test/MachO/relocations.s`.) We're mocking out a few load commands to
achieve this -- for example, we can't load dynamic libraries, but
Catalina requires binaries to be linked against `dyld`, so we hardcode
the emission of a `LC_LOAD_DYLIB` command. Other mocked out load
commands include LC_SYMTAB and LC_DYSYMTAB.
Differential Revision: https://reviews.llvm.org/D75382
--no-threads is a name copied from gold.
gold has --no-thread, --thread-count and several other --thread-count-*.
There are needs to customize the number of threads (running several lld
processes concurrently or customizing the number of LTO threads).
Having a single --threads=N is a straightforward replacement of gold's
--no-threads + --thread-count.
--no-threads is used rarely. So just delete --no-threads instead of
keeping it for compatibility for a while.
If --threads= is specified (ELF,wasm; COFF /threads: is similar),
--thinlto-jobs= defaults to --threads=,
otherwise all available hardware threads are used.
There is currently no way to override a --threads={1,2,...}. It is still
a debate whether we should use --threads=all.
Reviewed By: rnk, aganea
Differential Revision: https://reviews.llvm.org/D76885
The default GNU linker script uses the following idiom for the array
sections. I'll use .init_array here, but this also applies to
.preinit_array and .fini_array sections.
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(.init_array))
PROVIDE_HIDDEN (__init_array_end = .);
}
The C-library will take references to the _start and _end symbols to
process the array. This will make LLD keep the OutputSection even if there
are no .init_array sections. As the current check for RELRO uses the
section type for .init_array the above example with no .init_array
InputSections fails the checks as there are no .init_array sections to give
the OutputSection a type of SHT_INIT_ARRAY. This often leads to a
non-contiguous RELRO error message.
The simple fix is to a textual section match as well as a section type
match.
Differential Revision: https://reviews.llvm.org/D76915
This test covers the case where --gc-sections is used when there are
many sections. In particular, it ensures that there is no adverse
interaction with common and absolute symbols.
Reviewed by: MaskRay, grimar
Differential Revision: https://reviews.llvm.org/D76003
The test had a few style issues, and I noticed a hole in the coverage
(namely that the search order wasn't tested). Adding cases for the hole
in turn meant other cases weren't important.
The .so test case isn't important, since the code is shared code, so
I've removed it. Additionally, I've modified the usage of the "bar"
directive to show that an unneeded library must still be present, or the
link will fail, even though it isn't linked in.
Reviewed by: MaskRay, grimar
Differential Revision: https://reviews.llvm.org/D76851
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
As of a while ago, lld groups all undefined references to a single
symbol in a single diagnostic. Back then, I made it so that we
print up to 10 references to each undefined symbol.
Having used this for a while, I never wished there were more
references, but I sometimes found that this can print a lot of
output. lld prints up to 10 diagnostics by default, and if
each has 10 references (which I've seen in practice), and each
undefined symbol produces 2 (possibly very long) lines of output,
that's over 200 lines of error output.
Let's try it with just 3 references for a while and see how
that feels in practice.
Differential Revision: https://reviews.llvm.org/D77017
Currently, `error: incompatible section flags for .rodata` is reported
when we mix SHF_LINK_ORDER and non-SHF_LINK_ORDER sections in an output section.
This is overconstrained. This patch allows mixed flags with the
requirement that SHF_LINK_ORDER sections must be contiguous. Mixing
flags is used by Linux aarch64 (https://github.com/ClangBuiltLinux/linux/issues/953)
.init.data : { ... KEEP(*(__patchable_function_entries)) ... }
When the integrated assembler is enabled, clang's -fpatchable-function-entry=N[,M]
implementation sets the SHF_LINK_ORDER flag (D72215) to fix a number of
garbage collection issues.
Strictly speaking, the ELF specification does not require contiguous
SHF_LINK_ORDER sections but for many current uses of SHF_LINK_ORDER like
.ARM.exidx/__patchable_function_entries there has been a requirement for
the sections to be contiguous on top of the requirements of the ELF
specification.
This patch also imposes one restriction: SHF_LINK_ORDER sections cannot
be separated by a symbol assignment or a BYTE command. Not allowing BYTE
is a natural extension that a non-SHF_LINK_ORDER cannot be a separator.
Symbol assignments can delimiter the contents of SHF_LINK_ORDER
sections. Allowing SHF_LINK_ORDER sections across symbol assignments
(especially __start_/__stop_) can make things hard to explain. The
restriction should not be a problem for practical use cases.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D77007
This fixes PR# 45336.
Output sections described in a linker script as NOLOAD with no input sections would be marked as SHT_PROGBITS.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D76981
DWARF sections are typically live and not COMDAT, so they would be
treated as GC roots. Enabling DWARF would essentially keep all code with
debug info alive, preventing any section GC.
Fixes PR45273
Reviewed By: mstorsjo, MaskRay
Differential Revision: https://reviews.llvm.org/D76935
Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.
One can now say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask.
When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows).
Differential Revision: https://reviews.llvm.org/D75153
When the debug info contains a relocation against a dead symbol, wasm-ld
may emit spurious range-list terminator entries (entries with Start==0
and End==0). This change fixes this by emitting the WasmRelocation
Addend as End value for a non-live symbol.
Reviewed by: sbc100, dblaikie
Differential Revision: https://reviews.llvm.org/D74781
Summary:
This adds a test for D76752. Now the global section comes after the
event section, and this change makes sure it is satisfied.
Reviewers: sbc100, tlively
Reviewed By: tlively
Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76823
```
// llvm-objdump -d output (before)
0: bl .-4
4: bl .+0
8: bl .+4
// llvm-objdump -d output (after) ; GNU objdump -d
0: bl 0xfffffffc / bl 0xfffffffffffffffc
4: bl 0x4
8: bl 0xc
```
Many Operand's are not annotated as OPERAND_PCREL.
They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches.
Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test
address space wraparound for powerpc32 and powerpc64.
Reviewed By: sfertile, jhenderson
Differential Revision: https://reviews.llvm.org/D76591
The error previously talked about a "section header" but was actually
referring to a program header.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D76846
```
// llvm-objdump -d output (before)
400000: e8 0b 00 00 00 callq 11
400005: e8 0b 00 00 00 callq 11
// llvm-objdump -d output (after)
400000: e8 0b 00 00 00 callq 0x400010
400005: e8 0b 00 00 00 callq 0x400015
// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
400000: e8 0b 00 00 00 callq 400010
400005: e8 0b 00 00 00 callq 400015
```
In llvm-objdump, we pass the address of the next MCInst. Ideally we
should just thread the address of the current address, unfortunately we
cannot call X86MCCodeEmitter::encodeInstruction (X86MCCodeEmitter
requires MCInstrInfo and MCContext) to get the length of the MCInst.
MCInstPrinter::printInst has other callers (e.g llvm-mc -filetype=asm, llvm-mca) which set Address to 0.
They leave MCInstPrinter::PrintBranchImmAsAddress as false and this change is a no-op for them.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76580
Added support for /map and /map:[filepath].
The output was derived from Microsoft's Link.exe output when using that same option.
Note that /MAPINFO support was not added.
The previous implementation of MapFile.cpp/.h was meant for /lldmap, and was renamed to LLDMapFile.cpp/.h
MapFile.cpp/.h is now for /MAP
However, a small fix was added to lldmap, replacing a std::sort with std::stable_sort to enforce reproducibility.
Differential Revision: https://reviews.llvm.org/D70557
Add the relevant magic bits to allow "-mllvm=-load=plugin.so" etc.
This is now using export_executable_symbols_for_plugins, so symbols are
only exported if plugins are enabled.
Differential Revision: https://reviews.llvm.org/D75879
This behavior matches GNU ld and seems reasonable.
```
// If a SECTIONS command is not specified
.text.* -> .text
.rodata.* -> .rodata
.init_array.* -> .init_array
```
A proposed Linux feature CONFIG_FG_KASLR may depend on the GNU ld behavior.
Reword a comment about -z keep-text-section-prefix and a comment about
CommonSection (deleted by rL286234).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75225
This essentially drops the change by r288021 (discussed with Georgii Rymar
and Peter Smith and noted down in the release note of lld 10).
GNU ld>=2.31 enables -z separate-code by default for Linux x86. By
default (in the absence of a PHDRS command) a readonly PT_LOAD is
created, which is different from its traditional behavior.
Not emulating GNU ld's traditional behavior is good for us because it
improves code consistency (we create a readonly PT_LOAD in the absence
of a SECTIONS command).
Users can add --no-rosegment to restore the previous behavior (combined
readonly and read-executable sections in a single RX PT_LOAD).
Fixes https://bugs.llvm.org/show_bug.cgi?id=44903
It is about the following case:
```
SECTIONS {
.foo : { *(.foo) } =0x90909090
/DISCARD/ : { *(.bar) }
}
```
Here while parsing the fill expression we treated the
"/" of "/DISCARD/" as operator.
With this change, suggested by Fangrui Song, we do
not allow expressions with operators (e.g. "0x1100 + 0x22")
that are not wrapped into round brackets. It should not
be an issue for users, but helps to resolve parsing ambiguity.
Differential revision: https://reviews.llvm.org/D74687
MCTargetOptionsCommandFlags.inc and CommandFlags.inc are headers which contain
cl::opt with static storage.
These headers are meant to be incuded by tools to make it easier to parametrize
codegen/mc.
However, these headers are also included in at least two libraries: lldCommon
and handle-llvm. As a result, when creating DYLIB, clang-cpp holds a reference
to the options, and lldCommon holds another reference. Linking the two in a
single executable, as zig does[0], results in a double registration.
This patch explores an other approach: the .inc files are moved to regular
files, and the registration happens on-demand through static declaration of
options in the constructor of a static object.
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1756977#c5
Differential Revision: https://reviews.llvm.org/D75579
Follow-up for D74433
What the function returns are almost standard BFD names, except that "ELF" is
in uppercase instead of lowercase.
This patch changes "ELF" to "elf" and changes ARM/AArch64 to use their BFD names.
MIPS and PPC64 have endianness differences as well, but this patch does not intend to address them.
Advantages:
* llvm-objdump: the "file format " line matches GNU objdump on ARM/AArch64 objects
* "file format " line can be extracted and fed into llvm-objcopy -O literally.
(https://github.com/ClangBuiltLinux/linux/issues/779 has such a use case)
Affected tools: llvm-readobj, llvm-objdump, llvm-dwarfdump, MCJIT (internal implementation detail, not exposed)
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76046
When emitting PDBs, the TypeStreamMerger class is used to merge .debug$T records from the input .OBJ files into the output .PDB stream.
Records in .OBJs are not required to be aligned on 4-bytes, and "The Netwide Assembler 2.14" generates non-aligned records.
When compiling with -DLLVM_ENABLE_ASSERTIONS=ON, an assert was triggered in MergingTypeTableBuilder when non-ghash merging was used.
With ghash merging there was no assert.
As a result, LLD could potentially generate a non-aligned TPI stream.
We now align records on 4-bytes when record indices are remapped, in TypeStreamMerger::remapIndices().
Differential Revision: https://reviews.llvm.org/D75081
Hexagon ABI specifies that call x@gdplt is transformed to call __tls_get_addr.
Example:
call x@gdplt
is changed to
call __tls_get_addr
When x is an external tls variable.
Differential Revision: https://reviews.llvm.org/D74443
Any OUTPUT_FORMAT in a linker script overrides the emulation passed on
the command line, so record the passed bfdname and use that in the error
message about incompatible input files.
This prevents confusing error messages. For example, if you explicitly
pass `-m elf_x86_64` to LLD but accidentally include a linker script
which sets `OUTPUT_FORMAT(elf32-i386)`, LLD would previously complain
about your input files being compatible with elf_x86_64, which isn't the
actual issue, and is confusing because the input files are in fact
x86-64 ELF files.
Interestingly enough, this also prevents a segfault! When we don't pass
`-m` and we have an object file which is incompatible with the
`OUTPUT_FORMAT` set by a linker script, the object file is checked for
compatibility before it's added to the objectFiles vector.
config->emulation, objectFiles, and sharedFiles will all be empty, so
we'll attempt to access bitcodeFiles[0], but bitcodeFiles is also empty,
so we'll segfault. This commit prevents the segfault by adding
OUTPUT_FORMAT as a possible source of machine configuration, and it also
adds an llvm_unreachable to diagnose similar issues in the future.
Differential Revision: https://reviews.llvm.org/D76109
On an internal target,
* Before D74773: time -f '%M' => 18275680
* After D74773: time -f '%M' => 22088964
This patch restores to the status before D74773.
-M output can be useful when diagnosing an "error: output file too large" problem (emitted in openFile()).
I just ran into such a situation where I had to debug an erronerous
Linux kernel linker script. It tried to create a file larger than
INT64_MAX bytes.
This patch could have helped https://bugs.llvm.org/show_bug.cgi?id=44715 as well.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75966
See `docs/ELF/linker_script.rst` for the new computation for sh_addr and sh_addralign.
`ALIGN(section_align)` now means: "increase alignment to section_align"
(like yet another input section requirement).
The "start of section .foo changes from 0x11 to 0x20" warning no longer
makes sense. Change it to warn if sh_addr%sh_addralign!=0.
To decrease the alignment from the default max_input_align,
use `.output ALIGN(8) : {}` instead of `.output : ALIGN(8) {}`
See linkerscript/section-address-align.test as an example.
When both an output section address and ALIGN are set (can be seen as an
"undefined behavior" https://sourceware.org/ml/binutils/2020-03/msg00115.html),
lld may align more than GNU ld, but it makes a linker script working
with GNU ld hard to break with lld.
This patch can be considered as restoring part of the behavior before D74736.
Differential Revision: https://reviews.llvm.org/D75724
LLD implements Linker Scripts as they are described in the GNU ld manual.
This description is far from a specification, with the only true reference
the GNU ld implementation, which has undocumented behaviour that can vary
from release to release.
To make it easy for people to switch between linkers we try to follow GNU
ld implementation details wherever possible. We reserve the right to make
our own decisions where the undocumented GNU ld behaviour is not
appropriate for LLD. We don't have a place to document these decisions and
it can be difficult for users to find out this information.
This file is a statement of the LLD implementation policy and will contain
intentional deviations from GNU ld.
The first patch that will add concrete details to this file is D75724
Differential Revision: https://reviews.llvm.org/D75921
Summary:
Places orphan sections into a unique output section. This prevents the merging of orphan sections of the same name.
Matches behaviour of GNU ld --unique. --unique=pattern is not implemented.
Motivated user case shown in the test has 2 local symbols as they would appear if C++ source has been compiled with -ffunction-sections. The merging of these sections in the case of a partial link (-r) may limit the effectiveness of -gc-sections of a subsequent link.
Reviewers: espindola, jhenderson, bd1976llvm, edd, andrewng, JonChesterfield, MaskRay, grimar, ruiu, psmith
Reviewed By: MaskRay, grimar
Subscribers: emaste, arichardson, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75536
```
createFiles(args)
readDefsym
readerLinkerScript(*mb)
...
readMemory
readMemoryAssignment("ORIGIN", "org", "o") // eagerly evaluated
target = getTarget();
link(args)
writeResult<ELFT>()
...
finalizeSections()
script->processSymbolAssignments()
addSymbol(cmd) // with this patch, evaluated here
```
readMemoryAssignment eagerly evaluates ORIGIN/LENGTH and returns an uint64_t.
This patch postpones the evaluation to make
* --defsym and symbol assignments
* `CONSTANT(COMMONPAGESIZE)` (requires a non-null `lld:🧝:target`)
work. If the expression somehow requires interaction with memory
regions, the circular dependency may cause the expression to evaluate to
a strange value. See the new test added to memory-err.s
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75763
lld/.clang-tidy is almost identical to the top-level .clang-tidy, with the aforementioned customization.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D75809
Added a write method for TimeTrace that takes two strings representing
file names. The first is any file name that may have been provided by the
user via `time-trace-file` flag, and the second is a fallback that should
be configured by the caller. This method makes it cleaner to write the
trace output because there is no longer a need to check file names at the
caller and simplifies future TimeTrace usages.
Reviewed By: modocache
Differential Revision: https://reviews.llvm.org/D74514
llvm::call_once(initDwarfLine, [this]() { initializeDwarf(); });
Though it is not used in all places.
I need that patch for implementing "Remove obsolete debug info" feature
(D74169). But this caching mechanism is useful by itself, and I think it
would be good to use it without connection to "Remove obsolete debug info"
feature. So this patch changes inplace creation of DWARFContext with
its cached version.
Depends on D74308
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D74773
Currently `yaml2obj` require `Offset` field in a relocation description.
There are many cases when `Offset` is insignificant in a context of a test case.
Making `Offset` optional allows to simplify our test cases.
This is what this patch does.
Also, with this patch `obj2yaml` does not dump a zero offset of a relocation.
Differential revision: https://reviews.llvm.org/D75608
The new behavior matches GNU objdump. A pair of angle brackets makes tests slightly easier.
`.foo:` is not unique and thus cannot be used in a `CHECK-LABEL:` directive.
Without `-LABEL`, the CHECK line can match the `Disassembly of section`
line and causes the next `CHECK-NEXT:` to fail.
```
Disassembly of section .foo:
0000000000001634 .foo:
```
Bdragon: <> has metalinguistic connotation. it just "feels right"
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D75713
* Delete boilerplate
* Change functions to return `Error`
* Test parsing errors
* Update callers of ARMAttributeParser::parse() to check the `Error` return value.
Since this patch touches nearly everything in the file, I apply
http://llvm.org/docs/Proposals/VariableNames.html and change variable
names to lower case.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D75015
This fixes several issues. The behavior changes are:
A SHN_COMMON symbol does not have the 'g' flag.
An undefined symbol does not have 'g' or 'l' flag.
A STB_GLOBAL SymbolRef::ST_Unknown symbol has the 'g' flag.
A STB_LOCAL SymbolRef::ST_Unknown symbol has the 'l' flag.
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D75659
This changes the output of `llvm-readelf -n` from:
```
Displaying notes found at file offset 0x<...> with length 0x<...>:
```
to:
```
Displaying notes found in: .note.foo
```
And similarly, adds a `Name:` field to the `llvm-readobj -n` output for notes.
This change not only increases GNU compatibility, it also makes it much easier to read notes. Note that we still fall back to printing the file offset/length in cases where we don't have a section name, such as when printing notes in program headers or printing notes in a partially stripped file (GNU readelf does the same).
Fixes llvm.org/PR41339.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D75647
Summary:
LLD has workaround for the times when SectionIndex was not passed properly:
LT->getFileLineInfoForAddress(
S->getOffsetInFile() + Offset, nullptr,
DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath, Info));
S->getOffsetInFile() was added to differentiate offsets between
various sections. Now SectionIndex is properly specified.
Thus it is not necessary to use getOffsetInFile() workaround.
See https://reviews.llvm.org/D58194, https://reviews.llvm.org/D58357.
This patch removes getOffsetInFile() workaround.
Reviewers: ruiu, grimar, MaskRay, espindola
Reviewed By: grimar, MaskRay
Subscribers: emaste, arichardson, llvm-commits
Tags: #llvm, #lld
Differential Revision: https://reviews.llvm.org/D75636
Summary:
A test is passing `-o -` to lld in the hope of writing the output to
standard out but that is not the case. Instead it creates a file named
`-.lto.o`. This fixes it by creating a temporary file in the work
directory.
Differential Revision: https://reviews.llvm.org/D75605
and follow-ups:
a2ca1c2d "build: disable zlib by default on Windows"
2181bf40 "[CMake] Link against ZLIB::ZLIB"
1079c68a "Attempt to fix ZLIB CMake logic on Windows"
This changed the output of llvm-config --system-libs, and more
importantly it broke stand-alone builds. Instead of piling on more fix
attempts, let's revert this to reduce the risk of more breakages.
Otherwise ld.lld -save-temps will crash when writing to ResolutionFile.
llvm-lto2 -save-temps does not crash because it exits immediately.
Reviewed By: evgeny777
Differential Revision: https://reviews.llvm.org/D75426
Similar to D63182 [ELF][PPC64] Don't report "relocation refers to a discarded section" for .toc
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D75419
Summary:
Extracted from D74773. Currently, errors happened while parsing
debug info are reported as errors. DebugInfoDWARF library treats such
errors as "Recoverable errors". This patch makes debug info errors
to be reported as warnings, to support DebugInfoDWARF approach.
Reviewers: ruiu, grimar, MaskRay, jhenderson, espindola
Reviewed By: MaskRay, jhenderson
Subscribers: emaste, aprantl, arichardson, arphaman, llvm-commits
Tags: #llvm, #debug-info, #lld
Differential Revision: https://reviews.llvm.org/D75234
When there are both strong and weak references to an undefined
symbol ensure that the strong reference prevails in the output symbol
generating the correct error.
Test case copied from lld/test/ELF/weak-and-strong-undef.s
Differential Revision: https://reviews.llvm.org/D75322