Fix PR36272 and PR46835
A .eh_frame FDE references a text section and (optionally) a LSDA (in
.gcc_except_table). Even if two text sections have identical content and
relocations (e.g. a() and b()), we cannot fold them if their LSDA are different.
```
void foo();
void a() {
try { foo(); } catch (int) { }
}
void b() {
try { foo(); } catch (float) { }
}
```
Scan .eh_frame pieces with LSDA and disallow referenced text sections to be
folded. If two .gcc_except_table have identical semantics (usually identical
content with PC-relative encoding), we will lose folding opportunity.
For ClickHouse (an exception-heavy application), this can reduce --icf=all efficiency
from 9% to 5%. There may be some percentage we can reclaim without affecting
correctness, if we analyze .eh_frame and .gcc_except_table sections.
gold 2.24 implemented a more complex fix (resolution to
https://sourceware.org/bugzilla/show_bug.cgi?id=21066) which combines the
checksum of .eh_frame CIE/FDE pieces.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D84610
--oformat=binary is rare (used in a few places in FreeBSD, see `stand/i386/mbr/Makefile` `LDFLAGS_BIN`)
The result should be identical to a normal output transformed by `objcopy -O binary`.
The current implementation ignores addresses and lays out sections by
respecting output section alignments. It can fail when an output section
address is specified, e.g. `.rodata ALIGN(16) :` (PR33651).
Fix PR33651 by respecting LMA. The code is similar to
`tools/llvm-objcop/ELF/Object.cpp` BinaryWriter::finalize after D71035 and D79229.
Unforunately for an output section without PT_LOAD, we assume its LMA is equal
to its VMA. So the result is still incorrect when an output section LMA
(`AT(...)`) is specified
Also drop `alignTo(off, config->wordsize)`. GNU ld does not round up the file size.
Differential Revision: https://reviews.llvm.org/D85086
See https://lists.llvm.org/pipermail/llvm-dev/2020-July/143373.html
"[llvm-dev] Multiple documents in one test file" for some discussions.
This patch has explored several alternatives. The current semantics are similar to
what @dblaikie proposed.
`split-file filename output` splits the input file into multiple parts separated by
regex `^(.|//)--- filename` and write each part to the file `output/filename`
(`filename` can include path separators).
Use case A (organizing input of different formats (e.g. linker
script+assembly) in one file).
```
# RUN: split-file %s %t
# RUN: llvm-mc %t/asm -o %t.o
# RUN: ld.lld -T %t/lds %t.o -o %t
This is sometimes better than the %S/Inputs/ approach because the user
can see the auxiliary files immediately and don't have to open another file.
# asm
...
# lds
...
```
Use case B (for utilities which don't have built-in input splitting
feature):
```
// RUN: split-file %s %t
// RUN: llc < %t/1.ll | FileCheck %s --check-prefix=CASE1
// RUN: llc < %t/2.ll | FileCheck %s --check-prefix=CASE2
Combing tests prudently can improve readability.
For example, when testing parsing errors if the recovery mechanism isn't possible,
grouping the tests in one file can more readily see test coverage/strategy.
//--- 1.ll
...
//--- 2.ll
...
```
Since this is a new utility, there is no git history concerns for
UpperCase variable names. I use lowerCase variable names like mlir/lld.
Reviewed By: jhenderson, lattner
Differential Revision: https://reviews.llvm.org/D83834
Clang and GCC have a feature (-MD flag) to create a dependency file
in a format that build systems such as Make or Ninja can read, which
specifies all the additional inputs such .h files.
This change introduces the same functionality to lld bringing it to
feature parity with ld and gold which gained this feature recently.
See https://sourceware.org/bugzilla/show_bug.cgi?id=22843 for more
details and discussion.
The implementation corresponds to -MD -MP compiler flag where the
generated dependency file also includes phony targets which works
around the errors where the dependency is removed. This matches the
format used by ld and gold.
Fixes PR42806
Differential Revision: https://reviews.llvm.org/D82437
D68049 created options for basic block sections: -fbasic-block-sections=,
-funique-basic-block-section-names. Rename options in llc and lld (--lto-)
to be consistent. Specifically,
+ Rename basicblock-sections to basic-block-sections
+ Rename unique-bb-section-names to unique-basic-block-section-names
Differential Revision: https://reviews.llvm.org/D84462
Clang and GCC have a feature (-MD flag) to create a dependency file
in a format that build systems such as Make or Ninja can read, which
specifies all the additional inputs such .h files.
This change introduces the same functionality to lld bringing it to
feature parity with ld and gold which gained this feature recently.
See https://sourceware.org/bugzilla/show_bug.cgi?id=22843 for more
details and discussion.
The implementation corresponds to -MD -MP compiler flag where the
generated dependency file also includes phony targets which works
around the errors where the dependency is removed. This matches the
format used by ld and gold.
Fixes PR42806
Differential Revision: https://reviews.llvm.org/D82437
This patch supports the situation where caller does not have a valid TOC and
calls using the R_PPC64_REL24_NOTOC relocation and the callee is not DSO local.
In this case the call cannot be made directly since the callee may or may not
require a valid TOC pointer. As a result this situation require a PC-relative
plt stub to set up r12.
Reviewed By: sfertile, MaskRay, stefanp
Differential Revision: https://reviews.llvm.org/D83669
This patch adds support for the LOG2CEIL builtin function in linker scripts: https://sourceware.org/binutils/docs/ld/Builtin-Functions.html#index-LOG2CEIL_0028exp_0029
As documented for LD, and to keep compatibility, LOG2CEIL(0) returns 0 (not -inf).
The test vectors are somewhat arbitrary. We check minimum values (0-4); middle values (2^32, and 2^32+1); and the maximum value (2^64-1).
The checks for LOG2CEIL explicitly use full 64-bit values (16 hex digits). This is needed to properly verify that -inf and other interesting results aren't returned. (For some reason, all other tests in operators.test use only 14 digits.)
Differential revision: https://reviews.llvm.org/D84054
See https://lists.llvm.org/pipermail/llvm-dev/2020-July/143373.html
"[llvm-dev] Multiple documents in one test file" for some discussions.
`extract part filename` splits the input file into multiple parts separated by
regex `^(.|//)--- ` and extract the specified part to stdout or the
output file (if specified).
Use case A (organizing input of different formats (e.g. linker
script+assembly) in one file).
```
// RUN: extract lds %s -o %t.lds
// RUN: extract asm %s -o %t.s
// RUN: llvm-mc %t.s -o %t.o
// RUN: ld.lld -T %t.lds %t.o -o %t
This is sometimes better than the %S/Inputs/ approach because the user
can see the auxiliary files immediately and don't have to open another file.
```
Use case B (for utilities which don't have built-in input splitting
feature):
```
// RUN: extract case1 %s | llc | FileCheck %s --check-prefix=CASE1
// RUN: extract case2 %s | llc | FileCheck %s --check-prefix=CASE2
Combing tests prudently can improve readability.
This is sometimes better than having multiple test files.
```
Since this is a new utility, there is no git history concerns for
UpperCase variable names. I use lowerCase variable names like mlir/lld.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D83834
-r --gc-sections is usually not useful because it just makes intermediate output
smaller. https://bugs.llvm.org/show_bug.cgi?id=46700#c7 mentions a use case:
validating the absence of undefined symbols ealier than in the final link.
After D84129 (SHT_GROUP support in -r links), we can support -r
--gc-sections without extra code. So let's allow it.
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D84131
* If two group members are combined, we should leave just one index in the SHT_GROUP content.
* If a group member is discarded (/DISCARD/ or upcoming -r --gc-sections combination),
we should drop its index in the SHT_GROUP content. LLD currently crashes (`getOutputSection()` is null).
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D84129
The PC Relative code now allows for calls that are marked with the relocation
R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC
pointer in R2 and does not require R2 to be restored after the call.
This patch is added to support local calls to callees that require a TOC
Reviewed By: sfertile, MaskRay, nemanjai, stefanp
Differential Revision: https://reviews.llvm.org/D83504
The test fails in 32-bit Windows builds for unclear reasons:
ld.lld: error: failed to open
C:\src\llvm_package_1100-rc1\build32_stage0\tools\lld\test\ELF\Output\arm-exidx-range.s.tmp:
The parameter is incorrect.
After D69985, symbols for "-init" and "-fini" were unconditionally
marked as used even if they were just lazy symbols seen when scanning
archives. That resulted in exposing them in the symbol table of an
output file, as Undefined, which added unwanted dependencies. The patch
fixes the issue by checking the kind of the symbols before the marking.
Differential Revision: https://reviews.llvm.org/D83549
It allows handling cases when we have SHT_REL[A] sections before target
sections in objects.
This fixes https://bugs.llvm.org/show_bug.cgi?id=46632
which says: "Normally it is not what compilers would emit. We have to support it,
because some custom tools might want to use this feature, which is not restricted by ELF gABI"
Differential revision: https://reviews.llvm.org/D83469
Implements the missing relocation types for AVR target.
The results have been cross-checked with binutils.
Original patch by LemonBoy. Some changes by me.
Differential Revision: https://reviews.llvm.org/D78741
The PC Relative code allows for calls that are marked with the relocation
R_PPC64_REL24_NOTOC. This indicates that the caller does not have a valid TOC
pointer in R2 and does not require R2 to be restored after the call.
This patch is added to support local calls to callees tha also do not have a TOC.
Reviewed By: sfertile, MaskRay, stefanp
Differential Revision: https://reviews.llvm.org/D82816
The R_PPC64_REL24 is used in function calls when the caller requires a
valid TOC pointer. If the callee shares the same TOC or does not clobber
the TOC pointer then a direct call can be made. If the callee does not
share the TOC a thunk must be added to save the TOC pointer for the caller.
Up until PC Relative was introduced all local calls on medium and large code
models were assumed to share a TOC. This is no longer the case because
if the caller requires a TOC and the callee is PC Relative then the callee
can clobber the TOC even if it is in the same DSO.
This patch is to add support for a TOC caller calling a PC Relative callee that
clobbers the TOC.
Reviewed By: sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D82950
The patch adds checking for various potential issues in parsing name
lookup tables and reporting them as recoverable errors, similarly as we
do for other tables.
Differential Revision: https://reviews.llvm.org/D83050
The parsing method did not check reading errors and might easily fall
into an infinite loop on an invalid input because of that.
Differential Revision: https://reviews.llvm.org/D83049
... to customize the tombstone value we use for an absolute relocation
referencing a discarded symbol. This can be used as a workaround when
some debug processing tool has trouble with current -1 tombstone value
(https://bugs.chromium.org/p/chromium/issues/detail?id=1102223#c11 )
For example, to get the current built-in rules (not considering the .debug_line special case for ICF):
```
-z dead-reloc-in-nonalloc='.debug_*=0xffffffffffffffff'
-z dead-reloc-in-nonalloc=.debug_loc=0xfffffffffffffffe
-z dead-reloc-in-nonalloc=.debug_ranges=0xfffffffffffffffe
```
To get GNU ld (as of binutils 2.35)'s behavior:
```
-z dead-reloc-in-nonalloc='*=0'
-z dead-reloc-in-nonalloc=.debug_ranges=1
```
This option has other use cases. For example, if we want to check
whether a non-SHF_ALLOC section has dead relocations.
With this patch, we can run a regular LLD and run another with a special
-z dead-reloc-in-nonalloc=, then compare their output.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D83264
In GNU ld, --no-relax can disable x86-64 GOTPCRELX relaxation.
It is not useful, so we don't implement it.
For RISC-V, --no-relax disables linker relaxations which have larger
impact.
Linux kernel specifies --no-relax when CONFIG_DYNAMIC_FTRACE is specified
(since http://git.kernel.org/linus/a1d2a6b4cee858a2f27eebce731fbf1dfd72cb4e ).
LLD has not implemented the relaxations, so this option is a no-op.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81359
Follow-up to D82899. Note, we need to disable R_DTPREL relaxation
because ARM psABI does not define TLS relaxation.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D83138
The location of a TLS variable is encoded as a DW_OP_const4u/DW_OP_const8u
followed by a DW_OP_push_tls_address (or DW_OP_GNU_push_tls_address https://sourceware.org/bugzilla/show_bug.cgi?id=11616 ).
This change follows up to D81784 and makes relocations types generalized as
R_DTPREL (e.g. R_X86_64_DTPOFF{32,64}, R_PPC64_DTPREL64) use -1 as the
tombstone value as well. This works for both TLS Variant I and Variant II
architectures.
* arm: .long tls(tlsldo) # not working currently (R_ARM_TLS_LDO32 is R_ABS)
* mips64: .dtpreldword tls+32768
* ppc64: .quad tls@DTPREL+0x8000
* riscv: neither GCC nor clang has implemented DW_AT_location. It is likely .long/.quad tls@dtprel+0x800
* x86-32: .long tls@DTPOFF
* x86-64: .long tls@DTPOFF; .quad tls@DTPOFF
tls has a non-negative st_value, so such relocations (st_value+addend)
never resolve to -1 in a normal (not discarded) case.
```
// clang -fuse-ld=lld -g -ffunction-sections a.c -Wl,--gc-sections
// foo and tls will be discarded by --gc-sections.
// DW_AT_location [DW_FORM_exprloc] (DW_OP_const8u 0xffffffffffffffff, DW_OP_GNU_push_tls_address)
thread_local int tls;
int foo() { return ++tls; }
int main() {}
```
Also, drop logic added in D26201 intended to address PR30793. It added a test
(gc-debuginfo-tls.s) using a non-SHF_ALLOC section and a local symbol, which
does not reflect the intended scenario: a relocation in a SHF_ALLOC section
referencing a discarded non-local symbol. For such a non .debug_* section, just
emit an error.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82899
On Windows co-operative programs can be expected to open LLD's
output in FILE_SHARE_DELETE mode. This allows us to delete the
file (by moving it to a temporary filename and then deleting
it) so that we can link another output file that overwrites
the existing file, even if the current file is in use.
A similar strategy is documented here:
https://boostgsoc13.github.io/boost.afio/doc/html/afio/FAQ/deleting_open_files.html
Differential Revision: https://reviews.llvm.org/D82567
This patch adds a few extra cases to the existing testing for eh_frame
and eh_frame_hdr behaviour in LLD. They all come from a private
testsuite we are trying to migrate to lit.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D82852
After D81784, we resolve a relocation in .debug_* referencing an ICF folded
section symbol to a tombstone value.
Doing this for .debug_line has a problem (https://reviews.llvm.org/D81784#2116925 ):
.debug_line may describe folded lines as having addresses UINT64_MAX or
some wraparound small addresses.
```
int foo(int x) {
return x; // line 2
}
int bar(int x) {
return x; // line 6
}
```
```
Address Line Column File ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x00000000002016c0 1 0 1 0 0 is_stmt
0x00000000002016c7 2 9 1 0 0 is_stmt
prologue_end
0x00000000002016ca 2 2 1 0 0
0x00000000002016cc 2 2 1 0 0 end_sequence
// UINT64_MAX and wraparound small addresses
0xffffffffffffffff 5 0 1 0 0 is_stmt
0x0000000000000006 6 9 1 0 0 is_stmt
prologue_end
0x0000000000000009 6 2 1 0 0
0x000000000000000b 6 2 1 0 0 end_sequence
0x00000000002016d0 9 0 1 0 0 is_stmt
0x00000000002016df 10 6 1 0 0 is_stmt prologue_end
0x00000000002016e6 11 11 1 0 0 is_stmt
...
```
These entries can confuse debuggers:
gdb before 2020-07-01 (binutils-gdb a8caed5d7faa639a1e6769eba551d15d8ddd9510 "Recognize -1 as a tombstone value in .debug_line")
(can't continue due to a breakpoint in an invalid region of memory):
```
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x6
```
lldb (breakpoint has no effect):
```
(lldb) b 6
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
```
This patch special cases .debug_line to not use the tombstone value,
restoring the previous behavior: .debug_line will have entries with the
same addresses (ICF) but different line numbers. A breakpoint on line 2
or 6 will trigger on both functions.
Reviewed By: dblaikie, jhenderson
Differential Revision: https://reviews.llvm.org/D82828
D79300 forgot to change `getBuffer().empty()` in LazyObjFile::parse to
`fetched`. This caused incorrect iterating after the current LazyObjFile was
fetched. This issue is benign and can just cause loss of "undefined symbols"
and "backward reference" diagnostics.
Before D79300 `mb = {}` caused --warn-backrefs-exclude to be useless for
a fetched LazyObjFile.
Add two test cases.
Fixes PR46420
Similar to D43307 for non-LTO.
Module-level inline assembly can use .symver to create a symbol with `@` in the name.
For relocatable output, @ should be retained in the symbol name. `@ver` should
not be parsed and dropped.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D82433
Add support for the 34bit relocation R_PPC64_GOT_PCREL34 for
PC Relative in LLD.
Reviewers: sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D81948
This is the followup to D77647 which implements handling for the new
R_AARCH64_PLT32 relocation type in lld. This relocation would benefit the
PIC-friendly vtables feature described in D72959.
Differential Revision: https://reviews.llvm.org/D81184
See D59553, https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html and
https://sourceware.org/pipermail/binutils/2020-May/111357.html for
extensive discussions on a tombstone value.
See http://www.dwarfstd.org/ShowIssue.php?issue=200609.1
(Reserve an address value for "not present") for a DWARF enhancement proposal.
We resolve such relocations to a tombstone value to indicate that the address is invalid.
This solves several problems (the normal behavior is to resolve the relocation to the addend):
* For an empty function in a collected section, a pair of (0,0) can
terminate .debug_loc and .debug_ranges (as of binutils 2.34, GNU ld
resolves such a relocation to 1 to avoid the .debug_ranges issue)
* If DW_AT_high_pc is sufficiently large, the address range can collide
with a regular code range of low address (https://bugs.llvm.org/show_bug.cgi?id=41124 )
* If a text section is folded into another by ICF, we may leave entries
in multiple CUs claiming ownership of the same range of code, which can
confuse consumers.
* Debug information associated with COMDAT sections can have problems
similar to ICF, but is more complex - thus not addressed by this patch.
For pre-DWARF-v5 .debug_loc and .debug_ranges, a pair of 0 can terminate
entries (invalidating subsequent ranges).
-1 is a reserved value with special meaning (base address selection entry) which can't be used either.
Use -2 instead.
For all other .debug_*, use UINT32_MAX for 32-bit targets and UINT64_MAX
for 64-bit targets. In the code, we intentionally use
`uint64_t tombstone = UINT64_MAX` for 32-bit targets as well: this matches
SignExtend64 as used in `relocateAlloc`. (Actually UINT32_MAX does not work for R_386_32)
Note 0, we only special case `target->symbolicRel` (R_X86_64_64, R_AARCH64_ABS64, R_PPC64_ADDR64), not
short-range absolute relocations (e.g. R_X86_64_32). Only forms like DW_FORM_addr need to be special cased.
They can hold an arbitrary address (must be 64-bit on a 64-bit target). (In theory,
producers can make use of small code model to emit 32-bit relocations. This doesn't seem to be leveraged.)
Note 1, we have to ignore the addend, because we don't want to resolve
DW_AT_low_pc (which may have a non-zero addend) to -1+addend (wrap
around to a low address):
__attribute__((section(".text.x"))) void f1() { }
__attribute__((section(".text.x"))) void f2() { } // DW_AT_low_pc has a non-zero addend
Note 2, if the prevailing copy does not have debugging information while
a non-prevailing copy has (partial debug build), we don't do extra work
to attach debugging information to the prevailing definition. (clang
has a lot of debug info optimizations that are on-by-default that assume
the whole program is built with debug info).
clang -c -ffunction-sections a.cc # prevailing copy has no debug info
clang -c -ffunction-sections -g b.cc
Reviewed By: dblaikie, avl, jhenderson
Differential Revision: https://reviews.llvm.org/D81784
If neither AT(lma) nor AT>lma_region is specified,
D76995 keeps `lmaOffset` (LMA - VMA) if the previous section is in the
default LMA region.
This patch additionally checks that the two sections are in the same
memory region.
Add a test case derived from https://bugs.llvm.org/show_bug.cgi?id=45313
.mdata : AT(0xfb01000) { *(.data); } > TCM
// It is odd to make .bss inherit lmaOffset, because the two sections
// are in different memory regions.
.bss : { *(.bss) } > DDR
With this patch, section VMA/LMA match GNU ld. Note, GNU ld supports
out-of-order (w.r.t sh_offset) sections and places .text and .bss in the
same PT_LOAD. We don't have that behavior.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81986
Fixes PR46348.
ObjFile<ELFT>::initializeSymbols contains two symbol iteration loops:
```
for each symbol
if non-inheriting && non-local
fill in this->symbols[i]
for each symbol
if local
fill in this->symbols[i]
else
symbol resolution
```
Symbol resolution can trigger a duplicate symbol error which will call
InputSectionBase::getObjMsg to iterate over InputFile::symbols. If a
non-local symbol appears after the non-local symbol being resolved
(violating ELF spec), its `this->symbols[i]` entry has not been filled
in, InputSectionBase::getObjMsg will crash due to
`dyn_cast<Defined>(nullptr)`.
To fix the bug, reorganize the two loops to ensure this->symbols is
complete before symbol resolution. This enforces the invariant:
InputFile::symbols has none null entry when InputFile::getSymbols() is called.
```
for each symbol
if non-inheriting
fill in this->symbols[i]
for each symbol starting from firstGlobal
if non-local
symbol resolution
```
Additionally, move the (non-local symbol in local part of .symtab)
diagnostic from Writer<ELFT>::copyLocalSymbols() to initializeSymbols().
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D81988
A hasWildcard pattern iterates over symVector, which can be slow when there
are many --export-dynamic-symbol. In optimistic cases, most patterns don't use
a wildcard character. hasWildcard: false can avoid a symbol table iteration.
While here, add two tests using `[` and `?`, respectively.
This change introduces an LLD switch --thinlto-single-module to allow compiling only a part of the input modules. This is specifically enables:
1. Fast investigating/debugging modules of interest without spending time on compiling unrelated modules.
2. Compiler debug dump with -mllvm -debug-only= for specific modules.
It will be useful for large applications which has 1K+ input modules for thinLTO.
The switch can be combined with `--lto-obj-path=` or `--lto-emit-asm` to obtain intermediate object files or assembly files. So far the module name matching is implemented as a fuzzy name lookup where the modules with name containing the switch value are compiled.
E.g,
Command:
ld.lld main.o thin.a --thinlto-single-module=thin.a --lto-obj-path=single.o
log:
[ThinLTO] Selecting thin.a(thin1.o at 168) to compile
[ThinLTO] Selecting thin.a(thin2.o at 228) to compile
Command:
ld.lld main.o thin.a --thinlto-single-module=thin1.o --lto-obj-path=single.o
log:
[ThinLTO] Selecting thin.a(thin1.o at 168) to compile
Differential Revision: https://reviews.llvm.org/D80406
After D79300, we don't rewrite InputFile::mb to an empty buffer.
In thinLTOCreateEmptyIndexFiles(), we should check LazyObjFile::fetched
as well as checking whether mb is a bitcode, otherwise we would overwrite (path + .thinlto.bc) with an empty index.
Fixes PR45594.
In `ObjFile<ELFT>::initializeSymbols()`, for a defined symbol relative to
a discarded section (due to section group rules), it may have been
inserted as a lazy symbol. We need to demote it to an Undefined to
enable the `discarded section` error happened in a later pass.
Add `LazyObjFile::fetched` (if true) and `ArchiveFile::parsed` (if
false) to represent that there is an ongoing lazy symbol fetch and we
should replace the current lazy symbol with an Undefined, instead of
calling `Symbol::resolve` (`Symbol::resolve` should be called if the lazy
symbol was added by an unrelated archive/lazy object).
As a side result, one small issue in start-lib-comdat.s is now fixed.
The hack motivating D51892 will be unsupported: if
`.gnu.linkonce.t.__i686.get_pc_thunk.bx` in an archive is referenced
by another section, this will likely be errored unless the function is
also defined in a regular object file.
(Bringing back rL330869 would error `undefined symbol` instead of the
more relevant `discarded section`.)
Note, glibc i386's crti.o still works (PR31215), because
`.gnu.linkonce.t.__x86.get_pc_thunk.bx` is in crti.o (one of the first
regular object files in a linker command line).
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D79300
If both a.a and b.so define foo
```
ld.bfd -u foo a.a b.so # foo is defined
ld.bfd a.a b.so -u foo # foo is defined
ld.bfd -u foo b.so a.a # foo is undefined (provided at runtime by b.so)
ld.bfd b.so a.a -u foo # foo is undefined (provided at runtime by b.so)
```
In all cases we make foo undefined in the output. I tend to think the
GNU ld behavior makes more sense.
* In their model, they have to treat -u as a fake object file with an
undefined symbol before all input files, otherwise the first archive would not be fetched.
* Following their behavior allows us to drop a --warn-backrefs special case.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D81052
--no-allow-shlib-undefined (enabled by default when linking an
executable) rejects unresolved references in shared objects.
Users may be confused by the common diagnostics of unresolved symbols in
object files (LLD: "undefined symbol: foo"; GNU ld/gold: "undefined reference to")
Learn from GCC/clang " [-Wfoo]": append the option name to the
diagnostics. Users can find relevant information by searching
"--no-allow-shlib-undefined". It should also be obvious to them that
the positive form --allow-shlib-undefined can suppress the error.
Also downgrade the error to a warning if --noinhibit-exec is used (compatible
with GNU ld and gold).
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D81028
MIPS 64-bit ABI does not provide special PC-relative relocation like
R_MIPS_PC32 in 32-bit case. But we can use a "chain of relocation"
defined by N64 ABIs. In that case one relocation record might contain up
to three relocations which applied sequentially. Width of a final relocation
mask applied to the result of relocation depends on the last relocation
in the chain. In case of 64-bit PC-relative relocation we need the following
chain: `R_MIPS_PC32 | R_MIPS_64`. The first relocation calculates an
offset, but does not truncate the result. The second relocation just
apply calculated result as a 64-bit value.
The 64-bit PC-relative relocation might be useful in generation of
`.eh_frame` sections to escape passing `-Wl,-z,notext` flags to linker.
Differential Revision: https://reviews.llvm.org/D80390
GNU ld from binutils 2.35 onwards will likely support
--export-dynamic-symbol but with different semantics.
https://sourceware.org/pipermail/binutils/2020-May/111302.html
Differences:
1. -export-dynamic-symbol is not supported
2. --export-dynamic-symbol takes a glob argument
3. --export-dynamic-symbol can suppress binding the references to the definition within the shared object if (-Bsymbolic or -Bsymbolic-functions)
4. --export-dynamic-symbol does not imply -u
I don't think the first three points can affect any user.
For the fourth point, Not implying -u can lead to some archive members unfetched.
Add -u foo to restore the previous behavior.
Exact semantics:
* -no-pie or -pie: matched non-local defined symbols will be added to the dynamic symbol table.
* -shared: matched non-local STV_DEFAULT symbols will not be bound to definitions within the shared object
even if they would otherwise be due to -Bsymbolic, -Bsymbolic-functions, or --dynamic-list.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D80487
LLD supports both REL and RELA for static relocations, but emits either
of REL and RELA for dynamic relocations. The relocation entry format is
specified by each psABI.
musl ld.so supports both REL and RELA. For such ld.so implementations,
REL (.rel.dyn .rel.plt) has size benefits even if the psABI chooses RELA:
sizeof(Elf64_Rel)=16 < sizeof(Elf64_Rela)=24.
* COPY, GLOB_DAT and J[U]MP_SLOT always have 0 addend. A ld.so
implementation does not need to read the implicit addend.
REL is strictly better.
* A RELATIVE has a non-zero addend. Such relocations can be packed
compactly with the RELR relocation entry format, which is out of scope
of this patch.
* For other dynamic relocation types (e.g. symbolic relocation R_X86_64_64),
a ld.so implementation needs to read the implicit addend. REL may have
minor performance impact, because reading implicit addends forces
random access reads instead of being able to blast out a bunch of
writes while chasing the relocation array.
This patch adds -z rel and -z rela to change the relocation entry format
for dynamic relocations. I have tested that a -z rel produced x86-64
executable works with musl ld.so
-z rela may be useful for debugging purposes on processors whose psABIs
specify REL as the canonical format: addends can be easily read by a tool.
Reviewed By: grimar, mcgrathr
Differential Revision: https://reviews.llvm.org/D80496
Summary:
Count the per-module number of basic blocks when the module summary is computed
and sum them up during Thin LTO indexing.
This is used to estimate the working set size under the partial sample PGO.
This is split off of D79831.
Reviewers: davidxl, espindola
Subscribers: emaste, inglorion, hiraditya, MaskRay, steven_wu, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80403
In D34993, we discussed and concluded that we should drop `__real_
symbol from the symbol table, but I did the opposite in D50569.
This patch is to drop `__real_` symbol.
MaskRay's note: omitting `__real_` is important if it is undefined:
otherwise a subsequent link may error due to the undefined `__real_` in .dynsym
Differential Revision: https://reviews.llvm.org/D51283
Bazel created interface shared objects (.ifso) may be misaligned. We use
llvm::support::detail::packed_endian_specific_integral under the hood
which allows reading of misaligned values, so there is not a need to
diagnose (in LLD we don't intend to support sophisticated parsing for
SHT_GNU_*).
In the 64-bit ELF V2 API Specification: Power Architecture, 2.3.3.1. GPR
Save and Restore Functions defines some special functions which may be
referenced by GCC produced assembly (LLVM does not reference them).
With GCC -Os, when the number of call-saved registers exceeds a certain
threshold, GCC generates `_savegpr0_* _restgpr0_*` calls and expects the
linker to define them. See
https://sourceware.org/pipermail/binutils/2002-February/017444.html and
https://sourceware.org/pipermail/binutils/2004-August/036765.html . This
is weird because libgcc.a would be the natural place. However, the linker
generation approach has the advantage that the linker can generate
multiple copies to avoid long branch thunks. We don't consider the
advantage significant enough to complicate our trunk implementation, so
we take a simple approach.
* Check whether `_savegpr0_{14..31}` are used
* If yes, define needed symbols and add an InputSection with the code sequence.
`_savegpr1_*` `_restgpr0_*` and `_restgpr1_*` are similar.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D79977
An undefined symbol in a shared object can be versioned, like `f@v1`.
We currently insert `f` as an Undefined into the symbol table, but we
should insert `f@v1` instead.
The string `v1` is inferred from SHT_GNU_versym and SHT_GNU_verneed.
This patch implements the functionality.
Failing to do this can cause two issues:
* If a versioned symbol referenced by a shared object is defined in the
executable, we will fail to export it.
* If a versioned symbol referenced by a shared object in another object
file, --no-allow-shlib-undefined may spuriously report an
"undefined reference to " error. See https://bugs.llvm.org/show_bug.cgi?id=44842
(Linking -lfftw3 -lm on Arch Linux can cause
`undefined reference to __log_finite`)
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D80059
Note, we still name a preempted SharedSymbol "shared definition",
instead of "reference" as printed by GNU ld. This difference should not matter.
```
// GNU ld
ld.bfd: t: definition of f@v1
ld.bfd: t.so: reference to f@v1
```
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D80143
This is fixing a thinLTO module collision issue for thin archives. The problem is that we always use a zero offset to name members in a thin archive and that causes the following build error:
ld.lld: error: Expected at most one ThinLTO module per bitcode file
which happens to a thin archive that has two members with the same object file name (whose paths will be ignored by thinLTO driver)
The fix here is to use real member offset instead as is done for non-thin archives.
Differential Revision: https://reviews.llvm.org/D79880
Announced on https://lists.llvm.org/pipermail/llvm-dev/2020-May/141416.html
Similar to D79371, but for `multiclass B` (convenience helper for defining --foo and --no-foo)
Some changed options are also used by gold, but I haven't seen their
one-dash use cases outside of lld's testsuite.
Both the .ARM.exidx and .eh_frame sections have a custom SyntheticSection
that acts as a container for the InputSections. The InputSections are added
to the SyntheticSection prior to /DISCARD/ which limits the affect a
/DISCARD/ can have to the whole SyntheticSection. In the majority of cases
this is sufficient as it is not common to discard subsets of the
InputSections. The Linux kernel has one of these scripts which has something
like:
/DISCARD/ : { *(.ARM.exidx.exit.text) *(.ARM.extab.exit.text) ... }
The .ARM.exidx.exit.text are not discarded because the InputSection has been
transferred to the Synthetic Section. The *(.ARM.extab.exit.text) sections
have not so they are discarded. When we come to write out the .ARM.exidx
sections the dangling references from .ARM.exidx.exit.text to
.ARM.extab.exit.text currently cause relocation out of range errors, but
could as easily cause a fatal error message if we check for dangling
references at relocation time.
This patch attempts to respect the /DISCARD/ command by running it on the
.ARM.exidx InputSections stored in the SyntheticSection.
The .eh_frame is in theory affected by this problem, but I don't think that
there is a dangling reference problem that can happen with these sections.
Fixes remaining part of pr44824
Differential Revision: https://reviews.llvm.org/D79687
For sampleFDO, because the optimized build uses profile generated from previous
release, often we couldn't tell a function without profile was truely cold or
just newly created so we had to treat them conservatively and put them in .text
section instead of .text.unlikely. The result was when we persue the best
performance by locking .text.hot and .text in memory, we wasted a lot of memory
to keep cold functions inside. This problem has been largely solved for regular
sampleFDO using profile-symbol-list (https://reviews.llvm.org/D66374), but for
the case when we use partial profile, we still waste a lot of memory because
of it.
In https://reviews.llvm.org/D62540, we propose to save functions with unknown
hotness information in a special section called ".text.unknown", so that
compiler will treat those functions as luck-warm, but runtime can choose not
to mlock the special section in memory or use other strategy to save memory.
That will solve most of the memory problem even if we use a partial profile.
The patch adds the support in lld for the special section.For sampleFDO,
because the optimized build uses profile generated from previous release,
often we couldn't tell a function without profile was truely cold or just
newly created so we had to treat them conservatively and put them in .text
section instead of .text.unlikely. The result was when we persue the best
performance by locking .text.hot and .text in memory, we wasted a lot of
memory to keep cold functions inside. This problem has been largely solved
for regular sampleFDO using profile-symbol-list
(https://reviews.llvm.org/D66374), but for the case when we use partial
profile, we still waste a lot of memory because of it.
In https://reviews.llvm.org/D62540, we propose to save functions with unknown
hotness information in a special section called ".text.unknown", so that
compiler will treat those functions as luck-warm, but runtime can choose not
to mlock the special section in memory or use other strategy to save memory.
That will solve most of the memory problem even if we use a partial profile.
The patch adds the support in lld for the special section.
Differential Revision: https://reviews.llvm.org/D79590
A linker will create .ARM.exidx sections for InputSections that don't
have them. This can cause a relocation out of range error If the
InputSection happens to be extremely far away from the other sections.
This is often the case for the vector table on older ARM CPUs as the only
two places that the table can be placed is 0 or 0xffff0000. We fix this
by removing InputSections that need a linker generated .ARM.exidx
section if that would cause an error.
Differential Revision: https://reviews.llvm.org/D79289
Summary:
That unless the user requested an output object (--lto-obj-path), the an
unused empty combined module is not emitted.
This changed is helpful for some target (ex. RISCV-V) which encoded the
ABI info in IR module flags (target-abi). Empty unused module has no ABI
info so the linker would get the linking error during merging
incompatible ABIs.
Reviewers: tejohnson, espindola, MaskRay
Subscribers: emaste, inglorion, arichardson, hiraditya, simoncook, MaskRay, steven_wu, dexonsmith, PkmX, dang, lenary, s.egerton, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78988
Sections with the SHF_LINK_ORDER flag must be ordered in the same relative
order as the Sections they have a link to. When using a linker script an
arbitrary expression may be used for the virtual address of the
OutputSection. In some cases the virtual address does not monotonically
increase as the OutputSection index increases, so if we base the ordering
of the SHF_LINK_ORDER sections on the index then we can get the order
wrong. We fix this by moving SHF_LINK_ORDER resolution till after we have
created OutputSection virtual addresses.
Differential Revision: https://reviews.llvm.org/D79286
Summary:
Lld test ELF/linkerscript/thunk-gen-mips.s was accidentally disabled due
to the use of wrong FileCheck directives. As a result the test seems to
have bitrotted as it fails to pass if fixing the directive. To ease
updates to the test in case of change of the __start address the checks
have been changed to use numeric variables to express all the addresses
based on the __start address.
Reviewed By: atanasyan
Differential Revision: https://reviews.llvm.org/D79270
Lld test ELF/linkerscript/input-archive.s fails when path contain a @
because is not accepted in unquoted token in linker scripts which leads
to the path being broken in 2 around the @. This commit quotes the path
used in the linker script created by this and similar testcases allowing
the test to pass even in the presence of an @ sign in the path.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D79103
The current implementation assumes that R_PPC64_TOC16_HA is always followed
by R_PPC64_TOC16_LO_DS. This can break with R_PPC64_TOC16_LO:
// Load the address of the TOC entry, instead of the value stored at that address
addis 3, 2, .LC0@tloc@ha # R_PPC64_TOC16_HA
addi 3, 3, .LC0@tloc@l # R_PPC64_TOC16_LO
blr
which is used by boringssl's util/fipstools/delocate/delocate.go
https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation.
In short, this tool converts an assembly file to avoid any potential relocations.
The distance to an input .toc is not a constant after linking, so it cannot use an `addis;ld` pair.
Instead, it jumps to a stub which loads the TOC entry address with `addis;addi`.
This patch checks the presence of R_PPC64_TOC16_LO and suppresses
toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen.
This approach is conservative and loses some relaxation opportunities but is easy to implement.
addis 3, 2, .LC0@toc@ha # no relaxation
addi 3, 3, .LC0@toc@l # no relaxation
li 9, 0
addis 4, 2, .LC0@toc@ha # can relax but suppressed
ld 4, .LC0@toc@l(4) # can relax but suppressed
Also note that interleaved R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS is
possible and this patch accounts for that.
addis 3, 2, .LC1@toc@ha # can relax
addis 4, 2, .LC2@toc@ha # can relax
ld 3, .LC1@toc@l(3) # can relax
ld 4, .LC2@toc@l(4) # can relax
Reviewed By: #powerpc, sfertile
Differential Revision: https://reviews.llvm.org/D78431
gold has an option --print-symbol-counts= which prints:
// For each archive
archive $archive $members $fetched_members
// For each object file
symbols $object $defined_symbols $used_defined_symbols
In most cases, `$defined_symbols = $used_defined_symbols` unless weak
symbols are present. Strangely `$used_defined_symbols` includes symbols defined relative to --gc-sections discarded sections.
The `symbols` lines do not appear to be useful.
`archive` lines are useful: `$fetched_members=0` lines correspond to
unused archives. The information can be used to trim dependencies.
This patch implements --print-archive-stats= which prints the number of
members and the number of fetched members for each archive.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D78983
--gdb-index currently crashes when reading a translation unit with
DWARF v5 .debug_loclists . Call stack:
```
SyntheticSections.cpp GdbIndexSection::create
SyntheticSections.cpp readAddressAreas
DWARFUnit.cpp DWARFUnit::tryExtractDIEsIfNeeded
DWARFListTable.cpp DWARFListTableHeader::extract
...
DWARFDataExtractor.cpp DWARFDataExtractor::getRelocatedValue
lld/ELF/DWARF.cpp LLDDwarfObj<ELFT>::find (sec.sec is nullptr)
...
```
This patch adds support for .debug_loclists to make `DWARFUnit::tryExtractDIEsIfNeeded` happy.
Building --gdb-index does not need .debug_loclists
Reviewed By: dblaikie, grimar
Differential Revision: https://reviews.llvm.org/D79061
This reverts commit 03ffe58605.
Full tile of reverted commit is:
[ELF][PPC64] Don't perform toc-indirect to toc-relative relaxation for
R_PPC64_TOC16_HA not followed by R_PPC64_TOC16_LO_DS
Breaks the multistage lld PowerPC buildbot.
The current implementation assumes that R_PPC64_TOC16_HA is always followed
by R_PPC64_TOC16_LO_DS. This can break with:
// Load the address of the TOC entry, instead of the value stored at that address
addis 3, 2, .LC0@tloc@ha # R_PPC64_TOC16_HA
addi 3, 3, .LC0@tloc@l # R_PPC64_TOC16_LO
blr
which is used by boringssl's util/fipstools/delocate/delocate.go
https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation.
In short, this tool converts an assembly file to avoid any potential relocations.
The distance to an input .toc is not a constant after linking, so the assembly cannot use an `addis;ld` pair.
Instead, delocate changes the code to jump to a stub (`addis;addi`) which loads the TOC entry address.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D78431
GNU tools generate mapping symbols "$d" for .ARM.exidx sections. The
symbols are added to the symbol table much earlier than the merging
takes place, and after that, they become dangling. Before the patch,
LLD output those symbols as SHN_ABS with the value of 0. The patch
removes such symbols from the symbol table.
Differential Revision: https://reviews.llvm.org/D78820
Summary: The switch --plugin-opt=emit-asm can be used with the gold linker to dump the final assembly code generated by LTO in a user-friendly way. Unfortunately it doesn't work with lld. I'm hooking it up with lld. With that switch, lld emits assembly code into the output file (specified by -o) and if there are multiple input files, each of their assembly code will be emitted into a separate file named by suffixing the output file name with a unique number, respectively. The linking then stops after generating those assembly files.
Reviewers: espindola, wenlei, tejohnson, MaskRay, grimar
Reviewed By: tejohnson, MaskRay, grimar
Subscribers: pcc, emaste, inglorion, arichardson, hiraditya, MaskRay, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77231
When discarding local symbols with --discard-all or --discard-locals,
the ones which are used in relocations should be preserved. LLD used
the simplest approach and just ignored those switches when -r or
--emit-relocs was specified.
The patch implements handling the --discard-* switches for the cases
when relocations are kept by identifying used local symbols and allowing
removing only unused ones. This makes the behavior of LLD compatible
with GNU linkers.
Differential Revision: https://reviews.llvm.org/D77807
Fixed error detected by msan. The size field of the .ARM.exidx synthetic
section needs to be initialized to at least estimation level before
calling assignAddresses as that will use the size field.
This was previously reverted in 1ca16fc4f5.
Differential Revision: https://reviews.llvm.org/D78422
This reverts commit f969c2aa65.
There are some msan buildbot failures sanitzer-x86_64-linux-fast that
I need to investigate.
Differential Revision: https://reviews.llvm.org/D78422
The contents of the .ARM.exidx section must be ordered by SHF_LINK_ORDER
rules. We don't need to know the precise address for this order, but we
do need to know the relative order of sections. We have been using the
sectionIndex for this purpose, this works when the OutputSection order
has a monotonically increasing virtual address, but it is possible to
write a linker script with non-monotonically increasing virtual address.
For these cases we need to evaluate the base address of the OutputSection
so that we can order the .ARM.exidx sections properly.
This change moves the finalisation of .ARM.exidx till after the first
call to AssignAddresses. This permits us to sort on virtual address which
is linker script safe. It also permits a fix for part of pr44824 where
we generate .ARM.exidx section for the vector table when that table is so
far away it is out of range of the .ARM.exidx section. This fix will come
in a follow up patch.
Differential Revision: https://reviews.llvm.org/D78422
Time profiler emits relative timestamps for events (the number of
microseconds passed since the start of the current process).
This patch allows combining events from different processes while
preserving their relative timing by emitting a new attribute
"beginningOfTime". This attribute contains the system time that
corresponds to the zero timestamp of the time profiler.
This has at least two use cases:
- Build systems can use this to merge time traces from multiple compiler
invocations and generate statistics for the whole build. Tools like
ClangBuildAnalyzer could also leverage this feature.
- Compilers that use LLVM as their backend by invoking llc/opt in
a child process. If such a compiler supports generating time traces
of its own events, it could merge those events with LLVM-specific
events received from llc/opt, and produce a more complete time trace.
A proof-of-concept script that merges multiple logs that
contain a synchronization point into one log:
https://github.com/broadwaylamb/merge_trace_events
Differential Revision: https://reviews.llvm.org/D78030
For a relative path in INPUT() or GROUP(), this patch changes the search order by adding the directory of the current linker script.
The new search order (consistent with GNU ld >= 2.35 regarding the new test `test/ELF/input-relative.s`):
1. the directory of the current linker script (GNU ld from Binutils 2.35 onwards; https://sourceware.org/bugzilla/show_bug.cgi?id=25806)
2. the current working directory
3. library paths (-L)
This behavior makes it convenient to replace a .so or .a with a linker script with additional input. For example, glibc
```
% cat /usr/lib/x86_64-linux-gnu/libm.a
/* GNU ld script
*/
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /usr/lib/x86_64-linux-gnu/libm-2.29.a /usr/lib/x86_64-linux-gnu/libmvec.a )
```
could be simplified as `GROUP(libm-2.29.a libmvec.a)`.
Another example is to make libc++.a a linker script:
```
INPUT(libc++.a.1 libc++abi.a)
```
Note, -l is not affected.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D77779
After D78301 MC no longer emits a relocation for this case. Change to use
.inst and .reloc to synthesize the same instruction and relocation. One
more test case I missed.
If there is no SHF_TLS section, there will be no PT_TLS and Out::tlsPhdr may be a nullptr.
If the symbol referenced by an R_TLS is lazy, we should treat the symbol as undefined.
Also reorganize tls-in-archive.s and tls-weak-undef.s . They do not test what they intended to test.
This fixes a bug as exposed by D77807.
Add tests for {--emit-relocs,-r} x {--discard-locals,--discard-all}. They add coverage for previously undertested cases:
* STT_SECTION associated to GCed sections (`gc`)
* STT_SECTION associated to retained sections (`text`)
* STT_SECTION associated to non-SHF_ALLOC sections (`.comment`)
* STB_LOCAL in GCed sections (`unused_gc`)
Reviewed By: grimar, ikudrin
Differential Revision: https://reviews.llvm.org/D78389
D13550 added the diagnostic to address/work around a crash.
The rule was refined by D19836 (test/ELF/tls-archive.s) to exclude Lazy symbols.
https://bugs.llvm.org/show_bug.cgi?id=45598 reported another case where the current logic has a false positive:
Bitcode does not record undefined module-level inline assembly symbols
(`IRSymtab.cpp:Builder::addSymbol`). Such an undefined symbol does not
have the FB_tls bit and lld will not consider it STT_TLS. When the symbol is
later replaced by a STT_TLS Defined, lld will error "TLS attribute mismatch".
This patch fixes this false positive by allowing a STT_NOTYPE undefined
symbol to be replaced by a STT_TLS.
Considered alternative:
Moving the diagnostics to scanRelocs() can improve the diagnostics (PR36049)
but that requires a fair amount of refactoring. We will need more
RelExpr members. It requires more thoughts whether it is worthwhile.
See `test/ELF/tls-mismatch.s` for behavior differences. We will fail to
diagnose a likely runtime bug (STT_NOTYPE non-TLS relocation referencing
a TLS definition). This is probably acceptable because compiler
generated code sets symbol types properly.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D78438
D77522 changed --warn-backrefs to not warn for linking sandwich
problems (-ldef1 -lref -ldef2). This removed lots of false positives.
However, glibc still has some problems. libc.a defines some symbols
which are normally in libm.a and libpthread.a, e.g. __isnanl/raise.
For a linking order `-lm -lpthread -lc`, I have seen:
```
// different resolutions: GNU ld/gold select libc.a(s_isnan.o) as the definition
backward reference detected: __isnanl in libc.a(printf_fp.o) refers to libm.a(m_isnanl.o)
// different resolutions: GNU ld/gold select libc.a(raise.o) as the definition
backward reference detected: raise in libc.a(abort.o) refers to libpthread.a(pt-raise.o)
```
To facilitate deployment of --warn-backrefs, add --warn-backrefs-exclude= so that
certain known issues (which may be impractical to fix) can be whitelisted.
Deliberate choices:
* Not a comma-separated list (`--warn-backrefs-exclude=liba.a,libb.a`).
-Wl, splits the argument at commas, so we cannot use commas.
--export-dynamic-symbol is similar.
* Not in the style of `--warn-backrefs='*' --warn-backrefs=-liba.a`.
We just need exclusion, not inclusion. For easier build system
integration, we should avoid order dependency. With the current
scheme, we enable --warn-backrefs, and indivial libraries can add
--warn-backrefs-exclude=<glob> to their LDFLAGS.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D77512
See http://lists.llvm.org/pipermail/llvm-dev/2020-April/140549.html
For the record, GNU ld changed to 64k max page size in 2014
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=7572ca8989ead4c3425a1500bc241eaaeffa2c89
"[RFC] ld/ARM: Increase maximum page size to 64kB"
Android driver forced 4k page size in AArch64 (D55029) and ARM (D77746).
A binary linked with max-page-size=4096 does not run on a system with a
higher page size configured. There are some systems out there that do
this and it leads to the binary getting `Killed!` by the kernel.
In the non-linker-script cases, when linked with -z noseparate-code
(default), the max-page-size increase should not cause any size
difference. There may be some VMA usage differences, though.
Reviewed By: psmith, MaskRay
Differential Revision: https://reviews.llvm.org/D77330
Implemented a bunch of relocations found in binaries with medium/large code model and the Local-Exec TLS model. The binaries link and run fine in Qemu.
In addition, the emulation `elf64_sparc` is now recognized.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D77672
GCC collect2 passes several options to the linker even if LTO is not used
(note, lld does not support GCC LTO). The lto-wrapper may be a relative
path (especially during development, when gcc is in a build directory), e.g.
-plugin-opt=relative/path/to/lto-wrapper
We need to ignore such options, which are currently interpreted by
cl::ParseCommandLineOptions() and will fail with `error: --plugin-opt: ld.lld: Unknown command line argument 'relative/path/to/lto-wrapper'`
because the path is apparently not an option registered by an `llvm:🆑:opt`.
See lto-plugin-ignore.s for how we interpret various -plugin-opt= options now.
Reviewed By: grimar, tejohnson
Differential Revision: https://reviews.llvm.org/D78158
This patch changes the reproduce tests so that they no longer extract
the "long" paths of the generated reproduce tar archives. This
extraction prevented them from being run on Windows due to potential
issues relating to the Windows path length limit.
This patch also reduces the use of diff in these tests, as this was
raised as a performance concern in review D77659 and deemed unnecessary.
Differential Revision: https://reviews.llvm.org/D77750
The R_ARM_ALU_PC_G0 and R_ARM_LDR_PC_G0 relocations are used by the
ADR and LDR pseudo instructions, and are the basis of the group
relocations that can load an arbitrary constant via a series of add, sub
and ldr instructions.
The relocations need to be obtained via the .reloc directive.
R_ARM_ALU_PC_G0 is much more complicated as the add/sub instruction uses
a modified immediate encoding of an 8-bit immediate rotated right by an
even 4-bit field. This means that the range of representable immediates
is sparse. We extract the encoding and decoding functions for the modified
immediate from llvm/lib/Target/ARM/MCTargetDesc/ARMAddressingModes.h as
this header file is not accessible from LLD. Duplication of code isn't
ideal, but as these are well-defined mathematical functions they are
unlikely to change.
Differential Revision: https://reviews.llvm.org/D75349
This is an alternative design to D77512.
D45195 added --warn-backrefs to detect
* A. certain input orders which GNU ld either errors ("undefined reference")
or has different resolution semantics
* B. (byproduct) some latent multiple definition problems (-ldef1 -lref -ldef2) which I
call "linking sandwich problems". def2 may or may not be the same as def1.
When an archive appears more than once (-ldef -lref -ldef), lld and GNU
ld may have the same resolution but --warn-backrefs may warn. This is
not uncommon. For example, currently lld itself has such a problem:
```
liblldCommon.a liblldCOFF.a ... liblldCommon.a
_ZN3lld10DWARFCache13getDILineInfoEmm in liblldCOFF.a refers to liblldCommon.a(DWARF.cpp.o)
libLLVMSupport.a also appears twice and has a similar warning
```
glibc has such problems. It is somewhat destined because of its separate
libc/libpthread/... and arbitrary grouping. The situation is getting
improved over time but I have seen:
```
-lc __isnanl references -lm
-lc _IO_funlockfile references -lpthread
```
There are also various issues in interaction with other runtime
libraries such as libgcc_eh and libunwind:
```
-lc __gcc_personality_v0 references -lgcc_eh
-lpthread __gcc_personality_v0 references -lgcc_eh
-lpthread _Unwind_GetCFA references -lunwind
```
These problems are actually benign. We want --warn-backrefs to focus on
its main task A and defer task B (which is also useful) to a more
specific future feature (see gold --detect-odr-violations and
https://bugs.llvm.org/show_bug.cgi?id=43110).
Instead of warning immediately, we store the message and only report it
if no subsequent lazy definition exists.
The use of the static variable `backrefDiags` is similar to `undefs` in
Relocations.cpp
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D77522
SymbolAssignment::addr stores the location counter. The type should be
uint64_t instead of unsigned. The upper half of the address space is
commonly used by operating system kernels.
Similarly, SymbolAssignment::size should be an uint64_t. A kernel linker
script can move the location counter from 0 to the upper half of the
address space.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D77445
This is part of the Propeller framework to do post link code layout
optimizations. Please see the RFC here:
https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the
detailed RFC doc here:
https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf
This patch adds lld support for basic block sections and performs relaxations
after the basic blocks have been reordered.
After the linker has reordered the basic block sections according to the
desired sequence, it runs a relaxation pass to optimize jump instructions.
Currently, the compiler emits the long form of all jump instructions. AMD64 ISA
supports variants of jump instructions with one byte offset or a four byte
offset. The compiler generates jump instructions with R_X86_64 32-bit PC
relative relocations. We would like to use a new relocation type for these jump
instructions as it makes it easy and accurate while relaxing these instructions.
The relaxation pass does two things:
First, it deletes all explicit fall-through direct jump instructions between
adjacent basic blocks. This is done by discarding the tail of the basic block
section.
Second, If there are consecutive jump instructions, it checks if the first
conditional jump can be inverted to convert the second into a fall through and
delete the second.
The jump instructions are relaxed by using jump instruction mods, something
like relocations. These are used to modify the opcode of the jump instruction.
Jump instruction mods contain three values, instruction offset, jump type and
size. While writing this jump instruction out to the final binary, the linker
uses the jump instruction mod to determine the opcode and the size of the
modified jump instruction. These mods are required because the input object
files are memory-mapped without write permissions and directly modifying the
object files requires copying these sections. Copying a large number of basic
block sections significantly bloats memory.
Differential Revision: https://reviews.llvm.org/D68065
Fixes https://bugs.llvm.org/show_bug.cgi?id=45391
The LTO code generator happens after version script scanning and may
create references which will fetch some lazy symbols.
Currently a version script does not assign VER_NDX_LOCAL to lazy symbols
and such symbols will be made global after they are fetched.
Change findByVersion and findAllByVersion to work on lazy symbols.
For unfetched lazy symbols, we should keep them non-local (D35263).
Check isDefined() in computeBinding() as a compensation.
This patch fixes a companion bug that --dynamic-list does not export
libcall fetched symbols.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D77280
finalizeSynthetic(in.symTab) calls sortSymTabSymbols() to order local
symbols before non-local symbols.
The newly added tests ensure that thunk symbols are added before
finalizeSynthetic(in.symTab), otherwise .symtab would be out of order.
A PC-relative relocation referencing a non-preemptible absolute symbol
(due to STT_TLS) is not representable in -pie/-shared mode.
Differential Revision: https://reviews.llvm.org/D77021
In the near future llvm-mc will resolve the fixups that generate
R_ARM_THUMB_PC8 and R_ARM_THUMB_PC12 at assembly time (see comments in
D72892), and forbid inter-section references. Change the LLD tests for
these relocations to use .inst and .reloc to avoid LLD tests failing when
this happens. The tests generate the same instructions, relocations
and symbols.
I will need to make equivalent changes for D75349 Arm equivalent
relocations, but this is still in review so these don't need changing
before llvm-mc.
Differential Revision: https://reviews.llvm.org/D77200
In most cases, LLD prints its multiline diagnostic messages starting
additional lines with ">>> ". That greatly helps external tools to parse
the output, simplifying combining several lines of the log back into one
message. The patch fixes the only message I found that does not follow
the common pattern.
Differential Revision: https://reviews.llvm.org/D77132
When reporting an "undefined symbol" diagnostic:
* We don't print @ for the reference.
* We don't print @ or @@ for the definition. https://bugs.llvm.org/show_bug.cgi?id=45318
This can lead to confusing diagnostics:
```
// foo may be foo@v2
ld.lld: error: undefined symbol: foo
>>> referenced by t1.o:(.text+0x1)
// foo may be foo@v1 or foo@@v1
>>> did you mean: foo
>>> defined in: t.so
```
There are 2 ways a symbol in symtab may get truncated:
* A @@ definition may be truncated *early* by SymbolTable::insert().
The name ends with a '\0'.
* A @ definition/reference may be truncated *later* by Symbol::parseSymbolVersion().
The name ends with a '@'.
This patch detects the second case and improves the diagnostics. The first case is
not improved but the second case is sufficient to make diagnostics not confusing.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D76999
--no-threads is a name copied from gold.
gold has --no-thread, --thread-count and several other --thread-count-*.
There are needs to customize the number of threads (running several lld
processes concurrently or customizing the number of LTO threads).
Having a single --threads=N is a straightforward replacement of gold's
--no-threads + --thread-count.
--no-threads is used rarely. So just delete --no-threads instead of
keeping it for compatibility for a while.
If --threads= is specified (ELF,wasm; COFF /threads: is similar),
--thinlto-jobs= defaults to --threads=,
otherwise all available hardware threads are used.
There is currently no way to override a --threads={1,2,...}. It is still
a debate whether we should use --threads=all.
Reviewed By: rnk, aganea
Differential Revision: https://reviews.llvm.org/D76885