This is available in GNU ld 2.35 and can be seen as a shortcut for multiple
--export-dynamic-symbol, or a --dynamic-list variant without the symbolic intention.
In the long term, this option probably should be preferred over --dynamic-list.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D107317
This option is a subset of -Bsymbolic-functions. It applies to STB_GLOBAL
STT_FUNC definitions.
The address of a vague linkage function (STB_WEAK STT_FUNC, e.g. an inline
function, a template instantiation) seen by a -Bsymbolic-functions linked
shared object may be different from the address seen from outside the shared
object. Such cases are uncommon. (ELF/Mach-O programs may use
`-fvisibility-inlines-hidden` to break such pointer equality. On Windows,
correct dllexport and dllimport are needed to make pointer equality work.
Windows link.exe enables /OPT:ICF by default so different inline functions may
have the same address.)
```
// a.cc -> a.o -> a.so (-Bsymbolic-functions)
inline void f() {}
void *g() { return (void *)&f; }
// b.cc -> b.o -> exe
// The address is different!
inline void f() {}
```
-Bsymbolic-non-weak-functions is a safer (C++ conforming) subset of
-Bsymbolic-functions, which can make such programs work.
Implementations usually emit a vague linkage definition in a COMDAT group. We
could detect the group (with more code) but I feel that we should just check
STB_WEAK for simplicity. A weak definition will thus serve as an escape hatch
for rare cases when users want interposition on definitions.
GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27871
Longer write-up: https://maskray.me/blog/2021-05-16-elf-interposition-and-bsymbolic
If Linux distributions migrate to protected non-vague-linkage external linkage
functions by default, the linker option can still be handy because it allows
rapid experiment without recompilation. Protected function addresses currently
have deep issues in GNU ld.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D102570
This is somewhat of a repeat of D66658 but for sections in PT_TLS
segments. Although such sections don't need to be aligned such that
address and offset are congruent modulo the page size, they do need
to be congruent modulo the segment alignment, otherwise the
whole PT_TLS will be unaligned. We therefore use the normal calculation
to determine the section's address within the PT_LOAD rather than
bailing out early due to being SHT_NOBITS.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D106987
clang may place dynamic initializations for explicitly specialized class
template static data members in comdat.
Such in-comdat SHT_INIT_ARRAY was an abuse but we have to work around it for a while.
Change removeUnusedSyntheticSections() to actually remove empty
SyntheticSections in inputSections.
In addition to doing what removeUnusedSyntheticSections() was meant
to do, this will also make the shuffle-sections tests, which shuffles
inputSections, less sensitive to empty Synthetic Sections that
will not appear in the final image.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D106427
Change-Id: I589eaf596472161a4395fb658aea0fad73318088
This generalizes D70146 (SHT_NOTE) to more reserved sections and makes our rules
more consistent. Now SHF_GROUP is more similar to SHF_LINK_ORDER.
For SHT_INIT_ARRAY/SHT_FINI_ARRAY, the rule will be closer to PE/COFF link.exe.
Previously sanitizers use llvm.global_ctors to make module_ctor a GC
root, which is considered an abuse.
https://groups.google.com/g/generic-abi/c/TpleUEkNoQI
We can squeak through on compatibility issues because compilers otherwise don't
use SHF_GROUP special sections.
In PGO, a C++ external linkage function `foo` has a private counter
`__profc_foo` and a private `__profd_foo` in a `comdat nodeduplicate`.
A `__attribute__((weak))` function `foo` has a weak hidden counter `__profc_foo`
and a private `__profd_foo` in a `comdat nodeduplicate`.
In `ld.lld a.o b.o`, say a.o defines an external linkage `foo` and b.o
defines a weak `foo`. Currently we treat `comdat nodeduplicate` as `comdat any`,
ld.lld will incorrectly consider `b.o:__profc_foo` non-prevailing. In the worst
case when `b.o:__profd_foo` is retained and `b.o:__profc_foo` isn't, there will
be dangling reference causing an `undefined hidden symbol` error.
Add SelectionKind to `Comdat` in IRSymtab and let linkers ignore nodeduplicate comdat.
Differential Revision: https://reviews.llvm.org/D106228
`clang -fuse-ld=lld -static-pie -fpie` produced executable
currently crashes and this patch makes it work.
See https://sourceware.org/bugzilla/show_bug.cgi?id=27164
and https://sourceware.org/pipermail/libc-alpha/2021-July/128810.html
While it seems unreasonable to keep csu/libc-start.c ARCH_APPLY_IREL unclear in
static-pie mode and have an unneeded diff -u =(ld.bfd --verbose) =(ld.bfd -pie
--verbose) difference, glibc folks don't want to fix their code.
I feel sad about that but this patch can remove an iffy condition for lld/ELF
as well: `needsInterpSection()`.
The ELF specification says "The link editor honors the common definition and
ignores the weak ones." GNU ld and our Symbol::compare follow this, but the
--fortran-common code (D86142) made a mistake on the precedence.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51082
Reviewed By: peter.smith, sfertile
Differential Revision: https://reviews.llvm.org/D105945
This is a follow up to https://reviews.llvm.org/D104080, and ca3bdb57fa (diff-e64a48fabe31db213a631fdc5f2acb51bdddf3f16a8fb2928784f4c579229585). The implementation of call graph profile was changed from a black box section to relocation approach. This was done to be compatible with post processing tools like strip/objcopy, and llvm equivalent. When they are invoked on object file before the final linking step with this new approach the symbol indices correctness is preserved.
The GNU binutils tools change the REL section to RELA section, unlike llvm tools. For example when strip -S is run on the ELF object files, as an intermediate step before linking. To preserve compatibility this patch extends implementation in LLD and ELFDumper to support both REL and RELA sections for call graph profile.
Reviewed By: MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D105217
This patch is a followup patch to https://reviews.llvm.org/D105760 which adds this relocation. This handles the relocation in lld.
The s_branch family of instruction does the following:
PC = PC + signext(simm * 4) + 4
so we we do the opposite on the target address before writing it in the instruction stream.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D105761
Since D100490 this case is diagnosed for -z rel. This commit implements
R_AARCH64_TLSDESC cases for AArch64::getImplicitAddend() and
AArch64::relocate(). However, there are probably further relocation types
that need to be handled for full support of -z rel.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47009
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100544
I found this missing case with the new --check-dynamic-relocation flag
while running the lld tests with --apply-dynamic-relocs enabled by default.
This is the same as D101452 just for RISC-V
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101454
I found this missing case with the new --check-dynamic-relocation flag
while running the lld tests with --apply-dynamic-relocs enabled by default.
This also fixes a broken CHECK in lld/test/ELF/x86-64-gotpc-relax.s:
The test wasn't using CHECK-NEXT, so it was passing despite the output
actually containing relocations. I am not sure when this changed, but I
think this behaviour is correct.
Found with D101450 + enabling --apply-dynamic-relocs by default.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101452
There used to be many cases where addends for Elf_Rel were not emitted in
the final object file (mostly when building for MIPS64 since the input .o
files use RELA but the output uses REL). These cases have been fixed since,
but this patch adds a check to ensure that the written values are correct.
It is based on a previous patch that I added to the CHERI fork of LLD since
we were using MIPS64 as a baseline. The work has now almost entirely
shifted to RISC-V and Arm Morello (which use Elf_Rela), but I thought
it would be useful to upstream our local changes anyway.
This patch adds a (hidden) command line flag --check-dynamic-relocations
that can be used to enable these checks. It is also on by default in
assertions builds for targets that handle all dynamic relocations kinds
that LLD can emit in Target::getImplicitAddend(). Currently this is
enabled for ARM, MIPS, and I386.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101450
This patch changes the DynamicReloc class to store an enum instead
of the overloaded useSymVA member to make it easier to understand
and fix incorrect addends being written in some corner cases. The
change is motivated by a follow-up review that checks the value of
implicit Elf_Rel addends written to the output file.
This patch fixes an incorrect output when using `-z rela` for i386 files
with R_386_GOT32 relocations (not that this really matters since it's an
unsupported configuration).
Storing the relocation expression kind also addresses an incorrect addend
FIXME in ppc64-abs64-dyn.s introduced in D63383.
DynamicReloc now also has a special case for the MIPS TLS relocations
(DynamicReloc::AgainstSymbolWithTargetVA) since the
R_MIPS_TLS_TPREL{32/64} the symbol VA to the GOT for preemptible
symbols. I'm not sure if the symbol value actually should be written
for R_MIPS_TLS_TPREL32, but this patch does not attempt to change
that behaviour.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D100490
For
```
SECTIONS {
text.0 : {}
text.1 : {}
text.2 : {}
} INSERT AFTER .data;
```
the current order is `.data text.2 text.1 text.0`. It makes more sense to
preserve the specified order and thus improve compatibility with GNU ld.
For
```
SECTIONS { text.0 : {} } INSERT AFTER .data;
SECTIONS { text.3 : {} } INSERT AFTER .data;
```
GNU ld somehow collects sections with `INSERT AFTER .data` together (IMO
inconsistent) but I think it makes more sense to execute the commands in order
and get `.data text.3 text.0` instead.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D105158
See the comment for my understanding of -no-pie and -shared expectation.
-no-pie has freedom on choices. We choose dynamic relocations to be consistent
with the handling of GOT-generating relocations.
Note: GNU ld has arch-varying behaviors and its x86 -pie has a very
complex rule:
if there is at least one GOT-generating or PLT-generating relocation and
-z dynamic-undefined-weak (enabled by default) is in effect, generate a
dynamic relocation.
We don't emulate its rule.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D105164
There are a couple of problems with the code to patch
unrelocated BLX instructions:
1. The calculation of the PC needs to take into account
the alignment of the instruction. The Thumb BLX
uses alignDown(PC, 4) for the source address.
2. The calculation of the PC bias is hard-coded to 4
which works for Thumb, but when there is a BLX the
branch will be in Arm state so it needs an 8 byte
PC bias.
No asssembler generates an unrelocated BLX instruction
so these problems do not affect real world programs.
However we should still fix them.
Differential Revision: https://reviews.llvm.org/D104905
Modify the D13209 logic: for a script inside the sysroot, if an absolute path
does not exist, report an error instead of falling back to the path without the
sysroot prefix.
This matches GNU ld, which makes sense to me: we don't want to find an arbitrary
file in the host.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D104894
... even on targets preferring RELA. The section is only consumed by ld.lld
which can handle REL.
Follow-up to D104080 as I explained in the review. There are two advantages:
* The D104080 code only handles RELA, so arm/i386/mips32 etc may warn for -fprofile-use=/-fprofile-sample-use= usage.
* Decrease object file size for RELA targets
While here, change the relocation to relocate weights, instead of 0,1,2,3,..
I failed to catch the issue during review.
Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index".
This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied.
With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct.
Doing a quick experiment with clang-13.
The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D104080
getLineNumber() was counting the number of line feeds from the start of
the buffer to the current token. For large linker scripts this became a
performance bottleneck. For one 4MB linker script over 4 minutes was
spent in getLineNumber's StringRef::count.
Store the line number from the last token, and only count the additional
line feeds since the last token.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D104137
During PHDR creation, the case where an output section does not require a
PT_LOAD header but still occupies memory in the current VMA region was not handled.
If such an output section interleaves two output sections that have the same
VMA and LMA regions set, we would previously re-use the existing PT_LOAD header
for the second output section.
However, since the memory region is not contiguous, we need to start a new PT_LOAD
segment.
This fixes https://bugs.llvm.org/show_bug.cgi?id=50558
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D103815
This implements https://sourceware.org/bugzilla/show_bug.cgi?id=26404
An `OVERWRITE_SECTIONS` command is a `SECTIONS` variant which contains several
output section descriptions. The output sections do not have specify an order.
Similar to `INSERT [BEFORE|AFTER]`, `LinkerScript::hasSectionsCommand` is not
set, so the built-in rules (see `docs/ELF/linker_script.rst`) still apply.
`OVERWRITE_SECTIONS` can be more convenient than `INSERT` because it does not
need an anchor section.
The initial syntax is intentionally narrow to facilitate backward compatible
extensions in the future. Symbol assignments cannot be used.
This feature is versatile. To list a few usage:
* Use `section : { KEEP(...) }` to retain input sections under GC
* Define encapsulation symbols (start/end) for an output section
* Use `section : ALIGN(...) : { ... }` to overalign an output section (similar to ld64 `-sectalign`)
When an output section is specified by both `OVERWRITE_SECTIONS` and
`INSERT`, `INSERT` is processed after overwrite sections. To make this work,
this patch changes `InsertCommand` to use name based matching instead of pointer
based matching. (This may cause a difference when `INSERT` moves one output
section more than once. Such duplicate commands should not be used in practice
(seems that in GNU ld the output sections may just disappear).)
A linker script can be used without -T/--script. The traditional `SECTIONS`
commands are concatenated, so a wrong rule can be more noticeable from the
section order. This feature if misused can be less noticeable, just like
`INSERT`.
Differential Revision: https://reviews.llvm.org/D103303
In a -no-pie link we optimize R_PLT_PC to R_PC. Currently we resolve a branch
relocation to the link-time zero address. However such a choice tends to cause
relocation overflow possibility for RISC architectures.
* aarch64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the next instruction
* mips: GNU ld: branch to the start of the text segment (?); ld.lld: branch to zero
* ppc32: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
* ppc64: GNU ld: rewrite the instruction to a NOP; ld.lld: branch to the current instruction
* riscv: GNU ld: branch to the absolute zero address (with instruction rewriting)
* i386/x86_64: GNU ld/ld.lld: branch to the link-time zero address
I think that resolving to the same location is a good choice. The instruction,
if triggered, is clearly an undefined behavior. Resolving to the same location
can cause an infinite loop (making the user aware of the issue) while ensuring
no overflow.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D103001
Given the following scenario:
```
// Cat.cpp
struct Animal { virtual void makeNoise() const = 0; };
struct Cat : Animal { void makeNoise() const override; };
extern "C" int puts(char const *);
void Cat::makeNoise() const { puts("Meow"); }
void doThingWithCat(Animal *a) { static_cast<Cat *>(a)->makeNoise(); }
// CatUser.cpp
struct Animal { virtual void makeNoise() const = 0; };
struct Cat : Animal { void makeNoise() const override; };
void doThingWithCat(Animal *a);
void useDoThingWithCat() {
Cat *d = new Cat;
doThingWithCat(d);
}
// cat.ver
{
global: _Z17useDoThingWithCatv;
local: *;
};
$ clang++ Cat.cpp CatUser.cpp -fpic -flto=thin -fwhole-program-vtables
-shared -O3 -fuse-ld=lld -Wl,--lto-whole-program-visibility
-Wl,--version-script,cat.ver
```
We cannot devirtualize `Cat::makeNoise`. The issue is complex:
Due to `-fsplit-lto-unit` and usage of type metadata, we place the Cat
vtable declaration into module 0 and the Cat vtable definition with type
metadata into module 1, causing duplicate entries (Undefined followed by
Defined) in the `lto::InputFile::symbols()` output.
In `BitcodeFile::parse`, after processing the `Undefined` then the
`Defined`, the final state is `Defined`.
In `BitcodeCompiler::add`, for the first symbol, `computeBinding`
returns `STB_LOCAL`, then we reset it to `Undefined` because it is
prevailing (`versionId` is `preserved`). For the second symbol, because
the state is now `Undefined`, `computeBinding` returns `STB_GLOBAL`,
causing `ExportDynamic` to be true and suppressing devirtualization.
In D77280, the `computeBinding` change used a stricter `isDefined()`
condition to make weak``Lazy` symbol work.
This patch relaxes the condition to weaker `!isLazy()` to keep it
working while making the devirtualization work as well.
Differential Revision: https://reviews.llvm.org/D98686
D62727 removed GotEntrySize and GotPltEntrySize with a comment that they
are always equal to wordsize(), but that is not entirely true: X32 has a
word size of 4, but needs 8-byte GOT entries. This restores gotEntrySize
for both, adjusted for current naming conventions, but defaults it to
config->wordsize to keep things simple for architectures other than
x86_64.
This partially reverts D62727.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D102509
This option will be available in GNU ld 2.27 (https://sourceware.org/bugzilla/show_bug.cgi?id=27834).
This option can cancel previously specified -Bsymbolic and
-Bsymbolic-functions. This is useful for excluding some links when the
default uses -Bsymbolic-functions.
Reviewed By: jhenderson, peter.smith
Differential Revision: https://reviews.llvm.org/D102383
Currently, when reporting unresolved symbols in shared libraries, if an
undefined symbol is firstly seen in a regular object file that shadows
the reference for the same symbol in a shared object. As a result, the
error for the unresolved symbol in the shared library is not reported.
If referencing sections in regular object files are discarded because of
'--gc-sections', no reports about such symbols are generated, and the
linker finishes successfully, generating an output image that fails on
the run.
The patch fixes the issue by keeping symbols, which should be checked,
for each shared library separately.
Differential Revision: https://reviews.llvm.org/D101996
This fixes an issue where mixed TOC / NOTOC calls can call the incorrect
thunks if a previous thunk already exists. The issue appears when a TOC
funciton calls a NOTOC callee and then a different NOTOC function calls the same
NOTOC callee. In this case the linker would sometimes incorrectly call the
same thunk for both cases.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101837
This is a slight improvement to the help text, as I was slightly
surprised when strip-all did more than remove the symbol table.
Currently, we match gold's help text for strip-all and strip-debug.
I think that the GNU documentation for these options is not particularly
clear. However, I have opted to make only a minor change here and keep
the help text similar to gold's as these are mature options that are
well understood.
ld.bfd (https://sourceware.org/binutils/docs/ld/Options.html) has a
similar implication although it defines strip-debug as a subset of
strip-all. However, felt that noting that strip-all implies strip-debug
is better; because, with the ld.bfd approach you have to read both the
--strip-debug and the --strip-all help text to understand the behaviour
of --strip-all (and the --strip-all help text doesn't indicate that he
--strip-debug help text is related).
Differential Revision: https://reviews.llvm.org/D101890
Adopt my suggestion in https://reviews.llvm.org/D91426#2653926 ,
generalizing the ppc64 specific code.
GNU ld and glibc ld.so has a contract about the first few entries of .got .
There are somewhat complex conditions when the header is needed. This patch
switches to a simpler approach: add a header unconditionally if
_GLOBAL_OFFSET_TABLE_ is used or the number of entries is more than just the
header.
Stop using the compatibility spellings of `OF_{None,Text,Append}`
left behind by 1f67a3cba9. A follow-up
will remove them.
Differential Revision: https://reviews.llvm.org/D101650
GNU ld -r can create .rela.eh_frame with unordered r_offset values.
(With LLD, we can craft such a case by reordering sections in .eh_frame.)
This is currently unsupported and will trigger
`assert(pieces[i].inputOff <= off ...` in `OffsetGetter::get`
(the content is corrupted in a -DLLVM_ENABLE_ASSERTIONS=off build).
This patch supports this case.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D101116
The original page no longer works, so use a web.archive.org link instead.
Reviewed By: atanasyan
Differential Revision: https://reviews.llvm.org/D100949
This is the same problem as 127176e59e, but for static TLS rather than
dynamic TLS. Although we know the symbol will be the one in our own TLS
segment, and thus the offset of it within that, we don't know where in
the static TLS block our data will be allocated and thus we must emit a
dynamic relocation for this case.
Reviewed By: MaskRay, atanasyan
Differential Revision: https://reviews.llvm.org/D101381
Whilst not wrong (unless using static PIE where the relocations are
likely not implemented by the runtime), this is inefficient, as the TLS
module indices and offsets are independent of the executable's load
address.
Reviewed By: MaskRay, atanasyan
Differential Revision: https://reviews.llvm.org/D101382
An unfetched lazy symbol (undefined weak) should be considered to have its
original versionId which is VER_NDX_GLOBAL, instead of the lazy symbol's
versionId. (The original versionId cannot be non-VER_NDX_GLOBAL because a
undefined versioned symbol is an error.)
The regression was introduced in D77280 when making version scripts work
with lazy symbols fetched by LTO calls.
Fix PR49915
Differential Revision: https://reviews.llvm.org/D100624
Fix PR49897: if `__real_foo` has the isUsedInRegularObj bit set, we need to
retain `foo` in .symtab, even if `foo` is undefined. The new behavior will match
GNU ld.
Before the patch, we produced an R_X86_64_JUMP_SLOT relocation referencing the
index 0 undefined symbol, which would be erroed by glibc
(see f96ff3c0f8).
While here, fix another bug: if `__wrap_foo` does not exist, its initial binding
should be `foo`'s.
Change the default to facilitate GC for metadata section usage, so that they
don't need SHF_LINK_ORDER or SHF_GROUP just to drop the unhelpful rule (if they
want to be unconditionally retained, use SHF_GNU_RETAIN
(`__attribute__((retain))`) or linker script `KEEP`).
The dropped SHF_GROUP special case makes the behavior of -z start-stop-gc and -z
nostart-stop-gc closer to GNU ld>=2.37 (https://sourceware.org/PR27451).
However, we default to -z start-stop-gc (which actually matches more closely to
GNU ld before 2015-10 https://sourceware.org/PR19167), which is different from
modern GNU ld (which has the unhelpful rule to work around glibc). As a
compensation, we special case `__libc_` sections as a workaround for glibc<2.34
(https://sourceware.org/PR27492).
Since -z start-stop-gc as the default actually matches the traditional GNU ld
behavior, there isn't much to be aware of. There was a systemd usage which has
been fixed by https://github.com/systemd/systemd/pull/19144
The `e_flags` for a ELF file targeting the AVR ISA contains two fields at the time of writing:
- A 7-bit integer field specifying the ISA revision being targeted
- A 1-bit flag specifying whether the object files being linked are suited for applying the relaxations at link time
The linked ELF file is blessed with the arch revision shared among all the files.
The behaviour in case of mismatch is purposefully different than the one implemented in libbfd: LLD will raise a fatal error while libbfd silently picks a default value of `avr2`.
The relaxation-ready flag is handled as done by libbfd, in order for it to appear in the linked object every source object must be tagged with it.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99754
From the PowerPC ELFv2 ABI section 4.2.3. Global Offset Table.
```
The GOT consists of an 8-byte header that contains the TOC base (the first TOC
base when multiple TOCs are present), followed by an array of 8-byte addresses.
```
Due to the introduction of PC Relative code it is now possible to require a GOT
without having a .TOC. symbol in the object that is being linked. Since LLD uses
the .TOC. symbol to determine whether or not a GOT is required the GOT header is
not setup correctly and the 8-byte header is missing.
This patch allows the Power PC GOT setup to happen when an element is added to
the GOT instead of at the very begining. When this header is added a .TOC.
symbol is also added.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D91426
This reuses the approach (and some code) from LLD-ELF.
It's a decent win when linking chromium_framework on a Mac Pro (3.2 GHz 16-Core Intel Xeon W):
N Min Max Median Avg Stddev
x 20 4.58 4.83 4.66 4.6685 0.066591844
+ 20 4.42 4.61 4.5 4.505 0.04751731
Difference at 95.0% confidence
-0.1635 +/- 0.0370242
-3.5022% +/- 0.793064%
(Student's t, pooled s = 0.0578462)
The output binary is 381MB.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D99279
In future patches I will be setting the IsText parameter frequently so I will refactor the args to be in the following order. I have removed the FileSize parameter because it is never used.
```
static ErrorOr<std::unique_ptr<MemoryBuffer>>
getFile(const Twine &Filename, bool IsText = false,
bool RequiresNullTerminator = true, bool IsVolatile = false);
static ErrorOr<std::unique_ptr<MemoryBuffer>>
getFileOrSTDIN(const Twine &Filename, bool IsText = false,
bool RequiresNullTerminator = true);
static ErrorOr<std::unique_ptr<MB>>
getFileAux(const Twine &Filename, uint64_t MapSize, uint64_t Offset,
bool IsText, bool RequiresNullTerminator, bool IsVolatile);
static ErrorOr<std::unique_ptr<WritableMemoryBuffer>>
getFile(const Twine &Filename, bool IsVolatile = false);
```
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D99182
There is a bug when initial exec is relaxed to local exec.
In the following situation:
InitExec.c
```
extern __thread unsigned TGlobal;
unsigned getConst(unsigned*);
unsigned addVal(unsigned, unsigned*);
unsigned GetAddrT() {
return addVal(getConst(&TGlobal), &TGlobal);
}
```
Def.c
```
__thread unsigned TGlobal;
unsigned getConst(unsigned* A) {
return *A + 3;
}
unsigned addVal(unsigned A, unsigned* B) {
return A + *B;
}
```
The problem is in InitExec.c but Def.c is required if you want to link the example and see the problem.
To compile everything:
```
clang -O3 -mcpu=pwr10 -c InitExec.c
clang -O3 -mcpu=pwr10 -c Def.c
ld.lld InitExec.o Def.o -o IeToLe
```
If you objdump the problem object file:
```
$ llvm-objdump -dr --mcpu=pwr10 InitExec.o
```
you will get the following assembly:
```
0000000000000000 <GetAddrT>:
0: a6 02 08 7c mflr 0
4: f0 ff c1 fb std 30, -16(1)
8: 10 00 01 f8 std 0, 16(1)
c: d1 ff 21 f8 stdu 1, -48(1)
10: 00 00 10 04 00 00 60 e4 pld 3, 0(0), 1
0000000000000010: R_PPC64_GOT_TPREL_PCREL34 TGlobal
18: 14 6a c3 7f add 30, 3, 13
0000000000000019: R_PPC64_TLS TGlobal
1c: 78 f3 c3 7f mr 3, 30
20: 01 00 00 48 bl 0x20
0000000000000020: R_PPC64_REL24_NOTOC getConst
24: 78 f3 c4 7f mr 4, 30
28: 30 00 21 38 addi 1, 1, 48
2c: 10 00 01 e8 ld 0, 16(1)
30: f0 ff c1 eb ld 30, -16(1)
34: a6 03 08 7c mtlr 0
38: 00 00 00 48 b 0x38
0000000000000038: R_PPC64_REL24_NOTOC addVal
```
The lines of interest are:
```
10: 00 00 10 04 00 00 60 e4 pld 3, 0(0), 1
0000000000000010: R_PPC64_GOT_TPREL_PCREL34 TGlobal
18: 14 6a c3 7f add 30, 3, 13
0000000000000019: R_PPC64_TLS TGlobal
1c: 78 f3 c3 7f mr 3, 30
```
Which once linked gets turned into:
```
10010210: ff ff 03 06 00 90 6d 38 paddi 3, 13, -28672, 0
10010218: 00 00 00 60 nop
1001021c: 78 f3 c3 7f mr 3, 30
```
The problem is that register 30 is never set after the optimization.
Therefore it is not correct to relax the above instructions by replacing
the add instruction with a nop.
Instead the add instruction should be replaced with a copy (mr) instruction.
If the add uses the same resgiter as input and as ouput then it is safe to
continue to replace the add with a nop.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D95262
`--shuffle-sections=<seed>` applies to all sections. The new
`--shuffle-sections=<section-glob>=<seed>` makes shuffling selective. To the
best of my knowledge, the option is only used as debugging, so just drop the
original form.
`--shuffle-sections '.init_array*=-1'` `--shuffle-sections '.fini_array*=-1'`.
reverses static constructors/destructors of the same priority.
Useful to detect some static initialization order fiasco.
`--shuffle-sections '.data*=-1'`
reverses `.data*` sections. Useful to detect unfunded pointer comparison results
of two unrelated objects.
If certain sections have an intrinsic order, the old form cannot be used.
Differential Revision: https://reviews.llvm.org/D98679
If the number of sections changes, which is common for re-links after
incremental updates, the section order may change drastically.
Special case -1 to reverse input sections. This is a stable transform.
The section order is more resilient to incremental updates. Usually the
code issue (e.g. Static Initialization Order Fiasco, assuming pointer
comparison result of two unrelated objects) is due to the relative order
between two problematic input files A and B. Checking the regular order
and the reversed order is sufficient.
Differential Revision: https://reviews.llvm.org/D98445
GNU ld supports `.` and `$` in symbol names while LLD doesn't support them in
`readPrimary` expressions. Using `.` can result in such an error:
```
https://github.com/ClangBuiltLinux/linux/issues/1318
ld.lld: error: ./arch/powerpc/kernel/vmlinux.lds:255: malformed number: .TOC.
>>> __toc_ptr = (DEFINED (.TOC.) ? .TOC. : ADDR (.got)) + 0x8000;
```
Allow `.` (ppc64 special symbol `.TOC.`) and `$` (RISC-V special symbol `__global_pointer$`).
Change `diag[3-5].test` to use an invalid character `^`.
Note: GNU ld allows `~` in non-leading positions of a symbol name. `~`
is not used in practice, conflicts with the unary operator, and can
cause some parsing difficulty, so this patch does not add it.
Differential Revision: https://reviews.llvm.org/D98306
A `local:` version node in a version script can change the effective symbol binding
to STB_LOCAL. The linker needs to communicate the fact to enable WPD
(otherwise LTO does not know that the `!vcall_visibility` metadata has
effectively changed from VCallVisibilityPublic to VCallVisibilityLinkageUnit).
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D98220
Implemented the option to omit Power10 instructions from save stubs via the
option --no-power10-stubs or --power10-stubs=no on lld. --power10-stubs= will
override the other option. --power10-stubs=auto also exists to use the default
behaviour (ie allow Power10 instructions in stubs).
Differential Revision: https://reviews.llvm.org/D94627
In AArch32 ARM, the PC reads two instructions ahead of the currently
executiing instruction. This evaluates to 8 in ARM state and 4 in
Thumb state. Branch instructions on AArch32 compensate for this by
subtracting the PC bias from the addend. For a branch to symbol this
will result in an addend of -8 in ARM state and -4 in Thumb state.
The existing ARM Target::inBranchRange function accounted for this
implict addend within the function meaning that if the addend were
to be taken into account by the caller then it would be double
counted. This complicates the interface for all Targets as callers
wanting to account for addends had to account for the ARM PC-bias.
In certain situations such as:
https://github.com/ClangBuiltLinux/linux/issues/1305
the PC-bias compensation code didn't match up. In particular
normalizeExistingThunk() didn't put the PC-bias back in as Arm
thunks did not store the addend.
The simplest fix for the problem is to add the PC bias in
normalizeExistingThunk when restoring the addend. However I think
it is worth refactoring the Arm inBranchRange implementation so
that fewer calls to getPCBias are needed for other Targets. I
wasn't able to remove getPCBias completely but hopefully the
Relocations.cpp code is simpler now.
In principle a test could be written to replicate the linux kernel
build failure but I wasn't able to reproduce with a small example
that I could build up from scratch.
Fixes https://github.com/ClangBuiltLinux/linux/issues/1305
Differential Revision: https://reviews.llvm.org/D97550
Also a couple of minor cleanups in merge-string.s:
- fix inconsistent use of tabs
- use `.p2align` rather than `.align` since `.p2align` works the
same on all platforms (the meaning of align seems to differ
between platforms according to `AlignmentIsInBytes`.
I noticed these potential cleanups while porting SHF_STRINGS support to
wasm-ld.
Differential Revision: https://reviews.llvm.org/D97647
For one metadata section usage, each text section references a metadata section.
The metadata sections have a C identifier name to allow the runtime to collect them via `__start_/__stop_` symbols.
Since `__start_`/`__stop_` references are always present from live sections, the
C identifier name sections appear like GC roots, which means they cannot be
discarded by `ld --gc-sections`.
To make such sections GCable, either SHF_LINK_ORDER or a section group is needed.
SHF_LINK_ORDER is not suitable for the references can be inlined into other functions
(See D97430:
Function A (in the section .text.A) references its `__sancov_guard` section.
Function B inlines A (so now .text.B references `__sancov_guard` - this is invalid with the semantics of SHF_LINK_ORDER).
In the linking stage,
if `.text.A` gets discarded, and `__sancov_guard` is retained via the reference from `.text.B`,
the output will be invalid because `__sancov_guard` references the discarded `.text.A`.
LLD errors "sh_link points to discarded section".
)
A section group have size overhead, and is cumbersome when there is just one metadata section.
Add `-z start-stop-gc` to drop the "__start_/__stop_ references retain
non-SHF_LINK_ORDER non-SHF_GROUP C identifier name sections" rule.
We reserve the rights to switch the default in the future.
Reviewed By: phosek, jrtc27
Differential Revision: https://reviews.llvm.org/D96914
The special root semantics for identifier-named sections is meant
specifically for the metadata sections. In the context of group
semantics, where group members are always retained or discarded as a
unit, it's natural not to have this semantics apply to a section in a
group, otherwise we would never discard the group defeating the purpose
of using the group in the first place.
This change modifies the GC behavior so that __start_/__stop_ references
don't retain C identifier named sections in section groups which allows
for these groups to be collected. This matches the behavior of BFD ld.
The only kind of existing case that might break is interdependent
metadata sections that are all in a group together, but that group
doesn't contain any other sections referenced by anything except
implicit inclusion in a `__start_` and/or `__stop_`-referenced
identifier-named section, but such cases should be unlikely.
Differential Revision: https://reviews.llvm.org/D96753
This change introduces support for zero flag ELF section groups to lld.
lld already supports COMDAT sections, which in ELF are a special type of
ELF section groups. These are generally useful to enable linker GC where
you want a group of sections to always travel together, that is to be
either retained or discarded as a whole, but without the COMDAT
semantics. Other ELF linkers already support zero flag ELF section
groups and this change helps us reach feature parity.
Differential Revision: https://reviews.llvm.org/D96636
When parsing an object file, LLD interleaves undefined symbol resolution (which
may recursively fetch other lazy objects) with defined symbol resolution.
This may lead to surprising results, e.g. if an object file defines currently
undefined symbols and references another lazy symbol, we may interleave defined
symbols with the lazy fetch, potentially leading to the defined symbols
resolving to different files.
As an example, if both `a.a(a.o)` and `a.a(b.o)` define `foo` (not in COMDAT
group, or in different COMDAT groups) and `__profd_foo` (in COMDAT group
`__profd_foo`). LLD may resolve `foo` to `a.a(a.o)` and `__profd_foo` to
`b.a(b.o)`, i.e. different files.
```
parse ArchiveFile a.a
entry fetches a.a(a.o)
parse ObjectFile a.o
define entry
define foo
reference b
b fetches a.a(b.o)
parse ObjectFile b.o
define prevailing __profd_foo
define (ignored) non-prevailing __profd_foo
```
Assuming a set of interconnected symbols are defined all or none in several lazy
objects. Arguably making them resolve to the same file is preferable than making
them resolve to different files (some are lazy objects).
The main argument favoring the new behavior is the stability. The relative order
between a defined symbol and an undefined symbol does not change the symbol
resolution behavior. Only the relative order between two undefined symbols can
affect fetching behaviors.
---
The real world case is reduced from a Fuchsia PGO usage: `a.a(a.o)` has a
constructor within COMDAT group C5 while `a.a(b.o)` has a constructor within
COMDAT group C2. Because they use different group signatures, they are not
de-duplicated. It is not entirely whether Clang behavior is entirely conforming.
LLD selects the PGO counter section (`__profd_*`) from `a.a(b.o)` and the
constructor section from `a.a(a.o)`. The `__profd_*` is a SHF_LINK_ORDER section
linking to its own non-prevailing constructor section, so LLD errors
`sh_link points to discarded section`. This patch fixes the error.
Differential Revision: https://reviews.llvm.org/D95985
`extern const bfd_target aarch64_elf64_le_vec;` is a variable in BFD.
It was somehow misused as an emulation by Android.
```
% aarch64-linux-gnu-ld -m aarch64_elf64_le_vec a.o
aarch64-linux-gnu-ld: unrecognised emulation mode: aarch64_elf64_le_vec
Supported emulations: aarch64linux aarch64elf aarch64elf32 aarch64elf32b aarch64elfb armelf armelfb aarch64linuxb aarch64linux32 aarch64linux32b armelfb_linux_eabi armelf_linux_eabi
```
Acked by Stephen Hines, who removed the flag from Android a while back.
Rewritting the path of the sample profile file in response.txt to be relative to the repro tar.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D96193
A SHF_LINK_ORDER .gcc_except_table is similar to a .gcc_except_table in
a section group. The associated text section is responsible for retaining it.
LLD still does not support GC of non-group non-SHF_LINK_ORDER .gcc_except_table -
but that is not necessary because we can teach the compiler to set SHF_LINK_ORDER.
The current diagnostic has confused users. The new wording is adapted from one suggested by Ian Lance Taylor.
Differential Revision: https://reviews.llvm.org/D95917
binutils 2.36 introduced the new section flag SHF_GNU_RETAIN (for ELFOSABI_GNU &
ELFOSABI_FREEBSD) to mark a sections as a GC root. Several LLVM side toolchain
folks (including me) were involved in the design process of SHF_GNU_RETAIN and
were happy with this proposal.
Currently GNU ld only respects SHF_GNU_RETAIN semantics for ELFOSABI_GNU &
ELFOSABI_FREEBSD object files
(https://sourceware.org/bugzilla/show_bug.cgi?id=27282). GNU ld sets EI_OSABI
to ELFOSABI_GNU for relocatable output
(https://sourceware.org/bugzilla/show_bug.cgi?id=27091). In practice the single
value EI_OSABI is neither a good indicator for object file compatibility, nor a
useful mechanism marking used ELF extensions.
For input, we respect SHF_GNU_RETAIN semantics even for ELFOSABI_NONE object
files. This is compatible with how LLD and GNU ld handle (mildly useful) STT_GNU_IFUNC
/ (emitted by GCC, considered misfeature by some folks) STB_GNU_UNIQUE input.
(As of LLVM 12.0.0, the integrated assembler does not set ELFOSABI_GNU for
STT_GNU_IFUNC/STB_GNU_UNIQUE).
Arguably STT_GNU_IFUNC/STB_GNU_UNIQUE probably need indicators in object files
but SHF_GNU_RETAIN is more likely accepted by more OSABI platforms.
For output, we take a step further than GNU ld: we don't promote ELFOSABI_NONE
to ELFOSABI_GNU for all output.
Differential Revision: https://reviews.llvm.org/D95749
In GCC emitted .debug_info sections, R_386_GOTOFF may be used to
relocate DW_AT_GNU_call_site_value values
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98946).
R_386_GOTOFF (`S + A - GOT`) is one of the `isStaticLinkTimeConstant` relocation
type which is not PC-relative, so it can be used from non-SHF_ALLOC sections. We
current allow new relocation types as needs come. The diagnostic has caught some
bugs in the past.
Differential Revision: https://reviews.llvm.org/D95994
The option catches incompatibility between `R_*_IRELATIVE` and DT_TEXTREL/DF_TEXTREL
before glibc 2.29. Newer glibc versions are more common nowadays and I don't
think this option has ever been used. Diagnosing this problem is also
straightforward by reading the stack trace.
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.
Differential Revision: https://reviews.llvm.org/D91583
I noticed that this option was not appearing at all in the `--help`
messages for `wasm-ld` or `ld.lld`.
Add help text and make it consistent across all ports.
Differential Revision: https://reviews.llvm.org/D94925
If foo is referenced in any object file, bitcode file or shared object,
`__wrap_foo` should be retained as the redirection target of sym
(f96ff3c0f8).
If the object file defining foo has foo references, we cannot easily distinguish
the case from cases where foo is not referenced (we haven't scanned
relocations). Retain `__wrap_foo` because we choose to wrap sym references
regardless of whether sym is defined to keep non-LTO/LTO/relocatable links' behaviors similar
https://sourceware.org/bugzilla/show_bug.cgi?id=26358 .
If foo is defined in a shared object, `__wrap_foo` can still be omitted
(`wrap-dynamic-undef.s`).
Reviewed By: andrewng
Differential Revision: https://reviews.llvm.org/D95152
Fixes PR48523. When the linker errors with "output file too large",
one question that comes to mind is how the section sizes differ from
what they were previously. Unfortunately, this information is lost
when the linker exits without writing the output file. This change
makes it so that the error message includes the sizes of the largest
sections.
Reviewed By: MaskRay, grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D94560
```
// a.s
jmp fcntl
// b.s
.globl fcntl
fcntl:
ret
```
`ld.lld -shared --wrap=fcntl a.o b.o` has an `R_X86_64_JUMP_SLOT` referencing
the index 0 undefined symbol, which will cause a glibc `symbol lookup error` at
runtime. This is because `__wrap_fcntl` is not in .dynsym
We use an approximation `!wrap->isUndefined()`, which doesn't set
`isUsedInRegularObj` of `__wrap_fcntl` when `fcntl` is referenced and
`__wrap_fcntl` is undefined.
Fix this by using `sym->referenced`.
R_PPC64_ADDR16_HI represents bits 16-31 of a 32-bit value
R_PPC64_ADDR16_HIGH represents bits 16-31 of a 64-bit value.
In the Linux kernel, `LOAD_REG_IMMEDIATE_SYM` defined in `arch/powerpc/include/asm/ppc_asm.h`
uses @l, @high, @higher, @highest to load the 64-bit value of a symbol.
Fixes https://github.com/ClangBuiltLinux/linux/issues/1260
The commit 18aa0be36e changed the default GotBaseSymInGotPlt to true
for AArch64. This is different than binutils, where
_GLOBAL_OFFSET_TABLE_ points at the start or .got.
It seems to not intefere with current relocations used by LLVM. However
as indicated by PR#40357 [1] gcc generates R_AARCH64_LD64_GOTPAGE_LO15
for -pie (in fact it also generated the relocation for -fpic).
This change is requires to correctly handle R_AARCH64_LD64_GOTPAGE_LO15
by lld from objects generated by gcc.
[1] https://bugs.llvm.org/show_bug.cgi?id=40357
OutputSections.h used to close the lld::elf namespace only to
immediately open it again. This change merges both parts into
one.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D94538
Fixes PR48693: --emit-relocs keeps relocation sections. --gdb-index drops
.debug_gnu_pubnames and .debug_gnu_pubtypes but not their relocation sections.
This can cause a null pointer dereference in `getOutputSectionName`.
Also delete debug-gnu-pubnames.s which is covered by gdb-index.s
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D94354
Fixes PR48681: after LTO, lto.tmp may reference a libcall symbol not in an IR
symbol table of any bitcode file. If such a symbol is defined in an archive
matched by a --exclude-libs, we don't correctly localize the symbol.
Add another `excludeLibs` after `compileBitcodeFiles` to localize such libcall
symbols. Unfortunately we have keep the existing one for D43126.
Using VER_NDX_LOCAL is an implementation detail of `--exclude-libs`, it does not
necessarily tie to the "localize" behavior. `local:` patterns in a version
script can be omitted.
The `symbol ... has undefined version ...` error should not be exempted.
Ideally we should error as GNU ld does. https://issuetracker.google.com/issues/73020933
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D94280
This character indicates that when return pointer authentication is
being used, the function signs the return address using the B key.
Differential Revision: https://reviews.llvm.org/D93954
Add support for linking powerpcle code in LLD.
Rewrite lld/test/ELF/emulation-ppc.s to use a shared check block and add powerpcle tests.
Update tests.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93917
We can reduce the number of "using" declarations.
`LLVM_ELF_IMPORT_TYPES_ELFT` was extended in D93801.
Differential revision: https://reviews.llvm.org/D93856
For x86-64, D33100 added a diagnostic for local-exec TLS relocations referencing a preemptible symbol.
This patch generalizes it to non-preemptible symbols (see `-Bsymbolic` in `tls.s`)
on all targets.
Local-exec TLS relocations resolve to offsets relative to a fixed point within
the static TLS block, which are only meaningful for the executable.
With this change, `clang -fpic -shared -fuse-ld=bfd a.c` on the following example will be flagged for AArch64/ARM/i386/x86-64/RISC-V
```
static __attribute__((tls_model("local-exec"))) __thread long TlsVar = 42;
long bump() { return ++TlsVar; }
```
Note, in GNU ld, at least arm, riscv and x86's ports have the similar
diagnostics, but aarch64 and ppc64 do not error.
Differential Revision: https://reviews.llvm.org/D93331
Alternative to D91611.
The TLS General Dynamic/Local Dynamic code sequences need to mark
`__tls_get_addr` with R_PPC64_TLSGD or R_PPC64_TLSLD, e.g.
```
addis r3, r2, x@got@tlsgd@ha # R_PPC64_GOT_TLSGD16_HA
addi r3, r3, x@got@tlsgd@l # R_PPC64_GOT_TLSGD16_LO
bl __tls_get_addr(x@tlsgd) # R_PPC64_TLSGD followed by R_PPC64_REL24
nop
```
However, there are two deviations form the above:
1. direct call to `__tls_get_addr`. This is essential to implement ld.so in glibc/musl/FreeBSD.
```
bl __tls_get_addr
nop
```
This is only used in a -shared link, and thus not subject to the GD/LD to IE/LE
relaxation issue below.
2. Missing R_PPC64_TLSGD/R_PPC64_TLSGD for compiler generated TLS references
According to Stefan Pintille, "In the early days of the transition from the
ELFv1 ABI that is used for big endian PowerPC Linux distributions to the ELFv2
ABI that is used for little endian PowerPC Linux distributions, there was some
ambiguity in the specification of the relocations for TLS. The GNU linker has
implemented support for correct handling of calls to __tls_get_addr with a
missing relocation. Unfortunately, we didn't notice that the IBM XL compiler
did not handle TLS according to the updated ABI until we tried linking XL
compiled libraries with LLD."
In short, LLD needs to work around the old IBM XL compiler issue.
Otherwise, if the object file is linked in -no-pie or -pie mode,
the result will be incorrect because the 4 instructions are partially
rewritten (the latter 2 are not changed).
Work around the compiler bug by disable General Dynamic/Local Dynamic to
Initial Exec/Local Exec relaxation. Note, we also disable Initial Exec
to Local Exec relaxation for implementation simplicity, though technically it can be kept.
ppc64-tls-missing-gdld.s demonstrates the updated behavior.
Reviewed By: #powerpc, stefanp, grimar
Differential Revision: https://reviews.llvm.org/D92959
The scope of R_TLS (TP offset relocation types (TPREL/TPOFF) used for the
local-exec TLS model) is actually narrower than its name may imply. R_TLS_NEG
is only used by Solaris R_386_TLS_LE_32.
Rename them so that they will be less confusing.
Reviewed By: grimar, psmith, rprichard
Differential Revision: https://reviews.llvm.org/D93467
Libraries linked to the lld elf library exposes a function named main.
When debugging code linked to such libraries and intending to set a
breakpoint at main, the debugger also sets breakpoint at the main
function at lld elf driver. The possible choice was to rename it to
link but that would again clash with lld::*::link. This patch tries
to consistently rename them to linkerMain.
Differential Revision: https://reviews.llvm.org/D91418
As indicated by AArch64 ELF specification, symbols with st_other
marked with STO_AARCH64_VARIANT_PCS indicates it may follow a variant
procedure call standard with different register usage convention
(for instance SVE calls).
Static linkers must preserve the marking and propagate it to the dynamic
symbol table if any reference or definition of the symbol is marked with
STO_AARCH64_VARIANT_PCS, and add a DT_AARCH64_VARIANT_PCS dynamic tag if
there are R_<CLS>_JUMP_SLOT relocations that reference that symbols.
It implements https://bugs.llvm.org/show_bug.cgi?id=48368.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93045
Fix PR48357: If .rela.dyn appears as an output section description, its type may
be SHT_RELA (due to the empty synthetic .rela.plt) while there is no input
section. The empty .rela.dyn may be retained due to a reference in a linker
script. Don't crash.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D93367
Follow the naming set by TI's own GCC-based toolchain.
Also, force the `osabi` field to `ELFOSABI_STANDALONE`, this matches GNU LD's output (the patching is done in `elf32_msp430_post_process_headers`).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92931
Normally we should not delete options. However, the Clang driver passes
`-plugin-opt={new,legacy}-pass-manager` instead of
`--[no-]lto-legacy-pass-manager` (`-plugin-opt=new-pass-manager` has been used
since 7.0), and it is unlikely anyone will use the `--lto-*` style options directly.
So let's rename them to be consistent with the Clang driver option names.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D92988
-DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=on configured LLD and LLVMgold.so
will use the new pass manager by default. Add an option to
use the legacy pass manager. This will also be used by the Clang driver
when -fno-new-pass-manager (D92915) / -fno-experimental-new-pass-manager is set.
Reviewed By: aeubanks, tejohnson
Differential Revision: https://reviews.llvm.org/D92916
This patch changes the archive handling to enable the semantics needed
for legacy FORTRAN common blocks and block data. When we have a COMMON
definition of a symbol and are including an archive, LLD will now
search the members for global/weak defintions to override the COMMON
symbol. The previous LLD behavior (where a member would only be included
if it satisifed some other needed symbol definition) can be re-enabled with the
option '-no-fortran-common'.
Differential Revision: https://reviews.llvm.org/D86142
This reverts a side effect introduced in the code cleanup patch D43571:
LLD started to emit empty output sections that are explicitly assigned to a segment.
This patch fixes the issue by removing the !sec.phdrs.empty() special case from
isDiscardable. As compensation, we add an early phdrs propagation step (see the inline comment).
This is similar to one that we do in adjustSectionsAfterSorting.
Differential revision: https://reviews.llvm.org/D92301
Also, for .o files, include full path as given on link command line.
Before:
lld: error: undefined symbol [...], referenced from sandbox_logging.o
After:
lld: error: undefined symbol [...], referenced from libseatbelt.a(sandbox_logging.o)
Move archiveName up to InputFile so we can consistently use toString()
to print InputFiles in diags, and pass it to the ObjFile ctor. This
matches the ELF and COFF ports.
Differential Revision: https://reviews.llvm.org/D92437
If an object file has an undefined foo@v1, we emit a dynamic symbol foo.
This is incorrect if at runtime a shared object provides the non-default version foo@v1
(the undefined foo may bind to foo@@v2, for example).
GNU ld issues an error for this case, even if foo@v1 is undefined weak
(https://sourceware.org/bugzilla/show_bug.cgi?id=3351). This behavior makes
sense because to represent an undefined foo@v1, we have to construct a Verneed
entry. However, without knowing the defining filename, we cannot construct a
Verneed entry (Verneed::vn_file is unavailable).
This patch implements the error.
Depends on D92258
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D92260
The symbol resolution rules for versioned symbols are:
* foo@@v1 (default version) resolves both undefined foo and foo@v1
* foo@v1 (non-default version) resolves undefined foo@v1
Note, foo@@v1 must be defined (the assembler errors if attempting to
create an undefined foo@@v1).
For defined foo@@v1 in a shared object, we call `SymbolTable::addSymbol` twice,
one for foo and the other for foo@v1. We don't do the same for object files, so
foo@@v1 defined in one object file incorrectly does not resolve a foo@v1
reference in another object file.
This patch fixes the issue by reusing the --wrap code to redirect symbols in
object files. This has to be done after processing input files because
foo and foo@v1 are two separate symbols if we haven't seen foo@@v1.
Add a helper `Symbol::getVersionSuffix` to retrieve the optional trailing
`@...` or `@@...` from the possibly truncated symbol name.
Depends on D92258
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D92259
All three use readFile() for their argument so their argument file is
already copied to the tar, but we weren't rewriting the argument to
point to the path used in the tar file.
No test because the change is trivial (several other flags in
createResponseFile() also aren't tested, likely for the same reason.)
Differential Revision: https://reviews.llvm.org/D92356
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.
This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:
* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='
Differential Revision: https://reviews.llvm.org/D85809
clang may produce `movl x@GOTPCREL+4(%rip), %eax` when loading the high 32 bits
of the address of a global variable in -fpic/-fpie mode.
If assembled by GNU as, the fixup emits an R_X86_64_GOTPCRELX with an
addend != -4. The instruction loads from the GOT entry with an offset
and thus it is incorrect to relax the instruction.
If assembled by the integrated assembler, we emit R_X86_64_GOTPCREL for
relocations that definitely cannot be relaxed (D92114), so this patch is not
needed.
This patch disables the relaxation, which is compatible with the implementation in GNU ld
("Add R_X86_64_[REX_]GOTPCRELX support to gas and ld").
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D91993
This adds support for ld.lld's --reproduce / lld-link's /reproduce:
flag to the MachO port. This flag can be added to a link command
to make the link write a tar file containing all inputs to the link
and a response file containing the link command. This can be used
to reproduce the link on another machine, which is useful for sharing
bug report inputs or performance test loads.
Since the linker is usually called through the clang driver and
adding linker flags can be a bit cumbersome, setting the env var
`LLD_REPRODUCE=foo.tar` triggers the feature as well.
The file response.txt in the archive can be used with
`ld64.lld.darwinnew $(cat response.txt)` as long as the contents are
smaller than the command-line limit, or with `ld64.lld.darwinnew
@response.txt` once D92149 is in.
The support in this patch is sufficient to create a tar file for
Chromium's base_unittests that can link after unpacking on a different
machine.
Differential Revision: https://reviews.llvm.org/D92274
For --gc-sections, SmallVector<InputSection *, 256> -> SmallVector<InputSection *, 0> because the code bloat (1296 bytes) is not worthwhile (the saved reallocation is negligible).
For OutputSection::compressedData, N=1 is useless (for a compressed .debug_*, the size is always larger than 1).
Also sync help texts for the option between elf and coff ports.
Decisions:
- Do this even if /lldignoreenv is passed. /reproduce: does not affect
the main output, and this makes the env var more convenient to use.
(On the other hand, it's now possible to set this env var and forget
about it, and all future builds in the same shell will be much slower.
That's true for ld.lld, but posix shells have an easy way to set an
env var for a single command; in cmd.exe this is not possible without
contortions. Then again, lld-link runs in posix shells too.)
Original patch rebased across D68378 and D68381.
Differential Revision: https://reviews.llvm.org/D67707
With this change, `TargetInfo::adjustRelaxExpr` is only related to TLS
relaxations and a subsequent clean-up can delete the `data` parameter.
Differential Revision: https://reviews.llvm.org/D92079
Enables overriding earlier --lto-whole-program-visibility.
Variant of D91583 while discussing alternate ways to identify and
handle the --export-dynamic case.
Differential Revision: https://reviews.llvm.org/D92060
Also use "unknown flag 'flag'" instead of "unknown flag: flag" for
consistency with the other ports.
Differential Revision: https://reviews.llvm.org/D91970
This allows to reuse the RelocationResolver from the code
that doesn't want to deal with `RelocationRef` class.
I am going to use it in llvm-readobj. See the description
of D91530 for more details.
Differential revision: https://reviews.llvm.org/D91533
As mentioned in https://reviews.llvm.org/D67479#1667256 ,
* `--[no-]allow-shlib-undefined` control the diagnostic for an unresolved symbol in a shared object
* `-z defs/-z undefs` control the diagnostic for an unresolved symbol in a regular object file
* `--unresolved-symbols=` controls both bits.
In addition, make --warn-unresolved-symbols affect --no-allow-shlib-undefined.
This patch makes the behavior match GNU ld.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D91510
This adds `--[no-]color-diagnostics[=auto,never,always]` to
the MachO port and harmonizes the flag in the other ports:
- Consistently use MetaVarName
- Consistently document the non-eq version as alias of the eq version
- Use B<> in the ports that have it (no-op, shorter)
- Fix oversight in COFF port that made the --no flag have the wrong
prefix
Differential Revision: https://reviews.llvm.org/D91640
`try ... catch` in an inline function produces `.gcc_except_table.*` in a COMDAT
group with GCC or newer Clang (since D83655). For --gc-sections, currently we
scan `.eh_frame` pieces and mark liveness of such a `.gcc_except_table.*` and
then the associated `.text.*` (if a member in a section group is retained, the
others should be retained as well).
Essentially all `.text.*` and `.gcc_except_table.*` compiled from inline
functions with `try ... catch` cannot be discarded by the imprecise
--gc-sections. Compared with the state before D83655, the output
`.gcc_except_table` is smaller (non-prevailing copies in COMDAT groups can now
be discarded) but `.text` may be larger, i.e. size regression.
This patch teaches the .eh_frame piece scanning code to not mark
`.gcc_except_table` in a section group, thus allow unused `.text.*` and
`.gcc_except_table.*` in a section group to be discarded.
Note, non-group `.gcc_except_table` can still not be discarded. That is the status quo.
Reviewed By: grimar, echristo
Differential Revision: https://reviews.llvm.org/D91579
Fixes PR48071
* The Rust compiler produces SHF_ALLOC `.debug_gdb_scripts` (which normally does not have the flag)
* `.debug_gdb_scripts` sections are removed from `inputSections` due to --strip-debug/--strip-all
* When processing --gc-sections, pieces of a SHF_MERGE section can be marked live separately
`=>` segfault when marking liveness of a `.debug_gdb_scripts` which is not split into pieces (because it is not in `inputSections`)
This patch circumvents the problem by not treating SHF_ALLOC ".debug*" as debug sections (to prevent --strip-debug's stripping)
(which is still useful on its own).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D91291
Input sections `.ctors/.ctors.N` may go to either the output section `.init_array` or the output section `.ctors`:
* output `.ctors`: currently we sort them by name. This patch changes to sort by priority from high to low. If N in `.ctors.N` is in the form of %05u, there is no semantic difference. Actually GCC and Clang do use %05u. (In the test `ctors_dtors_priority.s` and Gold's test `gold/testsuite/script_test_14.s`, we can see %03u, but they are not really produced by compilers.)
* output `.init_array`: users can provide an input section description `SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)` to mix `.init_array.*` and `.ctors.*`. This can make .init_array.N and .ctors.(65535-N) interchangeable.
With this change, users can mix `.ctors.N` and `.init_array.N` in `.init_array` (PR44698 and PR48096) with linker scripts. As an example:
```
SECTIONS {
.init_array : {
*(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*))
*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors)
}
} INSERT AFTER .fini_array;
SECTIONS {
.fini_array : {
*(SORT_BY_INIT_PRIORITY(.fini_array.* .dtors.*))
*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors)
}
} INSERT BEFORE .init_array;
```
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D91187
According to
https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html#Input-Section-Basics
for `*(.a .b)`, the order should match the input order:
* for `ld 1.o 2.o`, sections from 1.o precede sections from 2.o
* within a file, `.a` and `.b` appear in the section header table order
This patch implements the behavior. The interaction with `SORT*` and --sort-section is:
Matched sections are ordered by radix sort with the keys being `(SORT*, --sort-section, input order)`,
where `SORT*` (if present) is most significant.
> Note, multiple `SORT*` within an input section description has undocumented and
> confusing behaviors in GNU ld:
> https://sourceware.org/pipermail/binutils/2020-November/114083.html
> Therefore multiple `SORT*` is not the focus for this patch but
> this patch still strives to have an explainable behavior.
As an example, we partition `SORT(a.*) b.* c.* SORT(d.*)`, into
`SORT(a.*) | b.* c.* | SORT(d.*)` and perform sorting within groups. Sections
matched by patterns between two `SORT*` are sorted by input order. If
--sort-alignment is given, they are sorted by --sort-alignment, breaking tie by
input order.
This patch also allows a section to be matched by multiple patterns, previously
duplicated sections could occupy more space in the output and had erroneous zero bytes.
The patch is in preparation for support for
`*(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)) *(.init_array .ctors)`,
which will allow LLD to mix .ctors*/.init_array* like GNU ld (gold's --ctors-in-init-array)
PR44698 and PR48096
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D91127
The second `SORT` in `*(SORT(...) SORT(...))` is incorrectly parsed as a file pattern.
Fix the bug by stopping at `SORT*` in `readInputSectionsList`.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D91180
I noticed when running a large link with the --time-trace option that
there were several areas which were missing any specific time trace
categories (aside from the generic link/ExecuteLinker categories). This
patch adds new categories to fill most of the "gaps", or to provide more
detail than was previously provided.
Reviewed by: MaskRay, grimar, russell.gallop
Differential Revision: https://reviews.llvm.org/D90686