Printing pass manager invocations is fairly verbose and not super
useful.
This allows us to remove DebugLogging from pass managers and PassBuilder
since all logging (aside from analysis managers) goes through
instrumentation now.
This has the downside of never being able to print the top level pass
manager via instrumentation, but that seems like a minor downside.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D101797
GNU as documentation states that a `.thumb_func` directive implies `.thumb`, teach the asm parser to switch mode whenever it's encountered. On the other hand the labeled form, exclusive to Apple's toolchain, doesn't switch mode at all.
Reviewed By: nickdesaulniers, peter.smith
Differential Revision: https://reviews.llvm.org/D101975
This fixes an issue where mixed TOC / NOTOC calls can call the incorrect
thunks if a previous thunk already exists. The issue appears when a TOC
funciton calls a NOTOC callee and then a different NOTOC function calls the same
NOTOC callee. In this case the linker would sometimes incorrectly call the
same thunk for both cases.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101837
Adopt my suggestion in https://reviews.llvm.org/D91426#2653926 ,
generalizing the ppc64 specific code.
GNU ld and glibc ld.so has a contract about the first few entries of .got .
There are somewhat complex conditions when the header is needed. This patch
switches to a simpler approach: add a header unconditionally if
_GLOBAL_OFFSET_TABLE_ is used or the number of entries is more than just the
header.
GNU ld -r can create .rela.eh_frame with unordered r_offset values.
(With LLD, we can craft such a case by reordering sections in .eh_frame.)
This is currently unsupported and will trigger
`assert(pieces[i].inputOff <= off ...` in `OffsetGetter::get`
(the content is corrupted in a -DLLVM_ENABLE_ASSERTIONS=off build).
This patch supports this case.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D101116
This is the same problem as 127176e59e, but for static TLS rather than
dynamic TLS. Although we know the symbol will be the one in our own TLS
segment, and thus the offset of it within that, we don't know where in
the static TLS block our data will be allocated and thus we must emit a
dynamic relocation for this case.
Reviewed By: MaskRay, atanasyan
Differential Revision: https://reviews.llvm.org/D101381
Whilst not wrong (unless using static PIE where the relocations are
likely not implemented by the runtime), this is inefficient, as the TLS
module indices and offsets are independent of the executable's load
address.
Reviewed By: MaskRay, atanasyan
Differential Revision: https://reviews.llvm.org/D101382
An unfetched lazy symbol (undefined weak) should be considered to have its
original versionId which is VER_NDX_GLOBAL, instead of the lazy symbol's
versionId. (The original versionId cannot be non-VER_NDX_GLOBAL because a
undefined versioned symbol is an error.)
The regression was introduced in D77280 when making version scripts work
with lazy symbols fetched by LTO calls.
Fix PR49915
Differential Revision: https://reviews.llvm.org/D100624
Fix PR49897: if `__real_foo` has the isUsedInRegularObj bit set, we need to
retain `foo` in .symtab, even if `foo` is undefined. The new behavior will match
GNU ld.
Before the patch, we produced an R_X86_64_JUMP_SLOT relocation referencing the
index 0 undefined symbol, which would be erroed by glibc
(see f96ff3c0f8).
While here, fix another bug: if `__wrap_foo` does not exist, its initial binding
should be `foo`'s.
Change the default to facilitate GC for metadata section usage, so that they
don't need SHF_LINK_ORDER or SHF_GROUP just to drop the unhelpful rule (if they
want to be unconditionally retained, use SHF_GNU_RETAIN
(`__attribute__((retain))`) or linker script `KEEP`).
The dropped SHF_GROUP special case makes the behavior of -z start-stop-gc and -z
nostart-stop-gc closer to GNU ld>=2.37 (https://sourceware.org/PR27451).
However, we default to -z start-stop-gc (which actually matches more closely to
GNU ld before 2015-10 https://sourceware.org/PR19167), which is different from
modern GNU ld (which has the unhelpful rule to work around glibc). As a
compensation, we special case `__libc_` sections as a workaround for glibc<2.34
(https://sourceware.org/PR27492).
Since -z start-stop-gc as the default actually matches the traditional GNU ld
behavior, there isn't much to be aware of. There was a systemd usage which has
been fixed by https://github.com/systemd/systemd/pull/19144
The `e_flags` for a ELF file targeting the AVR ISA contains two fields at the time of writing:
- A 7-bit integer field specifying the ISA revision being targeted
- A 1-bit flag specifying whether the object files being linked are suited for applying the relaxations at link time
The linked ELF file is blessed with the arch revision shared among all the files.
The behaviour in case of mismatch is purposefully different than the one implemented in libbfd: LLD will raise a fatal error while libbfd silently picks a default value of `avr2`.
The relaxation-ready flag is handled as done by libbfd, in order for it to appear in the linked object every source object must be tagged with it.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99754
chmod tries to be very helpful on some platforms and prevent naive mistakes, by warning the user. This results in the following error during the test:
```chmod: ...resolution-err.ll.tmp.resolution.txt: new permissions are r--rw-rw-, not r--r--r--```
To fix the test, call chmod with u.
Differential Revision: https://reviews.llvm.org/D100417
From the PowerPC ELFv2 ABI section 4.2.3. Global Offset Table.
```
The GOT consists of an 8-byte header that contains the TOC base (the first TOC
base when multiple TOCs are present), followed by an array of 8-byte addresses.
```
Due to the introduction of PC Relative code it is now possible to require a GOT
without having a .TOC. symbol in the object that is being linked. Since LLD uses
the .TOC. symbol to determine whether or not a GOT is required the GOT header is
not setup correctly and the 8-byte header is missing.
This patch allows the Power PC GOT setup to happen when an element is added to
the GOT instead of at the very begining. When this header is added a .TOC.
symbol is also added.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D91426
There is a bug when initial exec is relaxed to local exec.
In the following situation:
InitExec.c
```
extern __thread unsigned TGlobal;
unsigned getConst(unsigned*);
unsigned addVal(unsigned, unsigned*);
unsigned GetAddrT() {
return addVal(getConst(&TGlobal), &TGlobal);
}
```
Def.c
```
__thread unsigned TGlobal;
unsigned getConst(unsigned* A) {
return *A + 3;
}
unsigned addVal(unsigned A, unsigned* B) {
return A + *B;
}
```
The problem is in InitExec.c but Def.c is required if you want to link the example and see the problem.
To compile everything:
```
clang -O3 -mcpu=pwr10 -c InitExec.c
clang -O3 -mcpu=pwr10 -c Def.c
ld.lld InitExec.o Def.o -o IeToLe
```
If you objdump the problem object file:
```
$ llvm-objdump -dr --mcpu=pwr10 InitExec.o
```
you will get the following assembly:
```
0000000000000000 <GetAddrT>:
0: a6 02 08 7c mflr 0
4: f0 ff c1 fb std 30, -16(1)
8: 10 00 01 f8 std 0, 16(1)
c: d1 ff 21 f8 stdu 1, -48(1)
10: 00 00 10 04 00 00 60 e4 pld 3, 0(0), 1
0000000000000010: R_PPC64_GOT_TPREL_PCREL34 TGlobal
18: 14 6a c3 7f add 30, 3, 13
0000000000000019: R_PPC64_TLS TGlobal
1c: 78 f3 c3 7f mr 3, 30
20: 01 00 00 48 bl 0x20
0000000000000020: R_PPC64_REL24_NOTOC getConst
24: 78 f3 c4 7f mr 4, 30
28: 30 00 21 38 addi 1, 1, 48
2c: 10 00 01 e8 ld 0, 16(1)
30: f0 ff c1 eb ld 30, -16(1)
34: a6 03 08 7c mtlr 0
38: 00 00 00 48 b 0x38
0000000000000038: R_PPC64_REL24_NOTOC addVal
```
The lines of interest are:
```
10: 00 00 10 04 00 00 60 e4 pld 3, 0(0), 1
0000000000000010: R_PPC64_GOT_TPREL_PCREL34 TGlobal
18: 14 6a c3 7f add 30, 3, 13
0000000000000019: R_PPC64_TLS TGlobal
1c: 78 f3 c3 7f mr 3, 30
```
Which once linked gets turned into:
```
10010210: ff ff 03 06 00 90 6d 38 paddi 3, 13, -28672, 0
10010218: 00 00 00 60 nop
1001021c: 78 f3 c3 7f mr 3, 30
```
The problem is that register 30 is never set after the optimization.
Therefore it is not correct to relax the above instructions by replacing
the add instruction with a nop.
Instead the add instruction should be replaced with a copy (mr) instruction.
If the add uses the same resgiter as input and as ouput then it is safe to
continue to replace the add with a nop.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D95262
`--shuffle-sections=<seed>` applies to all sections. The new
`--shuffle-sections=<section-glob>=<seed>` makes shuffling selective. To the
best of my knowledge, the option is only used as debugging, so just drop the
original form.
`--shuffle-sections '.init_array*=-1'` `--shuffle-sections '.fini_array*=-1'`.
reverses static constructors/destructors of the same priority.
Useful to detect some static initialization order fiasco.
`--shuffle-sections '.data*=-1'`
reverses `.data*` sections. Useful to detect unfunded pointer comparison results
of two unrelated objects.
If certain sections have an intrinsic order, the old form cannot be used.
Differential Revision: https://reviews.llvm.org/D98679
If the number of sections changes, which is common for re-links after
incremental updates, the section order may change drastically.
Special case -1 to reverse input sections. This is a stable transform.
The section order is more resilient to incremental updates. Usually the
code issue (e.g. Static Initialization Order Fiasco, assuming pointer
comparison result of two unrelated objects) is due to the relative order
between two problematic input files A and B. Checking the regular order
and the reversed order is sufficient.
Differential Revision: https://reviews.llvm.org/D98445
GNU ld supports `.` and `$` in symbol names while LLD doesn't support them in
`readPrimary` expressions. Using `.` can result in such an error:
```
https://github.com/ClangBuiltLinux/linux/issues/1318
ld.lld: error: ./arch/powerpc/kernel/vmlinux.lds:255: malformed number: .TOC.
>>> __toc_ptr = (DEFINED (.TOC.) ? .TOC. : ADDR (.got)) + 0x8000;
```
Allow `.` (ppc64 special symbol `.TOC.`) and `$` (RISC-V special symbol `__global_pointer$`).
Change `diag[3-5].test` to use an invalid character `^`.
Note: GNU ld allows `~` in non-leading positions of a symbol name. `~`
is not used in practice, conflicts with the unary operator, and can
cause some parsing difficulty, so this patch does not add it.
Differential Revision: https://reviews.llvm.org/D98306
A `local:` version node in a version script can change the effective symbol binding
to STB_LOCAL. The linker needs to communicate the fact to enable WPD
(otherwise LTO does not know that the `!vcall_visibility` metadata has
effectively changed from VCallVisibilityPublic to VCallVisibilityLinkageUnit).
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D98220
This changes the target data layout to make stack align to 16 bytes
on Power10. Before this change, stack was being aligned to 32 bytes.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D96265
Implemented the option to omit Power10 instructions from save stubs via the
option --no-power10-stubs or --power10-stubs=no on lld. --power10-stubs= will
override the other option. --power10-stubs=auto also exists to use the default
behaviour (ie allow Power10 instructions in stubs).
Differential Revision: https://reviews.llvm.org/D94627
llvm-objdump only uses one MCInstrAnalysis object, so if ARM and Thumb
code is mixed in one object, or if an object is disassembled without
explicitly setting the triple to match the ISA used, then branch and
call targets will be printed incorrectly.
This could be fixed by creating two MCInstrAnalysis objects in
llvm-objdump, like we currently do for SubtargetInfo. However, I don't
think there's any reason we need two separate sub-classes of
MCInstrAnalysis, so instead these can be merged into one, and the ISA
determined by checking the opcode of the instruction.
Differential revision: https://reviews.llvm.org/D97766
In AArch32 ARM, the PC reads two instructions ahead of the currently
executiing instruction. This evaluates to 8 in ARM state and 4 in
Thumb state. Branch instructions on AArch32 compensate for this by
subtracting the PC bias from the addend. For a branch to symbol this
will result in an addend of -8 in ARM state and -4 in Thumb state.
The existing ARM Target::inBranchRange function accounted for this
implict addend within the function meaning that if the addend were
to be taken into account by the caller then it would be double
counted. This complicates the interface for all Targets as callers
wanting to account for addends had to account for the ARM PC-bias.
In certain situations such as:
https://github.com/ClangBuiltLinux/linux/issues/1305
the PC-bias compensation code didn't match up. In particular
normalizeExistingThunk() didn't put the PC-bias back in as Arm
thunks did not store the addend.
The simplest fix for the problem is to add the PC bias in
normalizeExistingThunk when restoring the addend. However I think
it is worth refactoring the Arm inBranchRange implementation so
that fewer calls to getPCBias are needed for other Targets. I
wasn't able to remove getPCBias completely but hopefully the
Relocations.cpp code is simpler now.
In principle a test could be written to replicate the linux kernel
build failure but I wasn't able to reproduce with a small example
that I could build up from scratch.
Fixes https://github.com/ClangBuiltLinux/linux/issues/1305
Differential Revision: https://reviews.llvm.org/D97550
Also a couple of minor cleanups in merge-string.s:
- fix inconsistent use of tabs
- use `.p2align` rather than `.align` since `.p2align` works the
same on all platforms (the meaning of align seems to differ
between platforms according to `AlignmentIsInBytes`.
I noticed these potential cleanups while porting SHF_STRINGS support to
wasm-ld.
Differential Revision: https://reviews.llvm.org/D97647
For one metadata section usage, each text section references a metadata section.
The metadata sections have a C identifier name to allow the runtime to collect them via `__start_/__stop_` symbols.
Since `__start_`/`__stop_` references are always present from live sections, the
C identifier name sections appear like GC roots, which means they cannot be
discarded by `ld --gc-sections`.
To make such sections GCable, either SHF_LINK_ORDER or a section group is needed.
SHF_LINK_ORDER is not suitable for the references can be inlined into other functions
(See D97430:
Function A (in the section .text.A) references its `__sancov_guard` section.
Function B inlines A (so now .text.B references `__sancov_guard` - this is invalid with the semantics of SHF_LINK_ORDER).
In the linking stage,
if `.text.A` gets discarded, and `__sancov_guard` is retained via the reference from `.text.B`,
the output will be invalid because `__sancov_guard` references the discarded `.text.A`.
LLD errors "sh_link points to discarded section".
)
A section group have size overhead, and is cumbersome when there is just one metadata section.
Add `-z start-stop-gc` to drop the "__start_/__stop_ references retain
non-SHF_LINK_ORDER non-SHF_GROUP C identifier name sections" rule.
We reserve the rights to switch the default in the future.
Reviewed By: phosek, jrtc27
Differential Revision: https://reviews.llvm.org/D96914
The special root semantics for identifier-named sections is meant
specifically for the metadata sections. In the context of group
semantics, where group members are always retained or discarded as a
unit, it's natural not to have this semantics apply to a section in a
group, otherwise we would never discard the group defeating the purpose
of using the group in the first place.
This change modifies the GC behavior so that __start_/__stop_ references
don't retain C identifier named sections in section groups which allows
for these groups to be collected. This matches the behavior of BFD ld.
The only kind of existing case that might break is interdependent
metadata sections that are all in a group together, but that group
doesn't contain any other sections referenced by anything except
implicit inclusion in a `__start_` and/or `__stop_`-referenced
identifier-named section, but such cases should be unlikely.
Differential Revision: https://reviews.llvm.org/D96753
ST_Data is used to model BFD `BFD_OBJECT`.
A STT_TLS symbol does not have the `BFD_OBJECT` flag in BFD.
This makes sense because a STT_TLS symbol is like in a different address space,
normal data/object properties do not apply on them.
With this change, a STT_TLS symbol will not be displayed as 'O'.
This new behavior matches objdump.
Differential Revision: https://reviews.llvm.org/D96735
Adds a lld test for a case that the handling added for dynamically
exported symbols in 1487747e99 already
fixes. Because isExportDynamic returns true when the symbol is
SharedKind with default visibility, it will treat as dynamically
exported and block devirtualization when the definition of a vtable
comes from a shared library. This is desireable as it is dangerous to
devirtualize in that case, since there could be hidden overrides in the
shared library. Typically that happens when the shared library header
contains available externally definitions, which applications can
override. An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.
The regular LTO case in the new test already worked, but there are
2 fixes in this patch needed for the index-only case and the hybrid
LTO case. For the index-only case, WPD should not simply ignore
available externally vtables. A follow on fix will be made to clang to
emit type metadata for those vtables, which the new test is modeling.
For the hybrid case, we need to ensure when the module is split that any
llvm.*used globals are cloned to the regular LTO split module so
available externally vtable definitions are not prematurely deleted.
Another follow on fix will add the equivalent gold test, which requires
a small fix to the plugin to treat symbols in dynamic libraries the same
way lld already is.
Differential Revision: https://reviews.llvm.org/D96721
This change introduces support for zero flag ELF section groups to lld.
lld already supports COMDAT sections, which in ELF are a special type of
ELF section groups. These are generally useful to enable linker GC where
you want a group of sections to always travel together, that is to be
either retained or discarded as a whole, but without the COMDAT
semantics. Other ELF linkers already support zero flag ELF section
groups and this change helps us reach feature parity.
Differential Revision: https://reviews.llvm.org/D96636
As we don't sort local symbols, don't sort non-local symbols. This makes
non-local symbols appear in their register order, which matches GNU as. The
register order is nice in that you can write tests with interleaved CHECK
prefixes, e.g.
```
// CHECK: something about foo
.globl foo
foo:
// CHECK: something about bar
.globl bar
bar:
```
With the lexicographical order, the user needs to place lexicographical smallest
symbol first or keep CHECK prefixes in one place.
When parsing an object file, LLD interleaves undefined symbol resolution (which
may recursively fetch other lazy objects) with defined symbol resolution.
This may lead to surprising results, e.g. if an object file defines currently
undefined symbols and references another lazy symbol, we may interleave defined
symbols with the lazy fetch, potentially leading to the defined symbols
resolving to different files.
As an example, if both `a.a(a.o)` and `a.a(b.o)` define `foo` (not in COMDAT
group, or in different COMDAT groups) and `__profd_foo` (in COMDAT group
`__profd_foo`). LLD may resolve `foo` to `a.a(a.o)` and `__profd_foo` to
`b.a(b.o)`, i.e. different files.
```
parse ArchiveFile a.a
entry fetches a.a(a.o)
parse ObjectFile a.o
define entry
define foo
reference b
b fetches a.a(b.o)
parse ObjectFile b.o
define prevailing __profd_foo
define (ignored) non-prevailing __profd_foo
```
Assuming a set of interconnected symbols are defined all or none in several lazy
objects. Arguably making them resolve to the same file is preferable than making
them resolve to different files (some are lazy objects).
The main argument favoring the new behavior is the stability. The relative order
between a defined symbol and an undefined symbol does not change the symbol
resolution behavior. Only the relative order between two undefined symbols can
affect fetching behaviors.
---
The real world case is reduced from a Fuchsia PGO usage: `a.a(a.o)` has a
constructor within COMDAT group C5 while `a.a(b.o)` has a constructor within
COMDAT group C2. Because they use different group signatures, they are not
de-duplicated. It is not entirely whether Clang behavior is entirely conforming.
LLD selects the PGO counter section (`__profd_*`) from `a.a(b.o)` and the
constructor section from `a.a(a.o)`. The `__profd_*` is a SHF_LINK_ORDER section
linking to its own non-prevailing constructor section, so LLD errors
`sh_link points to discarded section`. This patch fixes the error.
Differential Revision: https://reviews.llvm.org/D95985
`extern const bfd_target aarch64_elf64_le_vec;` is a variable in BFD.
It was somehow misused as an emulation by Android.
```
% aarch64-linux-gnu-ld -m aarch64_elf64_le_vec a.o
aarch64-linux-gnu-ld: unrecognised emulation mode: aarch64_elf64_le_vec
Supported emulations: aarch64linux aarch64elf aarch64elf32 aarch64elf32b aarch64elfb armelf armelfb aarch64linuxb aarch64linux32 aarch64linux32b armelfb_linux_eabi armelf_linux_eabi
```
Acked by Stephen Hines, who removed the flag from Android a while back.
Rewritting the path of the sample profile file in response.txt to be relative to the repro tar.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D96193
GNU as does not sort local symbols. This has several advantages:
* The .symtab order is roughly the symbol occurrence order.
* The closest preceding STT_SECTION symbol is the definition of a local symbol.
* The closest preceding STT_FILE symbol is the defining file of a local symbol, if there are multiple default-version .file directives. (Not implemented in MC.)
A SHF_LINK_ORDER .gcc_except_table is similar to a .gcc_except_table in
a section group. The associated text section is responsible for retaining it.
LLD still does not support GC of non-group non-SHF_LINK_ORDER .gcc_except_table -
but that is not necessary because we can teach the compiler to set SHF_LINK_ORDER.
The current diagnostic has confused users. The new wording is adapted from one suggested by Ian Lance Taylor.
Differential Revision: https://reviews.llvm.org/D95917
binutils 2.36 introduced the new section flag SHF_GNU_RETAIN (for ELFOSABI_GNU &
ELFOSABI_FREEBSD) to mark a sections as a GC root. Several LLVM side toolchain
folks (including me) were involved in the design process of SHF_GNU_RETAIN and
were happy with this proposal.
Currently GNU ld only respects SHF_GNU_RETAIN semantics for ELFOSABI_GNU &
ELFOSABI_FREEBSD object files
(https://sourceware.org/bugzilla/show_bug.cgi?id=27282). GNU ld sets EI_OSABI
to ELFOSABI_GNU for relocatable output
(https://sourceware.org/bugzilla/show_bug.cgi?id=27091). In practice the single
value EI_OSABI is neither a good indicator for object file compatibility, nor a
useful mechanism marking used ELF extensions.
For input, we respect SHF_GNU_RETAIN semantics even for ELFOSABI_NONE object
files. This is compatible with how LLD and GNU ld handle (mildly useful) STT_GNU_IFUNC
/ (emitted by GCC, considered misfeature by some folks) STB_GNU_UNIQUE input.
(As of LLVM 12.0.0, the integrated assembler does not set ELFOSABI_GNU for
STT_GNU_IFUNC/STB_GNU_UNIQUE).
Arguably STT_GNU_IFUNC/STB_GNU_UNIQUE probably need indicators in object files
but SHF_GNU_RETAIN is more likely accepted by more OSABI platforms.
For output, we take a step further than GNU ld: we don't promote ELFOSABI_NONE
to ELFOSABI_GNU for all output.
Differential Revision: https://reviews.llvm.org/D95749
In GCC emitted .debug_info sections, R_386_GOTOFF may be used to
relocate DW_AT_GNU_call_site_value values
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98946).
R_386_GOTOFF (`S + A - GOT`) is one of the `isStaticLinkTimeConstant` relocation
type which is not PC-relative, so it can be used from non-SHF_ALLOC sections. We
current allow new relocation types as needs come. The diagnostic has caught some
bugs in the past.
Differential Revision: https://reviews.llvm.org/D95994
On z/OS, other error messages are not matched correctly in lit tests.
```
EDC5121I Invalid argument.
EDC5111I Permission denied.
```
This patch adds a lit substitution to fix it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D95808
The option catches incompatibility between `R_*_IRELATIVE` and DT_TEXTREL/DF_TEXTREL
before glibc 2.29. Newer glibc versions are more common nowadays and I don't
think this option has ever been used. Diagnosing this problem is also
straightforward by reading the stack trace.
On z/OS, the following error message is not matched correctly in lit tests.
```
EDC5129I No such file or directory.
```
This patch uses a lit config substitution to check for platform specific error messages.
Reviewed By: muiez, jhenderson
Differential Revision: https://reviews.llvm.org/D95246
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.
Differential Revision: https://reviews.llvm.org/D91583
A default version (@@) is only available for defined symbols.
Currently we use "@@" for undefined symbols too.
This patch fixes the issue and improves our test case.
Differential revision: https://reviews.llvm.org/D95219
Whilst migrating/retiring some downstream testing, I came across a test
for weak undef IE and LD TLS references, but was unable to find any
equivalent in LLD's upstream testing. There does seem to be some slight
subtle differences that could be worth testing compared to LE TLS
references, in particular that IE can be relaxed to LE in this case,
hence this change.
Differential Revision: https://reviews.llvm.org/D95124
Reviewed by: grimar, MaskRay
If foo is referenced in any object file, bitcode file or shared object,
`__wrap_foo` should be retained as the redirection target of sym
(f96ff3c0f8).
If the object file defining foo has foo references, we cannot easily distinguish
the case from cases where foo is not referenced (we haven't scanned
relocations). Retain `__wrap_foo` because we choose to wrap sym references
regardless of whether sym is defined to keep non-LTO/LTO/relocatable links' behaviors similar
https://sourceware.org/bugzilla/show_bug.cgi?id=26358 .
If foo is defined in a shared object, `__wrap_foo` can still be omitted
(`wrap-dynamic-undef.s`).
Reviewed By: andrewng
Differential Revision: https://reviews.llvm.org/D95152
Fixes PR48523. When the linker errors with "output file too large",
one question that comes to mind is how the section sizes differ from
what they were previously. Unfortunately, this information is lost
when the linker exits without writing the output file. This change
makes it so that the error message includes the sizes of the largest
sections.
Reviewed By: MaskRay, grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D94560
This makes the following improvements.
For `SHT_GNU_versym`:
* yaml2obj: set `sh_link` to index of `.dynsym` section automatically.
For `SHT_GNU_verdef`:
* yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
* yaml2obj: set `sh_info` field automatically.
* obj2yaml: don't dump the `Info` field when its value matches the number of version definitions.
For `SHT_GNU_verneed`:
* yaml2obj: set `sh_link` to index of `.dynstr` section automatically.
* yaml2obj: set `sh_info` field automatically.
* obj2yaml: don't dump the `Info` field when its value matches the number of version dependencies.
Also, simplifies few test cases.
Differential revision: https://reviews.llvm.org/D94956
In 8031785f4a the temporary object being built was moved to %t/main.o,
but not all run lines were updated to reflect this. Observe the failure
on this buildbot:
http://lab.llvm.org:8011/#/builders/5/builds/3646/steps/9/logs/stdio
It might pass locally for some people due to a stale %t.o hanging around
the build directory.
```
// a.s
jmp fcntl
// b.s
.globl fcntl
fcntl:
ret
```
`ld.lld -shared --wrap=fcntl a.o b.o` has an `R_X86_64_JUMP_SLOT` referencing
the index 0 undefined symbol, which will cause a glibc `symbol lookup error` at
runtime. This is because `__wrap_fcntl` is not in .dynsym
We use an approximation `!wrap->isUndefined()`, which doesn't set
`isUsedInRegularObj` of `__wrap_fcntl` when `fcntl` is referenced and
`__wrap_fcntl` is undefined.
Fix this by using `sym->referenced`.
R_PPC64_ADDR16_HI represents bits 16-31 of a 32-bit value
R_PPC64_ADDR16_HIGH represents bits 16-31 of a 64-bit value.
In the Linux kernel, `LOAD_REG_IMMEDIATE_SYM` defined in `arch/powerpc/include/asm/ppc_asm.h`
uses @l, @high, @higher, @highest to load the 64-bit value of a symbol.
Fixes https://github.com/ClangBuiltLinux/linux/issues/1260
The commit 18aa0be36e changed the default GotBaseSymInGotPlt to true
for AArch64. This is different than binutils, where
_GLOBAL_OFFSET_TABLE_ points at the start or .got.
It seems to not intefere with current relocations used by LLVM. However
as indicated by PR#40357 [1] gcc generates R_AARCH64_LD64_GOTPAGE_LO15
for -pie (in fact it also generated the relocation for -fpic).
This change is requires to correctly handle R_AARCH64_LD64_GOTPAGE_LO15
by lld from objects generated by gcc.
[1] https://bugs.llvm.org/show_bug.cgi?id=40357
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match successfully.
```
EDC5129I No such file or directory.
```
Reviewed By: muiez
Differential Revision: https://reviews.llvm.org/D94239
Fixes PR48693: --emit-relocs keeps relocation sections. --gdb-index drops
.debug_gnu_pubnames and .debug_gnu_pubtypes but not their relocation sections.
This can cause a null pointer dereference in `getOutputSectionName`.
Also delete debug-gnu-pubnames.s which is covered by gdb-index.s
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D94354
Fixes PR48681: after LTO, lto.tmp may reference a libcall symbol not in an IR
symbol table of any bitcode file. If such a symbol is defined in an archive
matched by a --exclude-libs, we don't correctly localize the symbol.
Add another `excludeLibs` after `compileBitcodeFiles` to localize such libcall
symbols. Unfortunately we have keep the existing one for D43126.
Using VER_NDX_LOCAL is an implementation detail of `--exclude-libs`, it does not
necessarily tie to the "localize" behavior. `local:` patterns in a version
script can be omitted.
The `symbol ... has undefined version ...` error should not be exempted.
Ideally we should error as GNU ld does. https://issuetracker.google.com/issues/73020933
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D94280
This character indicates that when return pointer authentication is
being used, the function signs the return address using the B key.
Differential Revision: https://reviews.llvm.org/D93954
Add support for linking powerpcle code in LLD.
Rewrite lld/test/ELF/emulation-ppc.s to use a shared check block and add powerpcle tests.
Update tests.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93917
Every basic block section symbol created by -fbasic-block-sections will contain
".__part." to know that this symbol corresponds to a basic block fragment of
the function.
This patch solves two problems:
a) Like D89617, we want function symbols with suffixes to be properly qualified
so that external tools like profile aggregators know exactly what this
symbol corresponds to.
b) The current basic block naming just adds a ".N" to the symbol name where N is
some integer. This collides with how clang creates __cxx_global_var_init.N.
clang creates these symbol names to call constructor functions and basic
block symbol naming should not use the same style.
Fixed all the test cases and added an extra test for __cxx_global_var_init
breakage.
Differential Revision: https://reviews.llvm.org/D93082
For x86-64, D33100 added a diagnostic for local-exec TLS relocations referencing a preemptible symbol.
This patch generalizes it to non-preemptible symbols (see `-Bsymbolic` in `tls.s`)
on all targets.
Local-exec TLS relocations resolve to offsets relative to a fixed point within
the static TLS block, which are only meaningful for the executable.
With this change, `clang -fpic -shared -fuse-ld=bfd a.c` on the following example will be flagged for AArch64/ARM/i386/x86-64/RISC-V
```
static __attribute__((tls_model("local-exec"))) __thread long TlsVar = 42;
long bump() { return ++TlsVar; }
```
Note, in GNU ld, at least arm, riscv and x86's ports have the similar
diagnostics, but aarch64 and ppc64 do not error.
Differential Revision: https://reviews.llvm.org/D93331
Alternative to D91611.
The TLS General Dynamic/Local Dynamic code sequences need to mark
`__tls_get_addr` with R_PPC64_TLSGD or R_PPC64_TLSLD, e.g.
```
addis r3, r2, x@got@tlsgd@ha # R_PPC64_GOT_TLSGD16_HA
addi r3, r3, x@got@tlsgd@l # R_PPC64_GOT_TLSGD16_LO
bl __tls_get_addr(x@tlsgd) # R_PPC64_TLSGD followed by R_PPC64_REL24
nop
```
However, there are two deviations form the above:
1. direct call to `__tls_get_addr`. This is essential to implement ld.so in glibc/musl/FreeBSD.
```
bl __tls_get_addr
nop
```
This is only used in a -shared link, and thus not subject to the GD/LD to IE/LE
relaxation issue below.
2. Missing R_PPC64_TLSGD/R_PPC64_TLSGD for compiler generated TLS references
According to Stefan Pintille, "In the early days of the transition from the
ELFv1 ABI that is used for big endian PowerPC Linux distributions to the ELFv2
ABI that is used for little endian PowerPC Linux distributions, there was some
ambiguity in the specification of the relocations for TLS. The GNU linker has
implemented support for correct handling of calls to __tls_get_addr with a
missing relocation. Unfortunately, we didn't notice that the IBM XL compiler
did not handle TLS according to the updated ABI until we tried linking XL
compiled libraries with LLD."
In short, LLD needs to work around the old IBM XL compiler issue.
Otherwise, if the object file is linked in -no-pie or -pie mode,
the result will be incorrect because the 4 instructions are partially
rewritten (the latter 2 are not changed).
Work around the compiler bug by disable General Dynamic/Local Dynamic to
Initial Exec/Local Exec relaxation. Note, we also disable Initial Exec
to Local Exec relaxation for implementation simplicity, though technically it can be kept.
ppc64-tls-missing-gdld.s demonstrates the updated behavior.
Reviewed By: #powerpc, stefanp, grimar
Differential Revision: https://reviews.llvm.org/D92959
We need to make sure not to emit R_X86_64_GOTPCRELX relocations for
instructions that use a REX prefix. If a REX prefix is present, we need to
instead use a R_X86_64_REX_GOTPCRELX relocation. The existing logic for
CALL64m, JMP64m, etc. already handles this by checking the HasREX parameter
and using it to determine which relocation type to use. Do this for all
instructions that can use relaxed relocations.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93561
Currently, `ELFFile<ELFT>::getEntry` does not check an index of
an entry. Because of that the code might read past the end of the symbol
table silently. I've added a test to `llvm-readobj\ELF\relocations.test`
to demonstrate the possible issue. Also, I've added a unit test for
this method.
After this change, `getEntry` stops reporting the section index and
reuses the `getSectionContentsAsArray` method, which already has
all the validation needed. Our related warnings now provide
more and better context sometimes.
Differential revision: https://reviews.llvm.org/D93209
As indicated by AArch64 ELF specification, symbols with st_other
marked with STO_AARCH64_VARIANT_PCS indicates it may follow a variant
procedure call standard with different register usage convention
(for instance SVE calls).
Static linkers must preserve the marking and propagate it to the dynamic
symbol table if any reference or definition of the symbol is marked with
STO_AARCH64_VARIANT_PCS, and add a DT_AARCH64_VARIANT_PCS dynamic tag if
there are R_<CLS>_JUMP_SLOT relocations that reference that symbols.
It implements https://bugs.llvm.org/show_bug.cgi?id=48368.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93045
Similar to D77853. Change ADRP to print the target address in hex, instead of the raw immediate.
The behavior is similar to GNU objdump but we also include `0x`.
Note: GNU objdump is not consistent whether or not to emit `0x` for different architectures. We try emitting 0x consistently for all targets.
```
GNU objdump: adrp x16, 10000000
Old llvm-objdump: adrp x16, #0
New llvm-objdump: adrp x16, 0x10000000
```
`adrp Xd, 0x...` assembles to a relocation referencing `*ABS*+0x10000` which is not intended. We need to use a linker or use yaml2obj.
The main test is `test/tools/llvm-objdump/ELF/AArch64/pcrel-address.yaml`
Differential Revision: https://reviews.llvm.org/D93241
Fix PR48357: If .rela.dyn appears as an output section description, its type may
be SHT_RELA (due to the empty synthetic .rela.plt) while there is no input
section. The empty .rela.dyn may be retained due to a reference in a linker
script. Don't crash.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D93367
Follow the naming set by TI's own GCC-based toolchain.
Also, force the `osabi` field to `ELFOSABI_STANDALONE`, this matches GNU LD's output (the patching is done in `elf32_msp430_post_process_headers`).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92931
The original tests have unneeded symbols and copy-relocation-zero-abs-addr.s
does not actually test anything.
Rewrite them and add copy-relocation-zero-addr.s instead.
Add --soname=b so that the address 0x203400 will be stable. (When linking an
executable with %t.so, the path %t.so will be recorded in the DT_NEEDED entry if
%t.so doesn't have DT_SONAME. .dynstr will have varying lengths on different
systems.)
This test may fail if there is a new changes to this tests.
The archives are not deleted so the contents from the previous test run
may affect the contents for the current run,
so this will require cleaning up the Output dir or force build of buildbot.
The fix is to put all the objects in the temporary dir that we cleanup every run,
to avoid run-2-run flaky failures.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D93128
Normally we should not delete options. However, the Clang driver passes
`-plugin-opt={new,legacy}-pass-manager` instead of
`--[no-]lto-legacy-pass-manager` (`-plugin-opt=new-pass-manager` has been used
since 7.0), and it is unlikely anyone will use the `--lto-*` style options directly.
So let's rename them to be consistent with the Clang driver option names.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D92988
-DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=on configured LLD and LLVMgold.so
will use the new pass manager by default. Add an option to
use the legacy pass manager. This will also be used by the Clang driver
when -fno-new-pass-manager (D92915) / -fno-experimental-new-pass-manager is set.
Reviewed By: aeubanks, tejohnson
Differential Revision: https://reviews.llvm.org/D92916
This patch changes the archive handling to enable the semantics needed
for legacy FORTRAN common blocks and block data. When we have a COMMON
definition of a symbol and are including an archive, LLD will now
search the members for global/weak defintions to override the COMMON
symbol. The previous LLD behavior (where a member would only be included
if it satisifed some other needed symbol definition) can be re-enabled with the
option '-no-fortran-common'.
Differential Revision: https://reviews.llvm.org/D86142
This makes the llvm-objdump output much more readable and closer to binutils objdump. This builds on D76591
It requires changing the OperandType for certain immediates to "OPERAND_PCREL" so tablegen will generate code to pass the instruction's address. This means we can't do the generic check on these instructions in verifyInstruction any more. Should I add it back with explicit opcode checks? Or should we add a new operand flag to control the passing of address instead of matching the name?
Differential Revision: https://reviews.llvm.org/D92147
This reverts a side effect introduced in the code cleanup patch D43571:
LLD started to emit empty output sections that are explicitly assigned to a segment.
This patch fixes the issue by removing the !sec.phdrs.empty() special case from
isDiscardable. As compensation, we add an early phdrs propagation step (see the inline comment).
This is similar to one that we do in adjustSectionsAfterSorting.
Differential revision: https://reviews.llvm.org/D92301
If an object file has an undefined foo@v1, we emit a dynamic symbol foo.
This is incorrect if at runtime a shared object provides the non-default version foo@v1
(the undefined foo may bind to foo@@v2, for example).
GNU ld issues an error for this case, even if foo@v1 is undefined weak
(https://sourceware.org/bugzilla/show_bug.cgi?id=3351). This behavior makes
sense because to represent an undefined foo@v1, we have to construct a Verneed
entry. However, without knowing the defining filename, we cannot construct a
Verneed entry (Verneed::vn_file is unavailable).
This patch implements the error.
Depends on D92258
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D92260
The symbol resolution rules for versioned symbols are:
* foo@@v1 (default version) resolves both undefined foo and foo@v1
* foo@v1 (non-default version) resolves undefined foo@v1
Note, foo@@v1 must be defined (the assembler errors if attempting to
create an undefined foo@@v1).
For defined foo@@v1 in a shared object, we call `SymbolTable::addSymbol` twice,
one for foo and the other for foo@v1. We don't do the same for object files, so
foo@@v1 defined in one object file incorrectly does not resolve a foo@v1
reference in another object file.
This patch fixes the issue by reusing the --wrap code to redirect symbols in
object files. This has to be done after processing input files because
foo and foo@v1 are two separate symbols if we haven't seen foo@@v1.
Add a helper `Symbol::getVersionSuffix` to retrieve the optional trailing
`@...` or `@@...` from the possibly truncated symbol name.
Depends on D92258
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D92259
Test the symbol resolution related to
* defined foo@@v1 and foo@v1 in object files/shared objects
* undefined foo@v1
* weak foo@@v1 and foo@v1
* visibility
* interaction with --wrap.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D92258
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.
This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:
* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='
Differential Revision: https://reviews.llvm.org/D85809
clang may produce `movl x@GOTPCREL+4(%rip), %eax` when loading the high 32 bits
of the address of a global variable in -fpic/-fpie mode.
If assembled by GNU as, the fixup emits an R_X86_64_GOTPCRELX with an
addend != -4. The instruction loads from the GOT entry with an offset
and thus it is incorrect to relax the instruction.
If assembled by the integrated assembler, we emit R_X86_64_GOTPCREL for
relocations that definitely cannot be relaxed (D92114), so this patch is not
needed.
This patch disables the relaxation, which is compatible with the implementation in GNU ld
("Add R_X86_64_[REX_]GOTPCRELX support to gas and ld").
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D91993
Enables overriding earlier --lto-whole-program-visibility.
Variant of D91583 while discussing alternate ways to identify and
handle the --export-dynamic case.
Differential Revision: https://reviews.llvm.org/D92060
This patch:
- adds an ld64.lld.darwinnew symlink for lld, to go with f2710d4b57,
so that `clang -fuse-ld=lld.darwinnew` can be used to test new
Mach-O lld while it's in bring-up. (The expectation is that we'll
remove this again once new Mach-O lld is the defauld and only Mach-O
lld.)
- lets the clang driver know if the linker is lld (currently
only triggered if `-fuse-ld=lld` or `-fuse-ld=lld.darwinnew` is
passed). Currently only used for the next point, but could be used
to implement other features that need close coordination between
compiler and linker, e.g. having a diag for calling `clang++` instead
of `clang` when link errors are caused by a missing C++ stdlib.
- lets the clang driver pass `-demangle` to Mach-O lld (both old and
new), in addition to ld64
- implements -demangle for new Mach-O lld
- changes demangleItanium() to accept _Z, __Z, ___Z, ____Z prefixes
(and updates one test added in D68014). Mach-O has an extra
underscore for symbols, and the three (or, on Mach-O, four)
underscores are used for block names.
Differential Revision: https://reviews.llvm.org/D91884
As mentioned in https://reviews.llvm.org/D67479#1667256 ,
* `--[no-]allow-shlib-undefined` control the diagnostic for an unresolved symbol in a shared object
* `-z defs/-z undefs` control the diagnostic for an unresolved symbol in a regular object file
* `--unresolved-symbols=` controls both bits.
In addition, make --warn-unresolved-symbols affect --no-allow-shlib-undefined.
This patch makes the behavior match GNU ld.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D91510
`try ... catch` in an inline function produces `.gcc_except_table.*` in a COMDAT
group with GCC or newer Clang (since D83655). For --gc-sections, currently we
scan `.eh_frame` pieces and mark liveness of such a `.gcc_except_table.*` and
then the associated `.text.*` (if a member in a section group is retained, the
others should be retained as well).
Essentially all `.text.*` and `.gcc_except_table.*` compiled from inline
functions with `try ... catch` cannot be discarded by the imprecise
--gc-sections. Compared with the state before D83655, the output
`.gcc_except_table` is smaller (non-prevailing copies in COMDAT groups can now
be discarded) but `.text` may be larger, i.e. size regression.
This patch teaches the .eh_frame piece scanning code to not mark
`.gcc_except_table` in a section group, thus allow unused `.text.*` and
`.gcc_except_table.*` in a section group to be discarded.
Note, non-group `.gcc_except_table` can still not be discarded. That is the status quo.
Reviewed By: grimar, echristo
Differential Revision: https://reviews.llvm.org/D91579
Fixes PR48071
* The Rust compiler produces SHF_ALLOC `.debug_gdb_scripts` (which normally does not have the flag)
* `.debug_gdb_scripts` sections are removed from `inputSections` due to --strip-debug/--strip-all
* When processing --gc-sections, pieces of a SHF_MERGE section can be marked live separately
`=>` segfault when marking liveness of a `.debug_gdb_scripts` which is not split into pieces (because it is not in `inputSections`)
This patch circumvents the problem by not treating SHF_ALLOC ".debug*" as debug sections (to prevent --strip-debug's stripping)
(which is still useful on its own).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D91291
Input sections `.ctors/.ctors.N` may go to either the output section `.init_array` or the output section `.ctors`:
* output `.ctors`: currently we sort them by name. This patch changes to sort by priority from high to low. If N in `.ctors.N` is in the form of %05u, there is no semantic difference. Actually GCC and Clang do use %05u. (In the test `ctors_dtors_priority.s` and Gold's test `gold/testsuite/script_test_14.s`, we can see %03u, but they are not really produced by compilers.)
* output `.init_array`: users can provide an input section description `SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)` to mix `.init_array.*` and `.ctors.*`. This can make .init_array.N and .ctors.(65535-N) interchangeable.
With this change, users can mix `.ctors.N` and `.init_array.N` in `.init_array` (PR44698 and PR48096) with linker scripts. As an example:
```
SECTIONS {
.init_array : {
*(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*))
*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors)
}
} INSERT AFTER .fini_array;
SECTIONS {
.fini_array : {
*(SORT_BY_INIT_PRIORITY(.fini_array.* .dtors.*))
*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors)
}
} INSERT BEFORE .init_array;
```
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D91187
According to
https://sourceware.org/binutils/docs/ld/Input-Section-Basics.html#Input-Section-Basics
for `*(.a .b)`, the order should match the input order:
* for `ld 1.o 2.o`, sections from 1.o precede sections from 2.o
* within a file, `.a` and `.b` appear in the section header table order
This patch implements the behavior. The interaction with `SORT*` and --sort-section is:
Matched sections are ordered by radix sort with the keys being `(SORT*, --sort-section, input order)`,
where `SORT*` (if present) is most significant.
> Note, multiple `SORT*` within an input section description has undocumented and
> confusing behaviors in GNU ld:
> https://sourceware.org/pipermail/binutils/2020-November/114083.html
> Therefore multiple `SORT*` is not the focus for this patch but
> this patch still strives to have an explainable behavior.
As an example, we partition `SORT(a.*) b.* c.* SORT(d.*)`, into
`SORT(a.*) | b.* c.* | SORT(d.*)` and perform sorting within groups. Sections
matched by patterns between two `SORT*` are sorted by input order. If
--sort-alignment is given, they are sorted by --sort-alignment, breaking tie by
input order.
This patch also allows a section to be matched by multiple patterns, previously
duplicated sections could occupy more space in the output and had erroneous zero bytes.
The patch is in preparation for support for
`*(SORT_BY_INIT_PRIORITY(.init_array.* .ctors.*)) *(.init_array .ctors)`,
which will allow LLD to mix .ctors*/.init_array* like GNU ld (gold's --ctors-in-init-array)
PR44698 and PR48096
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D91127
The second `SORT` in `*(SORT(...) SORT(...))` is incorrectly parsed as a file pattern.
Fix the bug by stopping at `SORT*` in `readInputSectionsList`.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D91180
This covers a few cases that aren't otherwise tested:
1) Non-ascii symbol names are ordered.
2) Comments, whitespace and blank lines are trimmed.
3) Missing order files result in an error.
Reviewed by: MaskRay, grimar
Differential Revision: https://reviews.llvm.org/D90933
Add a calling convention called amdgpu_gfx for real function calls
within graphics shaders. For the moment, this uses the same calling
convention as other calls in amdgpu, with registers excluded for return
address, stack pointer and stack buffer descriptor.
Differential Revision: https://reviews.llvm.org/D88540
Make it possible for lld users to provide a custom script that would help to
find missing libraries. A possible scenario could be:
% clang /tmp/a.c -fuse-ld=lld -loauth -Wl,--error-handling-script=/tmp/addLibrary.py
unable to find library -loauth
looking for relevant packages to provides that library
liboauth-0.9.7-4.el7.i686
liboauth-devel-0.9.7-4.el7.i686
liboauth-0.9.7-4.el7.x86_64
liboauth-devel-0.9.7-4.el7.x86_64
pix-1.6.1-3.el7.x86_64
Where addLibrary would be called with the missing library name as first argument
(in that case addLibrary.py oauth)
Differential Revision: https://reviews.llvm.org/D87758
In the presence of a gap, the st_value field of a STT_SECTION symbol is the
address of the first input section (incorrect if there is a gap). Set it to the
output section address instead.
In -r mode, this bug can cause an incorrect non-zero st_value of a STT_SECTION
symbol (while output sections have zero addresses, input sections may have
non-zero outSecOff). The non-zero st_value can cause the final link to have
incorrect relocation computation (both GNU ld and LLD add st_value of the
STT_SECTION symbol to the output section address).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D90520
While MC did not produce R_X86_64_GOTPCRELX for test/binop instructions
(movl/adcl/addl/andl/...) before the previous commit, this code path has been
exercised by -fno-integrated-as for GNU as since 2016: -no-pie relaxing
may incorrectly access loc[-3] and produce a corrupted instruction.
Simply handle test/binop R_X86_64_GOTPCRELX like R_X86_64_GOTPCREL.
This partially reverts D85994.
In glibc, elf/dl-sym.c calls the raw `__tls_get_addr` by specifying the
tls_index parameter. Such a call does not have a pairing R_PPC64_TLSGD/R_PPC64_TLSLD.
This is legitimate. Since we cannot distinguish the benign case from cases due
to toolchain issues, we have to be permissive.
Acked by Stefan Pintilie
Add support to LLD for PC Relative Thread Local Storage for Local Dynamic.
This patch adds support for two relocations: R_PPC64_GOT_TLSLD_PCREL34 and
R_PPC64_DTPREL34.
The Local Dynamic code is:
```
pla r3, x@got@tlsld@pcrel R_PPC64_GOT_TLSLD_PCREL34
bl __tls_get_addr@notoc(x@tlsld) R_PPC64_TLSLD
R_PPC64_REL24_NOTOC
...
paddi r9, r3, x@dtprel R_PPC64_DTPREL34
```
After relaxation to Local Exec:
```
paddi r3, r13, 0x1000
nop
...
paddi r9, r3, x@dtprel R_PPC64_DTPREL34
```
Reviewed By: NeHuang, sfertile
Differential Revision: https://reviews.llvm.org/D87504
These are all inspired by existing test coverage we have in an internal
testsuite.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D89775
For a diagnostic `A refers to B` where B refers to a bitcode file, if the
symbol gets optimized out, the user may see `A refers to <internal>`; if the
symbol is retained, the user may see `A refers to lto.tmp`.
Save the reference InputFile * in the DenseMap so that the original filename is
available in reportBackrefs().
The ELF spec says
> If the sh_flags field for this section header includes the attribute SHF_INFO_LINK, then this member represents a section header table index.
Set SHF_INFO_LINK so that binary manipulation tools know that sh_info is
a section header table index instead of (the number of local symbols in the case of SHT_SYMTAB/SHT_DYNSYM).
We have already added SHF_INFO_LINK for --emit-relocs retained SHT_REL[A].
For example, we can teach llvm-objcopy to preserve the section index of the sh_info referenced section if
SHF_INFO_LINK is set. (GNU objcopy recognizes .rel[a].plt and updates
sh_info even if SHF_INFO_LINK is not set).
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D89828
The combination has not been tested before. In the case of ICF,
`e.section->getVA(0)` equals the start address of the output section.
This can cause incorrect overlapping with the actual function at the
start of the output section and potentially trigger a GDB internal error
in `dw2_find_pc_sect_compunit_symtab` (presumably because:
if a short address range incorrectly starts at the start address of the
output section, GDB may pick it instead of the correct longer address
range. When mapping an address within the long address range but
out of the scope of the short address range, the routine may find
nothing - while the code asserts that it can find something).
Note that in the case of ICF there may be duplicate address range entries,
but GDB appears to be fine with them.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D89751
ICF was not able to merge equivalent sections because of relocations to
sections ineligible for ICF that use alternative symbols, e.g. symbol
aliases or section relative relocations.
Merging in this scenario has been enabled by giving the sections that
are ineligible for ICF a unique ID, i.e. an equivalence class of their
own. This approach also provides another benefit as it improves the
hashing that is used to perform the initial equivalance grouping for
ICF. This is because the ICF ineligible sections can now contribute a
unique value towards the hashes instead of the same value of zero. This
has been seen to reduce link time with ICF by ~68% for objects compiled
with -fprofile-instr-generate.
In order to facilitate this use of a unique ID, the existing
inconsistent approach to the setting of the InputSection eqClass in ICF
has been changed so that there is a clear distinction between the
eqClass values of ICF eligible sections and those of the ineligible
sections that have a unique ID. This inconsistency could have caused
incorrect equivalence class equality in the past, although it appears
that no issues were encountered in actual use.
Differential Revision: https://reviews.llvm.org/D88830
Similar to D66992.
In GNU ld, a -u specified symbol is a STB_DEFAULT undefined.
It cannot be changed to STB_WEAK by a later STB_WEAK undefined in a regular object file.
The behavior is consistent with our model because -u means "we need to fetch a lazy definition".
It should not be altered just because there is also a STB_WEAK undefined.
Note, our -u semantics are still different from GNU ld (https://github.com/ClangBuiltLinux/linux/issues/515):
we don't force the specified symbol to appear in .symtab This is a deliberate decision.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D88945
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.
The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd) R_PPC64_TLSGD
R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.
Reviewed By: NeHuang, sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D87318
Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic.
The compiler will produce code that looks like:
```
pla r3, x@got@tlsgd@pcrel R_PPC64_GOT_TLSGD_PCREL34
bl __tls_get_addr@notoc(x@tlsgd) R_PPC64_TLSGD
R_PPC64_REL24_NOTOC
```
LLD should be able to correctly compute the relocation for R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible:
General Dynamic to Local Exec:
```
paddi r3, r13, x@tprel
nop
```
and General Dynamic to Initial Exec:
```
pld r3, x@got@tprel@pcrel
add r3, r3, r13
```
Note:
This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic.
The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence.
Reviewed By: NeHuang, sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D87318
When adding an archive member with a problem, e.g. a new bitcode with an
old archiver, containing an unsupported attribute, or an ELF file with a
malformed symbol table, the archiver would throw away the error and
simply add the member to the archive without any symbol entries. This
meant that the resultant archive could be silently unusable when not
using --whole-archive, and result in unexpected undefined symbols.
This change fixes this issue by addressing two FIXMEs and only throwing
away not-an-object errors. However, this meant that some LLD tests which
didn't need symbol tables and were using invalid members deliberately to
test the linker's malformed input handling no longer worked, so this
patch also stops the archiver from looking for symbols in an object if
it doesn't require a symbol table, and updates the tests accordingly.
Differential Revision: https://reviews.llvm.org/D88288
Reviewed by: grimar, rupprecht, MaskRay
The routing rules are:
sym -> __wrap_sym
__real_sym -> sym
__wrap_sym and sym are routing targets, so they need to be exposed to the symbol
table. __real_sym is not and can be eliminated if not used by regular object.
The R2 save stub will now support offsets up to 64 bits.
There are three cases that will be used.
1) The offset fits in 26 bits.
```
b <26 bit offset>
```
2) The offset does not fit in 26 bits but fits in 34 bits.
```
paddi r12, 0, <34 bit offset>, 1
mtctr r12
bctr
```
3) The offset does not fit in 34 bits. Since this is an R2 save stub we can use
the TOC in R2. We are not loading the offset but the actual address we want to
branch to.
```
addis r12, r2, <address in TOC lo>
ld r12 <address in TOC hi>(r12)
mtctr r12
bctr
```
In case 1) the stub is only 8 bytes while in cases 2) and 3) the stub will be
20 bytes.
Reviewed By: MaskRay, sfertile, NeHuang
Differential Revision: https://reviews.llvm.org/D87916