The R_PPC64_REL24 is used in function calls when the caller requires a
valid TOC pointer. If the callee shares the same TOC or does not clobber
the TOC pointer then a direct call can be made. If the callee does not
share the TOC a thunk must be added to save the TOC pointer for the caller.
Up until PC Relative was introduced all local calls on medium and large code
models were assumed to share a TOC. This is no longer the case because
if the caller requires a TOC and the callee is PC Relative then the callee
can clobber the TOC even if it is in the same DSO.
This patch is to add support for a TOC caller calling a PC Relative callee that
clobbers the TOC.
Reviewed By: sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D82950
The patch adds checking for various potential issues in parsing name
lookup tables and reporting them as recoverable errors, similarly as we
do for other tables.
Differential Revision: https://reviews.llvm.org/D83050
The parsing method did not check reading errors and might easily fall
into an infinite loop on an invalid input because of that.
Differential Revision: https://reviews.llvm.org/D83049
In the absence of TLS relaxation (rewrite of code sequences),
there is still an applicable optimization:
[gd]: General Dynamic: resolve DTPMOD to 1 and/or resolve DTPOFF statically
All the other relaxations are only performed when transiting to
executable (`!config->shared`).
Since [gd] is handled differently, we can fold `!config->shared` into canRelax
and simplify its use sites. Rename the variable to reflect to new semantics.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D83243
... to customize the tombstone value we use for an absolute relocation
referencing a discarded symbol. This can be used as a workaround when
some debug processing tool has trouble with current -1 tombstone value
(https://bugs.chromium.org/p/chromium/issues/detail?id=1102223#c11 )
For example, to get the current built-in rules (not considering the .debug_line special case for ICF):
```
-z dead-reloc-in-nonalloc='.debug_*=0xffffffffffffffff'
-z dead-reloc-in-nonalloc=.debug_loc=0xfffffffffffffffe
-z dead-reloc-in-nonalloc=.debug_ranges=0xfffffffffffffffe
```
To get GNU ld (as of binutils 2.35)'s behavior:
```
-z dead-reloc-in-nonalloc='*=0'
-z dead-reloc-in-nonalloc=.debug_ranges=1
```
This option has other use cases. For example, if we want to check
whether a non-SHF_ALLOC section has dead relocations.
With this patch, we can run a regular LLD and run another with a special
-z dead-reloc-in-nonalloc=, then compare their output.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D83264
In GNU ld, --no-relax can disable x86-64 GOTPCRELX relaxation.
It is not useful, so we don't implement it.
For RISC-V, --no-relax disables linker relaxations which have larger
impact.
Linux kernel specifies --no-relax when CONFIG_DYNAMIC_FTRACE is specified
(since http://git.kernel.org/linus/a1d2a6b4cee858a2f27eebce731fbf1dfd72cb4e ).
LLD has not implemented the relaxations, so this option is a no-op.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81359
The Symbol Table in LLD references the global object to add a symbol rather than adding it to itself.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D83184
Follow-up to D82899. Note, we need to disable R_DTPREL relaxation
because ARM psABI does not define TLS relaxation.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D83138
The location of a TLS variable is encoded as a DW_OP_const4u/DW_OP_const8u
followed by a DW_OP_push_tls_address (or DW_OP_GNU_push_tls_address https://sourceware.org/bugzilla/show_bug.cgi?id=11616 ).
This change follows up to D81784 and makes relocations types generalized as
R_DTPREL (e.g. R_X86_64_DTPOFF{32,64}, R_PPC64_DTPREL64) use -1 as the
tombstone value as well. This works for both TLS Variant I and Variant II
architectures.
* arm: .long tls(tlsldo) # not working currently (R_ARM_TLS_LDO32 is R_ABS)
* mips64: .dtpreldword tls+32768
* ppc64: .quad tls@DTPREL+0x8000
* riscv: neither GCC nor clang has implemented DW_AT_location. It is likely .long/.quad tls@dtprel+0x800
* x86-32: .long tls@DTPOFF
* x86-64: .long tls@DTPOFF; .quad tls@DTPOFF
tls has a non-negative st_value, so such relocations (st_value+addend)
never resolve to -1 in a normal (not discarded) case.
```
// clang -fuse-ld=lld -g -ffunction-sections a.c -Wl,--gc-sections
// foo and tls will be discarded by --gc-sections.
// DW_AT_location [DW_FORM_exprloc] (DW_OP_const8u 0xffffffffffffffff, DW_OP_GNU_push_tls_address)
thread_local int tls;
int foo() { return ++tls; }
int main() {}
```
Also, drop logic added in D26201 intended to address PR30793. It added a test
(gc-debuginfo-tls.s) using a non-SHF_ALLOC section and a local symbol, which
does not reflect the intended scenario: a relocation in a SHF_ALLOC section
referencing a discarded non-local symbol. For such a non .debug_* section, just
emit an error.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D82899
On Windows co-operative programs can be expected to open LLD's
output in FILE_SHARE_DELETE mode. This allows us to delete the
file (by moving it to a temporary filename and then deleting
it) so that we can link another output file that overwrites
the existing file, even if the current file is in use.
A similar strategy is documented here:
https://boostgsoc13.github.io/boost.afio/doc/html/afio/FAQ/deleting_open_files.html
Differential Revision: https://reviews.llvm.org/D82567
Previously, we only supported binding dysyms to the GOT. This
diff adds support for binding them to any arbitrary section. C++
programs appear to use this, I believe for vtables and type_info.
This diff also makes our bind opcode encoding a bit smarter -- we now
encode just the differences between bindings, which will make things
more compact.
I was initially concerned about the performance overhead of iterating
over these relocations, but it turns out that the number of such
relocations is small. A quick analysis of my llvm-project build
directory showed that < 1.3% out of ~7M relocations are RELOC_UNSIGNED
bindings to symbols (including both dynamic and static symbols).
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D83103
This patch adds a few extra cases to the existing testing for eh_frame
and eh_frame_hdr behaviour in LLD. They all come from a private
testsuite we are trying to migrate to lit.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D82852
With this, a simple hello world links against libSystem.tbd and the
old ld64.lld linker kind of works again with newer SDKs.
The motivation here is to have an arm64 cross linker that's good
enough to be able to run simple configure link checks on non-mac
systems for generating config.h files. Once -flavor darwinnew can
link arm64, we'll switch to that.
Summary:
ld64 does this, and references an internal rdar:// number as an explanation. No
idea what that rdar issue is, but in practice, it seems that not putting a BSS
section at the end can cause subsequent sections in the same segment to be
overwritten with zeroes.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81888
After D81784, we resolve a relocation in .debug_* referencing an ICF folded
section symbol to a tombstone value.
Doing this for .debug_line has a problem (https://reviews.llvm.org/D81784#2116925 ):
.debug_line may describe folded lines as having addresses UINT64_MAX or
some wraparound small addresses.
```
int foo(int x) {
return x; // line 2
}
int bar(int x) {
return x; // line 6
}
```
```
Address Line Column File ISA Discriminator Flags
------------------ ------ ------ ------ --- ------------- -------------
0x00000000002016c0 1 0 1 0 0 is_stmt
0x00000000002016c7 2 9 1 0 0 is_stmt
prologue_end
0x00000000002016ca 2 2 1 0 0
0x00000000002016cc 2 2 1 0 0 end_sequence
// UINT64_MAX and wraparound small addresses
0xffffffffffffffff 5 0 1 0 0 is_stmt
0x0000000000000006 6 9 1 0 0 is_stmt
prologue_end
0x0000000000000009 6 2 1 0 0
0x000000000000000b 6 2 1 0 0 end_sequence
0x00000000002016d0 9 0 1 0 0 is_stmt
0x00000000002016df 10 6 1 0 0 is_stmt prologue_end
0x00000000002016e6 11 11 1 0 0 is_stmt
...
```
These entries can confuse debuggers:
gdb before 2020-07-01 (binutils-gdb a8caed5d7faa639a1e6769eba551d15d8ddd9510 "Recognize -1 as a tombstone value in .debug_line")
(can't continue due to a breakpoint in an invalid region of memory):
```
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x6
```
lldb (breakpoint has no effect):
```
(lldb) b 6
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
```
This patch special cases .debug_line to not use the tombstone value,
restoring the previous behavior: .debug_line will have entries with the
same addresses (ICF) but different line numbers. A breakpoint on line 2
or 6 will trigger on both functions.
Reviewed By: dblaikie, jhenderson
Differential Revision: https://reviews.llvm.org/D82828
Include the archive name as well as the member name when an error
is encountered parsing bitcode archives.
Differential Revision: https://reviews.llvm.org/D82884
This reduces peak memory on my test case from 1960.14MB to 1700.63MB
(-260MB, -13.2%) with no measurable impact on CPU time. I'm currently
working with a publics stream that is about 277MB. Before this change,
we would allocate 277MB of heap memory, serialize publics into them,
hold onto that heap memory, open the PDB, and commit into it. After
this change, we defer the serialization until commit time.
In the last change I made to public writing, I re-sorted the list of
publics multiple times in place to avoid allocating new temporary data
structures. Deferring serialization until later requires that we don't
reorder the publics. Instead of sorting the publics, I partially
construct the hash table data structures, store a publics index in them,
and then sort the hash table data structures. Later, I replace the index
with the symbol record offset.
This change also addresses a FIXME and moves the list of global and
public records from GSIHashStreamBuilder to GSIStreamBuilder. Now that
publics aren't being serialized, it makes even less sense to store them
as a list of CVSymbol records. The hash table used to deduplicate
globals is moved as well, since that is specific to globals, and not
publics.
Reviewed By: aganea, hans
Differential Revision: https://reviews.llvm.org/D81296
D79300 forgot to change `getBuffer().empty()` in LazyObjFile::parse to
`fetched`. This caused incorrect iterating after the current LazyObjFile was
fetched. This issue is benign and can just cause loss of "undefined symbols"
and "backward reference" diagnostics.
Before D79300 `mb = {}` caused --warn-backrefs-exclude to be useless for
a fetched LazyObjFile.
Add two test cases.
The meaning of -shared and -pie are expected to be changed in the
future when Module Linking-style libraries are implemented. Begin
issuing warnings to give people a heads-up that they will be changing.
For compatibility with Emscripten, add a --experimental-pic flag which
disables these warnings.
Differential Revision: https://reviews.llvm.org/D81760
Fixes PR46420
Similar to D43307 for non-LTO.
Module-level inline assembly can use .symver to create a symbol with `@` in the name.
For relocatable output, @ should be retained in the symbol name. `@ver` should
not be parsed and dropped.
Reviewed By: grimar, psmith
Differential Revision: https://reviews.llvm.org/D82433
Add support for the 34bit relocation R_PPC64_GOT_PCREL34 for
PC Relative in LLD.
Reviewers: sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D81948
This is the followup to D77647 which implements handling for the new
R_AARCH64_PLT32 relocation type in lld. This relocation would benefit the
PIC-friendly vtables feature described in D72959.
Differential Revision: https://reviews.llvm.org/D81184
See D59553, https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html and
https://sourceware.org/pipermail/binutils/2020-May/111357.html for
extensive discussions on a tombstone value.
See http://www.dwarfstd.org/ShowIssue.php?issue=200609.1
(Reserve an address value for "not present") for a DWARF enhancement proposal.
We resolve such relocations to a tombstone value to indicate that the address is invalid.
This solves several problems (the normal behavior is to resolve the relocation to the addend):
* For an empty function in a collected section, a pair of (0,0) can
terminate .debug_loc and .debug_ranges (as of binutils 2.34, GNU ld
resolves such a relocation to 1 to avoid the .debug_ranges issue)
* If DW_AT_high_pc is sufficiently large, the address range can collide
with a regular code range of low address (https://bugs.llvm.org/show_bug.cgi?id=41124 )
* If a text section is folded into another by ICF, we may leave entries
in multiple CUs claiming ownership of the same range of code, which can
confuse consumers.
* Debug information associated with COMDAT sections can have problems
similar to ICF, but is more complex - thus not addressed by this patch.
For pre-DWARF-v5 .debug_loc and .debug_ranges, a pair of 0 can terminate
entries (invalidating subsequent ranges).
-1 is a reserved value with special meaning (base address selection entry) which can't be used either.
Use -2 instead.
For all other .debug_*, use UINT32_MAX for 32-bit targets and UINT64_MAX
for 64-bit targets. In the code, we intentionally use
`uint64_t tombstone = UINT64_MAX` for 32-bit targets as well: this matches
SignExtend64 as used in `relocateAlloc`. (Actually UINT32_MAX does not work for R_386_32)
Note 0, we only special case `target->symbolicRel` (R_X86_64_64, R_AARCH64_ABS64, R_PPC64_ADDR64), not
short-range absolute relocations (e.g. R_X86_64_32). Only forms like DW_FORM_addr need to be special cased.
They can hold an arbitrary address (must be 64-bit on a 64-bit target). (In theory,
producers can make use of small code model to emit 32-bit relocations. This doesn't seem to be leveraged.)
Note 1, we have to ignore the addend, because we don't want to resolve
DW_AT_low_pc (which may have a non-zero addend) to -1+addend (wrap
around to a low address):
__attribute__((section(".text.x"))) void f1() { }
__attribute__((section(".text.x"))) void f2() { } // DW_AT_low_pc has a non-zero addend
Note 2, if the prevailing copy does not have debugging information while
a non-prevailing copy has (partial debug build), we don't do extra work
to attach debugging information to the prevailing definition. (clang
has a lot of debug info optimizations that are on-by-default that assume
the whole program is built with debug info).
clang -c -ffunction-sections a.cc # prevailing copy has no debug info
clang -c -ffunction-sections -g b.cc
Reviewed By: dblaikie, avl, jhenderson
Differential Revision: https://reviews.llvm.org/D81784
Summary:
There were a few issues with the previous setup:
1. The section sorting comparator used a declarative map of section names to
determine the correct order, but it turns out we need to match on more than
just names -- in particular, an upcoming diff will sort based on whether the
S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but
flexible form.
2. We were sorting OutputSections stored in a MapVector, which left the
MapVector in an inconsistent state -- the wrong keys map to the wrong values!
In practice, we weren't doing key lookups (only container iteration) after the
sort, so this was fine, but it was still a dubious state of affairs. This diff
copies the OutputSections to a vector before sorting them.
3. We were adding unneeded OutputSections to OutputSegments and then filtering
them out later, which meant that we had to remember whether an OutputSegment
was in a pre- or post-filtered state. This diff only adds the sections to the
segments if they are needed.
In addition to those major changes, two minor ones worth noting:
1. I renamed all OutputSection variable names to `osec`, to parallel `isec`.
Previously we were using some inconsistent combination of `osec`, `os`, and
`section`.
2. I added a check (and a test) for InputSections with names that clashed with
those of our synthetic OutputSections.
Reviewers: #lld-macho
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81887
If neither AT(lma) nor AT>lma_region is specified,
D76995 keeps `lmaOffset` (LMA - VMA) if the previous section is in the
default LMA region.
This patch additionally checks that the two sections are in the same
memory region.
Add a test case derived from https://bugs.llvm.org/show_bug.cgi?id=45313
.mdata : AT(0xfb01000) { *(.data); } > TCM
// It is odd to make .bss inherit lmaOffset, because the two sections
// are in different memory regions.
.bss : { *(.bss) } > DDR
With this patch, section VMA/LMA match GNU ld. Note, GNU ld supports
out-of-order (w.r.t sh_offset) sections and places .text and .bss in the
same PT_LOAD. We don't have that behavior.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81986
Fixes PR46348.
ObjFile<ELFT>::initializeSymbols contains two symbol iteration loops:
```
for each symbol
if non-inheriting && non-local
fill in this->symbols[i]
for each symbol
if local
fill in this->symbols[i]
else
symbol resolution
```
Symbol resolution can trigger a duplicate symbol error which will call
InputSectionBase::getObjMsg to iterate over InputFile::symbols. If a
non-local symbol appears after the non-local symbol being resolved
(violating ELF spec), its `this->symbols[i]` entry has not been filled
in, InputSectionBase::getObjMsg will crash due to
`dyn_cast<Defined>(nullptr)`.
To fix the bug, reorganize the two loops to ensure this->symbols is
complete before symbol resolution. This enforces the invariant:
InputFile::symbols has none null entry when InputFile::getSymbols() is called.
```
for each symbol
if non-inheriting
fill in this->symbols[i]
for each symbol starting from firstGlobal
if non-local
symbol resolution
```
Additionally, move the (non-local symbol in local part of .symtab)
diagnostic from Writer<ELFT>::copyLocalSymbols() to initializeSymbols().
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D81988
Some projects use the constructor attribute on functions that also
return values. In this case we just ignore them.
The error was reported in the libgpg-error project that marks
gpg_err_init with the `__constructor__` attribute.
Differential Revision: https://reviews.llvm.org/D81962
This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).
Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.
For more information see PR36198 and D43002.
Differential Revision: https://reviews.llvm.org/D80833