I attempted to remove it 1 or 2 year ago but kept it just to have a good
diagnostic in case the output section is nullptr (should be impossible).
It is long enough that we haven't seen such a case.
Fix r285764: there is no guarantee that Out::first is placed before other
static data members of `struct Out`. After `bufferStart` was introduced, this
out-of-bounds write is destined in many compilers. It is likely benign, though.
And move `Out::elfHeader->size` assignment beside `Out::elfHeader->sectionIndex`
For -z separate-code and -z separate-loadable-segments:
When RW is present, the RX to RW transition is aligned with max-page-size.
When RW is absent, the RX to non-SHF_ALLOC transition should use max-page-size as well.
Currently, LLD does not support the complete set of ARM group relocations.
Given that I intend to start using these in the Linux kernel [0], let's add
support for these.
This implements the group processing as documented in the ELF psABI. Notably,
this means support is dropped for very far symbol references that also carry a
small component, where the immediate is rotated in such a way that only part of
it wraps to the other end of the 32-bit word. To me, it seems unlikely that
this is something anyone could be relying on, but of course I could be wrong.
[0] https://lore.kernel.org/r/20211122092816.2865873-8-ardb@kernel.org/
Reviewed By: peter.smith, MaskRay
Differential Revision: https://reviews.llvm.org/D114172
This allows --power10-stubs= and --[no-]power10-stubs to override each other
(they are position dependent in GNU ld).
Also improve --help messages and the manpage.
Note: GNU ld's default "auto" mode uses heuristics to decide whether Power10
instructions are used. Arguably it is a design mistake of R_PPC64_REL24_NOTOC
(acked by the relevant folks on a libc-alpha discussion). We don't implement
"auto", so the default --power10-stubs is the same as "yes".
The canonical term is "extract" (GNU ld documentation, Solaris's `-z *extract`
options). Avoid inventing a term and match --why-extract. (ld64 prefers "load"
but the word is overloaded too much)
Mostly MFC, except for --help messages and the header row in
--print-archive-stats output.
BaseCommand was picked when PHDRS/INSERT/etc were not implemented. Rename it to
SectionCommand to match `sectionCommands` and make it clear that the commands
are used in SECTIONS (except a special case for SymbolAssignment).
Also, improve naming of some BaseCommand variables (base -> cmd).
This partially reverts r315409: the description applies to LinkerScript, but not
to OutputSection.
The name "sectionCommands" is used in both LinkerScript::sectionCommands and
OutputSection::sectionCommands, which may lead to confusion.
"commands" in OutputSection has no ambiguity because there are no other types
of commands.
The attribute 'r' allows (or disallows for the negative case) read-only
sections, i.e. ones without the SHF_WRITE flag, to be assigned to the
memory region. Before the patch, lld could put a section in the wrong
region or fail with "error: no memory region specified for section".
Differential Revision: https://reviews.llvm.org/D113771
The current TLSDESC optimization code assumes:
```
leaq x@tlsdesc(%rip), %rax
call *x@tlscall(%rax) # adjacent
```
From https://gitlab.freedesktop.org/mesa/mesa/-/issues/5665 , it seems that the
two instructions may not be adjacent in GCC 10's output:
```
leaq x@tlsdesc(%rip), %rax
something else
call *x@tlscall(%rax)
```
This patch supports the case. While here, support non-RAX registers for
R_X86_64_GOTPC32_TLSDESC, in case the compiler generates inefficient:
```
leaq x@tlsdesc(%rip), %rcx # or %rdx, %rbx, %rdi, ...
movq %rcx, %rax
call *x@tlscall(%rax) # GNU ld/gold error for non-RAX
```
Differential Revision: https://reviews.llvm.org/D114416
The section symbols aren't of much practical use when looking at
a linked image. This shrinks one observed mingw style unstripped
binary by 14%.
IMAGE_SYM_CLASS_LABEL is in spirit the same as a temporary assembler
label that isn't emitted on the object file level at all.
Differential Revision: https://reviews.llvm.org/D113866
std::vector can have different sizes depending on the STL's debug level,
so account for its size separately. (You could argue that we should be
accounting for all the other members separately as well, but that would
be very unergonomic, and std::vector is the only one that's caused
problems so far.)
Follup-up to D107533, where we replaced local syms with non-local.
It doesn't make sense to replace local symbol with lazy.
Differential Revision: https://reviews.llvm.org/D110040
Fix a null pointer dereference when .got.plt is discarded.
This also adds a test for discarding `.plt`.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D114180
Follow-up to https://reviews.llvm.org/D112643. Even after that change, we were
still asserting if two separate functions that are eligible for ICF (same size,
same data, same number of relocs, same reloc types, ...) referred to
Undefineds. This fixes that oversight.
Differential Revision: https://reviews.llvm.org/D114195
When aligning the start address of an output section introduces a gap between the current dot pointer
and the new aligned address, we were already properly expanding the memory region, if available.
D74286 introduced a new behavior to also align the LMA address if an LMA region is specified.
However, this did not expand the corresponding LMA region.
Now, we also expand the LMA region if it is set.
This fixes PR52510.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D114166
ld64 doesn't warn on builds using `-install_name` if it's a bundle. But, the
current warning is nice to have because `install_name` only works with dylib.
To prevent an overflow of warnings in build logs and have parity with ld64,
create a `--warn-dylib-install-name` and `--warn-no-dylib-install-name` flag
that enables this LLD specific warning.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D113534
In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here.
Differential Revision: https://reviews.llvm.org/D114017
As discussed in https://reviews.llvm.org/D113809#3128636. It's a bit
unfortunate to move the asserts away from the structs whose sizes
they're checking, but it's a far better developer experience when one of
the asserts is violated, because you get a single error instead of every
single source file including the header erroring out.
The `r_address` field of `relocation_info` is only 4 bytes, so our
offset field (which is the `r_address` field adjusted for subsection
splitting) also only needs to be 4 bytes. This reduces the structure
size from 32 bytes to 24 bytes.
Combined with https://reviews.llvm.org/D113813, this is a minor perf
improvement for linking an internal app, tested on two machines:
```
smol-relocs baseline difference (95% CI)
sys_time 7.367 ± 0.138 7.543 ± 0.157 [ +0.9% .. +3.8%]
user_time 21.843 ± 0.351 21.861 ± 0.450 [ -1.3% .. +1.4%]
wall_time 20.301 ± 0.307 20.556 ± 0.324 [ +0.1% .. +2.4%]
samples 16 16
smol-relocs baseline difference (95% CI)
sys_time 2.923 ± 0.050 2.992 ± 0.018 [ +1.4% .. +3.4%]
user_time 10.345 ± 0.039 10.448 ± 0.023 [ +0.8% .. +1.2%]
wall_time 12.068 ± 0.071 12.229 ± 0.021 [ +1.0% .. +1.7%]
samples 15 12
```
More importantly though, this change by itself reduces our maximum
resident set size by 220 MB (2.75%, from 7.85 GB to 7.64 GB) on the
first machine. On the second machine, it reduces it by 125 MB (1.94%,
from 6.31 GB to 6.19 GB).
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113818
We can lay out Symbol more optimally to reduce its size from 56 bytes to
48 bytes by eliminating unnecessary padding, and we can lay out Defined
such that its bitfield members are placed in the tail padding of Symbol
(on ABIs which support this), to reduce it from 96 bytes to 80 bytes (8
bytes from the Symbol reduction, and 8 bytes from the tail padding
reuse).
This is perf-neutral for an internal app (results from two different
machines):
```
smol-syms baseline difference (95% CI)
sys_time 7.430 ± 0.202 7.440 ± 0.193 [ -2.6% .. +2.9%]
user_time 21.443 ± 0.513 21.206 ± 0.396 [ -3.3% .. +1.1%]
wall_time 20.453 ± 0.534 20.222 ± 0.488 [ -3.7% .. +1.5%]
samples 9 8
smol-syms baseline difference (95% CI)
sys_time 3.011 ± 0.050 3.040 ± 0.052 [ -0.4% .. +2.3%]
user_time 10.416 ± 0.075 10.496 ± 0.091 [ +0.1% .. +1.4%]
wall_time 12.229 ± 0.144 12.354 ± 0.192 [ -0.1% .. +2.1%]
samples 14 13
```
However, on the first machine, it reduces maximum resident set size by
65.9 MB (0.8%, from 7.92 GB to 7.85 GB). On the second machine, it
reduces it by 92 MB (1.4%, from 6.40 GB to 6.31 GB).
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113813
It was checking for 64-bit builds incorrectly. Unfortunately,
ConcatInputSection has grown a bit in the meantime, and I don't see any
obvious way to shrink it. Perhaps icfEqClass could use 32-bit hashes
instead of 64-bit ones, but xxHash64 is supposed to be much faster than
xxHash32 (https://github.com/Cyan4973/xxHash#benchmarks), so that sounds
like a loss. (Unrelatedly, we should really look at using XXH3 instead
of xxHash64 now.)
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113809
This is an NFC diff that prepares for pruning & relocating `__eh_frame`.
Along the way, I made the following changes to ...
* clarify usage of `section` vs. `subsection`
* remove `map` & `vec` from type names
* disambiguate class `Section` from template parameter `SectionHeader`.
Differential Revision: https://reviews.llvm.org/D113241
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with merged in these comments.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D113903
Non-allocatable sections are not part of the memory image of the
program, so there is no need to find memory regions for them either
matching properties or handling explicit assignments. The early test
and return help to simplify LinkerScript::findMemoryRegion() a bit.
Differential Revision: https://reviews.llvm.org/D113768
```
/Users/ksmiley/dev/llvm-project/lld/MachO/Symbols.cpp:43:27: warning: field 'external' will be initialized after field 'weakDefCanBeHidden' [-Wreorder-ctor]
weakDef(isWeakDef), external(isExternal),
^
1 warning generated.
```
Differential Revision: https://reviews.llvm.org/D113823
autohide symbols behaves similarly to private_extern symbols.
However, LD64 allows exporting autohide symbols. LLD currently does not.
This patch allows LLD to export them.
Differential Revision: https://reviews.llvm.org/D113167
(Split from D113167)
Benchmarking on one of our large apps which exports a few thousands symbols,
this showed an improvement of ~17%.
x ./LLD_no_parallel.txt
+ ./LLD_with_parallel.txt
N Min Max Median Avg Stddev
x 10 84.01 89.41 88.64 87.693 1.7424061
+ 10 71.9 74.29 72.63 72.753 0.77734663
Difference at 95.0% confidence
-14.94 +/- 1.26763
-17.0367% +/- 1.44553%
(Student's t, pooled s = 1.34912)
(wallclock)
Differential Revision: https://reviews.llvm.org/D113820