Sorting the prefixes by decreasing frequency can improve performance.
.gcc_except_table is relatively frequent, so move it ahead.
.ctors and .dtors mostly disappear and should be the last.
SHT_GNU_verdef is typically small, so it's unnecessary to reserve the vector.
While here, fix a hypothetical issue when SHT_GNU_verdef has non-increasing
version indexes, which don't happen with GNU ld, gold, ld.lld's output.
My x86-64 lld executable is 256 bytes smaller.
sizeof(ObjFile<ELF64LE>) is decreased from 344 to 272 on an ELF64 system.
In a large link with 30000 ObjFiles, this may be 2+MiB saving.
Change std::vector members to SmallVector, and std::string members to
SmallString<0> (these members typically don't benefit from small string optimization).
On Linux x86-64 the lld executable is ~6k smaller.
(Fixed an issue about GOT on a copy relocated alias.)
(Fixed an issue about not creating r_addend=0 IRELATIVE for unreferenced non-preemptible ifunc.)
The idea is to make scanRelocations mark some actions are needed (GOT/PLT/etc)
and postpone the real work to postScanRelocations. It gives some flexibility:
* Make it feasible to support .plt.got (PR32938): we need to know whether GLOB_DAT and JUMP_SLOT are both needed.
* Make non-preemptible IFUNC handling slightly cleaner: avoid setting/clearing sym.gotInIgot
* -z nocopyrel: report all copy relocation places for one symbol
* Make GOT deduplication feasible
* Make parallel relocation scanning feasible (if we can avoid all stateful operations and make Symbol attributes atomic), but parallelism may not be the appealing choice
Since this patch moves a large chunk of code out of ELFT templates. My x86-64
executable is actually a few hundred bytes smaller.
For ppc32-ifunc-nonpreemptible-pic.s: I remove absolute relocation references to non-preemptible ifunc
because absolute relocation references are incorrect in -fpie mode.
Reviewed By: peter.smith, ikudrin
Differential Revision: https://reviews.llvm.org/D114783
If a copy related symbol (say `copy`) is referenced in two .o
files, this change removes a duplicated line from the -Map output:
```
202470 202470 1 1 .bss.rel.ro
202470 202470 1 1 <internal>:(.bss.rel.ro)
202470 202470 1 1 copy
removed 202470 202470 1 1 copy
```
Differential Revision: https://reviews.llvm.org/D115697
needsPltAddr is equivalent to `needsCopy && isFunc`. In many places, it is
equivalent to `needsCopy` because the non-STT_FUNC cases are ruled out.
Reviewed By: ikudrin, peter.smith
Differential Revision: https://reviews.llvm.org/D115603
(Fixed an issue about GOT on a copy relocated alias.)
The idea is to make scanRelocations mark some actions are needed (GOT/PLT/etc)
and postpone the real work to postScanRelocations. It gives some flexibility:
* Make it feasible to support .plt.got (PR32938): we need to know whether GLOB_DAT and JUMP_SLOT are both needed.
* Make non-preemptible IFUNC handling slightly cleaner: avoid setting/clearing sym.gotInIgot
* -z nocopyrel: report all copy relocation places for one symbol
* Make GOT deduplication feasible
* Make parallel relocation scanning feasible (if we can avoid all stateful operations and make Symbol attributes atomic), but parallelism may not be the appealing choice
Since this patch moves a large chunk of code out of ELFT templates. My x86-64
executable is actually a few hundred bytes smaller.
For ppc32-ifunc-nonpreemptible-pic.s: I remove absolute relocation references to non-preemptible ifunc
because absolute relocation references are incorrect in -fpie mode.
Reviewed By: peter.smith, ikudrin
Differential Revision: https://reviews.llvm.org/D114783
This reverts commit fc33861d48.
`replaceWithDefined` should copy needsGot, otherwise an alias for a copy
relocated symbol may not have GOT entry if its needsGot was originally true.
The idea is to make scanRelocations mark some actions are needed (GOT/PLT/etc)
and postpone the real work to postScanRelocations. It gives some flexibility:
* Make it feasible to support .plt.got (PR32938): we need to know whether GLOB_DAT and JUMP_SLOT are both needed.
* Make non-preemptible IFUNC handling slightly cleaner: avoid setting/clearing sym.gotInIgot
* -z nocopyrel: report all copy relocation places for one symbol
* Make parallel relocation scanning possible (if we can avoid all stateful operations and make Symbol attributes atomic), but parallelism may not be the appealing choice
* Make GOT deduplication feasible
Since this patch moves a large chunk of code out of ELFT templates. My x86-64
executable is actually a few hundred bytes smaller.
For ppc32-ifunc-nonpreemptible-pic.s: I remove absolute relocation references to non-preemptible ifunc
because absolute relocation references are incorrect in -fpie mode.
Reviewed By: peter.smith, ikudrin
Differential Revision: https://reviews.llvm.org/D114783
An unstable sort suffices. In a large link (11.06s), this decreases .rela.dyn
writeTo time from 1.52s to 0.81s, resulting in 6% total time speedup (the
benefit will greatly dilute if --pack-dyn-relocs=relr becomes prevailing).
Encoding the dynamic relocations then sorting raw Elf_Rel/Elf_Rela doesn't seem
to improve much (doing that would require code duplicate because of
Elf_Rel/Elf_Rela plus unfortunate mips64le), so don't do that.
This fixes an issue introduced in D101996.
A weak reference in a shared library could be incorrectly reported if
there is another library that has a strong reference to the same symbol.
Differential Revision: https://reviews.llvm.org/D115041
PLT usage needs the first 12 bytes of the .got section. We need to keep .got and
DT_GOT_PPC even if .got/_GLOBAL_OFFSET_TABLE_ are not referenced (large PIC code
may only reference .got2), which is the case in OpenBSD's ld.so, leading
to a misleading error, "unsupported insecure BSS PLT object".
Fix this by adding R_PPC32_PLTREL to the list of hasGotOffRel.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D114982
When a comdat symbol is defined in both bitcode and regular object
files, which are contained in the same archive, the linker could lose
the flag that the symbol is used in the regular object file and allow
LTO to internalize it, which led to "error: undefined symbol".
The issue was introduced in D79300.
Differential Revision: https://reviews.llvm.org/D114801
This reverts the PPC64PCRelLongBranchThunk part from D86706.
PPC64PCRelLongBranchThunk is the same as PPC64R12SetupStub.
Use `__gep_setup_` instead of `__long_branch_pcrel_` for the stub symbol name
as it more closely indicates the operation.
(Note: GNU ld uses `*.long_branch.*` and `*.plt_branch.*`).
Reviewed By: NeHuang, nemanjai
Differential Revision: https://reviews.llvm.org/D114656
There is a trend of having more optional options (usually security
hardening related) like -z cet-report=, -z bti-report=, -z force-bti.
If ld.lld 14.0.0 uses a warning, in 15/16/17/... timeframe when people
add new options to software, they can worry less about linker errors on ld.lld 14.0.0.
In some cases `-z foo` does essential work where a silent ignore can be
problematic, but the user has received a warning. From my observation, the
doing-essential-work `-z foo` is much fewer than the converse. In addition,
the user who cares can use `--fatal-warnings` (Note: GNU ld doesn't upgrade warnings to errors).
It is unclear whether we need something like `clang -Wunknown-warning-option`.
If we ever run into unfortunate transition like `-z start-stop-gc`, the
affected software (e.g. ldc is a compiler which passes linker options to the underlying ld)
can blindly add the `-z` option, without worrying it may cause a linker error to LLD 14.0.0.
Reviewed By: jrtc27, peter.smith
Differential Revision: https://reviews.llvm.org/D114748
Make one change: when the OutputSection is nullptr (due to /DISCARD/ or garbage
collected BssSection (replaceCommonSymbols)), discard the SyntheticSection as well.
I attempted to remove it 1 or 2 year ago but kept it just to have a good
diagnostic in case the output section is nullptr (should be impossible).
It is long enough that we haven't seen such a case.
Fix r285764: there is no guarantee that Out::first is placed before other
static data members of `struct Out`. After `bufferStart` was introduced, this
out-of-bounds write is destined in many compilers. It is likely benign, though.
And move `Out::elfHeader->size` assignment beside `Out::elfHeader->sectionIndex`
For -z separate-code and -z separate-loadable-segments:
When RW is present, the RX to RW transition is aligned with max-page-size.
When RW is absent, the RX to non-SHF_ALLOC transition should use max-page-size as well.
Currently, LLD does not support the complete set of ARM group relocations.
Given that I intend to start using these in the Linux kernel [0], let's add
support for these.
This implements the group processing as documented in the ELF psABI. Notably,
this means support is dropped for very far symbol references that also carry a
small component, where the immediate is rotated in such a way that only part of
it wraps to the other end of the 32-bit word. To me, it seems unlikely that
this is something anyone could be relying on, but of course I could be wrong.
[0] https://lore.kernel.org/r/20211122092816.2865873-8-ardb@kernel.org/
Reviewed By: peter.smith, MaskRay
Differential Revision: https://reviews.llvm.org/D114172