This fixes an issue where a symbol only section at the start of a
PT_LOAD segment, causes incorrect alignment of the file offset for the
start of the segment which results in the output of an invalid ELF.
SHT_PROGBITS was the default output section type in the past.
Differential Revision: https://reviews.llvm.org/D60131
llvm-svn: 358981
This patch changes the behaviour of findOrphanPos to only consider live
sections when placing orphan sections. This used to be how it behaved in
the past.
Differential Revision: https://reviews.llvm.org/D60273
llvm-svn: 358979
Make some small adjustment while touching the code: make parameters
const, use less_first(), etc.
Differential Revision: https://reviews.llvm.org/D60989
llvm-svn: 358943
Summary:
Fixes PR35242. A simplified reproduce:
thread_local int i; int f() { return i; }
% {g++,clang++} -fPIC -shared -ftls-model=local-dynamic -fuse-ld=lld a.cc
ld.lld: error: can't create dynamic relocation R_X86_64_DTPOFF32 against symbol: i in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output
In isStaticLinkTimeConstant(), Syn.IsPreemptible is true, so it is not
seen as a constant. The error is then issued in processRelocAux().
A symbol of the local-dynamic TLS model cannot be preempted but it can
preempt symbols of the global-dynamic TLS model in other DSOs.
So it makes some sense that the variable is not static.
This patch fixes the linking error by changing getRelExpr() on
R_386_TLS_LDO_32 and R_X86_64_DTPOFF{32,64} from R_ABS to R_DTPREL.
R_PPC64_DTPREL_* and R_MIPS_TLS_DTPREL_* need similar fixes, but they are not handled in this patch.
As a bonus, we use `if (Expr == R_ABS && !Config->Shared)` to find
ld-to-le opportunities. R_ABS is overloaded here for such STT_TLS symbols.
A dedicated R_DTPREL is clearer.
Differential Revision: https://reviews.llvm.org/D60945
llvm-svn: 358870
Summary:
This relocation type is used by R_386_TLS_GD. Its formula is the same as
R_GOTPLT (e.g R_X86_64_GOT{32,64} R_386_TLS_GOTIE). Rename it to be clearer.
Differential Revision: https://reviews.llvm.org/D60941
llvm-svn: 358868
This is https://bugs.llvm.org//show_bug.cgi?id=39857.
I added the comment with much more details to the bug page,
the short version is below.
The following script and code demonstrates the issue:
aliasto__text = __text;
SECTIONS {
.text 0x1000 : { __text = . ; *(.text) }
}
...
call aliasto__text
LLD fails with "cannot refer to absolute symbol: aliasto__text" error.
It happens because at the moment of scanning the relocations
we do not yet assign the correct/final/any section value for the symbol aliasto__text.
I made a change to Relocations.cpp to fix that.
Also, I had to remove the symbol-location.s test case completely, because now it does not
trigger any error. Since now all linker scripts symbols are resolved to constants, no
errors can be triggered at all it seems. I checked that it is consistent with the behavior
of bfd and gold (they do not trigger errors for the case from symbol-location.s), so it should
be OK. I.e. at least it is probably not the best possible, but natural behavior we obtained.
Differential revision: https://reviews.llvm.org/D55423
llvm-svn: 358652
Summary:
If the output section contains only symbol assignments, we copy flags
from the previous sections. Don't set SHF_ALLOC if NonAlloc is true.
We also have to change the type from SHT_NOBITS to SHT_PROGBITS.
In ld.bfd, bfd_elf_get_default_section_type maps non-alloctable sections to SHT_PROGBITS.
Non-alloctable SHT_NOBITS sections do not make sense.
Fixes PR38626
Differential Revision: https://reviews.llvm.org/D59986
llvm-svn: 358650
This generalizes code and also fixes the broken behavior shown in
one of our test cases for some targets, like x86-64.
The issue occurs when the forward declarations are used in the script.
One of the samples is:
SECTIONS {
foo = ADDR(.text) - ABSOLUTE(ADDR(.text));
};
In that case, we have a broken output when output target does
not use thunks. That happens because thunks creating code
(called from maybeAddThunks)
calls Script->assignAddresses() at least one more time,
what fixups the values. As a result final symbols values can
be different on AArch64 and x86, for example.
In this patch, I generalize and rename maybeAddThunks to
finalizeAddressDependentContent and now it is used and called
by all targets.
Differential revision: https://reviews.llvm.org/D55550
llvm-svn: 358646
Summary:
We access Live and OutputOff (which may share the same memory location)
concurrently in 2 parallelForEachN loops. Separating them avoids subtle
data races like D41884/PR35788. This patch places Live and Hash
together.
2 reasons this is appealing:
1) Hash is immutable. Live is almost read-only - only written once in MarkLive.cpp where
Hash is not accessed
2) we already discard low bits of Hash to decide ShardID. It doesn't
matter much if we make 32-bit Hash to 31-bit.
For a huge internal clang -O3 executable (1.6GiB),
`Strings` in StringTableBuilder::finalizeStringTable contains at most 310253 elements.
The expected number of pair-wise collisions 2^(-31) * C(310253,2) ~= 22.41 is too small to have a negative impact on performance.
Actually, my benchmark shows there is actually a minor performance improvement.
Differential Revision: https://reviews.llvm.org/D60765
llvm-svn: 358645
Patch by Gabriel Smith.
The address for a section would be evaluated before the region was
switched to. Because of this, the position within the region would not
be updated. After the region is swapped to the dot would be set to the
out of date position within the region, undoing the section address
evaluation.
To fix this, the region is swapped to before the section's address is
evaluated. As part of the fallout of this, expandMemoryRegions needed
to be gated in setDot on the condition that the evaluated address is
less than the dot. This is for the case where sections are not listed
from lowest address to highest address.
Finally, a test for the case where sections are listed "out of order"
was added.
Differential Revision: https://reviews.llvm.org/D60744
llvm-svn: 358638
With partitions, each partition should have the same build id. This means
that the build id needs to be only computed once, otherwise we will end up
with different build ids in each partition as a result of the file contents
changing. This change moves the computation of the build id into Writer so
that it only happens once.
Differential Revision: https://reviews.llvm.org/D60342
llvm-svn: 358536
The typo was introduced to llvm MC in rL204769 (fixed in rL358247) and then to lld.
Also, for relocatable-many-sections.s, the size of .symtab changed at some point and the formula needs update.
llvm-svn: 358248
Patch by Robert O'Callahan.
Rust projects tend to link in all object files from all dependent
libraries and rely on --gc-sections to strip unused code and data.
Unfortunately --gc-sections doesn't currently strip any debuginfo
associated with GC'ed sections, so lld links in the full debuginfo from
all dependencies even if almost all that code has been discarded. See
https://github.com/rust-lang/rust/issues/56068 for some details.
Properly stripping debuginfo for discarded sections would be difficult,
but a simple approach that helps significantly is to mark debuginfo
sections as live only if their associated object file has at least one
live code/data section. This patch does that. In a (contrived but not
totally artificial) Rust testcase linked above, it reduces the final
binary size from 46MB to 5.1MB.
Differential Revision: https://reviews.llvm.org/D54747
llvm-svn: 358069
The code previously specified a 32-bit range for R_RISCV_HI20 and
R_RISCV_LO12_[IS], however this is incorrect as the maximum offset on
RV64 that can be formed from the immediate of lui and the displacement
of an I-type or S-type instruction is -0x80000800 to 0x7ffff7ff. There
is also the same issue with a c.lui and LO12 pair, whose actual
addressable range should be -0x20800 to 0x1f7ff.
The tests will be included in the next patch that converts all RISC-V
tests to use llvm-mc instead of yaml2obj, as assembler support has
matured enough to write tests in them.
Differential Revision: https://reviews.llvm.org/D60414
llvm-svn: 357995
For partitions I intend to use the same set of version indexes in
each partition for simplicity. Since each partition will need its own
VersionNeedSection this will require moving the verneed tracking out of
VersionNeedSection. The way I've done this is to move most of the tracking
into SharedFile. What will eventually become the per-partition tracking
still lives in VersionNeedSection.
As a bonus the code gets a little simpler and more consistent with how we
handle verdef.
Differential Revision: https://reviews.llvm.org/D60307
llvm-svn: 357926
Previously, we drop symbols starting with .L from the symbol table, so
if there is a relocation that refers a .L symbol, it ended up
referencing a null -- which happened to be interpreted as an absolute
symbol.
This patch copies all symbols including local ones if -emit-reloc is
given.
Fixes https://bugs.llvm.org/show_bug.cgi?id=41385
Differential Revision: https://reviews.llvm.org/D60306
llvm-svn: 357885
And rename the function to combineEhSections(). This makes the processing
of .ARM.exidx even more similar to .eh_frame and means that we can avoid an
additional loop over InputSections.
Differential Revision: https://reviews.llvm.org/D60026
llvm-svn: 357417
Summary:
Some synthetic sections can be empty while still being needed, thus they
can't be removed by removeUnusedSyntheticSections(). Rename this member
function to more appropriate isNeeded() with the opposite meaning.
No functional change intended.
Reviewers: ruiu, espindola
Reviewed By: ruiu
Subscribers: jhenderson, grimar, emaste, arichardson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59982
llvm-svn: 357377
This change itself doesn't mean anything, but it helps D59780 because
in patch, we don't know whether we need to create a CET-aware PLT or
not until we read all input files.
llvm-svn: 357194
Recommit r356666 with fixes for buildbot failure, as well as handling for
--emit-relocs, which we decide not to emit any relocation sections as the
table is already position independent and an offline tool can deduce the
relocations.
Instead of creating extra Synthetic .ARM.exidx sections to account for
gaps in the table, create a single .ARM.exidx SyntheticSection that can
derive the contents of the gaps from a sorted list of the executable
InputSections. This has the benefit of moving the ARM specific code for
SyntheticSections in SHF_LINK_ORDER processing and the table merging code
into the ARM specific SyntheticSection. This also makes it easier to create
EXIDX_CANTUNWIND table entries for executable InputSections that don't
have an associated .ARM.exidx section.
Fixes pr40277
Differential Revision: https://reviews.llvm.org/D59216
llvm-svn: 357160
Patch by Tiancong Wang.
In D36351, Call-Chain Clustering (C3) heuristic is implemented with
option --call-graph-ordering-file <file>.
This patch adds a flag --print-symbol-order=<file> to LLD, and when
specified, it prints out the symbols ordered by the heuristics to the
file. The symbols printout is helpful to those who want to understand
the heuristics and want to reproduce the ordering with
--symbol-ordering-file in later pass.
Differential Revision: https://reviews.llvm.org/D59311
llvm-svn: 357133
Summary:
This should address remaining issues discussed in PR36555.
Currently R_GOT*_FROM_END are exclusively used by x86 and x86_64 to
express relocations types relative to the GOT base. We have
_GLOBAL_OFFSET_TABLE_ (GOT base) = start(.got.plt) but end(.got) !=
start(.got.plt)
This can have problems when _GLOBAL_OFFSET_TABLE_ is used as a symbol, e.g.
glibc dl_machine_dynamic assumes _GLOBAL_OFFSET_TABLE_ is start(.got.plt),
which is not true.
extern const ElfW(Addr) _GLOBAL_OFFSET_TABLE_[] attribute_hidden;
return _GLOBAL_OFFSET_TABLE_[0]; // R_X86_64_GOTPC32
In this patch, we
* Change all GOT*_FROM_END to GOTPLT* to fix the problem.
* Add HasGotPltOffRel to denote whether .got.plt should be kept even if
the section is empty.
* Simplify GotSection::empty and GotPltSection::empty by setting
HasGotOffRel and HasGotPltOffRel according to GlobalOffsetTable early.
The change of R_386_GOTPC makes X86::writePltHeader simpler as we don't
have to compute the offset start(.got.plt) - Ebx (it is constant 0).
We still diverge from ld.bfd (at least in most cases) and gold in that
.got.plt and .got are not adjacent, but the advantage doing that is
unclear.
Reviewers: ruiu, sivachandra, espindola
Subscribers: emaste, mehdi_amini, arichardson, dexonsmith, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59594
llvm-svn: 356968
lld's mark-sweep garbage collector was written in the visitor pattern.
There are functions that traverses a given graph, and the functions calls
callback functions to dispatch according to node type.
The code was originaly pretty simple, and lambdas worked pretty
well. However, as we add more features to the garbage collector, that became
more like a callback hell. We now have a callback function that wraps
another callback function, for example. It is not easy to follow the flow of
the control.
This patch rewrites it as a regular class. What was once a lambda is now a
regular class member function. I think this change fixes the readability
issue.
No functionality change intended.
Differential Revision: https://reviews.llvm.org/D59800
llvm-svn: 356966
Previously, `Entries` contains pairs of symbols and their indices.
The indices are always 0, x, 2x, 3x, ..., where x is the size of
relocation entry. We didn't have to store that values because we can
compute them when we consume them.
llvm-svn: 356812
There is a reproducible buildbot failure (segfault) on the 2 stage
clang-cmake-armv8-lld bot. Reverting while I investigate.
Differential Revision: https://reviews.llvm.org/D59216
llvm-svn: 356684
Instead of creating extra Synthetic .ARM.exidx sections to account for
gaps in the table, create a single .ARM.exidx SyntheticSection that can
derive the contents of the gaps from a sorted list of the executable
InputSections. This has the benefit of moving the ARM specific code for
SyntheticSections in SHF_LINK_ORDER processing and the table merging code
into the ARM specific SyntheticSection. This also makes it easier to create
EXIDX_CANTUNWIND table entries for executable InputSections that don't
have an associated .ARM.exidx section.
Fixes pr40277
Differential Revision: https://reviews.llvm.org/D59216
llvm-svn: 356666
Summary:
This implements Rui Ueyama's idea in PR39044.
I've checked that ld.bfd and gold do not have the power-of-2 requirement
and do not require sh_entsize to be a multiple of sh_align.
Now on the updated test merge-entsize.s, all the 3 linkers happily
create .rodata that is not 3-byte aligned.
This has a use case in Linux arch/x86/crypto/sha512-avx2-asm.S
It uses sh_entsize of 640, which is not a power of 2.
See https://github.com/ClangBuiltLinux/linux/issues/417
Reviewers: ruiu, espindola
Reviewed By: ruiu
Subscribers: nickdesaulniers, E5ten, emaste, arichardson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59478
llvm-svn: 356428