When resolving dynamic RELA relocations the addend is taken from the
relocation and not the place being relocated. Accordingly lld does not
write the addend field to the place like it would for a REL relocation.
Unfortunately there is some system software, in particlar dynamic loaders
such as Bionic's linker64 that use the value of the place prior to
relocation to find the offset that they have been loaded at. Both gold
and bfd control this behavior with the --[no-]apply-dynamic-relocs option.
This change implements the option and defaults it to true for compatibility
with gold and bfd.
Differential Revision: https://reviews.llvm.org/D42797
llvm-svn: 324221
We did not report valid filename for duplicate symbol error when
symbol came from binary input file.
Patch fixes it.
Differential revision: https://reviews.llvm.org/D42635
llvm-svn: 324217
Initially LLD generates Elf_Rel relocations for O32 ABI and Elf_Rela
relocations for N32 / N64 ABIs. In other words, format of input and
output relocations was always the same. Now LLD generates all output
relocations using Elf_Rel format only. It conforms to ABIs requirement.
The patch suggested by Alexander Richardson.
llvm-svn: 324064
--nopie was a typo. GNU gold doesn't recognize it. It is also
inconsistent with other options that have --foo and --no-foo.
Differential Revision: https://reviews.llvm.org/D42825
llvm-svn: 324043
In GNU linkers, the last semicolon is optional. We can't link libstdc++
with lld because of that difference.
Differential Revision: https://reviews.llvm.org/D42820
llvm-svn: 324036
Currently ICF information is output through stderr if the "--verbose"
flag is used. This differs to Gold for example, which uses an explicit
flag to output this to stdout. This commit adds the
"--print-icf-sections" and "--no-print-icf-sections" flags and changes
the output message format for clarity and consistency with
"--print-gc-sections". These messages are still output to stderr if
using the verbose flag. However to avoid intermingled message output to
console, this will not occur when the "--print-icf-sections" flag is
used.
Existing tests have been modified to expect the new message format from
stderr.
Patch by Owen Reynolds.
Differential Revision: https://reviews.llvm.org/D42375
Reviewers: ruiu, rafael
Reviewed by:
llvm-svn: 323976
Summary:
While trying to make a linker script behave the same way with lld as it did
with bfd, I discovered that lld currently doesn't diagnose overlapping
output sections. I was getting very strange runtime failures which I
tracked down to overlapping sections in the resulting binary. When linking
with ld.bfd overlapping output sections are an error unless
--noinhibit-exec is passed and I believe lld should behave the same way
here to avoid surprising crashes at runtime.
The patch also uncovered an errors in the tests: arm-thumb-interwork-thunk
was creating a binary where .got.plt was placed at an address overlapping
with .got.
Reviewers: ruiu, grimar, rafael
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D41046
llvm-svn: 323856
When there is a duplicate absolute symbol, LLD reports <internal>
instead of known object file name currently.
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D42636
llvm-svn: 323849
Previously an empty CPU string was passed to the LTO engine which
resulted in a generic CPU for which certain features like NOPL were
disabled. This fixes that.
Patch by Pratik Bhatu!
llvm-svn: 323801
Currently symbols assigned or created by linkerscript are not processed early
enough. As a result it is not possible to version them or assign any other flags/properties.
Patch creates Defined symbols for -defsym and linkerscript symbols early,
so that issue from above can be addressed.
It is based on Rafael Espindola's version of D38239 patch.
Fixes PR34121.
Differential revision: https://reviews.llvm.org/D41987
llvm-svn: 323729
This should fix PR36017.
The root problem is that we were creating a PT_LOAD just for the
header. That was technically valid, but inconvenient: we should not be
making the ELF discontinuous.
The solution is to allow a section with LMAExpr to be added to a
PT_LOAD if that PT_LOAD doesn't already have a LMAExpr.
llvm-svn: 323625
Patch adds one more module with non-prevailing
version of asm symbol, defined in main module
This is for D42107, which is under review.
Extended version of testcase would fail with the
diff 9 version of patch posted.
llvm-svn: 323584
This fixes the crash reported at PR36083.
The issue is that we were trying to put all the sections in the same
PT_LOAD and crashing trying to write past the end of the file.
This also adds accounting for used space in LMARegion, without it all
3 PT_LOADs would have the same physical address.
llvm-svn: 323449
Since SyntheticSection::getParent() may return null, dereferencing
this pointer in ARMExidxSentinelSection::empty() call from
removeUnusedSyntheticSections() results in crashes when linking ARM
binaries.
Patch by vit9696!
llvm-svn: 323366
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
__llvm_external_retpoline_r11
```
or on 32-bit:
```
__llvm_external_retpoline_eax
__llvm_external_retpoline_ecx
__llvm_external_retpoline_edx
__llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.
There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.
When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.
We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
llvm-svn: 323155
It is possible for a link to fail with an undefined reference, unless
--gc-sections is specified, removing the reference in the process. This
doesn't look to be tested anywhere explicitly, so I thought it useful
to add a test for it to ensure the behaviour is maintained.
Reviewers: ruiu
Differential Revision: https://reviews.llvm.org/D42299
llvm-svn: 323099
The problem we had with it is that anything inside an AT is an
expression, so we failed to parse the section name because of the - in
it.
llvm-svn: 322801
Previously we always handled -defsym after other commands in command line.
That made impossible to overload values set by -defsym from linker script:
test.script:
foo = 0x22;
-defsym=foo=0x11 -script t.script
would always set foo to 0x11.
That is inconstent with common logic which allows to override command line
options. it is inconsistent with bfd behavior and seems breaks assumption that
-defsym is the same as linker script assignment, as -defsyms always handled out of
command line order.
Patch fixes the handling order.
Differential revision: https://reviews.llvm.org/D42054
llvm-svn: 322625
We track both the combined visibility that will be used for the output
symbol and the original input visibility of the selected symbol.
Almost everything should use the computed visibility.
I will make the names less confusing an a followup patch.
llvm-svn: 322576
When a section placement (AT) command references the section itself,
the physical address of the section in the ELF header was calculated
incorrectly due to alignment happening right after the location
pointer's value was captured.
The problem was diagnosed and the first version of the patch written
by Erick Reyes.
llvm-svn: 322421
Previously we checked (HeaderSize == 0) to find out if
PltSection section is IPLT or PLT. Some targets does not set
HeaderSize though. For example PPC64 has no lazy binding implemented
and does not set PltHeaderSize constant.
Because of that using of both IPLT and PLT relocations worked
incorrectly there (testcase is provided).
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D41613
llvm-svn: 322362
AT> lma_region expression allows to specify the memory region
for section load address.
Should fix PR35684.
Differential revision: https://reviews.llvm.org/D41397
llvm-svn: 322359
When setting up the chain, we copy over the bucket's previous symbol
index, assuming that this index will be 0 (STN_UNDEF) for an unused
bucket (marking the end of the chain). When linking with --no-rosegment,
however, unused buckets will in fact contain the padding value, and so
the hash table will end up containing invalid chains. Zero out the hash
table section explicitly to avoid this, similar to what's already done
for GNU hash sections.
Differential Revision: https://reviews.llvm.org/D41928
llvm-svn: 322259
When we have --icf=safe we should be able to define --icf=all as a
shorthand for --icf=safe --ignore-function-address-equality.
For now --ignore-function-address-equality is used only to control
access to non preemptable symbols in shared libraries.
llvm-svn: 322152
This splits relocation processing in two steps.
First, analyze what needs to be done at the relocation spot. This can
be a constant (non preemptible symbol, relative got reference, etc) or
require a dynamic relocation. At this step we also consider creating
copy relocations.
Once that is done we decide if we need a got or a plt entry.
The code is simpler IMHO. For example:
- There is a single call to isPicRel since the logic is not split
among adjustExpr and the caller.
- R_MIPS_GOTREL is simple to handle now.
- The tracking of what is preemptible or not is much simpler now.
This also fixes a regression with symbols being both in a got and copy
relocated. They had regressed in r268668 and r268149.
The other test changes are because of error messages changes or the
order of two relocations in the output.
llvm-svn: 322047
Previously, in r320472, I moved the calculation of section offsets and sizes
for compressed debug sections into maybeCompress, which happens before
assignAddresses, so that the compression had the required information. However,
I failed to take account of relocations that patch such sections. This had two
effects:
1. A race condition existed when a debug section referred to a different debug
section (see PR35788).
2. References to symbols in non-debug sections would be patched incorrectly.
This is because the addresses of such symbols are not calculated until after
assignAddresses (this was a partial regression caused by r320472, but they
could still have been broken before, in the event that a custom layout was used
in a linker script).
assignAddresses does not need to know about the output section size of
non-allocatable sections, because they do not affect the value of Dot. This
means that there is no longer a reason not to support custom layout of
compressed debug sections, as far as I'm aware. These two points allow for
delaying when maybeCompress can be called, removing the need for the loop I
previously added to calculate the section size, and therefore the race
condition. Furthermore, by delaying, we fix the issues of relocations getting
incorrect symbol values, because they have now all been finalized.
llvm-svn: 321986
LLD previously used to handle dynamic lists and version scripts in the
exact same way, even though they have very different semantics for
shared libraries and subtly different semantics for executables. r315114
untangled their semantics for executables (building on previous work to
correct their semantics for shared libraries). With that change, dynamic
lists won't set the default version to VER_NDX_LOCAL, and so resetting
the version to VER_NDX_GLOBAL in scanShlibUndefined is unnecessary.
This was causing an issue because version scripts containing `local: *`
work by setting the default version to VER_NDX_LOCAL, but scanShlibUndefined
would override this default, and therefore symbols which should have
been local would end up in the dynamic symbol table, which differs from
both bfd and gold's behavior. gold silently keeps the symbol hidden in
such a scenario, whereas bfd issues an error. I prefer bfd's behavior
and plan to implement that in LLD in a follow-up (and the test case
added here will be updated accordingly).
Differential Revision: https://reviews.llvm.org/D41639
llvm-svn: 321982
We normally want to ignore SHT_NOBITS sections when computing
offsets. The sh_offset of section itself seems to be irrelevant and
- If the section is in the middle of a PT_LOAD, it will make no
difference on the computed offset of the followup section.
- If it is in the end of a PT_LOAD, we want to avoid its alignment
changing the offset of the followup sections.
The issue is if it is at the start of the PT_LOAD. In that case we do
have to align it so that the following sections have congruent address
and offset module the page size. We were not handling this case.
This should fix freebsd kernel link.
llvm-svn: 321657
This is "Bug 35751 - .dynamic relocation entries omitted if output
contains only IFUNC relocations"
We have InX::RelaPlt and InX::RelaIPlt synthetic sections for PLT relocations.
They are usually live in rela.plt section. Problem appears when InX::RelaPlt
section is empty. In that case we did not produce normal set of dynamic tags
required, because logic was written in the way assuming we always have
non-IRelative relocations in rela.plt.
Patch fixes the issue.
Differential revision: https://reviews.llvm.org/D41592
llvm-svn: 321600
If using a version script with a `local: *` in it, symbols in shared
libraries will still get default visibility if another shared library on
the link line has an undefined reference to the symbol. This is quite
surprising. Neither bfd nor gold have this behavior when linking a
shared library, and none of LLD's tests fail without this behavior, so
it seems safe to limit scanShlibUndefined to executables.
As far as executables are concerned, gold doesn't do any automatic
default visibility marking, and bfd issues a link error about a shared
library having a reference to a hidden symbol rather than silently
giving that symbol default visibility. I think bfd's behavior here is
preferable to LLD's, but that's something to be considered in a
follow-up.
Differential Revision: https://reviews.llvm.org/D41524
llvm-svn: 321578
This makes adjustExpr a bit simpler too IMHO.
It seems that some of the complication around relocation processing
is that we are trying to create copy relocations too early. It seems
we could handle a few simple cases first and continue.
llvm-svn: 321507
Previously we failed to resolve them when produced executables:
"relocation R_X86_64_32 cannot be used against shared object; recompile with -fPIC"
Patch fixes it so that we resolve them to 0 for executables.
And for -shared case we still should produce the relocation.
This finishes fixing PR35720.
DIfferential revision: https://reviews.llvm.org/D41551
llvm-svn: 321473
If a relocation cannot be implemented by the dynamic linker and the
section is rw, allow creating a plt entry to use as the function
address as if the section was ro.
This matches bfd and gold. It also matches our behavior with -z
notext.
llvm-svn: 321430
Advance the memory region offset when handling a linker script data
command such as BYTE or LONG. Failure to advance the offset results
in corrupted output with overlapping sections.
Update tests to check for this combination of both a) memory regions
and b) data commands.
Fixes https://bugs.llvm.org/show_bug.cgi?id=35565
Patch by Owen Shaw!
llvm-svn: 321418
This is part of PR35720.
Currently LLD allows dynamic relocations against text when -z notext is given.
Though for non-PIC relocations like R_X86_64_PC32 that does not work,
we produce "relocation R_X86_64_PC32 cannot be used against shared object;"
error because they may overflow in runtime.
Solution implemented is to use PLT for them.
Differential revision: https://reviews.llvm.org/D41541
llvm-svn: 321400
When two linker script symbols are subtracted, the result should be absolute.
This is the behavior of binutils' ld.
Patch by Erick Reyes!
llvm-svn: 321390
A more efficient PLT sequence can be used when the distance between the
.plt and the end of the .plt.got is less than 128 Megabytes, which is
frequently true. We fall back to the old sequence when the offset is larger
than 128 Megabytes. This gives us an alternative to forcing the longer
entries with --long-plt as we gracefully fall back to it as needed.
See ELF for the ARM Architecture Appendix A for details of the PLT sequence.
Differential Revision: https://reviews.llvm.org/D41246
llvm-svn: 320987
Summary:
1. Use stream 0 only for combined module. Previously if combined module was not
processes ThinLTO used the stream for own output. However small changes in input,
could trigger combined module and shuffle outputs making life of llvm::LTO harder.
2. Always process combined module and write output to stream 0. Processing empty
combined module is cheap and allows llvm::LTO users to avoid implementing processing
which is already done in llvm::LTO.
Subscribers: mehdi_amini, inglorion, eraman, hiraditya
Differential Revision: https://reviews.llvm.org/D41267
llvm-svn: 320905
We only need to exceed 128 Megabytes to provoke the generation of a range
extension thunk. This brings the file size down to just over 128 Megabytes.
llvm-svn: 320821
I noticed that the continue this patch deletes was not tested. Trying
to add a test I realized that we never put a VER_NDX_LOCAL symbol in
the dynamic symbol table. There doesn't seem to be any reason for a
linker to use VER_NDX_LOCAL for a defined shared symbol.
llvm-svn: 320817
The ARM.exidx section contains a table of 8-byte entries with the first
word of each entry an offset to the function it describes and the second
word instructions for unwinding if an exception is thrown from that
function. The SHF_LINK_ORDER processing will order the table in ascending
order of the functions described by the exception table entries. As the
address range of an exception table entry is terminated by the next table
entry, it is possible to merge consecutive table entries that have
identical unwind instructions.
For this implementation we define a table entry to be identical if:
- Both entries are the special EXIDX_CANTUNWIND.
- Both entries have the same inline unwind instructions.
We do not attempt to establish if table entries that are references to
.ARM.extab sections are identical.
This implementation works at a granularity of a single .ARM.exidx
InputSection. If all entries in the InputSection are identical to the
previous table entry we can remove the InputSection. A more sophisticated
but more complex implementation would rewrite InputSection contents so that
duplicates within a .ARM.exidx InputSection can be merged.
Differential Revision: https://reviews.llvm.org/D40967
llvm-svn: 320803
This patch provides the mechanism to fix instances of the instruction
sequence that may trigger the cortex-a53 843419 erratum. The fix is
provided by an alternative instruction sequence to remove one of the
erratum conditions. To reach this alternative instruction sequence we
replace the original instruction with a branch to the alternative
sequence. The alternative sequence is responsible for branching back to
the original.
As there is only erratum to fix the implementation is specific to
AArch64 and the specific erratum conditions. It should be generalizable
to other targets and erratum if needed.
Differential Revision: https://reviews.llvm.org/D36749
llvm-svn: 320800
Specifically, libwidevinecdm.so in Chrome has such bad symbol.
It seems the BFD linker handles them as local symbols, so instead
of inserting them to the symbol table, we should skip them too.
Differential Revision: https://reviews.llvm.org/D41257
llvm-svn: 320770
We might crash in 'ARMExidxSentinelSection::writeTo()' because it expected
the sentinel entry to be put in the same 'InputSectionDescription' as
the last real entry. This assumption fails if the last output section command
for .ARM.exidx is anything but an input section description, because in this
case 'OutputSection::addSection()' creates a new 'InputSectionDescription'.
Differential Revision: https://reviews.llvm.org/D41105
llvm-svn: 320668
The size of an OutputSection is calculated early, to aid handling of compressed
debug sections. However, subsequent to this point, unused synthetic sections are
removed. In the event that an OutputSection, from which such an InputSection is
removed, is still required (e.g. because it has a symbol assignment), and no longer
has any InputSections, dot assignments, or BYTE()-family directives, the size
member is never updated when processing the commands. If the removed InputSection
had a non-zero size (such as a .got.plt section), the section ends up with the
wrong size in the output.
The fix is to reset the OutputSection size prior to processing the linker script
commands relating to that OutputSection. This ensures that the size is correct even
in the above situation.
Additionally, to reduce the risk of developers misusing OutputSection Size and
InputSection OutSecOff, they are set to simply the number of InputSections in an
OutputSection, and the corresponding index respectively. We cannot completely
stop using them, due to SHF_LINK_ORDER sections requiring them.
Compressed debug sections also require the full size. This is now calculated in
maybeCompress for these kinds of sections.
Reviewers: ruiu, rafael
Differential Revision: https://reviews.llvm.org/D38361
llvm-svn: 320472
An internal linker has support for merging identical data and in some
cases it can be a significant win.
This is behind an off by default flag so it has to be requested
explicitly.
llvm-svn: 320448
When an output section has no byte commands and has no input sections then it
would be ideal if the type of the section is SHT_NOBITS so that the file can
take up less space. This change sets the default type of of output sections to
SHT_NOBITS instead of SHT_PROGBITS to allow this. This required some minor test
changes (which double as tests for this new behavior) but extend-pt-load.s had
be changed in a non-trivial way. Since it seems to me that the point of the
test is to point out the consequences of how flags are assigned to output
sections that don't have input sections I changed the test to work and still
show how the memsize of the executable segment was changed.
Differential Revision: https://reviews.llvm.org/D41082
llvm-svn: 320437
log are also diagnostics so it seems like they should to
the same place as errors and debug messages.
Without this change when I enable --verbose those messages
go to stdout, but when I enable "-mllvm -debug" those messages
go to stderr (because dbgs() goes to stderr by default).
So I end up having to do this a lot:
lld <args> > output_message 2>&1
Differential Revision: https://reviews.llvm.org/D41033
llvm-svn: 320427
Now that gc sections runs after linker defined symbols are added it
can see symbols that point to an OutputSection.
Should fix a bot failure.
llvm-svn: 320412
This fixes pr35570.
We were creating these symbols after parsing version scripts, so they
could not be versioned.
We cannot move the version script parsing later because we need it for
lto.
One option is to move both addReservedSymbols and
createSyntheticSections earlier. The disadvantage is that some
sections created by createSyntheticSections replace other input
sections. For example, gdb index replaces .debug_gnu_pubnames, so it
wants to run after gc sections so that it can set S->Live to false.
What this patch does instead is to move just the ElfHeader creation
early.
llvm-svn: 320390
Summary:
I also changed the message to print both the ISA and the the architecture
name for incompatible files. Previously it would be quite hard to find the
actual path of the incompatible object files in projects that have many
object files with the same name in different directories.
Reviewers: atanasyan, ruiu
Reviewed By: atanasyan
Subscribers: emaste, sdardis, llvm-commits
Differential Revision: https://reviews.llvm.org/D40958
llvm-svn: 320056
As mentioned in PR35471, shared functions for which
.plt entry address is used shows up in bfd's map files.
Patch teaches LLD to do the same.
Differential revision: https://reviews.llvm.org/D40839
llvm-svn: 319879
We fill executable sections with trap instructions (0xcc or equivalent).
If a .gnu.hash section was put into an executable segment, we created
corrupted .gnu.hash section. This patch fixes the issue.
llvm-svn: 319863
Add a new file AArch64ErrataFix.cpp that implements the logic to scan for
the Cortex-A53 Erratum 843419. This involves finding all the executable
code, disassembling the instructions that might trigger the erratum and
reporting a message if the sequence is detected.
At this stage we do not attempt to fix the erratum, this functionality
will be added in a later patch. See D36749 for proposal.
Differential Revision: https://reviews.llvm.org/D36742
llvm-svn: 319780
With fix:
Specify -soname for input dso to fix up the .dynstr section
size in different environments.
Original commit message:
As mentioned in PR35471, copied symbols did not show
in --Map output. Patch fixes that.
Differential revision: https://reviews.llvm.org/D40785
llvm-svn: 319769
When a linker script is used with a pattern like { *(.bss .bss.*) } the
InX::BssRelRo section will match against .bss.*. By matching on the name
only, in the same way that .data.rel.ro works we prevent this
from happening, but permit scripts that want to explicitly provide
a .bss.rel.ro OutputSection.
Differential Revision: https://reviews.llvm.org/D40735
llvm-svn: 319755
As mentioned in PR35471, copied symbols did not show
in --Map output. Patch fixes that.
Differential revision: https://reviews.llvm.org/D40785
llvm-svn: 319747
Previously, lld exited with an error status if the only option given to
the command was -v. GNU linkers gracefully exit in that case. This patch
makes lld behave like GNU.
Note that even with this patch, lld's -v and --version options behave
slightly differently than GNU linkers' counterparts. For example,
if you run `ld.bfd -v -v`, the version string is printed out twice.
But that is an edge case that I don't think we need to take care of.
Fixes https://bugs.llvm.org/show_bug.cgi?id=31582
Differential Revision: https://reviews.llvm.org/D40810
llvm-svn: 319717
As well as location counter expressions. The data generating expressions
such as BYTE can generate a non-zero sized OutputSection that will report
0 until assignAddresses() is called. Add an example to the existing test
case relro-non-contiguous-script-data.s.
Differential Revision: https://reviews.llvm.org/D40732
llvm-svn: 319648
PR35478 https://bugs.llvm.org/show_bug.cgi?id=35478 points out a flaw
in the implementation of r318924 from D40364. The implementation
depends on the Size field being set or the SyntheticSection::empty()
being accurate. These functions are not reliable as some linker script
commands that have yet to be processed may affect the results, causing
some non-zero size sections to be reported as zero size.
I think the first step is to revert r318924 and come up with a better
solution for the underlying problem rather than trying to layer more
heuristics onto the zero sized output section.
Chances are I'll be out of office by the time anyone sees this so feel
free to commit the revert if you agree with me.
Fixes PR35478
Current thoughts on the underlying problem:
Revisiting the motivation for adding the zero size check in the first
place; it was to prevent 0 sized SyntheticSections that a user does
not have full control over from needlessly breaking the PT_GNU_RELRO,
rather than trying to accommodate arbitrarily complex linker
scripts. Looking at the code, it looks like
removeUnusedSyntheticSections() should remove zero sized synthetic
sections. It does, but it doesn't set the Parent to nullptr, this has
the side effect that Sec == InX::BssRelRo->getParent() will make the
parent OutputSection of InX::BssRelRo RelRo even if there is no
InX::BssRelRo.
I tried a quick experiment with setting the Parent to nullptr and this
flushed out a few interesting test failures, it feels like playing
Jenga with every change:
In the isRelroSection() we have to consider the case where there
is no .plt and .plt.got but there is a ifunc plt with accompanying
(ifunc .got or .plt.got)
The PPC64 has PltHeaderSize == 0. Unfortunately HeaderSize == 0 is
used to choose between the ifunc plt or normal plt. We seem to get
away with this at the moment, but tests start to fail when Parent
is set to nullptr for the .got.plt.
The InX::BssRelRo and InX::Bss never get their sizes set and they
are always removed by removeUnusedSyntheticSections(), their
purpose seems to be as some kind of proxy for add .bss or
.bss.relro InputSections into their parent OutputSections, they
therefore don't behave like other SyntheticSections anyway.
My thinking is that some work is needed to make sure that the Sec ==
SyntheticSection->getParent() does a bit more checking before
returning true, particularly for InX::BssRelRo as that has special
behaviour. I'll hope to post something for review as soon as possible.
Patch by Peter Smith!
llvm-svn: 319563
This is for "Bug 35474 - --emit-relocs produces wrongly-named reloc sections".
LLD currently for scripts like:
.text.boot : { *(.text.boot) }
emits relocation section with name .rela.text because does not take
redefined name of output section into account and builds section name
using rules for non-scripted case. Patch fixes this oddness.
Differential revision: https://reviews.llvm.org/D40652
llvm-svn: 319526
We had no tests for what PROVIDE should do if there is a shared symbol
with the same name.
In both bfd and our existing implementation PROVIDE wins. Add a test
for that.
llvm-svn: 319486
The ELF spec says
Symbols with section index SHN_COMMON may appear only in relocatable
objects.
Currently lld can produce file that break that requirement.
llvm-svn: 319473
When a linker script has "foo = bar" and bar is the result of a copy
relocation foo should point to the same location in .bss.
This is part of a growing evidence that copy relocations should be
implemented by using replaceSymbol to replace the SharedSymbol with a
Defined.
llvm-svn: 319449
The AArch64 unconditional branch and branch and link instructions have a
maximum range of 128 Mib. This is usually enough for most programs but
there are cases when it isn't enough. This change adds support for range
extension thunks to AArch64. For pc-relative thunks we follow the small
code model and use ADRP, ADD, BR. This has a limit of 4 gigabytes.
Differential Revision: https://reviews.llvm.org/D39744
llvm-svn: 319307
It had been reverted because it depended on r319008 which has been
recommitted.
Original message:
Add a missing test.
We were not testing that we correctly handled a .o with a weak symbol
after a .so.
llvm-svn: 319217
This includes a fix to mark copy reloc aliases as used.
Original message:
[ELF] Do not keep symbols if they referenced only from discarded sections.
This patch also ensures that in case of "--as-needed" is used,
DT_NEEDED entries are not created if they are required only by
these eliminated symbols.
llvm-svn: 319215
lld assumes some ARM features that are not available in all Arm
processors. In particular:
- The blx instruction present for interworking.
- The movt/movw instructions are used in Thunks.
- The J1=1 J2=1 encoding of branch immediates to improve Thumb wide
branch range are assumed to be present.
This patch reads the ARM Attributes section to check for the
architecture the object file was compiled with. If none of the objects
have an architecture that supports either of these features a warning
will be given. This is most likely to affect armv6 as used in the first
Raspberry Pi.
Differential Revision: https://reviews.llvm.org/D36823
llvm-svn: 319169
Currently we mark every shared symbol as STB_WEAK.
That is a hack to make it easy to decide when a .so is needed or not
because of a reference to a given symbol.
That hack leaks when we create copy relocations as shown by the update
to relocation-copy-alias.s.
This patch stores the original binding when we first read a shared
symbol. We still have to update the binding to weak if we see a weak
undef, but I find the logic easier to read where it is now.
llvm-svn: 319127
When an undefined weak reference has a PLT entry we must generate a range
extension thunk for any B or BL that can't reach the PLT entry.
This change explicitly looks for whether a PLT entry exists rather than
assuming that weak references never need PLT entries unless Config->Shared
is in operation. This covers the case where we are linking an executable
with dynamic linking, hence a PLT entry will be needed for undefined weak
references. This case comes up in real programs over 32 Mb in size as there
is a B to a weak reference __gmon__start__ in the Arm crti.o for glibc.
Differential Revision: https://reviews.llvm.org/D40248
llvm-svn: 319020
This patch also ensures that in case of "--as-needed" is used,
DT_NEEDED entries are not created if they are required only by
these eliminated symbols.
Differential Revision: https://reviews.llvm.org/D38790
llvm-svn: 319008
LLD uses .bss.rel.ro for read-only copy relocations whereas the ld.bfd and
gold linkers use .data.rel.ro. In some linker scripts including ld.bfd's
internal linker script, the relro sections are placed sequentially assuming
.data.rel.ro is used. LLD's use of .bss.rel.ro means that the copy
relocations get matched into the .bss section causing the relro sections to
be non-contiguous.
This change checks for a .data.rel.ro OutputSection when a linker script
with the SECTIONS command is used. The section will match in the
.data.rel.ro output section and will maintain contiguous relro.
Differential Revision: https://reviews.llvm.org/D40365
Fixes PR35265
llvm-svn: 318940
It now has a DT_NEEDED that could be removed by --gc-sections and one
that cannot. Without this all tests would pass if --gc-sections just
removed all DT_NEEDED.
llvm-svn: 318937
When checking for contiguous relro sections we can skip over empty sections.
If there is an empty non-relro section in the middle of a contiguous block
of relro sections then it cannot be written to so it is safe to include in
PT_GNU_RELRO header. If there is a contiguous block of empty relro sections
then no PT_GNU_RELRO header is required for them.
Differential Revision: https://reviews.llvm.org/D40364
llvm-svn: 318924
If a linker script is used that names linker generated synthetic sections
it is possible that the OutputSections for which isRelroSection() is true
are not contiguous. When the relro sections are not contiguous we cannot
describe them with a single PT_GNU_RELRO PHDR. Unfortunately at least one
contemporary dynamic loader only supports one PT_GNU_RELRO PHDR so we
cannot output more than one of these PHDRs. As not including relro
sections in the PHDR will lead to security sensitive sections being
writeable we choose to give an error message instead.
Differential Revision: https://reviews.llvm.org/D40359
[ELF] Skip over empty sections when checking for contiguous relro
llvm-svn: 318920
The default limit is 1000000 but it can be configured with a cache
policy. The motivation is that some filesystems (notably ext4) have
a limit on the number of files that can be contained in a directory
(separate from the inode limit).
Differential Revision: https://reviews.llvm.org/D40327
llvm-svn: 318857
The MIPS GOT section has a number of local entries based on the number of pages
needed for output sections referenced by GOT page relocations. The number is
recorded in the DT_MIPS_LOCAL_GOTNO dynamic section tag. However, the dynamic tag
is added before assignAddresses has been called, meaning that any section size used
to calculate the value will not include size modifications caused by, for example,
linker scripts and thunks.
This change moves the calculation of DT_MIPS_LOCAL_GOTNO until writeTo, by which
time the output section sizes have been finalized.
Reviewers: ruiu, rafael
Differential Revision: https://reviews.llvm.org/D39493
llvm-svn: 318828
Summary:
I noticed that the reproducers files I was getting from building CheriBSD
didn't work because the --sysroot option was not being rewritten. I've
updated the test to also verify that the rewritten path matches uses a
FileCheck capature instead of a {{.+}} regex
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: llvm-commits, emaste
Differential Revision: https://reviews.llvm.org/D40125
llvm-svn: 318656
Summary:
This matches the behaviour of ld.bfd:
https://sourceware.org/binutils/docs/ld/Options.html#Options
If scriptfile does not exist in the current directory, ld looks for it in
the directories specified by any preceding '-L' options. Multiple '-T'
options accumulate.
Reviewers: ruiu, grimar
Reviewed By: ruiu, grimar
Subscribers: emaste, llvm-commits
Differential Revision: https://reviews.llvm.org/D40129
llvm-svn: 318655
Summary:
The bug triggers when the following conditions are met:
- A thunk is created in a given input section S
- A linker script is specified
- There is at least one matcher in the linker script .text section output
that does not match any of the sections in the input files, before the matcher
that matches section S.
The issue was found when linking the FreeBSD kernel for MIPS when built
with -fPIC. Patch by Alfredo Mazzinghi.
Reviewers: ruiu, psmith, atanasyan
Reviewed By: ruiu
Subscribers: peter.smith, emaste, sdardis, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D40174
llvm-svn: 318653
Recently we teached LLD to report line numbers for duplicate variables
definitions, though currently LLD is unable to do that for case when
strings are not built in .debug_info, but stored in .debug_str instead.
That is because out LLDDwarfObj does not handle .debug_str yet.
Patch fixes that.
Differential revision: https://reviews.llvm.org/D39542
llvm-svn: 318519
This fixes PR35223.
Here I enabled SHF_MERGE section content merging for -r like
we do for regular linking.
Differential revision: https://reviews.llvm.org/D40026
llvm-svn: 318516
Commit r318397 fixed the cache pruning interval which broke this test
as it was assuming that the cache pruning was always being
performed. Explicitly set prune interval to 0s to ensure this.
llvm-svn: 318426
Previously our relocations we rewrote were broken for that case.
We emited incorrect addend and broken relocation info field
because did not produce section symbol for mergeable synthetic sections.
Differential revision: https://reviews.llvm.org/D40070
llvm-svn: 318394
An output section can include elements from two input sections with
different sh_entsize. When that happens the output section itself
should not have a sh_entsize.
llvm-svn: 318311
No difference in practice other than having sh_entsize in the output.
This should simplify the patch for handling SHF_MERGE in -r.
Based on a patch by George Rimar.
llvm-svn: 318306
microMIPS symbols including microMIPS PLT records created for regular
symbols needs to be marked by STO_MIPS_MICROMIPS flag in a symbol table.
Additionally microMIPS entries in a dynamic symbol table should have
configured less-significant bit. That allows to escape teaching a
dynamic linker about microMIPS symbols.
llvm-svn: 318097
Common symbols are now represented with a DefinedRegular that points
to a BssSection, even during symbol resolution.
Differential Revision: https://reviews.llvm.org/D39666
llvm-svn: 317447
Currently LLD tries to use information about functions and variables location
taking it from debug sections. When --strip-* is given we discard such sections
and that breaks error reporting.
Patch stops discarding such sections and just removes them from InputSections list.
Differential revision: https://reviews.llvm.org/D39550
llvm-svn: 317405
When findMemoryRegion do search to find a region for output section it
iterates over MemoryRegions which is DenseMap and so does not
guarantee iteration in insertion order. As a result selected region depends
on its name and not on its definition position
Testcase shows the issue, patch fixes it. Behavior after applying the patch
seems consistent with bfd.
Differential revision: https://reviews.llvm.org/D39544
llvm-svn: 317307
Currently we do not strip .zdebug_*, what looks wrong.
Also this simplifies the testcase we have for this options.
Differential revision: https://reviews.llvm.org/D39552
llvm-svn: 317306
Recently we teached LLD to report line numbers for duplicate variables
definitions, though currently LLD is unable to do that for case when
strings are not built in .debug_info, but stored in .debug_str instead.
That is because out LLDDwarfObj does not handle .debug_str yet.
Patch fixes that.
Differential revision: https://reviews.llvm.org/D39542
llvm-svn: 317305
This is PR34826.
Currently LLD is unable to report line number when reporting
duplicate declaration of some variable.
That happens because for extracting line information we always use
.debug_line section content which describes mapping from machine
instructions to source file locations, what does not help for
variables as does not describe them.
In this patch I am taking the approproate information about
variables locations from the .debug_info section.
Differential revision: https://reviews.llvm.org/D38721
llvm-svn: 317080
Summary:
**Problem**
`--exclude-libs` does not work for static libraries affected by the `--whole-archive` option.
**Description**
`--exclude-libs` creates a list of static library paths and does library lookups in this list.
`--whole-archive` splits the static libraries that follow it into separate objects. As a result, lld no longer sees static libraries among linked files and does no `--exclude-libs` lookups.
**Solution**
The proposed solution is to make `--exclude-libs` consider object files too. When lld finds an object file it checks whether this file originates from an archive and, if so, looks the archive up in the `--exclude-libs` list.
Reviewers: ruiu, rafael
Reviewed By: ruiu
Subscribers: asl, ikudrin, llvm-commits, emaste
Tags: #lld
Differential Revision: https://reviews.llvm.org/D39353
llvm-svn: 316998
edata needs to be set to the end of the last mapped initialized
section. We were previously mishandling the case where there were no
non-mapped sections by setting it to the end of the last section in
the output file.
Differential Revision: https://reviews.llvm.org/D39399
llvm-svn: 316877
I don't think we have to aim for precise bug compatibility.
We can return a nullptr if a section is consumed by the linker, and
the rest should naturally work.
llvm-svn: 316817
The Android relocation packing format is a more compact
format for dynamic relocations in executables and DSOs
that is based on delta encoding and SLEBs. An overview
of the format can be found in the Android source code:
https://android.googlesource.com/platform/bionic/+/refs/heads/master/tools/relocation_packer/src/delta_encoder.h
This patch implements relocation packing using that format.
This implementation uses a more intelligent algorithm for compressing
relative relocations than Android's own relocation packer. As a
result it can generally create smaller relocation sections than
that packer. If I link Chromium for Android targeting ARM32 I get a
.rel.dyn of size 174693 bytes, as compared to 371832 bytes with gold
and the Android packer.
Differential Revision: https://reviews.llvm.org/D39152
llvm-svn: 316775
This is for PR34852.
GCC 8.0 or earlier have a bug that it emits R_386_GOTPC relocations
against _GLOBAL_OFFSET_TABLE for .debug_info. The bug seems to have
been fixed in 2017: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82630,
but we do not want LLD to report errors for such inputs.
In this patch we ignore such relocations.
Differential revision: https://reviews.llvm.org/D38625
llvm-svn: 316761
It was reported (https://reviews.llvm.org/D38724#902841) that when we use
-ffunction-sections --emit-relocs build, REL[A] output section receives the name of first
input section, like .rela.text.first_function_in_text rather than .rela.text.
It is probably not really an issue as sh_info still points to correct target section, but
it does not look clean in output and allows internal section name to leak there,
what at least looks confusing and is not consistent with ld.bfd.
Patch changes this behavior so that target output section name is used as a base.
Differential revision: https://reviews.llvm.org/D39242
llvm-svn: 316760
This change allows Thunks to be added on multiple passes. To do this we must
merge only the thunks added in each pass, and deal with thunks that have
drifted out of range of their callers.
A thunk may end out of range of its caller if enough thunks are added in
between the caller and the thunk. To handle this we create another thunk.
Differential Revision: https://reviews.llvm.org/D34692
llvm-svn: 316754
This change adds initial support for range extension thunks. All thunks must
be created within the first pass so some corner cases are not supported. A
follow up patch will add support for multiple passes.
With this change the existing tests arm-branch-error.s and
arm-thumb-branch-error.s now no longer fail with an out of range branch.
These have been renamed and tests added for the range extension thunk.
Differential Revision: https://reviews.llvm.org/D34691
llvm-svn: 316752
When an OutputSection is larger than the branch range for a Target we
need to place thunks such that they are always in range of their caller,
and sufficiently spaced to maximise the number of callers that can use
the thunk. We use the simple heuristic of placing the
ThunkSection at intervals corresponding to a target specific branch range.
If the OutputSection is small we put the thunks at the end of the executable
sections.
Differential Revision: https://reviews.llvm.org/D34689
llvm-svn: 316751