This adds some of LLD specific scopes and picks up optimisation scopes
via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in
77e6bb3c.
Differential Revision: https://reviews.llvm.org/D71060
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.
The bot failure should be fixed by D73418, committed as
af954e441a.
I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html
This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.
Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.
Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.
Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.
I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.
Depends on D71907 and D71911.
Reviewers: pcc, evgeny777, steven_wu, espindola
Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71913
I felt really sad to push this commit for my selfish purpose to make
glibc -static-pie build with lld. Some code constructs in glibc require
R_X86_64_GOTPCREL/R_X86_64_REX_GOTPCRELX referencing undefined weak to
be resolved to a GOT entry not relocated by R_X86_64_GLOB_DAT (GNU ld
behavior), e.g.
csu/libc-start.c
if (__pthread_initialize_minimal != NULL)
__pthread_initialize_minimal ();
elf/dl-object.c
void
_dl_add_to_namespace_list (struct link_map *new, Lmid_t nsid)
{
/* We modify the list of loaded objects. */
__rtld_lock_lock_recursive (GL(dl_load_write_lock));
Emitting a GLOB_DAT will make the address equal &__ehdr_start (true
value) and cause elf/ldconfig to segfault. glibc really should move away
from weak references, which do not have defined semantics.
Temporarily special case --no-dynamic-linker.
This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang.
It adds Intel CET (Control-flow Enforcement Technology) support to lld.
The implementation follows the draft version of psABI which you can
download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI.
CET introduces a new restriction on indirect jump instructions so that
you can limit the places to which you can jump to using indirect jumps.
In order to use the feature, you need to compile source files with
-fcf-protection=full.
* IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt.
* SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified.
IBT-enabled executables/shared objects have two PLT sections, ".plt" and
".plt.sec". For the details as to why we have two sections, please read
the comments.
Reviewed By: xiangzhangllvm
Differential Revision: https://reviews.llvm.org/D59780
Summary:
Add a flag `F_no_mmap` to `FileOutputBuffer` to support
`--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores
this flag for compatibility with GNU ld and gold.
We need this flag to speed up link time for large binaries in certain
scenarios. When we link some of our larger binaries we find that LLD
takes 50+ GB of memory, which causes memory pressure. The memory
pressure causes the VM to flush dirty pages of the output file to disk.
This is normally okay, since we should be flushing cold pages. However,
when using BtrFS with compression we need to write 128KB at a time when
we flush a page. If any page in that 128KB block is written again, then
it must be flushed a second time, and so on. Since LLD doesn't write
sequentially this causes write amplification. The same 128KB block will
end up being flushed multiple times, causing the linker to many times
more IO than necessary. We've observed 3-5x faster builds with
-no-mmap-output-file when we hit this scenario.
The bad scenario only applies to compressed filesystems, which group
together multiple pages into a single compressed block. I've tested
BtrFS, but the problem will be present for any compressed filesystem
on Linux, since it is caused by the VM.
Silently ignoring --no-mmap-output-file caused a silent regression when
we switched from gold to lld. We pass --no-mmap-output-file to fix this
edge case, but since lld silently ignored the flag we didn't realize it
wasn't being respected.
Benchmark building a 9 GB binary that exposes this edge case. I linked 3
times with --mmap-output-file and 3 times with --no-mmap-output-file and
took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM,
BtrFS mounted with -compress-force=zstd, and an 80% full disk.
| Mode | Time |
|---------|-------|
| mmap | 894 s |
| no mmap | 126 s |
When compression is disabled, BtrFS performs just as well with and
without mmap on this benchmark.
I was unable to reproduce the regression with any binaries in
lld-speed-test.
Reviewed By: ruiu, MaskRay
Differential Revision: https://reviews.llvm.org/D69294
Add a new '-z nognustack' option that suppresses emitting PT_GNU_STACK
segment. This segment is not supported at all on NetBSD (stack is
always non-executable), and the option is meant to be used to disable
emitting it.
Differential Revision: https://reviews.llvm.org/D56554
D64906 allows PT_LOAD to have overlapping p_offset ranges. In the
default R RX RW RW layout + -z noseparate-code case, we do not tail pad
segments when transiting to another segment. This can save at most
3*maxPageSize bytes.
a) Before D64906, we tail pad R, RX and the first RW.
b) With -z separate-code, we tail pad R and RX, but not the first RW (RELRO).
In some cases, b) saves one file page. In some cases, b) wastes one
virtual memory page. The waste is a concern on Fuchsia. Because it uses
compressed binaries, it doesn't benefit from the saved file page.
This patch adds -z separate-loadable-segments to restore the behavior before
D64906. It can affect section addresses and can thus be used as a
debugging mechanism (see PR43214 and ld.so partition bug in
crbug.com/998712).
Reviewed By: jakehehrlich, ruiu
Differential Revision: https://reviews.llvm.org/D67481
llvm-svn: 372807
The --fix-cortex-a8 option implements a linker workaround for the
coretex-a8 erratum 657417. A summary of the erratum conditions is:
- A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two
4KiB regions.
- The destination of the branch is to the first 4KiB region.
- The instruction before the branch is a 32-bit Thumb-2 non-branch
instruction.
The linker fix is to redirect the branch to a patch not in the first
4KiB region. The patch forwards the branch on to its target.
The cortex-a8, is an old CPU, with the first implementation of this
workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in
early Android Phones and there are some critical applications that still
need to run on a cortex-a8 that have the erratum. The patch is applied
roughly 10 times on LLD and 20 on Clang when they are built with
--fix-cortex-a8 on an Arm system.
The formal erratum description is avaliable in the ARM Core Cortex-A8
(AT400/AT401) Errata Notice document. This is available from Arm on
request but it seems to be findable via a web search.
Differential Revision: https://reviews.llvm.org/D67284
llvm-svn: 371965
We prioritize non-* wildcards overs VER_NDX_LOCAL/VER_NDX_GLOBAL "*".
This patch generalizes the rule to "*" of other versions and thus fixes PR40176.
I don't feel strongly about this GNU linkers' behavior but the
generalization simplifies code.
Delete `config->defaultSymbolVersion` which was used to special case
VER_NDX_LOCAL/VER_NDX_GLOBAL "*".
In `SymbolTable::scanVersionScript`, custom versions are handled the same
way as VER_NDX_LOCAL/VER_NDX_GLOBAL. So merge
`config->versionScript{Locals,Globals}` into `config->versionDefinitions`.
Overall this seems to simplify the code.
In `SymbolTable::assign{Exact,Wildcard}Versions`,
`sym->verdefIndex == config->defaultSymbolVersion` is changed to
`verdefIndex == UINT32_C(-1)`.
This allows us to give duplicate assignment diagnostics for
`{ global: foo; };` `V1 { global: foo; };`
In test/linkerscript/version-script.s:
vs_index of an undefined symbol changes from 0 to 1. This doesn't matter (arguably 1 is better because the binding is STB_GLOBAL) because vs_index of an undefined symbol is ignored.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D65716
llvm-svn: 367869
This patch
1) adds -z separate-code and -z noseparate-code (default).
2) changes the condition that the last page of last PF_X PT_LOAD is
padded with trap instructions.
Current condition (after D33630): if there is no `SECTIONS` commands.
After this change: if -z separate-code is specified.
-z separate-code was introduced to ld.bfd in 2018, to place the text
segment in its own pages. There is no overlap in pages between an
executable segment and a non-executable segment:
1) RX cannot load initial contents from R or RW(or non-SHF_ALLOC).
2) R and RW(or non-SHF_ALLOC) cannot load initial contents from RX.
lld's current status:
- Between R and RX: in `Writer<ELFT>::fixSectionAlignments()`, the start of a
segment is always aligned to maxPageSize, so the initial contents loaded by R
and RX do not overlap. I plan to allow overlaps in D64906 if -z noseparate-code
is in effect.
- Between RX and RW(or non-SHF_ALLOC if RW doesn't exist):
we currently unconditionally pad the last page to commonPageSize
(defaults to 4096 on all targets we support).
This patch will make it effective only if -z separate-code is specified.
-z separate-code is a dubious feature that intends to reduce the number
of ROP gadgets (which is actually ineffective because attackers can find
plenty of gadgets in the text segment, no need to find gadgets in
non-code regions).
With the overlapping PT_LOAD technique D64906, -z noseparate-code
removes two more alignments at segment boundaries than -z separate-code.
This saves at most defaultCommonPageSize*2 bytes, which are significant
on targets with large defaultCommonPageSize (AArch64/MIPS/PPC: 65536).
Issues/feedback on alignment at segment boundaries to help understand
the implication:
* binutils PR24490 (the situation on ld.bfd is worse because they have
two R-- on both sides of R-E so more alignments.)
* In binutils, the 2018-02-27 commit "ld: Add --enable-separate-code" made -z separate-code the default on Linux.
d969dea983
In musl-cross-make, binutils is configured with --disable-separate-code
to address size regressions caused by -z separate-code. (lld actually has the same
issue, which I plan to fix in a future patch. The ld.bfd x86 status is
worse because they default to max-page-size=0x200000).
* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237676 people want
smaller code size. This patch will remove one alignment boundary.
* Stef O'Rear: I'm opposed to any kind of page alignment at the
text/rodata line (having a partial page of text aliased as rodata and
vice versa has no demonstrable harm, and I actually care about small
systems).
So, make -z noseparate-code the default.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64903
llvm-svn: 367537
This patch is mechanically generated by clang-llvm-rename tool that I wrote
using Clang Refactoring Engine just for creating this patch. You can see the
source code of the tool at https://reviews.llvm.org/D64123. There's no manual
post-processing; you can generate the same patch by re-running the tool against
lld's code base.
Here is the main discussion thread to change the LLVM coding style:
https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html
In the discussion thread, I proposed we use lld as a testbed for variable
naming scheme change, and this patch does that.
I chose to rename variables so that they are in camelCase, just because that
is a minimal change to make variables to start with a lowercase letter.
Note to downstream patch maintainers: if you are maintaining a downstream lld
repo, just rebasing ahead of this commit would cause massive merge conflicts
because this patch essentially changes every line in the lld subdirectory. But
there's a remedy.
clang-llvm-rename tool is a batch tool, so you can rename variables in your
downstream repo with the tool. Given that, here is how to rebase your repo to
a commit after the mass renaming:
1. rebase to the commit just before the mass variable renaming,
2. apply the tool to your downstream repo to mass-rename variables locally, and
3. rebase again to the head.
Most changes made by the tool should be identical for a downstream repo and
for the head, so at the step 3, almost all changes should be merged and
disappear. I'd expect that there would be some lines that you need to merge by
hand, but that shouldn't be too many.
Differential Revision: https://reviews.llvm.org/D64121
llvm-svn: 365595
Use -fsave-optimization-record=<format> to specify a different format
than the default, which is YAML.
For now, only YAML is supported.
llvm-svn: 363573
We create several types of synthetic sections for loadable partitions, including:
- The dynamic symbol table. This allows code outside of the loadable partitions
to find entry points with dlsym.
- Creating a dynamic symbol table also requires the creation of several other
synthetic sections for the partition, such as the dynamic table and hash table
sections.
- The partition's ELF header is represented as a synthetic section in the
combined output file, and will be used by llvm-objcopy to extract partitions.
Differential Revision: https://reviews.llvm.org/D62350
llvm-svn: 362819
Branch Target Identification (BTI) and Pointer Authentication (PAC) are
architecture features introduced in v8.5a and 8.3a respectively. The new
instructions have been added in the hint space so that binaries take
advantage of support where it exists yet still run on older hardware. The
impact of each feature is:
BTI: For executable pages that have been guarded, all indirect branches
must have a destination that is a BTI instruction of the appropriate type.
For the static linker, this means that PLT entries must have a "BTI c" as
the first instruction in the sequence. BTI is an all or nothing
property for a link unit, any indirect branch not landing on a valid
destination will cause a Branch Target Exception.
PAC: The dynamic loader encodes with PACIA the address of the destination
that the PLT entry will load from the .plt.got, placing the result in a
subset of the top-bits that are not valid virtual addresses. The PLT entry
may authenticate these top-bits using the AUTIA instruction before
branching to the destination. Use of PAC in PLT sequences is a contract
between the dynamic loader and the static linker, it is independent of
whether the relocatable objects use PAC.
BTI and PAC are independent features that can be combined. So we can have
several combinations of PLT:
- Standard with no BTI or PAC
- BTI PLT with "BTI c" as first instruction.
- PAC PLT with "AUTIA1716" before the indirect branch to X17.
- BTIPAC PLT with "BTI c" as first instruction and "AUTIA1716" before the
first indirect branch to X17.
The use of BTI and PAC in relocatable object files are encoded by feature
bits in the .note.gnu.property section in a similar way to Intel CET. There
is one AArch64 specific program property GNU_PROPERTY_AARCH64_FEATURE_1_AND
and two target feature bits defined:
- GNU_PROPERTY_AARCH64_FEATURE_1_BTI
-- All executable sections are compatible with BTI.
- GNU_PROPERTY_AARCH64_FEATURE_1_PAC
-- All executable sections have return address signing enabled.
Due to the properties of FEATURE_1_AND the static linker can tell when all
input relocatable objects have the BTI and PAC feature bits set. The static
linker uses this to enable the appropriate PLT sequence.
Neither -> standard PLT
GNU_PROPERTY_AARCH64_FEATURE_1_BTI -> BTI PLT
GNU_PROPERTY_AARCH64_FEATURE_1_PAC -> PAC PLT
Both properties -> BTIPAC PLT
In addition to the .note.gnu.properties there are two new command line
options:
--force-bti : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_BTI and warn for every relocatable object
that does not.
--pac-plt : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_PAC. As PAC is a contract between the loader
and static linker no warning is given if it is not present in an input.
Two processor specific dynamic tags are used to communicate that a non
standard PLT sequence is being used.
DTI_AARCH64_BTI_PLT and DTI_AARCH64_BTI_PAC.
Differential Revision: https://reviews.llvm.org/D62609
llvm-svn: 362793
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.
Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.
The design goals were to provide:
- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
environments (MSVC in particular).
Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.
In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:
1. There are no competing defined symbols in a given set of libraries, or
if they exist, the program owner doesn't care which is linked to their
program.
2. There may be circular dependencies between libraries.
The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:
.section ".deplibs","MS",@llvm_dependent_libraries,1
.asciz "foo"
For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.
LLD processes the dependent library specifiers in the following way:
1. Dependent libraries which are found from the specifiers in .deplibs sections
of relocatable object files are added when the linker decides to include that
file (which could itself be in a library) in the link. Dependent libraries
behave as if they were appended to the command line after all other options. As
a consequence the set of dependent libraries are searched last to resolve
symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
strings in a .deplibs section by; first, handling the string as if it was
specified on the command line; second, by looking for the string in each of the
library search paths in turn; third, by looking for a lib<string>.a or
lib<string>.so (depending on the current mode of the linker) in each of the
library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
dependent libraries.
Rationale for the above points:
1. Adding the dependent libraries last makes the process simple to understand
from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
will affect all of the dependent libraries. There is a potential problem of
surprise for developers, who might not realize that these options would apply
to these "invisible" input files; however, despite the potential for surprise,
this is easy for developers to reason about and gives developers the control
that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
find input files. The different search methods are tried by the linker in most
obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
is not necessary: if finer control is required developers can fall back to using
the command line directly.
RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.
Differential Revision: https://reviews.llvm.org/D60274
llvm-svn: 360984
Patch by Mark Johnston!
Summary:
When the option is configured, ifunc calls do not go through the PLT;
rather, they appear as regular function calls with relocations
referencing the ifunc symbol, and the resolver is invoked when
applying the relocation. This is intended for use in freestanding
environments where text relocations are permissible and is incompatible
with the -z text option. The option is motivated by ifunc usage in the
FreeBSD kernel, where ifuncs are used to elide CPU feature flag bit
checks in hot paths. Instead of replacing the cost of a branch with that
of an indirect function call, the -z ifunc-noplt option is used to ensure
that ifunc calls carry no hidden overhead relative to normal function
calls.
Test Plan:
I added a couple of regression tests and tested the FreeBSD kernel
build using the latest lld sources.
To demonstrate the effects of the change, I used a micro-benchmark
which results in frequent invocations of a FreeBSD kernel ifunc. The
benchmark was run with and without IBRS enabled, and with and without
-zifunc-noplt configured. The observed speedup is small and consistent,
and is significantly larger with IBRS enabled:
https://people.freebsd.org/~markj/ifunc-noplt/noibrs.txthttps://people.freebsd.org/~markj/ifunc-noplt/ibrs.txt
Reviewed By: ruiu, MaskRay
Differential Revision: https://reviews.llvm.org/D61613
llvm-svn: 360685
The -n (--nmagic) disables page alignment, and acts as a -Bstatic
The -N (--omagic) does what -n does but also marks the executable segment as
writeable. As page alignment is disabled headers are not allocated unless
explicit in the linker script.
To disable page alignment in LLD we choose to set the page sizes to 1 so
that any alignment based on the page size does nothing. To set the
Target->PageSize to 1 we implement -z common-page-size, which has the side
effect of allowing the user to set the value as well.
Setting the page alignments to 1 does mean that any use of
CONSTANT(MAXPAGESIZE) or CONSTANT(COMMONPAGESIZE) in a linker script will
return 1, unlike in ld.bfd. However given that -n and -N disable paging
these probably shouldn't be used in a linker script where -n or -N is in
use.
Differential Revision: https://reviews.llvm.org/D61688
llvm-svn: 360593
Patch by Tiancong Wang.
In D36351, Call-Chain Clustering (C3) heuristic is implemented with
option --call-graph-ordering-file <file>.
This patch adds a flag --print-symbol-order=<file> to LLD, and when
specified, it prints out the symbols ordered by the heuristics to the
file. The symbols printout is helpful to those who want to understand
the heuristics and want to reproduce the ordering with
--symbol-ordering-file in later pass.
Differential Revision: https://reviews.llvm.org/D59311
llvm-svn: 357133
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.
This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`
will only emit the remarks coming from the pass `inline`.
This adds:
* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin
Differential Revision: https://reviews.llvm.org/D59268
Original llvm-svn: 355964
llvm-svn: 355984
Currently we have -Rpass for filtering the remarks that are displayed as
diagnostics, but when using -fsave-optimization-record, there is no way
to filter the remarks while generating them.
This adds support for filtering remarks by passes using a regex.
Ex: `clang -fsave-optimization-record -foptimization-record-passes=inline`
will only emit the remarks coming from the pass `inline`.
This adds:
* `-fsave-optimization-record` to the driver
* `-opt-record-passes` to cc1
* `-lto-pass-remarks-filter` to the LTOCodeGenerator
* `--opt-remarks-passes` to lld
* `-pass-remarks-filter` to llc, opt, llvm-lto, llvm-lto2
* `-opt-remarks-passes` to gold-plugin
Differential Revision: https://reviews.llvm.org/D59268
llvm-svn: 355964
With the following changes:
1) Compilation fix:
std::atomic<bool> HasStaticTlsModel = false; ->
std::atomic<bool> HasStaticTlsModel{false};
2) Adjusted the comment in code.
Initial commit message:
DF_STATIC_TLS flag indicates that the shared object or executable
contains code using a static thread-local storage scheme.
Patch checks if IE/LE relocations were used to check if the code uses
a static model. If so it sets the DF_STATIC_TLS flag.
Differential revision: https://reviews.llvm.org/D57749
----
Modified : /lld/trunk/ELF/Arch/X86.cpp
Modified : /lld/trunk/ELF/Config.h
Modified : /lld/trunk/ELF/SyntheticSections.cpp
Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model1.s
Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model2.s
Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model3.s
Added : /lld/trunk/test/ELF/Inputs/i386-static-tls-model4.s
Added : /lld/trunk/test/ELF/i386-static-tls-model.s
Modified : /lld/trunk/test/ELF/i386-tls-ie-shared.s
Modified : /lld/trunk/test/ELF/tls-dynamic-i686.s
Modified : /lld/trunk/test/ELF/tls-opt-iele-i686-nopic.s
llvm-svn: 353299
DF_STATIC_TLS flag indicates that the shared object or executable
contains code using a static thread-local storage scheme.
Patch checks if IE/LE relocations were used to check if the code uses
a static model. If so it sets the DF_STATIC_TLS flag.
Differential revision: https://reviews.llvm.org/D57749
llvm-svn: 353293
Summary:
In ld.bfd/gold, --no-allow-shlib-undefined is the default when linking
an executable. This patch implements a check to error on undefined
symbols in a shared object, if all of its DT_NEEDED entries are seen.
Our approach resembles the one used in gold, achieves a good balance to
be useful but not too smart (ld.bfd traces all DSOs and emulates the
behavior of a dynamic linker to catch more cases).
The error is issued based on the symbol table, different from undefined
reference errors issued for relocations. It is most effective when there
are DSOs that were not linked with -z defs (e.g. when static sanitizers
runtime is used).
gold has a comment that some system libraries on GNU/Linux may have
spurious undefined references and thus system libraries should be
excluded (https://sourceware.org/bugzilla/show_bug.cgi?id=6811). The
story may have changed now but we make --allow-shlib-undefined the
default for now. Its interaction with -shared can be discussed in the
future.
Reviewers: ruiu, grimar, pcc, espindola
Reviewed By: ruiu
Subscribers: joerg, emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D57385
llvm-svn: 352826
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
By default LLD will generate position independent Thunks when the --pie or
--shared option is used. Reference to absolute addresses is permitted in
other cases. For some embedded systems position independent thunks are
needed for code that executes before the MMU has been set up. The option
--pic-veneer is used by ld.bfd to force position independent thunks.
The patch adds --pic-veneer as the option is needed for the Linux kernel
on Arm.
fixes pr39886
Differential Revision: https://reviews.llvm.org/D55505
llvm-svn: 351326
`--plugin-opt=emit-llvm` is an option for LTO. It makes the linker to
combine all bitcode files and write the result to an output file without
doing codegen. Gold LTO plugin has this option.
This option is being used for some post-link code analysis tools that
have to see a whole program but don't need to see them in the native
machine code.
Differential Revision: https://reviews.llvm.org/D55717
llvm-svn: 349198
This is https://bugs.llvm.org//show_bug.cgi?id=38978
Spec says that:
"Objects may be built with the -z nodefaultlib option to
suppress any search of the default locations at runtime.
Use of this option implies that all the dependencies of an
object can be located using its runpaths.
Without this option, which is the most common case, no
matter how you augment the runtime linker's library
search path, its last element is always /usr/lib for 32-bit
objects and /usr/lib/64 for 64-bit objects."
The patch implements this option.
Differential revision: https://reviews.llvm.org/D54577
llvm-svn: 347647
Recommitting https://reviews.llvm.org/rL344544 after fixing undefined behavior
from left-shifting a negative value. Original commit message:
This support is slightly different then the X86_64 implementation in that calls
to __morestack don't need to get rewritten to calls to __moresatck_non_split
when a split-stack caller calls a non-split-stack callee. Instead the size of
the stack frame requested by the caller is adjusted prior to the call to
__morestack. The size the stack-frame will be adjusted by is tune-able through a
new --split-stack-adjust-size option.
llvm-svn: 344622
This reverts commit https://reviews.llvm.org/rL344544, which causes failures on
a undefined behaviour sanitizer bot -->
lld/ELF/Arch/PPC64.cpp:849:35: runtime error: left shift of negative value -1
llvm-svn: 344551
This support is slightly different then the X86_64 implementation in that calls
to __morestack don't need to get rewritten to calls to __moresatck_non_split
when a split-stack caller calls a non-split-stack callee. Instead the size of
the stack frame requested by the caller is adjusted prior to the call to
__morestack. The size the stack-frame will be adjusted by is tune-able through a
new --split-stack-adjust-size option.
Differential Revision: https://reviews.llvm.org/D52099
llvm-svn: 344544
Summary:
This patch adds a new flag, --warn-ifunc-textrel, to work around a glibc bug. When a code with ifunc symbols is used to produce an object file with text relocations, lld always succeeds. However, if that object file is linked using an old version of glibc, the resultant binary just crashes with segmentation fault when it is run (The bug is going to be corrected as of glibc 2.19).
Since there is no way to tell beforehand what library the object file will be linked against in the future, there does not seem to be a fool-proof way for lld to give an error only in cases where the binary will crash. So, with this change (dated 2018-09-25), lld starts to give a warning, contingent on a new command line flag that does not have a gnu counter part. The default value for --warn-ifunc-textrel is false, so lld behaviour will not change unless the user explicitly asks lld to give a warning. Users that link with a glibc library with version 2.19 or newer, or does not use ifunc symbols, or does not generate object files with text relocations do not need to take any action. Other users may consider to start passing warn-ifunc-textrel to lld to get early warnings.
Reviewers: ruiu, espindola
Reviewed By: ruiu
Subscribers: grimar, MaskRay, markj, emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D52430
llvm-svn: 343628
The access sequence for global variables in the medium and large code models use
2 instructions to add an offset to the toc-pointer. If the offset fits whithin
16-bits then the instruction that sets the high 16 bits is redundant.
This patch adds the --toc-optimize option, (on by default) and enables rewriting
of 2 instruction global variable accesses into 1 when the offset from the
TOC-pointer to the variable (or .got entry) fits in 16 signed bits. eg
addis %r3, %r2, 0 --> nop
addi %r3, %r3, -0x8000 --> addi %r3, %r2, -0x8000
This rewriting can be disabled with the --no-toc-optimize flag
Differential Revision: https://reviews.llvm.org/D49237
llvm-svn: 342602
-z interpose sets the DF_1_INTERPOSE flag, marking the object as an
interposer.
Via FreeBSD PR 230604, linking Valgrind with lld failed.
Differential Revision: https://reviews.llvm.org/D52094
llvm-svn: 342239
The code involved was simply dead. `IgnoreAll` value is used in
`maybeReportUndefined` only which is never called for -r.
And at the same time `IgnoreAll` was set only for -r.
llvm-svn: 339672
GNU ld's manual says that TARGET(foo) is basically an alias for
`--format foo` where foo is a BFD target name such as elf64-x86-64.
Unlike GNU linkers, lld doesn't allow arbitrary BFD target name for
--format. We accept only "default", "elf" or "binary". This makes
situation a bit tricky because we can't simply make TARGET an alias for
--target.
A quick code search revealed that the usage number of TARGET is very
small, and the only meaningful usage is to switch to the binary mode.
Thus, in this patch, we handle only TARGET(elf.*) and TARGET(binary).
Differential Revision: https://reviews.llvm.org/D48153
llvm-svn: 339060