Before this patch:
Symbol 56
Defined 80
Undefined 56
SharedSymbol 88
LazyArchive 72
LazyObject 56
With this patch
Symbol 48
Defined 72
Undefined 48
SharedSymbol 80
LazyArchive 64
LazyObject 48
The result is that peak allocation when linking chromium (according to
heaptrack) goes from 578 to 568 MB.
llvm-svn: 330874
Currently, LLD supports ASSERT as a separate command.
We support two forms now.
Assign expression-form: . = ASSERT(0x100)
(old GNU ld required it and some scripts in the wild are still using
something like . = ASSERT((_end - _text <= (512 * 1024 * 1024)), "kernel image bigger than KERNEL_IMAGE_SIZE");
Nowadays above is not a mandatory form and command-like form is commonly used:
ASSERT(<expr>, "text);
The return value of the ASSERT is Dot. That was implemented in D30171.
It looks like (2) is just a short version of (1) then.
GNU ld does *not* list ASSERT as a SECTIONS command:
https://sourceware.org/binutils/docs/ld/SECTIONS.html#SECTIONS
Given above we probably can change ASSERT to be an assignment to Dot.
That makes the rest of the code much simpler. Patch do that.
Differential revision: https://reviews.llvm.org/D45434
llvm-svn: 330814
The fix is to copy Used when replacing the symbol.
Original message:
Do not keep shared symbols created from garbage-collected eliminated DSOs.
If all references to a DSO happen to be weak, and if the DSO is
specified with --as-needed, the DSO is not added to DT_NEEDED.
If that happens, we also need to eliminate shared symbols created
from the DSO. Otherwise, they become dangling references that point
to non-exsitent DSO.
Fixes https://bugs.llvm.org/show_bug.cgi?id=36991
Differential Revision: https://reviews.llvm.org/D45536
llvm-svn: 330788
It turns out we should not use the std::sort anymore.
r327219 added a new wrapper llvm::sort (D39245).
When EXPENSIVE_CHECKS is defined, it shuffles the
input container and that helps to find non-deterministic
ordering.
Patch changes code to use llvm::sort and std::stable_sort
instead of std::sort
Differential revision: https://reviews.llvm.org/D45969
llvm-svn: 330702
Our code for LazyObject and LazyArchive duplicates.
This patch extracts the common part to remove
the duplication.
Differential revision: https://reviews.llvm.org/D45516
llvm-svn: 330701
The PPC64 V2 ABI restores the toc base by loading from an offset of 24 from r1.
This patch fixes the offset and updates the testcases from V1 to V2. It also
issues an error when a nop is missing after a call to an external function.
Differential Revision: https://reviews.llvm.org/D45892
llvm-svn: 330600
Now that we don't ICF synthetic sections, we can go back to the old
logic on whose responsibility it is to check Repl.
The idea is that Sec->something() will not check Repl. It is the
responsibility of the caller to find the correct Sec.
llvm-svn: 330346
We had a single symbol using -1 with a synthetic section. It is
simpler to just update its value.
This is not a big will by itself, but will allow having a simple
getOffset for InputSeciton.
llvm-svn: 330340
Using getOffset is here was a bit of an overkill. This is being
written and has relocations. This implies it is a .eh_frame or regular
section.
llvm-svn: 330307
This is causing large numbers of Chromium test executables to crash on
shutdown. The relevant symbol seems to be __cxa_finalize, which gets
removed from the dynamic symbol table for some of the support libraries.
llvm-svn: 330164
MIPS ABI requires creation of the MIPS_RLD_MAP dynamic tag for non-PIE
executables only and MIPS_RLD_MAP_REL tag for both PIE and non-PIE
executables. The patch skips definition of the MIPS_RLD_MAP for PIE
files and defines MIPS_RLD_MAP_REL.
The MIPS_RLD_MAP_REL tag stores the offset to the .rld_map section
relative to the address of the tag itself.
Differential Revision: https://reviews.llvm.org/D43347
llvm-svn: 329996
If all references to a DSO happen to be weak, and if the DSO is
specified with --as-needed, the DSO is not added to DT_NEEDED.
If that happens, we also need to eliminate shared symbols created
from the DSO. Otherwise, they become dangling references that point
to non-exsitent DSO.
Fixes https://bugs.llvm.org/show_bug.cgi?id=36991
Differential Revision: https://reviews.llvm.org/D45536
llvm-svn: 329960
LLVM_ON_WIN32 is set exactly with MSVC and MinGW (but not Cygwin) in
HandleLLVMOptions.cmake, which is where _WIN32 defined too. Just use the
default macro instead of a reinvented one.
See thread "Replacing LLVM_ON_WIN32 with just _WIN32" on llvm-dev and cfe-dev.
No intended behavior change.
llvm-svn: 329696
I'm proposing a new command line flag, --warn-backrefs in this patch.
The flag and the feature proposed below don't exist in GNU linkers
nor the current lld.
--warn-backrefs is an option to detect reverse or cyclic dependencies
between static archives, and it can be used to keep your program
compatible with GNU linkers after you switch to lld. I'll explain the
feature and why you may find it useful below.
lld's symbol resolution semantics is more relaxed than traditional
Unix linkers. Therefore,
ld.lld foo.a bar.o
succeeds even if bar.o contains an undefined symbol that have to be
resolved by some object file in foo.a. Traditional Unix linkers
don't allow this kind of backward reference, as they visit each
file only once from left to right in the command line while
resolving all undefined symbol at the moment of visiting.
In the above case, since there's no undefined symbol when a linker
visits foo.a, no files are pulled out from foo.a, and because the
linker forgets about foo.a after visiting, it can't resolve
undefined symbols that could have been resolved otherwise.
That lld accepts more relaxed form means (besides it makes more
sense) that you can accidentally write a command line or a build
file that works only with lld, even if you have a plan to
distribute it to wider users who may be using GNU linkers. With
--check-library-dependency, you can detect a library order that
doesn't work with other Unix linkers.
The option is also useful to detect cyclic dependencies between
static archives. Again, lld accepts
ld.lld foo.a bar.a
even if foo.a and bar.a depend on each other. With --warn-backrefs
it is handled as an error.
Here is how the option works. We assign a group ID to each file. A
file with a smaller group ID can pull out object files from an
archive file with an equal or greater group ID. Otherwise, it is a
reverse dependency and an error.
A file outside --{start,end}-group gets a fresh ID when
instantiated. All files within the same --{start,end}-group get the
same group ID. E.g.
ld.lld A B --start-group C D --end-group E
A and B form group 0, C, D and their member object files form group
1, and E forms group 2. I think that you can see how this group
assignment rule simulates the traditional linker's semantics.
Differential Revision: https://reviews.llvm.org/D45195
llvm-svn: 329636
Currently LLD sets OutSecOff in addSection for input sections.
That is a fake offset (just a rude approximation to remember the order),
used for sorting SHF_LINK_ORDER sections
(see resolveShfLinkOrder, compareByFilePosition).
There are 2 problems with such approach:
1. We currently change and reuse Size field as a value assigned. Changing size is
not good because leads to bugs. Currently, SIZEOF(.bss) for empty .bss returns 2
because we add two empty synthetic sections and increase size twice by 1.
(See PR37011: https://bugs.llvm.org/show_bug.cgi?id=37011)
2. Such approach simply does not work when --symbol-ordering-file is involved,
because processing of the ordering file might break the initial section order.
This fixes PR37011.
Differential revision: https://reviews.llvm.org/D45368
llvm-svn: 329560
This is for PR36716 and
this enables emitting STT_FILE symbols.
Output size affect is minor:
lld binary size changes from 52,883,408 to 52,949,400
clang binary size changes from 83,136,456 to 83,219,600
Differential revision: https://reviews.llvm.org/D45261
llvm-svn: 329557
Some system libraries have a lot of versioned symbols. When linking
scylla this brings the number of malloc calls from 49154 to 37944.
llvm-svn: 329453
Currently there are a few odd things about the warning about symbols
that cannot be ordered. This patch fixes:
* When there is an undefined symbol that resolves to a shared file, we
were printing the location of the undefined reference.
* If there are multiple comdats, we were reporting them all.
llvm-svn: 329371
This is similar to r329219, but for the entire section. Like r329219 I
don't expect this to have any real impact, it is just more consistent
and simpler.
llvm-svn: 329367
Previously, "size" column is 9 characters long which is too long
at least for 32-bit (because at maximum it needs 8 columns). This
patch make it one column shorter than before. That's also a reasonable
default for 64-bit.
llvm-svn: 329317
As was mentioned in comments for D45158,
isPicRel's name does not make much sense,
because what this method does is checks if
we need to create the dynamic relocation or not.
Instead of renaming it to something different,
we can 'isPicRel' completely.
We can reuse the getDynRel method.
They are logically very close, getDynRel can just return
R_*_NONE in case no dynamic relocation should be produced
and that would simplify things and avoid functionality
correlation/duplication with 'isPicRel'.
The patch does this change.
Differential revision: https://reviews.llvm.org/D45248
llvm-svn: 329275
Currently, LLD print symbol assignment commands to the map file,
but it does not do that for assignments that are outside of the section
descriptions. Such assignments can affect the layout though.
The patch implements the following:
* Teaches LLD to print symbol assignments outside of section declaration.
* Teaches LLD to print PROVIDE/HIDDEN/PROVIDE hidden commands.
In case when symbol is not provided, nothing will be printed.
Differential revision: https://reviews.llvm.org/D44894
llvm-svn: 329272
Currently, LLD prints VA, but not LMA in a map file.
It seems can be useful to print both to reveal layout
details and patch implements it.
Differential revision: https://reviews.llvm.org/D44899
llvm-svn: 329271
We were ignoring the addend if the piece was dead. I don't expect this
to make a difference in any real world situations, but it is simpler
anyway.
llvm-svn: 329219
isPicRel is used to check if we want to create the dynamic relocations.
Not all of the dynamic relocations we create are passing through this
check, but those that are, probably better be whitelisted.
Differential revision: https://reviews.llvm.org/D45252
llvm-svn: 329203
In the lld perf builder r328686 had a negative impact in
stalled-cycles-frontend. Somehow that stat is not showing on my
machine, but the attached patch shows an improvement on cache-misses,
which is probably a reasonable proxy.
My working theory is that given a large input the pieces vector is out
of cache by the time initOffsetMap runs.
Both finalizeContents implementation have a convenient location for
initializing the OffsetMap, so this seems the best solution.
llvm-svn: 329117
Summary:
r328610 fixed a data race in the COFF linker. This change makes a
similar fix to the ELF linker.
Reviewers: ruiu, pcc, rnk
Reviewed By: ruiu
Subscribers: emaste, llvm-commits, arichardson
Differential Revision: https://reviews.llvm.org/D45192
llvm-svn: 329088
Added checks to test that we do not produce
output where VA of sections overruns the address
space available.
Differential revision: https://reviews.llvm.org/D43820
llvm-svn: 329063
This fixes PR36927.
The issue is next. Imagine we have -Ttext 0x7c and code below.
.code16
.global _start
_start:
movb $_start+0x83,%ah
So we have R_386_8 relocation and _start at 0x7C.
Addend is 0x83 == 131. We will sign extend it to 0xffffffffffffff83.
Now, 0xffffffffffffff83 + 0x7c gives us 0xFFFFFFFFFFFFFFFF.
Techically 0x83 + 0x7c == 0xFF, we do not exceed 1 byte value, but
currently LLD errors out, because we use checkUInt<8>.
Let's try to use checkInt<8> now and the following code to see if it can help (no):
main.s:
.byte foo
input.s:
.globl foo
.hidden foo
foo = 0xff
Here, foo is 0xFF. And addend is 0x0. Final value is 0x00000000000000FF.
Again, it fits one byte well, but with checkInt<8>,
we would error out it, so we can't use it.
What we want to do is to check that the result fits 1 byte well.
Patch changes the check to checkIntUInt to fix the issue.
Differential revision: https://reviews.llvm.org/D45051
llvm-svn: 329061
Having 8/16 bits dynamic relocations is incorrect.
Both gold and bfd (built from latest sources) disallow
that too.
Differential revision: https://reviews.llvm.org/D45158
llvm-svn: 329059
OffsetMap maps to a SectionPiece index, but we were not taking
advantage of that in getSectionPiece.
With this patch both getOffset and getSectionPiece use OffsetMap and
the binary search is moved to findSectionPiece.
llvm-svn: 329044
They are to pull out an object file for a symbol, but for a historical
reason the code is written in two separate functions. This patch
merges them.
llvm-svn: 329039
The Plt relative relocations are R_PPC64_JMP_SLOT in the V2 abi, and we only
reserve 2 double words instead of 3 at the start of the array of PLT entries for
lazy linking.
Differential Revision: https://reviews.llvm.org/D44951
llvm-svn: 329006
It generally does not matter much where we place sections ordered
by --symbol-ordering-file relative to other sections. But if the
ordered sections are hot (which is the case already for some users
of --symbol-ordering-file, and is increasingly more likely to be
the case once profile-guided section layout lands) and the target
has limited-range branches, it is beneficial to place the ordered
sections in the middle of the output section in order to decrease
the likelihood that a range extension thunk will be required to call
a hot function from a cold function or vice versa.
That is what this patch does. After D44966 it reduces the size of
Chromium for Android's .text section by 60KB.
Differential Revision: https://reviews.llvm.org/D44969
llvm-svn: 328905
Now that we have the ability to create short thunks, it is beneficial
for thunk sections to be surrounded by ThunkSectionSpacing bytes
of code on both sides in order to increase the likelihood that the
distance from the thunk to the target will be sufficiently small to
allow for the creation of a short thunk. This is currently the case
for most thunks that we create, except for the last one, which could,
depending on the size of the output section, potentially appear near
the end and therefore have a relatively small amount of code after it.
This patch moves the last thunk section to ThunkSectionSpacing bytes
before the end of the output section, as long as the section is larger
than 2*ThunkSectionSpacing bytes. It reduces the size of Chromium
for Android's .text section by 32KB.
Differential Revision: https://reviews.llvm.org/D44966
llvm-svn: 328889
I tried a few different designs to find a way to implement it without
too much hassle and settled down with this. Unlike before, object files
given as arguments for --just-symbols are handled as object files, with
an exception that their section tables are handled as if they were all
null.
Differential Revision: https://reviews.llvm.org/D42025
llvm-svn: 328852
A short thunk uses a direct branch (b or b.w) instruction, and is used
when the target has the same thumbness as the thunk and is within
direct branch range (32MB for ARM, 16MB for Thumb-2). Reduces the
size of Chromium for Android's .text section by around 160KB.
Differential Revision: https://reviews.llvm.org/D44963
llvm-svn: 328846
This patch fixes an issue introduced in r328810 which made the algorithm
to always run the loop O(n^2) times, though we can break early. The
output remains the same.
llvm-svn: 328811
The PLT retpoline support for X86 and X86_64 did not include the padding
when writing the header and entries. This issue was revealed when linker
scripts were used, as this disables the built-in behaviour of filling
the last page of executable segments with trap instructions. This
particular behaviour was hiding the missing padding.
Added retpoline tests with linker scripts.
Differential Revision: https://reviews.llvm.org/D44682
llvm-svn: 328777
NonLocal is technically more accurate, but we already use the term
"Global" to specify the non-local part of the symbol table, and
Local <-> Global is easier to digest.
llvm-svn: 328740
This fixes pr36623.
The problem is that we have to parse versions out of names before LTO
so that LTO can use that information.
When we get the LTO produced .o files, we replace the previous symbols
with the LTO produced ones, but they still have @ in their names.
We could just trim the name directly, but calling parseSymbolVersion
to do it is simpler.
llvm-svn: 328738
Also make certain Thunk methods non-const as this will be required for
an upcoming change.
Differential Revision: https://reviews.llvm.org/D44961
llvm-svn: 328732
Some tools (dwarfdump for example) get confused by the current -O0 -r
output since it has multiple copies of .debug_str.
We cannot just merge sections with the same name as they can have
different sh_entsize.
We could have duplicated logic for merging sections based on name and
sh_entsize, but it seems better to just use the existing logic by
enabling optimizations.
llvm-svn: 328640
The Data member of synthetic section's is not valid and empty. The Data
member is required to be valid by ICF as it is used by ICF to determine
the equality of section contents. Therefore, exclude synthetic sections
from ICF.
Fixes bug PR36910.
Differential Revision: https://reviews.llvm.org/D44923
llvm-svn: 328624
SharedFile::parseRest function grew organically and got a bit hard to
understand. This patch refactor it. This patch also adds comments.
Differential Revision: https://reviews.llvm.org/D44860
llvm-svn: 328579
When the target saves ElfSym::GlobalOffsetTable in the .got rather than
.got.plt, Target->GotHeaderEntriesNum states the number of extra entries
required in the .got. Rather than having to add Target->GotHeaderEntriesNum to
NumEntries in every function which refers to NumEntries, this patch changes the
initial value of NumEntries in the constructor.
Differential Revision: https://reviews.llvm.org/D44744
llvm-svn: 328559
Currently, we might have a bug with scripts like below:
.foo : ALIGN(8)
{
*(.foo)
} > ram
because do not expand the memory region when doing ALIGN.
This might result in file range overlaps. The patch fixes the issue.
Differential revision: https://reviews.llvm.org/D44730
llvm-svn: 328479
Currently when we build input sections list in linker script
we ignore all rel[a] sections. That was done to support
scripts like .rela.dyn : { *(.rela.data) } for emit relocs.
Though as a result following scripts were also silently ignored:
/DISCARD/ : { *(.rela.plt)
/DISCARD/ : { *(.rela.dyn)
and we produced output with this sections. That is not ideal.
The solution this patch suggests is simple: do not ignore synthetic
rel[a] sections. That way we can enable common discarding logic
for them and report a proper error.
Differential revision: https://reviews.llvm.org/D41640
llvm-svn: 328419
Previously, we used 0 as an alias for VER_NDX_GLOBAL and had a dummy
entry in SharedFile::Verdefs so that the access to the array is within
its boundary. But that's not straightforwad. We can just stop doing both.
llvm-svn: 328401
Since SectionBase::getOutputSection handles ICF replaces and
SectionBase::getOffset was handling it in some cases, it is more
consistent to have getOffset always handle it.
llvm-svn: 328391
When looking for the output section and the output offset the
expectation was that the caller had looked at Repl. That works fine
for InputSections, but in the case of MergeInputSections the caller
doesn't have the section that is actually replaced.
The original testcase was failing because getOutputSection was
returning null. The slightly extended testcase also checks that
getOffset also checks Repl.
I will send a refactoring separetelly.
llvm-svn: 328332
This fixes PR36367 which is about segfault when --emit-relocs is
used together with .eh_frame sections which happens because
of reordering of regular and .rel[a] sections.
Path changes loop that iterates over input sections to create
relocation target sections first.
Differential revision: https://reviews.llvm.org/D44679
llvm-svn: 328299
The relocations R_PPC64_REL16_LO and R_PPC64_REL16_HA should return R_PC
for getRelExpr since they compute #lo(S + A – P) and #ha(S + A – P).
Differential Revision: https://reviews.llvm.org/D44648
llvm-svn: 328103
Patch teaches LLD to hint user about -fdebug-types-section flag
if relocation overflow happens in debug section.
Differential revision: https://reviews.llvm.org/D40954
llvm-svn: 328081
There are no reasons for them to be STV_DEFAULT,
recently bfd did the same change.
Differential revision: https://reviews.llvm.org/D44566
llvm-svn: 327983
Choosing a Shift2 value based on wordsize is cargo-culted from gold.
Assuming that djb hash is a good hash function, choosing bits [4,9]
shouldn't be any worse or better than choosing bits [5,10]. We shouldn't
have copied that behavior that we can't justify in the first place.
Differential Revision: https://reviews.llvm.org/D44547
llvm-svn: 327921
We found that when you pass --allow-multiple-definitions or `-z muldefs`
to GNU linkers, they don't complain about duplicate symbols at all. They
don't even print out warnings on it. We emit warnings in that case.
If you pass --fatal-warnings, that difference results in a link failure.
Differential Revision: https://reviews.llvm.org/D44549
llvm-svn: 327920
This patch adds changes to start supporting the Power 64-Bit ELF V2 ABI.
This includes:
- Changing the ElfSym::GlobalOffsetTable to be named .TOC.
- Creating a GotHeader so the first entry in the .got is .TOC.
- Setting the e_flags to be 1 for ELF V1 and 2 for ELF V2
Differential Revision: https://reviews.llvm.org/D44483
llvm-svn: 327871
This is the same as 327248 except Arm defining _GLOBAL_OFFSET_TABLE_ to
be the base of the .got section as some existing code is relying upon it.
For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at
the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] =
reserved value that is by convention the address of the dynamic section.
Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end
of the .got section with the intention that the .got.plt section would
follow the .got. However this does not always hold with the current
default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent
with the reserved first entry of the .got.plt.
X86, X86_64 and AArch64 will use the .got.plt. Arm, Mips and Power use .got
Fixes PR36555
Differential Revision: https://reviews.llvm.org/D44259
llvm-svn: 327823
This change broke ARM code that expects to be able to add
_GLOBAL_OFFSET_TABLE_ to the result of an R_ARM_REL32.
I will provide a reproducer on llvm-commits.
llvm-svn: 327688
Patch teaches LLD to print BYTE/SHORT/LONG/QUAD and
location move commands to the map file.
Differential revision: https://reviews.llvm.org/D44004
llvm-svn: 327612
This is an option to print out a table of symbols and filenames.
The output format of this option is the same as GNU, so that it can be
processed by the same scripts as before after migrating from GNU to lld.
This option is mildly useful; we can live without it. But it is pretty
convenient sometimes, and it can be implemented in 50 lines of code, so
I think lld should support this option.
Differential Revision: https://reviews.llvm.org/D44336
llvm-svn: 327565
This "fixes" PR36678 by just producing an error when we find a case
where we would produce an plt entry that used ebx but ebx would not be
set.
llvm-svn: 327542
Currently, we can end up with NBuckets==0 and android loader
does not like it (PR36537).
Seems we can go with a minimal amount of changes here for simplicity
and be consistent with gold and so just always use >= 1 value for NBuckets.
Differential revision: https://reviews.llvm.org/D44422
llvm-svn: 327481
This finishes PR35877.
INSERT BEFORE used similar to INSERT AFTER,
it inserts sections before the given target section.
Differential revision: https://reviews.llvm.org/D44380
llvm-svn: 327378
This is part of PR36515.
With some linkerscripts it is possible to get file offset overlaps
and overflows. Currently LLD checks overlaps in checkNoOverlappingSections().
And also we allow broken output with --no-inhibit-exec.
Problem is that sometimes final offset of sections is completely broken
and we calculate output file size wrong and might crash.
Patch implements check to verify that there is no output section
which offset exceeds file size.
Differential revision: https://reviews.llvm.org/D43819
llvm-svn: 327376
This fixes PR36598.
LLD currently crashes when we have empty output section
with SHF_LINK_ORDER flag. This might happen if we place an
empty synthetic section in the linker script, but keep output
section alive with the use of additional symbol, for example.
The patch fixes the issue by dropping all special flags
for empty sections.
Differential revision: https://reviews.llvm.org/D44376
llvm-svn: 327374
AFTER keyword is mandatory and consume() was
used by mistake here. We accepted broken script before
this patch, testcase shows the issue.
llvm-svn: 327260
the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] =
reserved value that is by convention the address of the dynamic section.
Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end
of the .got section with the intention that the .got.plt section would
follow the .got. However this does not always hold with the current
default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent
with the reserved first entry of the .got.plt.
X86, X86_64, Arm and AArch64 will use the .got.plt. Mips and Power use .got
Fixes PR36555
Differential Revision: https://reviews.llvm.org/D44259
llvm-svn: 327248