Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.
Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.
There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.
The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.
As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.
In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.
llvm-svn: 265293
For now just treat such sections as non-mergeable.
Resubmit r263660 with test fix.
Differential Revision: http://reviews.llvm.org/D18225
llvm-svn: 263664
which was reverted because included
unrelative changes by mistake.
Original commit message:
[ELF] - Change all messages to lowercase to be consistent.
That is directly opposite to http://reviews.llvm.org/D18045,
which was reverted.
This patch changes all messages to start from lowercase letter if
they were not before.
That is done to be consistent with clang.
Differential revision: http://reviews.llvm.org/D18085
llvm-svn: 263337
That is directly opposite to http://reviews.llvm.org/D18045,
which was reverted.
This patch changes all messages to start from lowercase letter if
they were not before.
That is done to be consistent with clang.
Differential revision: http://reviews.llvm.org/D18085
llvm-svn: 263252
In lld we usually avoid hash lookups. In addition to that, IR names are
not fully mangled, so it is best to avoid using them whenever possible.
llvm-svn: 263248
It was discussed to make all messages be
lowercase to be consistent with clang.
(also reverts the r263128 which fixed
build bot fail after r263125)
Original commit message:
[ELF] - Consistent spelling for error/warning messages
Previously error and warnings were not consistent in lld.
Some of them started from lowercase letter, others from
uppercase. Also there was one or two which had a dot at the end.
This patch changes all messages to start from uppercase letter if
they were not before.
Differential revision: http://reviews.llvm.org/D18045
llvm-svn: 263240
Summary:
More generally, appending linkage is a special case that we don't want
to create a SymbolBody for.
Reviewers: rafael, ruiu
Subscribers: Bigcheese, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D18012
llvm-svn: 263179
Previously error and warnings were not consistent in lld.
Some of them started from lowercase letter, others from
uppercase. Also there was one or two which had a dot at the end.
This patch changes all messages to start from uppercase letter if
they were not before.
Differential revision: http://reviews.llvm.org/D18045
llvm-svn: 263125
Each object file compiled in split stack mode will have an empty
section with a special name: .note.GNU-split-stack
We don't support this in linker now, so patch just adds an error out for that.
Differential revision: http://reviews.llvm.org/D17918
llvm-svn: 263039
Summary:
Is there any other code needed for correctly handling appending linkage?
Do we need to do something more with @llvm.global_ctors in
SymbolTable.cpp:addBitcodeFile; otherwise the combined bitcode module
won't have all the global ctors.
Reviewers: rafael
Subscribers: Bigcheese, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D17975
llvm-svn: 262992
Summary:
This causes the issue in PR26872 to go away now that we aren't creating
symbols for the string literals, but that may just be concealing a
deeper problem, so best to keep that PR open.
Reviewers: rafael
Subscribers: Bigcheese, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D17971
llvm-svn: 262968
There was a known limitation for -r option:
relocations against local symbols were not supported.
For example rel[a].eh_frame sections contained relocations against sections
and that was not supported for -r before. Patch fixes that.
Differential review: http://reviews.llvm.org/D17813
llvm-svn: 262590
This patch implements the same algorithm as LLD/COFF's ICF. I'm
not going to repeat the same description about how it works, so you
want to read the comment in ICF.cpp in this patch if you want to know
the details. This algorithm should be more powerful than the ICF
algorithm implemented in GNU gold. It can even merge mutually-recursive
functions (which is harder than one might think).
ICF is a fairly effective size optimization. Here are some examples.
LLD: 37.14 MB -> 35.80 MB (-3.6%)
Clang: 59.41 MB -> 57.80 MB (-2.7%)
The lacking feature is "safe" version of ICF. This merges all
identical sections. That is not compatible with a C/C++ language
requirement that two distinct functions must have distinct addresses.
But as long as your program do not rely on the pointer equality
(which is in many cases true), your program should work with the
feature. LLD works fine for example.
GNU gold implements so-called "safe ICF" that identifies functions
that are safe to merge by heuristics -- for example, gold thinks
that constructors are safe to merge because there is no way to
take an address of a constructor in C++. We have a different idea
which David Majnemer suggested that we add NOPs at beginning of
merged functions so that two or more pointers can have distinct
values. We can do whichever we want, but this patch does not
include neither.
http://reviews.llvm.org/D17529
llvm-svn: 261912
-r, -relocatable - Generate relocatable output
Currently does not have support for files containing
relocation sections with entries that refer to local
symbols (like rel[a].eh_frame which refer to sections
and not to symbols)
Differential revision: http://reviews.llvm.org/D14382
llvm-svn: 261838
"Discarded" section is a marker for discarded sections, and we do not
use the instance except for checking its identity. In that sense, it
is just another type of a "null" pointer for InputSectionBase. So,
it doesn't have to be a real instance of InputSectionBase class.
In this patch, we no longer instantiate Discarded section but instead
use -1 as a pointer value. This eliminates a global variable which
needed initialization at startup.
llvm-svn: 261761
This patch moves BitcodeFile instantiation into createObjectFile.
Previously, we handle bitcode files outside that function and did
not handle for --whole-archive.
http://reviews.llvm.org/D17527
llvm-svn: 261663
This reduces the .rodata of scyladb from 4501932 to 4334639 bytes (1.038
times smaller).
I don't think it is critical to support tail merging, just exact
duplicates, but given the code organization it was actually a bit easier
to support both.
llvm-svn: 261327
Summary:
LLVM3.3 (and earlier) would fail to include a relocation section in
the group that the section it was relocating is in. Object files
affected by this issue have been encountered in the wild when using LLD.
This would result in a siutation like:
Section {
Index: 5
Name: .text._Z3fooIiEvv (6)
Type: SHT_PROGBITS (0x1)
Flags [ (0x206)
SHF_ALLOC (0x2)
SHF_EXECINSTR (0x4)
SHF_GROUP (0x200)
]
Address: 0x0
Offset: 0x48
Size: 5
Link: 0
Info: 0
AddressAlignment: 1
EntrySize: 0
}
Section {
Index: 6
Name: .rela.text._Z3fooIiEvv (1)
Type: SHT_RELA (0x4)
Flags [ (0x0)
]
Address: 0x0
Offset: 0x3F0
Size: 24
Link: 8
Info: 5
AddressAlignment: 8
EntrySize: 24
}
In LLD, during symbol resolution, we discard the section containing the
weak symbol, but this amounts to replacing it with
InputSection<ELFT>::Discarded.
When we later saw the corresponding relocation section, we would then
end up pusing onto InputSection<ELFT>::Discarded.RelocSections, which is
bogus.
Reviewers: ruiu, rafael
Subscribers: llvm-commits, Bigcheese
Differential Revision: http://reviews.llvm.org/D16898
llvm-svn: 259831
If object files are drawn from archive files, the error message should
be something like "conflict symbols in foo.a(bar.o) and baz.o" instead
of "conflict symbols in bar.o and baz.o". This patch implements that.
llvm-svn: 259475
In many situations, we don't want to exit at the first error even in the
process model. For example, it is better to report all undefined symbols
rather than reporting the first one that the linker picked up randomly.
In order to handle such errors, we don't need to wrap everything with
ErrorOr (thanks for David Blaikie for pointing this out!) Instead, we
can set a flag to record the fact that we found an error and keep it
going until it reaches a reasonable checkpoint.
This idea should be applicable to other places. For example, we can
ignore broken relocations and check for errors after visiting all relocs.
In this patch, I rename error to fatal, and introduce another version of
error which doesn't call exit. That function instead sets HasError to true.
Once HasError becomes true, it stays true, so that we know that there
was an error if it is true.
I think introducing a non-noreturn error reporting function is by itself
a good idea, and it looks to me that this also provides a gradual path
towards lld-as-a-library (or at least embed-lld-to-your-program) without
sacrificing code readability with lots of ErrorOr's.
http://reviews.llvm.org/D16641
llvm-svn: 259069
MipsReginfoInputSection is basically just a container of Elf_Mips_Reginfo
struct. This patch makes that struct directly accessible from others.
llvm-svn: 256984
Unlike ObjectFile or ArchiveFile, SharedFile had two parse functions,
parseSoName() and parse(). parse must have been called after parseSoName,
but that requirement was not obvious from their names. (So it looked
like you could call parse() on a shared object file right away.)
This patch rename parseRest. It is now obvious that there's no single
parse function for the shared object file.
llvm-svn: 256898
Previously, we handle archive files with --whole-archive this way:
create instances of ArchiveFile, call getMembers to obtain memory
buffers of archive members, and create ObjectFiles for the members.
We didn't call anything except getMembers if --whole-archive was
specified.
I noticed that we didn't actually have to create ArchiveFile instaces
at all for that case. All we need is to get a list of memory buffers
for members, which can be done by a non-member function.
This patch removes getMembers member function from ArchiveFile.
Also removed unnecessary code for memory management.
llvm-svn: 256893
The R_MIPS_GPREL16 / R_MIPS_GPREL32 relocations use the following
expressions for calculations:
```
local symbol: S + A + GP0 - GP
global symbol: S + A - GP
GP - Represents the final gp value, i.e. _gp symbol
GP0 - Represents the gp value used to create the relocatable object
```
The GP0 value is taken from the .reginfo data section defined by an object
file. To implement that I keep a reference to `MipsReginfoInputSection`
in the `ObjectFile` class. This reference is used by the
`ObjectFile::getMipsGp0` method to return the GP0 value.
Differential Revision: http://reviews.llvm.org/D15760
llvm-svn: 256416
There are 3 symbol types that a .bc can provide during lto: defined,
undefined, common.
Defined and undefined symbols have already been refactored. I was
working on common and noticed that absolute symbols would become an
oddity: They would be the only symbol type present in a .o but not in
a.bc.
Looking a bit more, other than the special section number they were only
used for special rules for computing values. In that way they are
similar to TLS, and we don't have a DefinedTLS.
This patch deletes it. With it we have a reasonable rule of the thumb
for having a symbol kind: It exists if it has special resolution
semantics.
llvm-svn: 256383
I am working on adding LTO support to the new ELF lld.
In order to do that, it will be necessary to represent defined and
undefined symbols that are not from ELF files. One way to do it is to
change the symbol hierarchy to look like
Defined : SymbolBody
Undefined : SymbolBody
DefinedElf<ELFT> : Defined
UndefinedElf<ELFT> : Undefined
Another option would be to use bogus Elf_Sym, but I think that is
getting a bit too hackish.
This patch does the Undefined/UndefinedElf. Split. The next one
will do the Defined/DefinedElf split.
llvm-svn: 256289
MIPS .reginfo section provides information on the registers used by
the code in the object file. Linker should collect this information and
write .reginfo section in the output file. This section contains a union
of used registers masks taken from input .reginfo sections and final
value of the `_gp` symbol.
For details see the "Register Information" section in Chapter 4 in the
following document:
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf
The patch implements .reginfo sections handling with a couple missed
features: a) it does not put output .reginfo section into the separate
REGINFO segment; b) it does not merge `ri_cprmask` masks from input
section. These features will be implemented later.
Differential Revision: http://reviews.llvm.org/D15669
llvm-svn: 256119
With this patch, lld creates PT_GNU_STACK segments only when all input
files have .note.GNU-stack sections. This is in line with other linkers
with a minor difference (we don't care about .note.GNU-stack rwx bits as
you can always remove .note.GNU-stack sections instead of setting x bit.)
At least, NetBSD loader does not understand PT_GNU_STACK segments and
reject any executables that have the section. This patch makes lld
compatible with such operating systems.
llvm-svn: 253797
PT_GNU_STACK is a entry in the elf file format which contains the access rights (read, write, execute) of the stack,
it is always generated now. By default stack is not executable in this implementation.
-z execstack can be used to make executable.
Differential revision: http://reviews.llvm.org/D14571
llvm-svn: 253145
This adds support for:
* Uniquing CIEs
* Dropping FDEs that point to dropped sections
It drops 657 488 bytes from the .eh_frame of a Release+Asserts clang.
The link time impact is smallish. Linking clang with a Release+Asserts
lld goes from 0.488064805 seconds to 0.504763060 seconds (1.034 X slower).
llvm-svn: 252790
Section garbage collection is a feature to remove unused sections
from outputs. Unused sections are sections that cannot be reachable
from known GC-root symbols or sections. Naturally the feature is
implemented as a mark-sweep garbage collector.
In this patch, I added Live bit to InputSectionBase. If and only
if Live bit is on, the section will be written to the output.
Starting from GC-root symbols or sections, a new function, markLive(),
visits all reachable sections and sets their Live bits. Writer then
ignores sections whose Live bit is off, so that such sections are
excluded from the output.
This change has small negative impact on performance if you use
the feature because making sections means more work. The time to
link Clang changes from 0.356s to 0.386s, or +8%.
It reduces Clang size from 57,764,984 bytes to 55,296,600 bytes.
That is 4.3% reduction.
http://reviews.llvm.org/D13950
llvm-svn: 251043
BSD's DSO files have undefined symbol "__progname" which is defined
in crt1.o. On that system, both user programs and system shared
libraries depend on each other.
In general, we need to put symbols defined by user programs which are
referenced by shared libraries to user program's .dynsym.
http://reviews.llvm.org/D13637
llvm-svn: 250176
This patch adds AsNeeded and IsUsed bool fields to SharedFile. AsNeeded bit
is set if the DSO is enclosed with --as-needed and --no-as-needed. IsUsed
bit is off by default. When we adds a symbol to the symbol table for dynamic
linking, we set its SharedFile's IsUsed bit.
If AsNeeded is set but IsUsed is not set, we don't want to write that
file's SO name to DT_NEEDED field.
http://reviews.llvm.org/D13579
llvm-svn: 249998
Parse and apply emulation given with -m option.
Check input files to match ELF type and machine architecture provided with -m.
Differential Revision: http://reviews.llvm.org/D13055
llvm-svn: 249529
Summary:
If --whole-archive is used, all symbols from the following archives are added to the output. --no-whole-archive restores default behavior. These switches can be used multiple times.
NB. We have to keep an ArchiveFile instance within SymbolTable even if --whole-archive mode is active since it can be a thin archive which contains just names of external files. In that case actual memory buffers for the archive members will be stored within the File member of ArchiveFile class.
Reviewers: rafael, ruiu
Subscribers: grimar, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D13286
llvm-svn: 249045
If a shared library has a DT_SONAME entry, that is what should be included
in the DT_NEEDED of a program using it.
We don't implement -soname yet, so check in a .so for now.
llvm-svn: 249025
Opening a file and dispatching to readLinkerScript() or createFile()
is a common operation. We want to use that at least from Driver and
from LinkerScript. In COFF, we had the same problem. This patch
resolves the problem in the same way as we did for COFF.
Now, if you have a path that you want to open, just call
Driver->addFile(StringRef). That function opens the file and handles
that as if that were given by command line. This function is the
only place we call identify_magic().
llvm-svn: 249023
This linker script parser and evaluator is powerful enough to read
Linux's libc.so, which is (despite its name) a linker script that
contains OUTPUT_FORMAT, GROUP and AS_NEEDED directives.
The parser implemented in this patch is a recursive-descendent one.
It does *not* construct an AST but consumes directives in place and
sets the results to Symtab object, like what Driver is doing.
This should be very fast since less objects are allocated, and
this is also more readable.
http://reviews.llvm.org/D13232
llvm-svn: 248918
This is more consistent with OutputSection. This is also part of removing
the "Chunk" term from the ELF linker, since we just have input/output sections
and program headers.
llvm-svn: 248183
Symbol table is now populated correctly, but some fields are missing,
they'll be added in the future. This patch also adds --discard-all
flag, which was the default behavior until now.
Differential Revision: http://reviews.llvm.org/D12874
llvm-svn: 247849
There were at least two issues with having them together:
* For compatibility checks, we only want to look at the ELF kind.
* Adding support for shared libraries should introduce one InputFile kind,
not 4.
llvm-svn: 246707