In LLDB, when parsing type units, we don't need to parse the whole line
table. Instead, we only need to parse the "support files" from the line
table prologue.
To make that possible, this patch moves the respective functions from
the LineTable into the Prologue. Because I don't think users of the
LineTable should have to know that these files come from the Prologue,
I've left the original methods in place, and made them redirect to the
LineTable.
Differential revision: https://reviews.llvm.org/D64774
llvm-svn: 366164
This removes a call to `object::getSymbol<ELFT>`.
We used this function in a next way: it was given an
array of symbols and index and returned either a symbol
at the index given or a error.
This function was removed in D64631.
(rL366052, but was reverted because of LLD compilation error
that I didn't know about).
It does not make much sense to keep this function on LLVM side
only for LLD, because having only a list of symbols and the index it
is not able to produce a valueable error message about context anyways.
llvm-svn: 366057
This patch does the same thing as r365595 to other subdirectories,
which completes the naming style change for the entire lld directory.
With this, the naming style conversion is complete for lld.
Differential Revision: https://reviews.llvm.org/D64473
llvm-svn: 365730
This patch is mechanically generated by clang-llvm-rename tool that I wrote
using Clang Refactoring Engine just for creating this patch. You can see the
source code of the tool at https://reviews.llvm.org/D64123. There's no manual
post-processing; you can generate the same patch by re-running the tool against
lld's code base.
Here is the main discussion thread to change the LLVM coding style:
https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html
In the discussion thread, I proposed we use lld as a testbed for variable
naming scheme change, and this patch does that.
I chose to rename variables so that they are in camelCase, just because that
is a minimal change to make variables to start with a lowercase letter.
Note to downstream patch maintainers: if you are maintaining a downstream lld
repo, just rebasing ahead of this commit would cause massive merge conflicts
because this patch essentially changes every line in the lld subdirectory. But
there's a remedy.
clang-llvm-rename tool is a batch tool, so you can rename variables in your
downstream repo with the tool. Given that, here is how to rebase your repo to
a commit after the mass renaming:
1. rebase to the commit just before the mass variable renaming,
2. apply the tool to your downstream repo to mass-rename variables locally, and
3. rebase again to the head.
Most changes made by the tool should be identical for a downstream repo and
for the head, so at the step 3, almost all changes should be merged and
disappear. I'd expect that there would be some lines that you need to merge by
hand, but that shouldn't be too many.
Differential Revision: https://reviews.llvm.org/D64121
llvm-svn: 365595
Some variables in lld have the same name as functions ignoring case.
This patch gives them different names, so that my next patch is easier
to read.
llvm-svn: 365003
Add Triple::riscv64 and Triple::riscv32 to getBitcodeMachineKind for get right
e_machine during LTO.
Reviewed By: ruiu, MaskRay
Differential Revision: https://reviews.llvm.org/D52165
llvm-svn: 364996
This restores r361830 "[ELF] Error on relocations to STT_SECTION symbols if the sections were discarded"
and dependent commits (r362218, r362497) which were reverted by r364321, with a fix of a --gdb-index issue.
.rela.debug_ranges contains relocations of range list entries:
// start address of a range list entry
// old: 0; after r361830: 0
00000000000033a0 R_X86_64_64 .text._ZN2v88internal7Isolate7factoryEv + 0
// end address of a range list entry
// old: 0xe; after r361830: 0
00000000000033a8 R_X86_64_64 .text._ZN2v88internal7Isolate7factoryEv + e
If both start and end addresses of a range list entry resolve to 0,
DWARFDebugRangeList::isEndOfListEntry() will return true, then the
.debug_range decoding loop will terminate prematurely:
while (true) {
decode StartAddress
decode EndAddress
if (Entry.isEndOfListEntry()) // prematurely
break;
Entries.push_back(Entry);
}
In lld/ELF/SyntheticSections.cpp, readAddressAreas() will read
incomplete address ranges and the resulting .gdb_index will be
incomplete. For files that gdb hasn't loaded their debug info, gdb uses
.gdb_index to map addresses to CUs. The absent entries make gdb fail to
symbolize some addresses.
To address this issue, we simply allow relocations to undefined symbols
in DWARF.cpp:findAux() and let RelocationResolver resolve them.
This patch should fix:
[1] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190603/659848.html
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=978067
llvm-svn: 364391
(In effect, reverting "[ELF] Error on relocations to STT_SECTION symbols if the sections were discarded".)
It caused debug info problems in LibreOffice [1] and Chromium/V8 [2].
Reverting until those can be fixed.
It also reverts r362497 "STT_SECTION symbol should be defined" on .eh_frame, .debug*, .zdebug* and .gcc_except_table"
which was landed as a follow-up to the above.
> With -r or --emit-relocs, we warn `STT_SECTION symbol should be defined`
> on relocations to discarded section symbol. This was added as an error
> in rLLD319404, but was not so effective before D61583 (it turned the
> error to a warning).
>
> Relocations from .eh_frame .debug* .zdebug* .gcc_except_table to
> discarded .text are very common and somewhat expected. Don't warn/error
> on them. As a reference, ld.bfd has a similar logic in
> _bfd_elf_default_action_discarded() to allow these cases.
>
> Delete invalid-undef-section-symbol.test because what it intended to
> check is now covered by the updated comdat-discarded-reloc.s
>
> Delete relocatable-eh-frame.s because we allow relocations from
> .eh_frame as a special case now.
And finally it reverts r362218 "[ELF] Replace a dead test in getSymVA() with assert()"
as that also depended on the main change reverted here.
> Symbols relative to discarded comdat sections are Undefined instead of
> Defined now (after D59649 and D61583). The `== &InputSection::Discarded`
> test becomes dead. I cannot find a test related to this behavior.
[1] http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190603/659848.html
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=978067
llvm-svn: 364321
Branch Target Identification (BTI) and Pointer Authentication (PAC) are
architecture features introduced in v8.5a and 8.3a respectively. The new
instructions have been added in the hint space so that binaries take
advantage of support where it exists yet still run on older hardware. The
impact of each feature is:
BTI: For executable pages that have been guarded, all indirect branches
must have a destination that is a BTI instruction of the appropriate type.
For the static linker, this means that PLT entries must have a "BTI c" as
the first instruction in the sequence. BTI is an all or nothing
property for a link unit, any indirect branch not landing on a valid
destination will cause a Branch Target Exception.
PAC: The dynamic loader encodes with PACIA the address of the destination
that the PLT entry will load from the .plt.got, placing the result in a
subset of the top-bits that are not valid virtual addresses. The PLT entry
may authenticate these top-bits using the AUTIA instruction before
branching to the destination. Use of PAC in PLT sequences is a contract
between the dynamic loader and the static linker, it is independent of
whether the relocatable objects use PAC.
BTI and PAC are independent features that can be combined. So we can have
several combinations of PLT:
- Standard with no BTI or PAC
- BTI PLT with "BTI c" as first instruction.
- PAC PLT with "AUTIA1716" before the indirect branch to X17.
- BTIPAC PLT with "BTI c" as first instruction and "AUTIA1716" before the
first indirect branch to X17.
The use of BTI and PAC in relocatable object files are encoded by feature
bits in the .note.gnu.property section in a similar way to Intel CET. There
is one AArch64 specific program property GNU_PROPERTY_AARCH64_FEATURE_1_AND
and two target feature bits defined:
- GNU_PROPERTY_AARCH64_FEATURE_1_BTI
-- All executable sections are compatible with BTI.
- GNU_PROPERTY_AARCH64_FEATURE_1_PAC
-- All executable sections have return address signing enabled.
Due to the properties of FEATURE_1_AND the static linker can tell when all
input relocatable objects have the BTI and PAC feature bits set. The static
linker uses this to enable the appropriate PLT sequence.
Neither -> standard PLT
GNU_PROPERTY_AARCH64_FEATURE_1_BTI -> BTI PLT
GNU_PROPERTY_AARCH64_FEATURE_1_PAC -> PAC PLT
Both properties -> BTIPAC PLT
In addition to the .note.gnu.properties there are two new command line
options:
--force-bti : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_BTI and warn for every relocatable object
that does not.
--pac-plt : Act as if all relocatable inputs had
GNU_PROPERTY_AARCH64_FEATURE_1_PAC. As PAC is a contract between the loader
and static linker no warning is given if it is not present in an input.
Two processor specific dynamic tags are used to communicate that a non
standard PLT sequence is being used.
DTI_AARCH64_BTI_PLT and DTI_AARCH64_BTI_PAC.
Differential Revision: https://reviews.llvm.org/D62609
llvm-svn: 362793
Any symbols defined in the LTO object are by definition the ones we
want in the final output so we skip the comdat group checking in those
cases.
This change makes the ELF code more explicit about this and means
that wasm and ELF do this in the same way.
Differential Revision: https://reviews.llvm.org/D62884
llvm-svn: 362625
Although many relocatable objects will have a single
GNU_PROPERTY_X86_FEATURE_1_AND in the .note.gnu.property section it is
permissible to have more than one, and there are tests in ld.bfd that use
it. The behavior that ld.bfd follows is to set the feature bit for a
relocatable object if any of the GNU_PROPERTY_X86_FEATURE_1_AND
have the feature bit set.
Differential Revision: https://reviews.llvm.org/D62862
llvm-svn: 362591
This is implemented by creating Undefined (instead of Defined) for such
local STT_SECTION symbols. It allows us to catch errors when there are
relocations to such discarded sections (e.g. in PR41693, ld.bfd and gold
error but we don't). Updated comdat-discarded-error.s checks we emit
friendly error message.
For relocatable-eh-frame.s, ld.lld -r a.o a.o will now error
"STT_SECTION symbol should be defined" because the section .eh_frame
refers to is now an Undefined instead of a Defined.
So I have to change `error()` to `warn()` to retain the output.
rLLD361144 inadvertently enabled the error for --gdb-index
(in LLDDwarfObj<ELFT>::findAux()).
Relocations from .debug_info (not in comdat) to .text.* (in comdat) for
DW_AT_low_pc are common. If an .text.* was discarded, rLLD361144 would error,
which was unexpected. (Note, if we don't error as this patch does,
InputSection::relocateNonAlloc() will resolve such relocations).
llvm-svn: 361830
This is implemented by creating Undefined (instead of Defined) for such
local STT_SECTION symbols. It allows us to catch errors when there are
relocations to such discarded sections (e.g. in PR41693, ld.bfd and gold
error but we don't). Updated comdat-discarded-error.s checks we emit
friendly error message.
For relocatable-eh-frame.s, ld.lld -r a.o a.o will now error
"STT_SECTION symbol should be defined" because the section .eh_frame
refers to is now an Undefined instead of a Defined.
So I have to change `error()` to `warn()` to retain the output.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D61583
llvm-svn: 361792
This patch simplifies ELFFile instance initialization by merging
two similar functions into a single function and call it from the
ctor.
llvm-svn: 361789
My recent commits separated symbol resolution from the symbol table,
so the functions to resolve symbols are now in a somewhat wrong file.
This patch moves it to Symbols.cpp.
The functions are now member functions of the symbol.
This is code move change. I modified function names so that they are
appropriate as member functions, though. No functionality change
intended.
Differential Revision: https://reviews.llvm.org/D62290
llvm-svn: 361474
--{start,end}-lib give files grouped by the options the archive file
semantics. That is, each object file between them acts as if it were
in an archive file whose sole member is the file.
Therefore, files between --{start,end}-lib are linked to the final
output only if they are needed to resolve some undefined symbols.
Previously, the feature was implemented this way:
1. We read a symbol table and insert defined symbols to the symbol
table as lazy symbols.
2. If an undefind symbol is resolved to a lazy symbol, that lazy
symbol instantiate ObjFile class for that symbol, which re-insert
all defined symbols to the symbol table.
So, if an ObjFile is instantiated, defined symbols are inserted to the
symbol table twice. Since inserting long symbol names is not cheap,
there's a room to optimize here.
This patch optimzies it. Now, LazyObjFile remembers symbol handles and
passed them over to a new ObjFile instance, so that the ObjFile
doesn't insert the same strings.
Here is a quick benchmark to link clang. "Original" is the original
lld with unmodified command line options. For "Case 1" and "Case 2", I
extracted all files from archive files and replace .a's in a command
line with .o's wrapped with --{start,end}-lib. I used the original lld
for Case 1" and use this patch for Case 2.
Original: 5.892
Case 1: 6.001 (+1.8%)
Case 2: 5.701 (-3.2%)
So, interestingly, --{start,end}-lib are now faster than the regular
linking scheme with archive files. That's perhaps not too surprising,
though, because for regular archive files, we look up the symbol table
with the same string twice.
Differential Revision: https://reviews.llvm.org/D62188
llvm-svn: 361473
Rather than report "undefined symbol: ", give more informative message
about the object file that defines the discarded section.
In particular, PR41133, if the section is a discarded COMDAT, print the
section group signature and the object file with the prevailing
definition. This is useful to track down some ODR issues.
We need to
* add `uint32_t DiscardedSecIdx` to Undefined for this feature.
* make ComdatGroups public and change its type to DenseMap<CachedHashStringRef, const InputFile *>
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D59649
llvm-svn: 361359
This patch implements a limited form of autolinking primarily designed to allow
either the --dependent-library compiler option, or "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in
C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically
add the specified library to the link when processing the input file generated
by the compiler.
Currently this extension is unique to LLVM and LLD. However, care has been taken
to design this feature so that it could be supported by other ELF linkers.
The design goals were to provide:
- A simple linking model for developers to reason about.
- The ability to to override autolinking from the linker command line.
- Source code compatibility, where possible, with "comment lib" pragmas in other
environments (MSVC in particular).
Dependent library support is implemented differently for ELF platforms than on
the other platforms. Primarily this difference is that on ELF we pass the
dependent library specifiers directly to the linker without manipulating them.
This is in contrast to other platforms where they are mapped to a specific
linker option by the compiler. This difference is a result of the greater
variety of ELF linkers and the fact that ELF linkers tend to handle libraries in
a more complicated fashion than on other platforms. This forces us to defer
handling the specifiers to the linker.
In order to achieve a level of source code compatibility with other platforms
we have restricted this feature to work with libraries that meet the following
"reasonable" requirements:
1. There are no competing defined symbols in a given set of libraries, or
if they exist, the program owner doesn't care which is linked to their
program.
2. There may be circular dependencies between libraries.
The binary representation is a mergeable string section (SHF_MERGE,
SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES
(0x6fff4c04). The compiler forms this section by concatenating the arguments of
the "comment lib" pragmas and --dependent-library options in the order they are
encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs
sections with the normal mergeable string section rules. As an example, #pragma
comment(lib, "foo") would result in:
.section ".deplibs","MS",@llvm_dependent_libraries,1
.asciz "foo"
For LTO, equivalent information to the contents of a the .deplibs section can be
retrieved by the LLD for bitcode input files.
LLD processes the dependent library specifiers in the following way:
1. Dependent libraries which are found from the specifiers in .deplibs sections
of relocatable object files are added when the linker decides to include that
file (which could itself be in a library) in the link. Dependent libraries
behave as if they were appended to the command line after all other options. As
a consequence the set of dependent libraries are searched last to resolve
symbols.
2. It is an error if a file cannot be found for a given specifier.
3. Any command line options in effect at the end of the command line parsing apply
to the dependent libraries, e.g. --whole-archive.
4. The linker tries to add a library or relocatable object file from each of the
strings in a .deplibs section by; first, handling the string as if it was
specified on the command line; second, by looking for the string in each of the
library search paths in turn; third, by looking for a lib<string>.a or
lib<string>.so (depending on the current mode of the linker) in each of the
library search paths.
5. A new command line option --no-dependent-libraries tells LLD to ignore the
dependent libraries.
Rationale for the above points:
1. Adding the dependent libraries last makes the process simple to understand
from a developers perspective. All linkers are able to implement this scheme.
2. Error-ing for libraries that are not found seems like better behavior than
failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options which
will affect all of the dependent libraries. There is a potential problem of
surprise for developers, who might not realize that these options would apply
to these "invisible" input files; however, despite the potential for surprise,
this is easy for developers to reason about and gives developers the control
that they may require.
4. This algorithm takes into account all of the different ways that ELF linkers
find input files. The different search methods are tried by the linker in most
obvious to least obvious order.
5. I considered adding finer grained control over which dependent libraries were
ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this
is not necessary: if finer control is required developers can fall back to using
the command line directly.
RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html.
Differential Revision: https://reviews.llvm.org/D60274
llvm-svn: 360984
This is the last patch of the series of patches to make it possible to
resolve symbols without asking SymbolTable to do so.
The main point of this patch is the introduction of
`elf::resolveSymbol(Symbol *Old, Symbol *New)`. That function resolves
or merges given symbols by examining symbol types and call
replaceSymbol (which memcpy's New to Old) if necessary.
With the new function, we have now separated symbol resolution from
symbol lookup. If you already have a Symbol pointer, you can directly
resolve the symbol without asking SymbolTable to do that.
Now that the nice abstraction become available, I can start working on
performance improvement of the linker. As a starter, I'm thinking of
making --{start,end}-lib faster.
--{start,end}-lib is currently unnecessarily slow because it looks up
the symbol table twice for each symbol.
- The first hash table lookup/insertion occurs when we instantiate a
LazyObject file to insert LazyObject symbols.
- The second hash table lookup/insertion occurs when we create an
ObjFile from LazyObject file. That overwrites LazyObject symbols
with Defined symbols.
I think it is not too hard to see how we can now eliminate the second
hash table lookup. We can keep LazyObject symbols in Step 1, and then
call elf::resolveSymbol() to do Step 2.
Differential Revision: https://reviews.llvm.org/D61898
llvm-svn: 360975
Module IDs can appear in diagnostic messages.
This patch adds some auxiliary symbols to improve their readability.
Differential Revision: https://reviews.llvm.org/D61857
llvm-svn: 360858
Previously, we handled common symbols as a kind of Defined symbol,
but what we were doing for common symbols is pretty different from
regular defined symbols.
Common symbol and defined symbol are probably as different as shared
symbol and defined symbols are different.
This patch introduces CommonSymbol to represent common symbols.
After symbols are resolved, they are converted to Defined symbols
residing in a .bss section.
Differential Revision: https://reviews.llvm.org/D61895
llvm-svn: 360841
SymbolTable's add-family functions have lots of parameters because
when they have to create a new symbol, they forward given arguments
to Symbol's constructors. Therefore, the functions take at least as
many arguments as their corresponding constructors.
This patch simplifies the add-family functions. Now, the functions
take a symbol instead of arguments to construct a symbol. If there's
no existing symbol, a given symbol is memcpy'ed to the symbol table.
Otherwise, the functions attempt to merge the existing and a given
new symbol.
I also eliminated `CanOmitFromDynSym` parameter, so that the functions
take really one argument.
Symbol classes are trivially constructible, so looks like constructing
them to pass to add-family functions is as cheap as passing a lot of
arguments to the functions. A quick benchmark showed that this patch
seems performance-neutral.
This is a preparation for
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html
Differential Revision: https://reviews.llvm.org/D61855
llvm-svn: 360838
The symbol table used to be a container of vectors of input files,
but that's no longer the case because the vectors are moved out of
SymbolTable and are now global variables.
Therefore, addFile doesn't have to belong to any class. This patch
moves the function out of the class.
This patch is a preparation for my RFC [1].
[1] http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html
Differential Revision: https://reviews.llvm.org/D61854
llvm-svn: 360666
For partitions I intend to use the same set of version indexes in
each partition for simplicity. Since each partition will need its own
VersionNeedSection this will require moving the verneed tracking out of
VersionNeedSection. The way I've done this is to move most of the tracking
into SharedFile. What will eventually become the per-partition tracking
still lives in VersionNeedSection.
As a bonus the code gets a little simpler and more consistent with how we
handle verdef.
Differential Revision: https://reviews.llvm.org/D60307
llvm-svn: 357926
That patch is the fix for https://bugs.llvm.org/show_bug.cgi?id=40703
"wrong line number info for obj file compiled with -ffunction-sections"
bug. The problem happened with only .o files. If object file contains
several .text sections then line number information showed incorrectly.
The reason for this is that DwarfLineTable could not detect section which
corresponds to specified address(because address is the local to the
section). And as the result it could not select proper sequence in the
line table. The fix is to pass SectionIndex with the address. So that it
would be possible to differentiate addresses from various sections. With
this fix llvm-objdump shows correct line numbers for disassembled code.
Differential review: https://reviews.llvm.org/D58194
llvm-svn: 354972
Summary:
In ld.bfd/gold, --no-allow-shlib-undefined is the default when linking
an executable. This patch implements a check to error on undefined
symbols in a shared object, if all of its DT_NEEDED entries are seen.
Our approach resembles the one used in gold, achieves a good balance to
be useful but not too smart (ld.bfd traces all DSOs and emulates the
behavior of a dynamic linker to catch more cases).
The error is issued based on the symbol table, different from undefined
reference errors issued for relocations. It is most effective when there
are DSOs that were not linked with -z defs (e.g. when static sanitizers
runtime is used).
gold has a comment that some system libraries on GNU/Linux may have
spurious undefined references and thus system libraries should be
excluded (https://sourceware.org/bugzilla/show_bug.cgi?id=6811). The
story may have changed now but we make --allow-shlib-undefined the
default for now. Its interaction with -shared can be discussed in the
future.
Reviewers: ruiu, grimar, pcc, espindola
Reviewed By: ruiu
Subscribers: joerg, emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D57385
llvm-svn: 352826
Summary:
lld discards .gnu.linonce.* sections work around a bug in glibc.
https://sourceware.org/bugzilla/show_bug.cgi?id=20543
Unfortunately, the Linux kernel uses a section named
.gnu.linkonce.this_module to store infomation about kernel modules. The
kernel reads data from this section when loading kernel modules, and
errors if it fails to find this section. The current behavior of lld
discards this section when kernel modules are linked, so kernel modules
linked with lld are unloadable by the linux kernel.
The Linux kernel should use a comdat section instead of .gnu.linkonce.
The minimum version of binutils supported by the kernel supports comdat
sections. The kernel is also not relying on the old linkonce behavior;
it seems to have chosen a name that contains a deprecated GNU feature.
Changing the section name now in the kernel would require all kernel
modules to be recompiled to make use of the new section name. Instead,
rather than discarding .gnu.linkonce.*, let's discard the more specific
section name to continue working around the glibc issue while supporting
linking Linux kernel modules.
Link: https://github.com/ClangBuiltLinux/linux/issues/329
Reviewers: pcc, espindola
Reviewed By: pcc
Subscribers: nathanchance, emaste, arichardson, void, srhines
Differential Revision: https://reviews.llvm.org/D57294
llvm-svn: 352302
This does *not* implement full SHT_GROUP semantic, yet it is a simple step forward:
Sections within a group are still considered valid, but they do not behave as
specified by the standard in case of garbage collection.
Differential Revision: https://reviews.llvm.org/D56437
llvm-svn: 352068
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Patch by Michael Skvortsov!
This change adds a basic support for linking static MSP430 ELF code.
Implemented relocation types are intended to correspond to the BFD.
Differential revision: https://reviews.llvm.org/D56535
llvm-svn: 350819
This is https://bugs.llvm.org/show_bug.cgi?id=39493.
We crashed previously because did not handle /DISCARD/ properly
when -r was used. I think it is uncommon to use scripts with -r, though I see
nothing wrong to handle the /DISCARD/ so that we will not crash at least.
Differential revision: https://reviews.llvm.org/D53864
llvm-svn: 345819
Previously, we uncompress all compressed sections before doing anything.
That works, and that is conceptually simple, but that could results in
a waste of CPU time and memory if uncompressed sections are then
discarded or just copied to the output buffer.
In particular, if .debug_gnu_pub{names,types} are compressed and if no
-gdb-index option is given, we wasted CPU and memory because we
uncompress them into newly allocated bufers and then memcpy the buffers
to the output buffer. That temporary buffer was redundant.
This patch changes how to uncompress sections. Now, compressed sections
are uncompressed lazily. To do that, `Data` member of `InputSectionBase`
is now hidden from outside, and `data()` accessor automatically expands
an compressed buffer if necessary.
If no one calls `data()`, then `writeTo()` directly uncompresses
compressed data into the output buffer. That eliminates the redundant
memory allocation and redundant memcpy.
This patch significantly reduces memory consumption (20 GiB max RSS to
15 Gib) for an executable whose .debug_gnu_pub{names,types} are in total
5 GiB in an uncompressed form.
Differential Revision: https://reviews.llvm.org/D52917
llvm-svn: 343979
r320770 made LLD handle invalid DSOs where local symbols were found in
the global part of the symbol table. Unfortunately, it didn't handle the
case where those local symbols were also undefined, and r326242 exposed
an assertion failure in that case. Just warn on that case instead of
crashing, by moving the local binding check before the undefined symbol
addition.
The input file for the test is crafted by hand, since I don't know of
any tool that would produce such a broken DSO. I also don't understand
what it even means for a symbol to be undefined but have STB_LOCAL
binding - I don't think that combination makes any sense - but we have
found broken DSOs of this nature that we were linking against. I've
included detailed instructions on how to produce the DSO in the test.
Differential Revision: https://reviews.llvm.org/D52815
llvm-svn: 343745
This uses the call graph profile embedded in the object files to construct the call graph.
This is read from a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight).
Differential Revision: https://reviews.llvm.org/D45850
llvm-svn: 343552
Previously, if you invoke lld's `main` more than once in the same process,
the second invocation could fail or produce a wrong result due to a stale
pointer values of the previous run.
Differential Revision: https://reviews.llvm.org/D52506
llvm-svn: 343009
Summary: This protects lld from a null pointer dereference when a faulty input file has such invalid sh_link fields.
Reviewers: ruiu, espindola
Reviewed By: ruiu
Subscribers: emaste, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D51743
llvm-svn: 341611
Summary:
For -thinlto-object-suffix-replace=old\;new, in
tools/gold/gold-plugin.cpp, the thinlto object filename is Path minus
optional old suffix.
static std::string getThinLTOObjectFileName(StringRef Path, StringRef OldSuffix,
StringRef NewSuffix) {
if (OldSuffix.empty() && NewSuffix.empty())
return Path;
StringRef NewPath = Path;
NewPath.consume_back(OldSuffix);
std::string NewNewPath = NewPath;
NewNewPath += NewSuffix;
return NewNewPath;
}
Currently lld will error that the path does not end with old suffix.
This patch makes lld accept such paths but only add new suffix if Path
ends with old suffix. This fixes a link error where bitcode members in
an archive are regular LTO objects without old suffix.
Acording to tejohnson, this will "enable supporting mix and match of
minimized ThinLTO bitcode files with normal ThinLTO bitcode files in a
single link (where we want to apply the suffix replacement to the
minimized files, and just ignore it for the normal ThinLTO files)."
Reviewers: ruiu, pcc, tejohnson, espindola
Reviewed By: tejohnson
Subscribers: emaste, inglorion, arichardson, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51055
llvm-svn: 340364
Our code in LazyObjFile::parse() has an ELFT switch and
adds a lazy object by its ELFT kind.
Though it might be possible to add a file using a different
architecture and make LLD to silently accept it (if the file
is empty or contains only week symbols). That itself, not a
huge issue perhaps (because the error would be reported later
if the file is fetched), but still does not look clean and correct.
It is possible to report an error earlier and clean up the
code. That is what the patch does.
Ideally, we might want to reuse isCompatible from SymbolTable.cpp,
but it is static and accepts a file as an argument, what is not
convenient. Since such a situation should be rare, I think it
should be OK to go with the way chosen in this patch.
Differential revision: https://reviews.llvm.org/D50899
llvm-svn: 340257
This patch solves 2 problems:
1) It adds a test to check the line below:
https://github.com/llvm-mirror/lld/blob/master/ELF/InputFiles.cpp#L334
Test case contains SHT_GROUP section with a broken (0xFF) flag.
2) The patch fixes the case when we silently accepted such broken groups
in the case when there were no other objects with the same group signature.
llvm-svn: 339765
The Tag_ABI_VFP_args build attribute controls the procedure call standard
used for floating point parameters on ARM. The values are:
0 - Base AAPCS (FP Parameters passed in Core (Integer) registers
1 - VFP AAPCS (FP Parameters passed in FP registers)
2 - Toolchain specific (Neither Base or VFP)
3 - Compatible with all (No use of floating point parameters)
If the Tag_ABI_VFP_args build attribute is missing it has an implicit value
of 0.
We use the attribute in two ways:
- Detect a clash in calling convention between Base, VFP and Toolchain.
we follow ld.bfd's lead and do not error if there is a clash between an
implicit Base AAPCS caused by a missing attribute. Many projects
including the hard-float (VFP AAPCS) version of glibc contain assembler
files that do not use floating point but do not have Tag_ABI_VFP_args.
- Set the EF_ARM_ABI_FLOAT_SOFT or EF_ARM_ABI_FLOAT_HARD ELF header flag
for Base or VFP AAPCS respectively. This flag is used by some ELF
loaders.
References:
- Addenda to, and Errata in, the ABI for the ARM Architecture for
Tag_ABI_VFP_args
- Elf for the ARM Architecture for ELF header flags
Fixes PR36009
Differential Revision: https://reviews.llvm.org/D49993
llvm-svn: 338377
The original computation for shared object symbol alignment is wrong when
st_value equals 0. It is very unusual for dso symbols to have st_value equal 0.
But when it happens, it causes obscure run time bugs.
Differential Revision: https://reviews.llvm.org/D47602
llvm-svn: 334135
Currently, when LLD do a lookup for variables location, it uses DW_AT_name attribute.
That is not always enough.
Imagine code:
namespace A {
int bar = 0;
}
namespace Z {
int bar = 1;
}
int hoho;
In this case there are 3 variables and their debug attributes are following:
A::bar has: DW_AT_name [DW_FORM_string] ("bar") DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x00000006] = "_ZN1A3barE")
Z::bar has: DW_AT_name [DW_FORM_string] ("bar") DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000003f] = "_ZN1Z3barE")
hoho has: DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000004a] = "hoho") and has NO DW_AT_linkage_name attribute. Because it would be
the same as DW_AT_name and DWARF producers avoids emiting excessive data.
Hence LLD should also use DW_AT_linkage_name when it is available.
(currently, LLD fails to report location correctly because thinks that A::bar and Z::bar are the same things)
Differential revision: https://reviews.llvm.org/D47373
llvm-svn: 333880
- Move some common code into Common/rrorHandler.cpp and
Common/Strings.h.
- Don't use `fatal` when incompatible bitcode files are
encountered.
- Rename NameRef variable to just Name
See D47162
Differential Revision: https://reviews.llvm.org/D47206
llvm-svn: 333021
Reviewed by: ruiu, grimar, espindola
Differential Revision: https://reviews.llvm.org/D44562
Summary:
r331971 changes the debug line parser interface to report LLVM errors in an
interface that different executables can use, rather than always being printed
directly as warnings to stderr. This change allows LLD to make use of the new
interface and call its own warning methods to report problems.
llvm-svn: 331972
I'm proposing a new command line flag, --warn-backrefs in this patch.
The flag and the feature proposed below don't exist in GNU linkers
nor the current lld.
--warn-backrefs is an option to detect reverse or cyclic dependencies
between static archives, and it can be used to keep your program
compatible with GNU linkers after you switch to lld. I'll explain the
feature and why you may find it useful below.
lld's symbol resolution semantics is more relaxed than traditional
Unix linkers. Therefore,
ld.lld foo.a bar.o
succeeds even if bar.o contains an undefined symbol that have to be
resolved by some object file in foo.a. Traditional Unix linkers
don't allow this kind of backward reference, as they visit each
file only once from left to right in the command line while
resolving all undefined symbol at the moment of visiting.
In the above case, since there's no undefined symbol when a linker
visits foo.a, no files are pulled out from foo.a, and because the
linker forgets about foo.a after visiting, it can't resolve
undefined symbols that could have been resolved otherwise.
That lld accepts more relaxed form means (besides it makes more
sense) that you can accidentally write a command line or a build
file that works only with lld, even if you have a plan to
distribute it to wider users who may be using GNU linkers. With
--check-library-dependency, you can detect a library order that
doesn't work with other Unix linkers.
The option is also useful to detect cyclic dependencies between
static archives. Again, lld accepts
ld.lld foo.a bar.a
even if foo.a and bar.a depend on each other. With --warn-backrefs
it is handled as an error.
Here is how the option works. We assign a group ID to each file. A
file with a smaller group ID can pull out object files from an
archive file with an equal or greater group ID. Otherwise, it is a
reverse dependency and an error.
A file outside --{start,end}-group gets a fresh ID when
instantiated. All files within the same --{start,end}-group get the
same group ID. E.g.
ld.lld A B --start-group C D --end-group E
A and B form group 0, C, D and their member object files form group
1, and E forms group 2. I think that you can see how this group
assignment rule simulates the traditional linker's semantics.
Differential Revision: https://reviews.llvm.org/D45195
llvm-svn: 329636
Some system libraries have a lot of versioned symbols. When linking
scylla this brings the number of malloc calls from 49154 to 37944.
llvm-svn: 329453
They are to pull out an object file for a symbol, but for a historical
reason the code is written in two separate functions. This patch
merges them.
llvm-svn: 329039
I tried a few different designs to find a way to implement it without
too much hassle and settled down with this. Unlike before, object files
given as arguments for --just-symbols are handled as object files, with
an exception that their section tables are handled as if they were all
null.
Differential Revision: https://reviews.llvm.org/D42025
llvm-svn: 328852
NonLocal is technically more accurate, but we already use the term
"Global" to specify the non-local part of the symbol table, and
Local <-> Global is easier to digest.
llvm-svn: 328740
This fixes pr36623.
The problem is that we have to parse versions out of names before LTO
so that LTO can use that information.
When we get the LTO produced .o files, we replace the previous symbols
with the LTO produced ones, but they still have @ in their names.
We could just trim the name directly, but calling parseSymbolVersion
to do it is simpler.
llvm-svn: 328738
Some tools (dwarfdump for example) get confused by the current -O0 -r
output since it has multiple copies of .debug_str.
We cannot just merge sections with the same name as they can have
different sh_entsize.
We could have duplicated logic for merging sections based on name and
sh_entsize, but it seems better to just use the existing logic by
enabling optimizations.
llvm-svn: 328640
SharedFile::parseRest function grew organically and got a bit hard to
understand. This patch refactor it. This patch also adds comments.
Differential Revision: https://reviews.llvm.org/D44860
llvm-svn: 328579
Previously, we used 0 as an alias for VER_NDX_GLOBAL and had a dummy
entry in SharedFile::Verdefs so that the access to the array is within
its boundary. But that's not straightforwad. We can just stop doing both.
llvm-svn: 328401
Our code assumes all input sections in an output SHF_LINK_ORDER
section has SHF_LINK_ORDER flag. We do not check that and that can cause a crash.
That happens because we call
std::stable_sort(Sections.begin(), Sections.end(), compareByFilePosition);,
where compareByFilePosition predicate does not expect to see
null when calls getLinkOrderDep.
The same might happen when sections refer to non-regular sections.
Test cases demonstrate the issues, patch fixes them.
Differential revision: https://reviews.llvm.org/D44193
llvm-svn: 327006
We don't need to handle an object file having more than one symbol table,
so as soon as we find the first one, we can process it and then return
from the function.
llvm-svn: 326977
addElfSymbols and readJustSymbolsFile still has duplicate code, but
I didn't come up with a good idea to eliminate them. Since this patch
is an improvement, I'm sending this for review.
Differential Revision: https://reviews.llvm.org/D44187
llvm-svn: 326972
LLD uses the debug info and debug line sections to determine the location of
e.g. references to undefined symbols, when producing error messages. In the
event that debug info was present, but debug line parsing failed for some
reason, then a nullptr would end up being dereferenced by the location-lookup
code.
Differential Revision: https://reviews.llvm.org/D44205
Reviewers: grimar
llvm-svn: 326899
It should be possible to resolve undefined symbols in dynamic libraries
using symbols defined in a linker script.
Differential Revision: https://reviews.llvm.org/D43011
llvm-svn: 326176
There seems to be no reason to collect this list of symbols.
Also fix a bug where --exclude-libs would apply to all symbols that
appear in an archive's symbol table, even if the relevant archive
member was not added to the link.
Differential Revision: https://reviews.llvm.org/D43369
llvm-svn: 325380
MIPS BFD linker puts _gp_disp symbol into DSO files and assigns zero
version definition index to it. This value means 'unversioned local
symbol' while _gp_disp is a section global symbol. We have to handle
this bug in the LLD because BFD linker is used for building MIPS
toolchain libraries.
Differential revision: https://reviews.llvm.org/D42486
llvm-svn: 324467
We did not report valid filename for duplicate symbol error when
symbol came from binary input file.
Patch fixes it.
Differential revision: https://reviews.llvm.org/D42635
llvm-svn: 324217
r323476 added support for DW_FORM_line_strp, and incorrectly made that
depend on having a DWARFUnit available. We shouldn't be tracking
.debug_line_str in DWARFUnit after all. After this patch, I can do an
NFC follow up and undo a bunch of the "plumbing" part of r323476.
Differential Revision: https://reviews.llvm.org/D42609
llvm-svn: 323691
We normally avoid "switch (Config->EKind)", but in this case I think
it is worth it.
It is only executed when there is an error and it allows detemplating
a lot of code.
llvm-svn: 321404
I noticed that the continue this patch deletes was not tested. Trying
to add a test I realized that we never put a VER_NDX_LOCAL symbol in
the dynamic symbol table. There doesn't seem to be any reason for a
linker to use VER_NDX_LOCAL for a defined shared symbol.
llvm-svn: 320817
Specifically, libwidevinecdm.so in Chrome has such bad symbol.
It seems the BFD linker handles them as local symbols, so instead
of inserting them to the symbol table, we should skip them too.
Differential Revision: https://reviews.llvm.org/D41257
llvm-svn: 320770
By using an index instead of a pointer for verdef we can put the index
next to the alignment field. This uses the otherwise wasted area and
reduces the shared symbol size.
By itself the performance change of this is in the noise, but I have a
followup patch to remove another 8 bytes that improves performance
when combined with this.
llvm-svn: 320449
This patch is to rename check CHECK and make it a C macro, so that
we can evaluate the second argument lazily.
Differential Revision: https://reviews.llvm.org/D40915
llvm-svn: 319974
This avoids allocating the error message when there is no error that
check requires.
It avoids the code duplication of inlining check.
llvm-svn: 319922
The SHT_ARM_ATTRIBUTES section type is in the processor specific space
adding guard to make sure that it is only recognized when EMachine is
EM_ARM.
llvm-svn: 319304
lld assumes some ARM features that are not available in all Arm
processors. In particular:
- The blx instruction present for interworking.
- The movt/movw instructions are used in Thunks.
- The J1=1 J2=1 encoding of branch immediates to improve Thumb wide
branch range are assumed to be present.
This patch reads the ARM Attributes section to check for the
architecture the object file was compiled with. If none of the objects
have an architecture that supports either of these features a warning
will be given. This is most likely to affect armv6 as used in the first
Raspberry Pi.
Differential Revision: https://reviews.llvm.org/D36823
llvm-svn: 319169
This fixes PR35223.
Here I enabled SHF_MERGE section content merging for -r like
we do for regular linking.
Differential revision: https://reviews.llvm.org/D40026
llvm-svn: 318516
Now that DefinedRegular is the only remaining derived class of
Defined, we can merge the two classes.
Differential Revision: https://reviews.llvm.org/D39667
llvm-svn: 317448
Currently LLD tries to use information about functions and variables location
taking it from debug sections. When --strip-* is given we discard such sections
and that breaks error reporting.
Patch stops discarding such sections and just removes them from InputSections list.
Differential revision: https://reviews.llvm.org/D39550
llvm-svn: 317405
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by
perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)
nd clang-format-diff.
Differential Revision: https://reviews.llvm.org/D39459
llvm-svn: 317370