2020-04-03 02:54:05 +08:00
|
|
|
//===- Driver.cpp ---------------------------------------------------------===//
|
|
|
|
//
|
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "Driver.h"
|
|
|
|
#include "Config.h"
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
#include "ICF.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "InputFiles.h"
|
2020-10-27 10:18:29 +08:00
|
|
|
#include "LTO.h"
|
[lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.
Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).
Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:
% ministat lld_*
x lld_nostrip.txt
+ lld_strip.txt
N Min Max Median Avg Stddev
x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794
+ 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651
Difference at 95.0% confidence
-0.144711 +/- 0.0336749
-3.60967% +/- 0.839989%
(Student's t, pooled s = 0.0358398)
This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols
It's possible it interacts with more features I didn't think of,
of course.
I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests
Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.
Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
(`.no_dead_strip` in asm)
Fixes PR49276.
Differential Revision: https://reviews.llvm.org/D103324
2021-05-08 05:10:05 +08:00
|
|
|
#include "MarkLive.h"
|
2020-08-19 05:37:04 +08:00
|
|
|
#include "ObjC.h"
|
2020-05-02 07:29:06 +08:00
|
|
|
#include "OutputSection.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "OutputSegment.h"
|
2022-01-25 06:27:56 +08:00
|
|
|
#include "SectionPriorities.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "SymbolTable.h"
|
|
|
|
#include "Symbols.h"
|
2020-07-31 05:28:41 +08:00
|
|
|
#include "SyntheticSections.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "Target.h"
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
#include "UnwindInfoSection.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "Writer.h"
|
|
|
|
|
|
|
|
#include "lld/Common/Args.h"
|
|
|
|
#include "lld/Common/Driver.h"
|
|
|
|
#include "lld/Common/ErrorHandler.h"
|
|
|
|
#include "lld/Common/LLVM.h"
|
|
|
|
#include "lld/Common/Memory.h"
|
2020-12-02 12:31:57 +08:00
|
|
|
#include "lld/Common/Reproduce.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "lld/Common/Version.h"
|
2020-05-06 07:37:34 +08:00
|
|
|
#include "llvm/ADT/DenseSet.h"
|
2020-04-22 04:37:57 +08:00
|
|
|
#include "llvm/ADT/StringExtras.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "llvm/ADT/StringRef.h"
|
|
|
|
#include "llvm/BinaryFormat/MachO.h"
|
|
|
|
#include "llvm/BinaryFormat/Magic.h"
|
2021-05-19 23:07:39 +08:00
|
|
|
#include "llvm/Config/llvm-config.h"
|
2020-10-27 10:18:29 +08:00
|
|
|
#include "llvm/LTO/LTO.h"
|
2020-05-15 03:43:51 +08:00
|
|
|
#include "llvm/Object/Archive.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "llvm/Option/ArgList.h"
|
2020-12-09 20:06:50 +08:00
|
|
|
#include "llvm/Support/CommandLine.h"
|
2020-07-27 03:46:46 +08:00
|
|
|
#include "llvm/Support/FileSystem.h"
|
2020-06-18 10:59:27 +08:00
|
|
|
#include "llvm/Support/Host.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
#include "llvm/Support/MemoryBuffer.h"
|
2021-03-26 02:39:45 +08:00
|
|
|
#include "llvm/Support/Parallel.h"
|
2020-04-24 11:16:49 +08:00
|
|
|
#include "llvm/Support/Path.h"
|
2020-11-29 11:38:27 +08:00
|
|
|
#include "llvm/Support/TarWriter.h"
|
2020-10-27 10:18:29 +08:00
|
|
|
#include "llvm/Support/TargetSelect.h"
|
2021-03-11 22:04:27 +08:00
|
|
|
#include "llvm/Support/TimeProfiler.h"
|
2021-04-06 00:59:50 +08:00
|
|
|
#include "llvm/TextAPI/PackedVersion.h"
|
2020-04-03 02:54:05 +08:00
|
|
|
|
2020-08-11 09:47:16 +08:00
|
|
|
#include <algorithm>
|
|
|
|
|
2020-04-03 02:54:05 +08:00
|
|
|
using namespace llvm;
|
|
|
|
using namespace llvm::MachO;
|
2020-08-13 10:50:27 +08:00
|
|
|
using namespace llvm::object;
|
2020-06-16 03:36:32 +08:00
|
|
|
using namespace llvm::opt;
|
2020-08-13 10:50:27 +08:00
|
|
|
using namespace llvm::sys;
|
2020-04-03 02:54:05 +08:00
|
|
|
using namespace lld;
|
|
|
|
using namespace lld::macho;
|
|
|
|
|
2022-01-11 11:39:14 +08:00
|
|
|
std::unique_ptr<Configuration> macho::config;
|
|
|
|
std::unique_ptr<DependencyTracker> macho::depTracker;
|
2020-04-03 02:54:05 +08:00
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
static HeaderFileType getOutputType(const InputArgList &args) {
|
2020-09-01 14:23:37 +08:00
|
|
|
// TODO: -r, -dylinker, -preload...
|
2021-03-10 12:40:08 +08:00
|
|
|
Arg *outputArg = args.getLastArg(OPT_bundle, OPT_dylib, OPT_execute);
|
2020-09-01 14:23:37 +08:00
|
|
|
if (outputArg == nullptr)
|
|
|
|
return MH_EXECUTE;
|
|
|
|
|
|
|
|
switch (outputArg->getOption().getID()) {
|
|
|
|
case OPT_bundle:
|
|
|
|
return MH_BUNDLE;
|
|
|
|
case OPT_dylib:
|
|
|
|
return MH_DYLIB;
|
|
|
|
case OPT_execute:
|
|
|
|
return MH_EXECUTE;
|
|
|
|
default:
|
|
|
|
llvm_unreachable("internal error");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-11-04 00:49:13 +08:00
|
|
|
static DenseMap<CachedHashStringRef, StringRef> resolvedLibraries;
|
2021-06-19 10:19:09 +08:00
|
|
|
static Optional<StringRef> findLibrary(StringRef name) {
|
2021-11-04 00:49:13 +08:00
|
|
|
CachedHashStringRef key(name);
|
|
|
|
auto entry = resolvedLibraries.find(key);
|
|
|
|
if (entry != resolvedLibraries.end())
|
|
|
|
return entry->second;
|
|
|
|
|
|
|
|
auto doFind = [&] {
|
|
|
|
if (config->searchDylibsFirst) {
|
|
|
|
if (Optional<StringRef> path = findPathCombination(
|
|
|
|
"lib" + name, config->librarySearchPaths, {".tbd", ".dylib"}))
|
|
|
|
return path;
|
|
|
|
return findPathCombination("lib" + name, config->librarySearchPaths,
|
|
|
|
{".a"});
|
|
|
|
}
|
2021-04-16 09:14:30 +08:00
|
|
|
return findPathCombination("lib" + name, config->librarySearchPaths,
|
2021-11-04 00:49:13 +08:00
|
|
|
{".tbd", ".dylib", ".a"});
|
|
|
|
};
|
|
|
|
|
|
|
|
Optional<StringRef> path = doFind();
|
|
|
|
if (path)
|
|
|
|
resolvedLibraries[key] = *path;
|
|
|
|
|
|
|
|
return path;
|
2021-04-16 09:14:30 +08:00
|
|
|
}
|
|
|
|
|
2021-11-04 02:08:57 +08:00
|
|
|
static DenseMap<CachedHashStringRef, StringRef> resolvedFrameworks;
|
2021-10-20 23:20:21 +08:00
|
|
|
static Optional<StringRef> findFramework(StringRef name) {
|
2021-11-04 02:08:57 +08:00
|
|
|
CachedHashStringRef key(name);
|
|
|
|
auto entry = resolvedFrameworks.find(key);
|
|
|
|
if (entry != resolvedFrameworks.end())
|
|
|
|
return entry->second;
|
|
|
|
|
2021-01-10 00:58:19 +08:00
|
|
|
SmallString<260> symlink;
|
2020-07-27 03:46:46 +08:00
|
|
|
StringRef suffix;
|
|
|
|
std::tie(name, suffix) = name.split(",");
|
|
|
|
for (StringRef dir : config->frameworkSearchPaths) {
|
|
|
|
symlink = dir;
|
|
|
|
path::append(symlink, name + ".framework", name);
|
2020-08-08 02:04:54 +08:00
|
|
|
|
2020-07-27 03:46:46 +08:00
|
|
|
if (!suffix.empty()) {
|
2020-08-08 02:04:54 +08:00
|
|
|
// NOTE: we must resolve the symlink before trying the suffixes, because
|
|
|
|
// there are no symlinks for the suffixed paths.
|
2021-01-10 00:58:19 +08:00
|
|
|
SmallString<260> location;
|
2020-08-08 02:04:54 +08:00
|
|
|
if (!fs::real_path(symlink, location)) {
|
|
|
|
// only append suffix if realpath() succeeds
|
|
|
|
Twine suffixed = location + suffix;
|
|
|
|
if (fs::exists(suffixed))
|
2022-01-21 03:53:18 +08:00
|
|
|
return resolvedFrameworks[key] = saver().save(suffixed.str());
|
2020-08-08 02:04:54 +08:00
|
|
|
}
|
2020-07-27 03:46:46 +08:00
|
|
|
// Suffix lookup failed, fall through to the no-suffix case.
|
|
|
|
}
|
2020-08-08 02:04:54 +08:00
|
|
|
|
2021-10-20 23:20:21 +08:00
|
|
|
if (Optional<StringRef> path = resolveDylibPath(symlink.str()))
|
2021-11-04 02:08:57 +08:00
|
|
|
return resolvedFrameworks[key] = *path;
|
2020-07-27 03:46:46 +08:00
|
|
|
}
|
|
|
|
return {};
|
|
|
|
}
|
|
|
|
|
2020-08-14 04:03:00 +08:00
|
|
|
static bool warnIfNotDirectory(StringRef option, StringRef path) {
|
2020-06-18 10:59:27 +08:00
|
|
|
if (!fs::exists(path)) {
|
|
|
|
warn("directory not found for option -" + option + path);
|
|
|
|
return false;
|
|
|
|
} else if (!fs::is_directory(path)) {
|
|
|
|
warn("option -" + option + path + " references a non-directory path");
|
|
|
|
return false;
|
2020-04-22 04:37:57 +08:00
|
|
|
}
|
2020-06-18 10:59:27 +08:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2020-09-19 11:51:38 +08:00
|
|
|
static std::vector<StringRef>
|
2021-03-10 12:40:08 +08:00
|
|
|
getSearchPaths(unsigned optionCode, InputArgList &args,
|
2020-09-19 11:51:38 +08:00
|
|
|
const std::vector<StringRef> &roots,
|
|
|
|
const SmallVector<StringRef, 2> &systemPaths) {
|
|
|
|
std::vector<StringRef> paths;
|
2020-08-14 04:03:00 +08:00
|
|
|
StringRef optionLetter{optionCode == OPT_F ? "F" : "L"};
|
|
|
|
for (StringRef path : args::getStrings(args, optionCode)) {
|
2020-06-20 12:13:03 +08:00
|
|
|
// NOTE: only absolute paths are re-rooted to syslibroot(s)
|
2020-08-14 04:03:00 +08:00
|
|
|
bool found = false;
|
|
|
|
if (path::is_absolute(path, path::Style::posix)) {
|
2020-06-20 12:13:03 +08:00
|
|
|
for (StringRef root : roots) {
|
|
|
|
SmallString<261> buffer(root);
|
2020-08-14 04:03:00 +08:00
|
|
|
path::append(buffer, path);
|
2020-06-20 12:13:03 +08:00
|
|
|
// Do not warn about paths that are computed via the syslib roots
|
2020-08-14 04:03:00 +08:00
|
|
|
if (fs::is_directory(buffer)) {
|
2022-01-21 03:53:18 +08:00
|
|
|
paths.push_back(saver().save(buffer.str()));
|
2020-08-14 04:03:00 +08:00
|
|
|
found = true;
|
|
|
|
}
|
2020-06-20 12:13:03 +08:00
|
|
|
}
|
2020-06-18 10:59:27 +08:00
|
|
|
}
|
2020-08-14 04:03:00 +08:00
|
|
|
if (!found && warnIfNotDirectory(optionLetter, path))
|
|
|
|
paths.push_back(path);
|
2020-06-18 10:59:27 +08:00
|
|
|
}
|
2020-06-20 12:13:03 +08:00
|
|
|
|
|
|
|
// `-Z` suppresses the standard "system" search paths.
|
|
|
|
if (args.hasArg(OPT_Z))
|
2020-09-19 11:51:38 +08:00
|
|
|
return paths;
|
2020-06-20 12:13:03 +08:00
|
|
|
|
2021-03-10 12:15:29 +08:00
|
|
|
for (const StringRef &path : systemPaths) {
|
|
|
|
for (const StringRef &root : roots) {
|
2020-06-20 12:13:03 +08:00
|
|
|
SmallString<261> buffer(root);
|
2020-08-14 04:03:00 +08:00
|
|
|
path::append(buffer, path);
|
2020-11-21 05:05:33 +08:00
|
|
|
if (fs::is_directory(buffer))
|
2022-01-21 03:53:18 +08:00
|
|
|
paths.push_back(saver().save(buffer.str()));
|
2020-06-20 12:13:03 +08:00
|
|
|
}
|
|
|
|
}
|
2020-09-19 11:51:38 +08:00
|
|
|
return paths;
|
|
|
|
}
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
static std::vector<StringRef> getSystemLibraryRoots(InputArgList &args) {
|
2020-09-19 11:51:38 +08:00
|
|
|
std::vector<StringRef> roots;
|
|
|
|
for (const Arg *arg : args.filtered(OPT_syslibroot))
|
|
|
|
roots.push_back(arg->getValue());
|
|
|
|
// NOTE: the final `-syslibroot` being `/` will ignore all roots
|
2021-10-27 03:14:25 +08:00
|
|
|
if (!roots.empty() && roots.back() == "/")
|
2020-09-19 11:51:38 +08:00
|
|
|
roots.clear();
|
|
|
|
// NOTE: roots can never be empty - add an empty root to simplify the library
|
|
|
|
// and framework search path computation.
|
|
|
|
if (roots.empty())
|
|
|
|
roots.emplace_back("");
|
|
|
|
return roots;
|
2020-06-18 10:59:27 +08:00
|
|
|
}
|
|
|
|
|
2020-09-19 11:51:38 +08:00
|
|
|
static std::vector<StringRef>
|
2021-03-10 12:40:08 +08:00
|
|
|
getLibrarySearchPaths(InputArgList &args, const std::vector<StringRef> &roots) {
|
2020-09-19 11:51:38 +08:00
|
|
|
return getSearchPaths(OPT_L, args, roots, {"/usr/lib", "/usr/local/lib"});
|
2020-06-18 10:59:27 +08:00
|
|
|
}
|
|
|
|
|
2020-09-19 11:51:38 +08:00
|
|
|
static std::vector<StringRef>
|
2021-03-10 12:40:08 +08:00
|
|
|
getFrameworkSearchPaths(InputArgList &args,
|
2020-09-19 11:51:38 +08:00
|
|
|
const std::vector<StringRef> &roots) {
|
|
|
|
return getSearchPaths(OPT_F, args, roots,
|
|
|
|
{"/Library/Frameworks", "/System/Library/Frameworks"});
|
2020-04-22 04:37:57 +08:00
|
|
|
}
|
|
|
|
|
2021-07-16 00:56:13 +08:00
|
|
|
static llvm::CachePruningPolicy getLTOCachePolicy(InputArgList &args) {
|
|
|
|
SmallString<128> ltoPolicy;
|
|
|
|
auto add = [<oPolicy](Twine val) {
|
|
|
|
if (!ltoPolicy.empty())
|
|
|
|
ltoPolicy += ":";
|
|
|
|
val.toVector(ltoPolicy);
|
|
|
|
};
|
|
|
|
for (const Arg *arg :
|
|
|
|
args.filtered(OPT_thinlto_cache_policy, OPT_prune_interval_lto,
|
|
|
|
OPT_prune_after_lto, OPT_max_relative_cache_size_lto)) {
|
|
|
|
switch (arg->getOption().getID()) {
|
2021-10-27 03:14:25 +08:00
|
|
|
case OPT_thinlto_cache_policy:
|
|
|
|
add(arg->getValue());
|
|
|
|
break;
|
2021-07-16 00:56:13 +08:00
|
|
|
case OPT_prune_interval_lto:
|
|
|
|
if (!strcmp("-1", arg->getValue()))
|
|
|
|
add("prune_interval=87600h"); // 10 years
|
|
|
|
else
|
|
|
|
add(Twine("prune_interval=") + arg->getValue() + "s");
|
|
|
|
break;
|
|
|
|
case OPT_prune_after_lto:
|
|
|
|
add(Twine("prune_after=") + arg->getValue() + "s");
|
|
|
|
break;
|
|
|
|
case OPT_max_relative_cache_size_lto:
|
|
|
|
add(Twine("cache_size=") + arg->getValue() + "%");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return CHECK(parseCachePruningPolicy(ltoPolicy), "invalid LTO cache policy");
|
|
|
|
}
|
|
|
|
|
2021-06-19 10:30:57 +08:00
|
|
|
static DenseMap<StringRef, ArchiveFile *> loadedArchives;
|
|
|
|
|
2021-10-29 23:00:16 +08:00
|
|
|
static InputFile *addFile(StringRef path, ForceLoad forceLoadArchive,
|
2022-01-20 02:14:49 +08:00
|
|
|
bool isLazy = false, bool isExplicit = true,
|
|
|
|
bool isBundleLoader = false) {
|
2021-03-02 17:20:22 +08:00
|
|
|
Optional<MemoryBufferRef> buffer = readFile(path);
|
2020-04-03 02:54:05 +08:00
|
|
|
if (!buffer)
|
2020-09-19 02:38:15 +08:00
|
|
|
return nullptr;
|
2020-04-03 02:54:05 +08:00
|
|
|
MemoryBufferRef mbref = *buffer;
|
2020-09-19 02:38:15 +08:00
|
|
|
InputFile *newFile = nullptr;
|
2020-04-03 02:54:05 +08:00
|
|
|
|
2021-03-10 12:15:29 +08:00
|
|
|
file_magic magic = identify_magic(mbref.getBuffer());
|
2020-12-03 07:57:30 +08:00
|
|
|
switch (magic) {
|
2020-05-15 03:43:51 +08:00
|
|
|
case file_magic::archive: {
|
2021-06-19 10:30:57 +08:00
|
|
|
// Avoid loading archives twice. If the archives are being force-loaded,
|
|
|
|
// loading them twice would create duplicate symbol errors. In the
|
|
|
|
// non-force-loading case, this is just a minor performance optimization.
|
|
|
|
// We don't take a reference to cachedFile here because the
|
|
|
|
// loadArchiveMember() call below may recursively call addFile() and
|
|
|
|
// invalidate this reference.
|
2021-06-21 09:59:11 +08:00
|
|
|
if (ArchiveFile *cachedFile = loadedArchives[path])
|
2021-06-19 10:30:57 +08:00
|
|
|
return cachedFile;
|
|
|
|
|
2021-08-26 23:49:47 +08:00
|
|
|
std::unique_ptr<object::Archive> archive = CHECK(
|
2020-05-15 03:43:51 +08:00
|
|
|
object::Archive::create(mbref), path + ": failed to parse archive");
|
|
|
|
|
2021-08-26 23:49:47 +08:00
|
|
|
if (!archive->isEmpty() && !archive->hasSymbolTable())
|
2020-05-15 03:43:51 +08:00
|
|
|
error(path + ": archive has no index; run ranlib to add one");
|
|
|
|
|
2021-08-26 23:49:47 +08:00
|
|
|
auto *file = make<ArchiveFile>(std::move(archive));
|
2021-10-29 23:00:16 +08:00
|
|
|
if ((forceLoadArchive == ForceLoad::Default && config->allLoad) ||
|
|
|
|
forceLoadArchive == ForceLoad::Yes) {
|
2021-03-02 17:20:22 +08:00
|
|
|
if (Optional<MemoryBufferRef> buffer = readFile(path)) {
|
2021-08-26 23:49:47 +08:00
|
|
|
Error e = Error::success();
|
|
|
|
for (const object::Archive::Child &c : file->getArchive().children(e)) {
|
2021-10-29 23:00:16 +08:00
|
|
|
StringRef reason =
|
|
|
|
forceLoadArchive == ForceLoad::Yes ? "-force_load" : "-all_load";
|
2021-08-26 23:49:47 +08:00
|
|
|
if (Error e = file->fetch(c, reason))
|
|
|
|
error(toString(file) + ": " + reason +
|
|
|
|
" failed to load archive member: " + toString(std::move(e)));
|
2020-11-20 23:14:57 +08:00
|
|
|
}
|
2021-08-26 23:49:47 +08:00
|
|
|
if (e)
|
|
|
|
error(toString(file) +
|
|
|
|
": Archive::children failed: " + toString(std::move(e)));
|
2020-11-20 23:14:57 +08:00
|
|
|
}
|
2021-10-29 23:00:16 +08:00
|
|
|
} else if (forceLoadArchive == ForceLoad::Default &&
|
|
|
|
config->forceLoadObjC) {
|
2021-08-26 23:49:47 +08:00
|
|
|
for (const object::Archive::Symbol &sym : file->getArchive().symbols())
|
2020-08-19 05:37:04 +08:00
|
|
|
if (sym.getName().startswith(objc::klass))
|
2021-08-27 01:51:38 +08:00
|
|
|
file->fetch(sym);
|
2020-08-19 05:37:04 +08:00
|
|
|
|
|
|
|
// TODO: no need to look for ObjC sections for a given archive member if
|
2021-08-26 23:49:47 +08:00
|
|
|
// we already found that it contains an ObjC symbol.
|
2021-03-02 17:20:22 +08:00
|
|
|
if (Optional<MemoryBufferRef> buffer = readFile(path)) {
|
2021-08-26 23:49:47 +08:00
|
|
|
Error e = Error::success();
|
|
|
|
for (const object::Archive::Child &c : file->getArchive().children(e)) {
|
|
|
|
Expected<MemoryBufferRef> mb = c.getMemoryBufferRef();
|
|
|
|
if (!mb || !hasObjCSection(*mb))
|
|
|
|
continue;
|
|
|
|
if (Error e = file->fetch(c, "-ObjC"))
|
|
|
|
error(toString(file) + ": -ObjC failed to load archive member: " +
|
|
|
|
toString(std::move(e)));
|
2020-12-02 08:00:48 +08:00
|
|
|
}
|
2021-08-26 23:49:47 +08:00
|
|
|
if (e)
|
|
|
|
error(toString(file) +
|
|
|
|
": Archive::children failed: " + toString(std::move(e)));
|
2020-12-02 08:00:48 +08:00
|
|
|
}
|
2020-08-19 05:37:04 +08:00
|
|
|
}
|
|
|
|
|
2021-08-26 23:49:47 +08:00
|
|
|
file->addLazySymbols();
|
|
|
|
newFile = loadedArchives[path] = file;
|
2020-05-15 03:43:51 +08:00
|
|
|
break;
|
|
|
|
}
|
2020-04-03 02:54:05 +08:00
|
|
|
case file_magic::macho_object:
|
2022-01-20 02:14:49 +08:00
|
|
|
newFile = make<ObjFile>(mbref, getModTime(path), "", isLazy);
|
2020-04-03 02:54:05 +08:00
|
|
|
break;
|
2020-04-22 04:37:57 +08:00
|
|
|
case file_magic::macho_dynamically_linked_shared_lib:
|
2020-10-08 05:50:42 +08:00
|
|
|
case file_magic::macho_dynamically_linked_shared_lib_stub:
|
2021-01-19 23:44:42 +08:00
|
|
|
case file_magic::tapi_file:
|
2021-07-12 06:36:59 +08:00
|
|
|
if (DylibFile *dylibFile = loadDylib(mbref)) {
|
2021-06-02 20:54:36 +08:00
|
|
|
if (isExplicit)
|
|
|
|
dylibFile->explicitlyLinked = true;
|
2021-06-02 04:09:41 +08:00
|
|
|
newFile = dylibFile;
|
2021-06-01 10:12:35 +08:00
|
|
|
}
|
2020-06-06 02:18:33 +08:00
|
|
|
break;
|
2020-10-27 10:18:29 +08:00
|
|
|
case file_magic::bitcode:
|
2022-01-20 02:14:49 +08:00
|
|
|
newFile = make<BitcodeFile>(mbref, "", 0, isLazy);
|
2020-10-27 10:18:29 +08:00
|
|
|
break;
|
2021-02-23 02:03:02 +08:00
|
|
|
case file_magic::macho_executable:
|
|
|
|
case file_magic::macho_bundle:
|
|
|
|
// We only allow executable and bundle type here if it is used
|
|
|
|
// as a bundle loader.
|
|
|
|
if (!isBundleLoader)
|
|
|
|
error(path + ": unhandled file type");
|
2021-06-02 04:09:41 +08:00
|
|
|
if (DylibFile *dylibFile = loadDylib(mbref, nullptr, isBundleLoader))
|
|
|
|
newFile = dylibFile;
|
2021-02-23 02:03:02 +08:00
|
|
|
break;
|
2020-04-03 02:54:05 +08:00
|
|
|
default:
|
|
|
|
error(path + ": unhandled file type");
|
|
|
|
}
|
2021-06-07 23:00:34 +08:00
|
|
|
if (newFile && !isa<DylibFile>(newFile)) {
|
2022-01-20 02:14:49 +08:00
|
|
|
if ((isa<ObjFile>(newFile) || isa<BitcodeFile>(newFile)) && newFile->lazy &&
|
|
|
|
config->forceLoadObjC) {
|
|
|
|
for (Symbol *sym : newFile->symbols)
|
|
|
|
if (sym && sym->getName().startswith(objc::klass)) {
|
|
|
|
extract(*newFile, "-ObjC");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (newFile->lazy && hasObjCSection(mbref))
|
|
|
|
extract(*newFile, "-ObjC");
|
|
|
|
}
|
|
|
|
|
2020-12-03 07:57:30 +08:00
|
|
|
// printArchiveMemberLoad() prints both .a and .o names, so no need to
|
2022-01-20 02:14:49 +08:00
|
|
|
// print the .a name here. Similarly skip lazy files.
|
|
|
|
if (config->printEachFile && magic != file_magic::archive && !isLazy)
|
2020-12-18 05:19:06 +08:00
|
|
|
message(toString(newFile));
|
2020-12-15 06:59:22 +08:00
|
|
|
inputFiles.insert(newFile);
|
2020-12-03 07:57:30 +08:00
|
|
|
}
|
2020-09-19 02:38:15 +08:00
|
|
|
return newFile;
|
2020-04-03 02:54:05 +08:00
|
|
|
}
|
|
|
|
|
2021-06-02 23:06:42 +08:00
|
|
|
static void addLibrary(StringRef name, bool isNeeded, bool isWeak,
|
2021-10-29 23:00:16 +08:00
|
|
|
bool isReexport, bool isExplicit,
|
|
|
|
ForceLoad forceLoadArchive) {
|
2021-06-19 10:19:09 +08:00
|
|
|
if (Optional<StringRef> path = findLibrary(name)) {
|
2021-06-02 20:54:36 +08:00
|
|
|
if (auto *dylibFile = dyn_cast_or_null<DylibFile>(
|
2022-01-20 02:14:49 +08:00
|
|
|
addFile(*path, forceLoadArchive, /*isLazy=*/false, isExplicit))) {
|
2021-06-02 23:06:42 +08:00
|
|
|
if (isNeeded)
|
|
|
|
dylibFile->forceNeeded = true;
|
2021-06-01 10:12:35 +08:00
|
|
|
if (isWeak)
|
|
|
|
dylibFile->forceWeakImport = true;
|
2021-06-02 08:17:04 +08:00
|
|
|
if (isReexport) {
|
|
|
|
config->hasReexports = true;
|
|
|
|
dylibFile->reexport = true;
|
|
|
|
}
|
2021-06-01 10:12:35 +08:00
|
|
|
}
|
2020-12-04 05:40:04 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
error("library not found for -l" + name);
|
|
|
|
}
|
|
|
|
|
2021-06-02 23:06:42 +08:00
|
|
|
static void addFramework(StringRef name, bool isNeeded, bool isWeak,
|
2021-10-29 23:00:16 +08:00
|
|
|
bool isReexport, bool isExplicit,
|
|
|
|
ForceLoad forceLoadArchive) {
|
2021-10-20 23:20:21 +08:00
|
|
|
if (Optional<StringRef> path = findFramework(name)) {
|
2021-06-02 20:54:36 +08:00
|
|
|
if (auto *dylibFile = dyn_cast_or_null<DylibFile>(
|
2022-01-20 02:14:49 +08:00
|
|
|
addFile(*path, forceLoadArchive, /*isLazy=*/false, isExplicit))) {
|
2021-06-02 23:06:42 +08:00
|
|
|
if (isNeeded)
|
|
|
|
dylibFile->forceNeeded = true;
|
2021-06-01 10:12:35 +08:00
|
|
|
if (isWeak)
|
|
|
|
dylibFile->forceWeakImport = true;
|
2021-06-02 08:17:04 +08:00
|
|
|
if (isReexport) {
|
|
|
|
config->hasReexports = true;
|
|
|
|
dylibFile->reexport = true;
|
|
|
|
}
|
2021-06-01 10:12:35 +08:00
|
|
|
}
|
2020-12-04 05:40:04 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
error("framework not found for -framework " + name);
|
|
|
|
}
|
|
|
|
|
2021-04-07 02:05:15 +08:00
|
|
|
// Parses LC_LINKER_OPTION contents, which can add additional command line
|
2021-11-05 10:53:18 +08:00
|
|
|
// flags. This directly parses the flags instead of using the standard argument
|
|
|
|
// parser to improve performance.
|
2021-04-07 02:05:15 +08:00
|
|
|
void macho::parseLCLinkerOption(InputFile *f, unsigned argc, StringRef data) {
|
2021-11-05 10:53:18 +08:00
|
|
|
SmallVector<StringRef, 4> argv;
|
2020-12-04 05:40:04 +08:00
|
|
|
size_t offset = 0;
|
|
|
|
for (unsigned i = 0; i < argc && offset < data.size(); ++i) {
|
|
|
|
argv.push_back(data.data() + offset);
|
|
|
|
offset += strlen(data.data() + offset) + 1;
|
|
|
|
}
|
|
|
|
if (argv.size() != argc || offset > data.size())
|
|
|
|
fatal(toString(f) + ": invalid LC_LINKER_OPTION");
|
|
|
|
|
2021-11-05 10:53:18 +08:00
|
|
|
unsigned i = 0;
|
|
|
|
StringRef arg = argv[i];
|
|
|
|
if (arg.consume_front("-l")) {
|
|
|
|
ForceLoad forceLoadArchive =
|
|
|
|
config->forceLoadSwift && arg.startswith("swift") ? ForceLoad::Yes
|
|
|
|
: ForceLoad::No;
|
|
|
|
addLibrary(arg, /*isNeeded=*/false, /*isWeak=*/false,
|
|
|
|
/*isReexport=*/false, /*isExplicit=*/false, forceLoadArchive);
|
|
|
|
} else if (arg == "-framework") {
|
|
|
|
StringRef name = argv[++i];
|
|
|
|
addFramework(name, /*isNeeded=*/false, /*isWeak=*/false,
|
|
|
|
/*isReexport=*/false, /*isExplicit=*/false, ForceLoad::No);
|
|
|
|
} else {
|
|
|
|
error(arg + " is not allowed in LC_LINKER_OPTION");
|
2020-12-04 05:40:04 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2022-01-20 02:14:49 +08:00
|
|
|
static void addFileList(StringRef path, bool isLazy) {
|
2021-03-02 17:20:22 +08:00
|
|
|
Optional<MemoryBufferRef> buffer = readFile(path);
|
2020-07-29 00:56:50 +08:00
|
|
|
if (!buffer)
|
|
|
|
return;
|
|
|
|
MemoryBufferRef mbref = *buffer;
|
|
|
|
for (StringRef path : args::getLines(mbref))
|
2022-01-20 02:14:49 +08:00
|
|
|
addFile(rerootPath(path), ForceLoad::Default, isLazy);
|
2020-07-29 00:56:50 +08:00
|
|
|
}
|
|
|
|
|
2020-05-06 07:37:34 +08:00
|
|
|
// An order file has one entry per line, in the following format:
|
|
|
|
//
|
2020-12-19 01:47:15 +08:00
|
|
|
// <cpu>:<object file>:<symbol name>
|
2020-05-06 07:37:34 +08:00
|
|
|
//
|
2020-12-19 01:47:15 +08:00
|
|
|
// <cpu> and <object file> are optional. If not specified, then that entry
|
|
|
|
// matches any symbol of that name. Parsing this format is not quite
|
|
|
|
// straightforward because the symbol name itself can contain colons, so when
|
|
|
|
// encountering a colon, we consider the preceding characters to decide if it
|
|
|
|
// can be a valid CPU type or file path.
|
2020-05-06 07:37:34 +08:00
|
|
|
//
|
|
|
|
// If a symbol is matched by multiple entries, then it takes the lowest-ordered
|
|
|
|
// entry (the one nearest to the front of the list.)
|
|
|
|
//
|
|
|
|
// The file can also have line comments that start with '#'.
|
2020-04-24 11:16:49 +08:00
|
|
|
// We expect sub-library names of the form "libfoo", which will match a dylib
|
2020-08-19 06:46:21 +08:00
|
|
|
// with a path of .*/libfoo.{dylib, tbd}.
|
|
|
|
// XXX ld64 seems to ignore the extension entirely when matching sub-libraries;
|
|
|
|
// I'm not sure what the use case for that is.
|
2020-12-15 11:55:28 +08:00
|
|
|
static bool markReexport(StringRef searchName, ArrayRef<StringRef> extensions) {
|
2020-04-24 11:16:49 +08:00
|
|
|
for (InputFile *file : inputFiles) {
|
|
|
|
if (auto *dylibFile = dyn_cast<DylibFile>(file)) {
|
|
|
|
StringRef filename = path::filename(dylibFile->getName());
|
2020-08-19 06:46:21 +08:00
|
|
|
if (filename.consume_front(searchName) &&
|
2020-12-15 11:55:28 +08:00
|
|
|
(filename.empty() ||
|
|
|
|
find(extensions, filename) != extensions.end())) {
|
2020-04-24 11:16:49 +08:00
|
|
|
dylibFile->reexport = true;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2020-10-27 10:18:29 +08:00
|
|
|
// This function is called on startup. We need this for LTO since
|
|
|
|
// LTO calls LLVM functions to compile bitcode files to native code.
|
|
|
|
// Technically this can be delayed until we read bitcode files, but
|
|
|
|
// we don't bother to do lazily because the initialization is fast.
|
|
|
|
static void initLLVM() {
|
|
|
|
InitializeAllTargets();
|
|
|
|
InitializeAllTargetMCs();
|
|
|
|
InitializeAllAsmPrinters();
|
|
|
|
InitializeAllAsmParsers();
|
|
|
|
}
|
|
|
|
|
|
|
|
static void compileBitcodeFiles() {
|
2021-07-06 12:25:01 +08:00
|
|
|
// FIXME: Remove this once LTO.cpp honors config->exportDynamic.
|
|
|
|
if (config->exportDynamic)
|
|
|
|
for (InputFile *file : inputFiles)
|
2021-07-08 15:46:30 +08:00
|
|
|
if (isa<BitcodeFile>(file)) {
|
2021-07-06 12:25:01 +08:00
|
|
|
warn("the effect of -export_dynamic on LTO is not yet implemented");
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2021-03-26 02:39:44 +08:00
|
|
|
TimeTraceScope timeScope("LTO");
|
2021-03-10 12:15:29 +08:00
|
|
|
auto *lto = make<BitcodeCompiler>();
|
2020-10-27 10:18:29 +08:00
|
|
|
for (InputFile *file : inputFiles)
|
|
|
|
if (auto *bitcodeFile = dyn_cast<BitcodeFile>(file))
|
2022-01-20 02:14:49 +08:00
|
|
|
if (!file->lazy)
|
|
|
|
lto->add(*bitcodeFile);
|
2020-10-27 10:18:29 +08:00
|
|
|
|
|
|
|
for (ObjFile *file : lto->compile())
|
2020-12-15 06:59:22 +08:00
|
|
|
inputFiles.insert(file);
|
2020-10-27 10:18:29 +08:00
|
|
|
}
|
|
|
|
|
2020-09-25 05:44:14 +08:00
|
|
|
// Replaces common symbols with defined symbols residing in __common sections.
|
|
|
|
// This function must be called after all symbol names are resolved (i.e. after
|
|
|
|
// all InputFiles have been loaded.) As a result, later operations won't see
|
|
|
|
// any CommonSymbols.
|
|
|
|
static void replaceCommonSymbols() {
|
2021-03-26 02:39:44 +08:00
|
|
|
TimeTraceScope timeScope("Replace common symbols");
|
[lld-macho] Have ICF operate on all sections at once
ICF previously operated only within a given OutputSection. We would
merge all CFStrings first, then merge all regular code sections in a
second phase. This worked fine since CFStrings would never reference
regular `__text` sections. However, I would like to expand ICF to merge
functions that reference unwind info. Unwind info references the LSDA
section, which can in turn reference the `__text` section, so we cannot
perform ICF in phases.
In order to have ICF operate on InputSections spanning multiple
OutputSections, we need a way to distinguish InputSections that are
destined for different OutputSections, so that we don't fold across
section boundaries. We achieve this by creating OutputSections early,
and setting `InputSection::parent` to point to them. This is what
LLD-ELF does. (This change should also make it easier to implement the
`section$start$` symbols.)
This diff also folds InputSections w/o checking their flags, which I
think is the right behavior -- if they are destined for the same
OutputSection, they will have the same flags in the output (even if
their input flags differ). I.e. the `parent` pointer check subsumes the
`flags` check. In practice this has nearly no effect (ICF did not become
any more effective on chromium_framework).
I've also updated ICF.cpp's block comment to better reflect its current
status.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D105641
2021-07-18 01:42:26 +08:00
|
|
|
ConcatOutputSection *osec = nullptr;
|
2021-03-30 08:19:29 +08:00
|
|
|
for (Symbol *sym : symtab->getSymbols()) {
|
2020-09-25 05:44:14 +08:00
|
|
|
auto *common = dyn_cast<CommonSymbol>(sym);
|
|
|
|
if (common == nullptr)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
// Casting to size_t will truncate large values on 32-bit architectures,
|
|
|
|
// but it's not really worth supporting the linking of 64-bit programs on
|
|
|
|
// 32-bit archs.
|
2021-07-02 08:33:55 +08:00
|
|
|
ArrayRef<uint8_t> data = {nullptr, static_cast<size_t>(common->size)};
|
|
|
|
auto *isec = make<ConcatInputSection>(
|
|
|
|
segment_names::data, section_names::common, common->getFile(), data,
|
|
|
|
common->align, S_ZEROFILL);
|
[lld-macho] Have ICF operate on all sections at once
ICF previously operated only within a given OutputSection. We would
merge all CFStrings first, then merge all regular code sections in a
second phase. This worked fine since CFStrings would never reference
regular `__text` sections. However, I would like to expand ICF to merge
functions that reference unwind info. Unwind info references the LSDA
section, which can in turn reference the `__text` section, so we cannot
perform ICF in phases.
In order to have ICF operate on InputSections spanning multiple
OutputSections, we need a way to distinguish InputSections that are
destined for different OutputSections, so that we don't fold across
section boundaries. We achieve this by creating OutputSections early,
and setting `InputSection::parent` to point to them. This is what
LLD-ELF does. (This change should also make it easier to implement the
`section$start$` symbols.)
This diff also folds InputSections w/o checking their flags, which I
think is the right behavior -- if they are destined for the same
OutputSection, they will have the same flags in the output (even if
their input flags differ). I.e. the `parent` pointer check subsumes the
`flags` check. In practice this has nearly no effect (ICF did not become
any more effective on chromium_framework).
I've also updated ICF.cpp's block comment to better reflect its current
status.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D105641
2021-07-18 01:42:26 +08:00
|
|
|
if (!osec)
|
|
|
|
osec = ConcatOutputSection::getOrCreateForInput(isec);
|
|
|
|
isec->parent = osec;
|
2020-09-25 05:44:14 +08:00
|
|
|
inputSections.push_back(isec);
|
|
|
|
|
[lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.
Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).
Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:
% ministat lld_*
x lld_nostrip.txt
+ lld_strip.txt
N Min Max Median Avg Stddev
x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794
+ 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651
Difference at 95.0% confidence
-0.144711 +/- 0.0336749
-3.60967% +/- 0.839989%
(Student's t, pooled s = 0.0358398)
This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols
It's possible it interacts with more features I didn't think of,
of course.
I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests
Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.
Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
(`.no_dead_strip` in asm)
Fixes PR49276.
Differential Revision: https://reviews.llvm.org/D103324
2021-05-08 05:10:05 +08:00
|
|
|
// FIXME: CommonSymbol should store isReferencedDynamically, noDeadStrip
|
|
|
|
// and pass them on here.
|
2021-07-02 08:33:55 +08:00
|
|
|
replaceSymbol<Defined>(sym, sym->getName(), isec->getFile(), isec,
|
|
|
|
/*value=*/0,
|
2021-04-02 08:48:09 +08:00
|
|
|
/*size=*/0,
|
2020-09-25 05:44:14 +08:00
|
|
|
/*isWeakDef=*/false,
|
2021-05-01 04:17:26 +08:00
|
|
|
/*isExternal=*/true, common->privateExtern,
|
2021-05-17 21:15:39 +08:00
|
|
|
/*isThumb=*/false,
|
[lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.
Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).
Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:
% ministat lld_*
x lld_nostrip.txt
+ lld_strip.txt
N Min Max Median Avg Stddev
x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794
+ 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651
Difference at 95.0% confidence
-0.144711 +/- 0.0336749
-3.60967% +/- 0.839989%
(Student's t, pooled s = 0.0358398)
This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols
It's possible it interacts with more features I didn't think of,
of course.
I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests
Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.
Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
(`.no_dead_strip` in asm)
Fixes PR49276.
Differential Revision: https://reviews.llvm.org/D103324
2021-05-08 05:10:05 +08:00
|
|
|
/*isReferencedDynamically=*/false,
|
|
|
|
/*noDeadStrip=*/false);
|
2020-09-25 05:44:14 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-04-26 07:00:24 +08:00
|
|
|
static void initializeSectionRenameMap() {
|
|
|
|
if (config->dataConst) {
|
|
|
|
SmallVector<StringRef> v{section_names::got,
|
|
|
|
section_names::authGot,
|
|
|
|
section_names::authPtr,
|
|
|
|
section_names::nonLazySymbolPtr,
|
|
|
|
section_names::const_,
|
|
|
|
section_names::cfString,
|
|
|
|
section_names::moduleInitFunc,
|
|
|
|
section_names::moduleTermFunc,
|
|
|
|
section_names::objcClassList,
|
|
|
|
section_names::objcNonLazyClassList,
|
|
|
|
section_names::objcCatList,
|
|
|
|
section_names::objcNonLazyCatList,
|
|
|
|
section_names::objcProtoList,
|
|
|
|
section_names::objcImageInfo};
|
|
|
|
for (StringRef s : v)
|
|
|
|
config->sectionRenameMap[{segment_names::data, s}] = {
|
|
|
|
segment_names::dataConst, s};
|
|
|
|
}
|
|
|
|
config->sectionRenameMap[{segment_names::text, section_names::staticInit}] = {
|
|
|
|
segment_names::text, section_names::text};
|
|
|
|
config->sectionRenameMap[{segment_names::import, section_names::pointers}] = {
|
|
|
|
config->dataConst ? segment_names::dataConst : segment_names::data,
|
|
|
|
section_names::nonLazySymbolPtr};
|
|
|
|
}
|
|
|
|
|
2020-08-11 09:47:16 +08:00
|
|
|
static inline char toLowerDash(char x) {
|
|
|
|
if (x >= 'A' && x <= 'Z')
|
|
|
|
return x - 'A' + 'a';
|
|
|
|
else if (x == ' ')
|
|
|
|
return '-';
|
|
|
|
return x;
|
|
|
|
}
|
|
|
|
|
|
|
|
static std::string lowerDash(StringRef s) {
|
|
|
|
return std::string(map_iterator(s.begin(), toLowerDash),
|
|
|
|
map_iterator(s.end(), toLowerDash));
|
|
|
|
}
|
|
|
|
|
2021-03-05 03:36:47 +08:00
|
|
|
// Has the side-effect of setting Config::platformInfo.
|
2022-01-13 06:01:59 +08:00
|
|
|
static PlatformType parsePlatformVersion(const ArgList &args) {
|
2021-03-10 12:40:08 +08:00
|
|
|
const Arg *arg = args.getLastArg(OPT_platform_version);
|
2021-03-04 04:52:10 +08:00
|
|
|
if (!arg) {
|
|
|
|
error("must specify -platform_version");
|
2022-01-13 06:01:59 +08:00
|
|
|
return PLATFORM_UNKNOWN;
|
2021-03-04 04:52:10 +08:00
|
|
|
}
|
2021-03-04 04:52:06 +08:00
|
|
|
|
2020-08-11 09:47:16 +08:00
|
|
|
StringRef platformStr = arg->getValue(0);
|
|
|
|
StringRef minVersionStr = arg->getValue(1);
|
|
|
|
StringRef sdkVersionStr = arg->getValue(2);
|
|
|
|
|
|
|
|
// TODO(compnerd) see if we can generate this case list via XMACROS
|
2022-01-13 06:01:59 +08:00
|
|
|
PlatformType platform =
|
|
|
|
StringSwitch<PlatformType>(lowerDash(platformStr))
|
|
|
|
.Cases("macos", "1", PLATFORM_MACOS)
|
|
|
|
.Cases("ios", "2", PLATFORM_IOS)
|
|
|
|
.Cases("tvos", "3", PLATFORM_TVOS)
|
|
|
|
.Cases("watchos", "4", PLATFORM_WATCHOS)
|
|
|
|
.Cases("bridgeos", "5", PLATFORM_BRIDGEOS)
|
|
|
|
.Cases("mac-catalyst", "6", PLATFORM_MACCATALYST)
|
|
|
|
.Cases("ios-simulator", "7", PLATFORM_IOSSIMULATOR)
|
|
|
|
.Cases("tvos-simulator", "8", PLATFORM_TVOSSIMULATOR)
|
|
|
|
.Cases("watchos-simulator", "9", PLATFORM_WATCHOSSIMULATOR)
|
|
|
|
.Cases("driverkit", "10", PLATFORM_DRIVERKIT)
|
|
|
|
.Default(PLATFORM_UNKNOWN);
|
|
|
|
if (platform == PLATFORM_UNKNOWN)
|
2020-08-11 09:47:16 +08:00
|
|
|
error(Twine("malformed platform: ") + platformStr);
|
|
|
|
// TODO: check validity of version strings, which varies by platform
|
|
|
|
// NOTE: ld64 accepts version strings with 5 components
|
|
|
|
// llvm::VersionTuple accepts no more than 4 components
|
|
|
|
// Has Apple ever published version strings with 5 components?
|
2021-03-05 03:36:47 +08:00
|
|
|
if (config->platformInfo.minimum.tryParse(minVersionStr))
|
2020-08-11 09:47:16 +08:00
|
|
|
error(Twine("malformed minimum version: ") + minVersionStr);
|
2021-03-05 03:36:47 +08:00
|
|
|
if (config->platformInfo.sdk.tryParse(sdkVersionStr))
|
2020-08-11 09:47:16 +08:00
|
|
|
error(Twine("malformed sdk version: ") + sdkVersionStr);
|
2021-03-04 04:52:06 +08:00
|
|
|
return platform;
|
2020-06-16 03:36:32 +08:00
|
|
|
}
|
|
|
|
|
2021-03-05 03:36:47 +08:00
|
|
|
// Has the side-effect of setting Config::target.
|
2021-03-10 12:40:08 +08:00
|
|
|
static TargetInfo *createTargetInfo(InputArgList &args) {
|
2021-03-05 03:36:47 +08:00
|
|
|
StringRef archName = args.getLastArgValue(OPT_arch);
|
2021-10-31 07:45:06 +08:00
|
|
|
if (archName.empty()) {
|
|
|
|
error("must specify -arch");
|
|
|
|
return nullptr;
|
|
|
|
}
|
2021-03-05 03:36:47 +08:00
|
|
|
|
2022-01-13 06:01:59 +08:00
|
|
|
PlatformType platform = parsePlatformVersion(args);
|
2021-04-21 20:41:14 +08:00
|
|
|
config->platformInfo.target =
|
|
|
|
MachO::Target(getArchitectureFromName(archName), platform);
|
2021-03-05 03:36:47 +08:00
|
|
|
|
2021-05-01 04:17:25 +08:00
|
|
|
uint32_t cpuType;
|
|
|
|
uint32_t cpuSubtype;
|
|
|
|
std::tie(cpuType, cpuSubtype) = getCPUTypeFromArchitecture(config->arch());
|
|
|
|
|
|
|
|
switch (cpuType) {
|
2021-03-12 02:28:08 +08:00
|
|
|
case CPU_TYPE_X86_64:
|
2021-03-05 03:36:47 +08:00
|
|
|
return createX86_64TargetInfo();
|
2021-03-12 02:28:08 +08:00
|
|
|
case CPU_TYPE_ARM64:
|
2021-03-05 03:36:47 +08:00
|
|
|
return createARM64TargetInfo();
|
2021-04-16 09:14:32 +08:00
|
|
|
case CPU_TYPE_ARM64_32:
|
|
|
|
return createARM64_32TargetInfo();
|
2021-05-01 04:17:25 +08:00
|
|
|
case CPU_TYPE_ARM:
|
|
|
|
return createARMTargetInfo(cpuSubtype);
|
2021-03-05 03:36:47 +08:00
|
|
|
default:
|
2021-10-31 07:45:06 +08:00
|
|
|
error("missing or unsupported -arch " + archName);
|
|
|
|
return nullptr;
|
2021-03-05 03:36:47 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-03-04 04:52:06 +08:00
|
|
|
static UndefinedSymbolTreatment
|
2021-03-10 12:40:08 +08:00
|
|
|
getUndefinedSymbolTreatment(const ArgList &args) {
|
2021-03-04 04:52:06 +08:00
|
|
|
StringRef treatmentStr = args.getLastArgValue(OPT_undefined);
|
2021-02-18 21:48:07 +08:00
|
|
|
auto treatment =
|
2021-01-10 00:58:19 +08:00
|
|
|
StringSwitch<UndefinedSymbolTreatment>(treatmentStr)
|
2021-03-04 04:52:06 +08:00
|
|
|
.Cases("error", "", UndefinedSymbolTreatment::error)
|
2020-12-14 11:31:33 +08:00
|
|
|
.Case("warning", UndefinedSymbolTreatment::warning)
|
|
|
|
.Case("suppress", UndefinedSymbolTreatment::suppress)
|
|
|
|
.Case("dynamic_lookup", UndefinedSymbolTreatment::dynamic_lookup)
|
|
|
|
.Default(UndefinedSymbolTreatment::unknown);
|
2021-02-18 21:48:07 +08:00
|
|
|
if (treatment == UndefinedSymbolTreatment::unknown) {
|
2020-12-14 11:31:33 +08:00
|
|
|
warn(Twine("unknown -undefined TREATMENT '") + treatmentStr +
|
|
|
|
"', defaulting to 'error'");
|
2021-02-18 21:48:07 +08:00
|
|
|
treatment = UndefinedSymbolTreatment::error;
|
|
|
|
} else if (config->namespaceKind == NamespaceKind::twolevel &&
|
|
|
|
(treatment == UndefinedSymbolTreatment::warning ||
|
|
|
|
treatment == UndefinedSymbolTreatment::suppress)) {
|
|
|
|
if (treatment == UndefinedSymbolTreatment::warning)
|
|
|
|
error("'-undefined warning' only valid with '-flat_namespace'");
|
|
|
|
else
|
|
|
|
error("'-undefined suppress' only valid with '-flat_namespace'");
|
|
|
|
treatment = UndefinedSymbolTreatment::error;
|
2020-12-14 11:31:33 +08:00
|
|
|
}
|
2021-03-04 04:52:06 +08:00
|
|
|
return treatment;
|
2020-12-14 11:31:33 +08:00
|
|
|
}
|
|
|
|
|
2021-05-20 00:58:17 +08:00
|
|
|
static ICFLevel getICFLevel(const ArgList &args) {
|
2021-06-19 00:38:50 +08:00
|
|
|
StringRef icfLevelStr = args.getLastArgValue(OPT_icf_eq);
|
2021-05-20 00:58:17 +08:00
|
|
|
auto icfLevel = StringSwitch<ICFLevel>(icfLevelStr)
|
|
|
|
.Cases("none", "", ICFLevel::none)
|
|
|
|
.Case("safe", ICFLevel::safe)
|
|
|
|
.Case("all", ICFLevel::all)
|
|
|
|
.Default(ICFLevel::unknown);
|
|
|
|
if (icfLevel == ICFLevel::unknown) {
|
2021-06-19 00:38:50 +08:00
|
|
|
warn(Twine("unknown --icf=OPTION `") + icfLevelStr +
|
2021-05-20 00:58:17 +08:00
|
|
|
"', defaulting to `none'");
|
|
|
|
icfLevel = ICFLevel::none;
|
|
|
|
} else if (icfLevel == ICFLevel::safe) {
|
2021-06-19 00:38:50 +08:00
|
|
|
warn(Twine("`--icf=safe' is not yet implemented, reverting to `none'"));
|
2021-05-20 00:58:17 +08:00
|
|
|
icfLevel = ICFLevel::none;
|
|
|
|
}
|
|
|
|
return icfLevel;
|
|
|
|
}
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
static void warnIfDeprecatedOption(const Option &opt) {
|
2020-06-16 03:36:32 +08:00
|
|
|
if (!opt.getGroup().isValid())
|
|
|
|
return;
|
|
|
|
if (opt.getGroup().getID() == OPT_grp_deprecated) {
|
|
|
|
warn("Option `" + opt.getPrefixedName() + "' is deprecated in ld64:");
|
|
|
|
warn(opt.getHelpText());
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
static void warnIfUnimplementedOption(const Option &opt) {
|
2020-09-23 23:53:37 +08:00
|
|
|
if (!opt.getGroup().isValid() || !opt.hasFlag(DriverFlag::HelpHidden))
|
2020-06-16 03:36:32 +08:00
|
|
|
return;
|
|
|
|
switch (opt.getGroup().getID()) {
|
|
|
|
case OPT_grp_deprecated:
|
|
|
|
// warn about deprecated options elsewhere
|
|
|
|
break;
|
|
|
|
case OPT_grp_undocumented:
|
|
|
|
warn("Option `" + opt.getPrefixedName() +
|
|
|
|
"' is undocumented. Should lld implement it?");
|
|
|
|
break;
|
|
|
|
case OPT_grp_obsolete:
|
|
|
|
warn("Option `" + opt.getPrefixedName() +
|
|
|
|
"' is obsolete. Please modernize your usage.");
|
|
|
|
break;
|
|
|
|
case OPT_grp_ignored:
|
|
|
|
warn("Option `" + opt.getPrefixedName() + "' is ignored.");
|
|
|
|
break;
|
2021-11-04 12:47:49 +08:00
|
|
|
case OPT_grp_ignored_silently:
|
|
|
|
break;
|
2020-06-16 03:36:32 +08:00
|
|
|
default:
|
|
|
|
warn("Option `" + opt.getPrefixedName() +
|
|
|
|
"' is not yet implemented. Stay tuned...");
|
|
|
|
break;
|
2020-05-13 02:02:13 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
static const char *getReproduceOption(InputArgList &args) {
|
2021-03-10 12:15:29 +08:00
|
|
|
if (const Arg *arg = args.getLastArg(OPT_reproduce))
|
2020-11-29 11:38:27 +08:00
|
|
|
return arg->getValue();
|
|
|
|
return getenv("LLD_REPRODUCE");
|
|
|
|
}
|
|
|
|
|
2020-12-08 21:08:56 +08:00
|
|
|
static void parseClangOption(StringRef opt, const Twine &msg) {
|
|
|
|
std::string err;
|
|
|
|
raw_string_ostream os(err);
|
|
|
|
|
|
|
|
const char *argv[] = {"lld", opt.data()};
|
|
|
|
if (cl::ParseCommandLineOptions(2, argv, "", &os))
|
|
|
|
return;
|
|
|
|
os.flush();
|
|
|
|
error(msg + ": " + StringRef(err).trim());
|
|
|
|
}
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
static uint32_t parseDylibVersion(const ArgList &args, unsigned id) {
|
|
|
|
const Arg *arg = args.getLastArg(id);
|
2020-12-15 07:24:50 +08:00
|
|
|
if (!arg)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (config->outputType != MH_DYLIB) {
|
|
|
|
error(arg->getAsString(args) + ": only valid with -dylib");
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-12-16 05:37:37 +08:00
|
|
|
PackedVersion version;
|
|
|
|
if (!version.parse32(arg->getValue())) {
|
2020-12-15 07:24:50 +08:00
|
|
|
error(arg->getAsString(args) + ": malformed version");
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-12-16 05:37:37 +08:00
|
|
|
return version.rawValue();
|
2020-12-15 07:24:50 +08:00
|
|
|
}
|
|
|
|
|
2021-03-30 02:08:12 +08:00
|
|
|
static uint32_t parseProtection(StringRef protStr) {
|
|
|
|
uint32_t prot = 0;
|
|
|
|
for (char c : protStr) {
|
|
|
|
switch (c) {
|
|
|
|
case 'r':
|
|
|
|
prot |= VM_PROT_READ;
|
|
|
|
break;
|
|
|
|
case 'w':
|
|
|
|
prot |= VM_PROT_WRITE;
|
|
|
|
break;
|
|
|
|
case 'x':
|
|
|
|
prot |= VM_PROT_EXECUTE;
|
|
|
|
break;
|
|
|
|
case '-':
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
error("unknown -segprot letter '" + Twine(c) + "' in " + protStr);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return prot;
|
|
|
|
}
|
|
|
|
|
2021-05-11 23:43:48 +08:00
|
|
|
static std::vector<SectionAlign> parseSectAlign(const opt::InputArgList &args) {
|
|
|
|
std::vector<SectionAlign> sectAligns;
|
|
|
|
for (const Arg *arg : args.filtered(OPT_sectalign)) {
|
|
|
|
StringRef segName = arg->getValue(0);
|
|
|
|
StringRef sectName = arg->getValue(1);
|
|
|
|
StringRef alignStr = arg->getValue(2);
|
|
|
|
if (alignStr.startswith("0x") || alignStr.startswith("0X"))
|
|
|
|
alignStr = alignStr.drop_front(2);
|
|
|
|
uint32_t align;
|
|
|
|
if (alignStr.getAsInteger(16, align)) {
|
|
|
|
error("-sectalign: failed to parse '" + StringRef(arg->getValue(2)) +
|
|
|
|
"' as number");
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (!isPowerOf2_32(align)) {
|
|
|
|
error("-sectalign: '" + StringRef(arg->getValue(2)) +
|
|
|
|
"' (in base 16) not a power of two");
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
sectAligns.push_back({segName, sectName, align});
|
|
|
|
}
|
|
|
|
return sectAligns;
|
|
|
|
}
|
|
|
|
|
2022-01-13 06:01:59 +08:00
|
|
|
PlatformType macho::removeSimulator(PlatformType platform) {
|
2021-07-12 06:24:53 +08:00
|
|
|
switch (platform) {
|
2022-01-13 06:01:59 +08:00
|
|
|
case PLATFORM_IOSSIMULATOR:
|
|
|
|
return PLATFORM_IOS;
|
|
|
|
case PLATFORM_TVOSSIMULATOR:
|
|
|
|
return PLATFORM_TVOS;
|
|
|
|
case PLATFORM_WATCHOSSIMULATOR:
|
|
|
|
return PLATFORM_WATCHOS;
|
2021-07-12 06:24:53 +08:00
|
|
|
default:
|
|
|
|
return platform;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-04-26 07:00:24 +08:00
|
|
|
static bool dataConstDefault(const InputArgList &args) {
|
2022-01-13 06:01:59 +08:00
|
|
|
static const std::vector<std::pair<PlatformType, VersionTuple>> minVersion = {
|
|
|
|
{PLATFORM_MACOS, VersionTuple(10, 15)},
|
|
|
|
{PLATFORM_IOS, VersionTuple(13, 0)},
|
|
|
|
{PLATFORM_TVOS, VersionTuple(13, 0)},
|
|
|
|
{PLATFORM_WATCHOS, VersionTuple(6, 0)},
|
|
|
|
{PLATFORM_BRIDGEOS, VersionTuple(4, 0)}};
|
|
|
|
PlatformType platform = removeSimulator(config->platformInfo.target.Platform);
|
2021-07-12 06:24:53 +08:00
|
|
|
auto it = llvm::find_if(minVersion,
|
|
|
|
[&](const auto &p) { return p.first == platform; });
|
2021-07-01 06:55:38 +08:00
|
|
|
if (it != minVersion.end())
|
|
|
|
if (config->platformInfo.minimum < it->second)
|
|
|
|
return false;
|
|
|
|
|
2021-04-26 07:00:24 +08:00
|
|
|
switch (config->outputType) {
|
|
|
|
case MH_EXECUTE:
|
|
|
|
return !args.hasArg(OPT_no_pie);
|
|
|
|
case MH_BUNDLE:
|
|
|
|
// FIXME: return false when -final_name ...
|
|
|
|
// has prefix "/System/Library/UserEventPlugins/"
|
|
|
|
// or matches "/usr/libexec/locationd" "/usr/libexec/terminusd"
|
|
|
|
return true;
|
|
|
|
case MH_DYLIB:
|
|
|
|
return true;
|
|
|
|
case MH_OBJECT:
|
|
|
|
return false;
|
|
|
|
default:
|
|
|
|
llvm_unreachable(
|
|
|
|
"unsupported output type for determining data-const default");
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2021-03-04 04:15:09 +08:00
|
|
|
void SymbolPatterns::clear() {
|
|
|
|
literals.clear();
|
|
|
|
globs.clear();
|
|
|
|
}
|
|
|
|
|
|
|
|
void SymbolPatterns::insert(StringRef symbolName) {
|
|
|
|
if (symbolName.find_first_of("*?[]") == StringRef::npos)
|
|
|
|
literals.insert(CachedHashStringRef(symbolName));
|
|
|
|
else if (Expected<GlobPattern> pattern = GlobPattern::create(symbolName))
|
|
|
|
globs.emplace_back(*pattern);
|
|
|
|
else
|
|
|
|
error("invalid symbol-name pattern: " + symbolName);
|
|
|
|
}
|
|
|
|
|
|
|
|
bool SymbolPatterns::matchLiteral(StringRef symbolName) const {
|
|
|
|
return literals.contains(CachedHashStringRef(symbolName));
|
|
|
|
}
|
|
|
|
|
|
|
|
bool SymbolPatterns::matchGlob(StringRef symbolName) const {
|
2021-07-12 06:35:45 +08:00
|
|
|
for (const GlobPattern &glob : globs)
|
2021-03-04 04:15:09 +08:00
|
|
|
if (glob.match(symbolName))
|
|
|
|
return true;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool SymbolPatterns::match(StringRef symbolName) const {
|
|
|
|
return matchLiteral(symbolName) || matchGlob(symbolName);
|
|
|
|
}
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
static void handleSymbolPatterns(InputArgList &args,
|
2021-03-04 04:15:09 +08:00
|
|
|
SymbolPatterns &symbolPatterns,
|
|
|
|
unsigned singleOptionCode,
|
|
|
|
unsigned listFileOptionCode) {
|
2021-03-10 12:15:29 +08:00
|
|
|
for (const Arg *arg : args.filtered(singleOptionCode))
|
2021-03-04 04:15:09 +08:00
|
|
|
symbolPatterns.insert(arg->getValue());
|
2021-03-10 12:15:29 +08:00
|
|
|
for (const Arg *arg : args.filtered(listFileOptionCode)) {
|
2021-03-04 04:15:09 +08:00
|
|
|
StringRef path = arg->getValue();
|
|
|
|
Optional<MemoryBufferRef> buffer = readFile(path);
|
|
|
|
if (!buffer) {
|
|
|
|
error("Could not read symbol file: " + path);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
MemoryBufferRef mbref = *buffer;
|
|
|
|
for (StringRef line : args::getLines(mbref)) {
|
|
|
|
line = line.take_until([](char c) { return c == '#'; }).trim();
|
|
|
|
if (!line.empty())
|
|
|
|
symbolPatterns.insert(line);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-10-29 13:14:36 +08:00
|
|
|
static void createFiles(const InputArgList &args) {
|
2021-03-26 02:39:44 +08:00
|
|
|
TimeTraceScope timeScope("Load input files");
|
|
|
|
// This loop should be reserved for options whose exact ordering matters.
|
|
|
|
// Other options should be handled via filtered() and/or getLastArg().
|
2022-01-20 02:14:49 +08:00
|
|
|
bool isLazy = false;
|
2021-03-26 02:39:44 +08:00
|
|
|
for (const Arg *arg : args) {
|
|
|
|
const Option &opt = arg->getOption();
|
|
|
|
warnIfDeprecatedOption(opt);
|
|
|
|
warnIfUnimplementedOption(opt);
|
|
|
|
|
|
|
|
switch (opt.getID()) {
|
|
|
|
case OPT_INPUT:
|
2022-01-20 02:14:49 +08:00
|
|
|
addFile(rerootPath(arg->getValue()), ForceLoad::Default, isLazy);
|
2021-03-26 02:39:44 +08:00
|
|
|
break;
|
2021-06-02 23:06:42 +08:00
|
|
|
case OPT_needed_library:
|
|
|
|
if (auto *dylibFile = dyn_cast_or_null<DylibFile>(
|
2021-10-29 23:00:16 +08:00
|
|
|
addFile(rerootPath(arg->getValue()), ForceLoad::Default)))
|
2021-06-02 23:06:42 +08:00
|
|
|
dylibFile->forceNeeded = true;
|
|
|
|
break;
|
2021-06-02 08:17:04 +08:00
|
|
|
case OPT_reexport_library:
|
2021-10-29 23:00:16 +08:00
|
|
|
if (auto *dylibFile = dyn_cast_or_null<DylibFile>(
|
|
|
|
addFile(rerootPath(arg->getValue()), ForceLoad::Default))) {
|
2021-06-02 08:17:04 +08:00
|
|
|
config->hasReexports = true;
|
|
|
|
dylibFile->reexport = true;
|
|
|
|
}
|
|
|
|
break;
|
2021-03-26 02:39:44 +08:00
|
|
|
case OPT_weak_library:
|
2021-04-16 09:14:30 +08:00
|
|
|
if (auto *dylibFile = dyn_cast_or_null<DylibFile>(
|
2021-10-29 23:00:16 +08:00
|
|
|
addFile(rerootPath(arg->getValue()), ForceLoad::Default)))
|
2021-03-26 02:39:44 +08:00
|
|
|
dylibFile->forceWeakImport = true;
|
|
|
|
break;
|
|
|
|
case OPT_filelist:
|
2022-01-20 02:14:49 +08:00
|
|
|
addFileList(arg->getValue(), isLazy);
|
2021-03-26 02:39:44 +08:00
|
|
|
break;
|
|
|
|
case OPT_force_load:
|
2021-10-29 23:00:16 +08:00
|
|
|
addFile(rerootPath(arg->getValue()), ForceLoad::Yes);
|
2021-03-26 02:39:44 +08:00
|
|
|
break;
|
|
|
|
case OPT_l:
|
2021-06-02 23:06:42 +08:00
|
|
|
case OPT_needed_l:
|
2021-06-02 08:17:04 +08:00
|
|
|
case OPT_reexport_l:
|
2021-03-26 02:39:44 +08:00
|
|
|
case OPT_weak_l:
|
2021-06-02 23:06:42 +08:00
|
|
|
addLibrary(arg->getValue(), opt.getID() == OPT_needed_l,
|
|
|
|
opt.getID() == OPT_weak_l, opt.getID() == OPT_reexport_l,
|
2021-10-29 23:00:16 +08:00
|
|
|
/*isExplicit=*/true, ForceLoad::Default);
|
2021-03-26 02:39:44 +08:00
|
|
|
break;
|
|
|
|
case OPT_framework:
|
2021-06-02 23:06:42 +08:00
|
|
|
case OPT_needed_framework:
|
2021-06-02 08:17:04 +08:00
|
|
|
case OPT_reexport_framework:
|
2021-03-26 02:39:44 +08:00
|
|
|
case OPT_weak_framework:
|
2021-06-02 23:06:42 +08:00
|
|
|
addFramework(arg->getValue(), opt.getID() == OPT_needed_framework,
|
|
|
|
opt.getID() == OPT_weak_framework,
|
2021-10-29 23:00:16 +08:00
|
|
|
opt.getID() == OPT_reexport_framework, /*isExplicit=*/true,
|
|
|
|
ForceLoad::Default);
|
2021-03-26 02:39:44 +08:00
|
|
|
break;
|
2022-01-20 02:14:49 +08:00
|
|
|
case OPT_start_lib:
|
|
|
|
if (isLazy)
|
|
|
|
error("nested --start-lib");
|
|
|
|
isLazy = true;
|
|
|
|
break;
|
|
|
|
case OPT_end_lib:
|
|
|
|
if (!isLazy)
|
|
|
|
error("stray --end-lib");
|
|
|
|
isLazy = false;
|
|
|
|
break;
|
2021-03-26 02:39:44 +08:00
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
static void gatherInputSections() {
|
|
|
|
TimeTraceScope timeScope("Gathering input sections");
|
|
|
|
int inputOrder = 0;
|
|
|
|
for (const InputFile *file : inputFiles) {
|
2021-11-05 11:55:31 +08:00
|
|
|
for (const Section §ion : file->sections) {
|
2021-11-16 02:46:59 +08:00
|
|
|
const Subsections &subsections = section.subsections;
|
|
|
|
if (subsections.empty())
|
|
|
|
continue;
|
|
|
|
if (subsections[0].isec->getName() == section_names::compactUnwind)
|
|
|
|
// Compact unwind entries require special handling elsewhere.
|
|
|
|
continue;
|
[lld-macho] Have ICF operate on all sections at once
ICF previously operated only within a given OutputSection. We would
merge all CFStrings first, then merge all regular code sections in a
second phase. This worked fine since CFStrings would never reference
regular `__text` sections. However, I would like to expand ICF to merge
functions that reference unwind info. Unwind info references the LSDA
section, which can in turn reference the `__text` section, so we cannot
perform ICF in phases.
In order to have ICF operate on InputSections spanning multiple
OutputSections, we need a way to distinguish InputSections that are
destined for different OutputSections, so that we don't fold across
section boundaries. We achieve this by creating OutputSections early,
and setting `InputSection::parent` to point to them. This is what
LLD-ELF does. (This change should also make it easier to implement the
`section$start$` symbols.)
This diff also folds InputSections w/o checking their flags, which I
think is the right behavior -- if they are destined for the same
OutputSection, they will have the same flags in the output (even if
their input flags differ). I.e. the `parent` pointer check subsumes the
`flags` check. In practice this has nearly no effect (ICF did not become
any more effective on chromium_framework).
I've also updated ICF.cpp's block comment to better reflect its current
status.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D105641
2021-07-18 01:42:26 +08:00
|
|
|
ConcatOutputSection *osec = nullptr;
|
2021-11-16 02:46:59 +08:00
|
|
|
for (const Subsection &subsection : subsections) {
|
2021-11-05 11:55:31 +08:00
|
|
|
if (auto *isec = dyn_cast<ConcatInputSection>(subsection.isec)) {
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
if (isec->isCoalescedWeak())
|
|
|
|
continue;
|
|
|
|
isec->outSecOff = inputOrder++;
|
[lld-macho] Have ICF operate on all sections at once
ICF previously operated only within a given OutputSection. We would
merge all CFStrings first, then merge all regular code sections in a
second phase. This worked fine since CFStrings would never reference
regular `__text` sections. However, I would like to expand ICF to merge
functions that reference unwind info. Unwind info references the LSDA
section, which can in turn reference the `__text` section, so we cannot
perform ICF in phases.
In order to have ICF operate on InputSections spanning multiple
OutputSections, we need a way to distinguish InputSections that are
destined for different OutputSections, so that we don't fold across
section boundaries. We achieve this by creating OutputSections early,
and setting `InputSection::parent` to point to them. This is what
LLD-ELF does. (This change should also make it easier to implement the
`section$start$` symbols.)
This diff also folds InputSections w/o checking their flags, which I
think is the right behavior -- if they are destined for the same
OutputSection, they will have the same flags in the output (even if
their input flags differ). I.e. the `parent` pointer check subsumes the
`flags` check. In practice this has nearly no effect (ICF did not become
any more effective on chromium_framework).
I've also updated ICF.cpp's block comment to better reflect its current
status.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D105641
2021-07-18 01:42:26 +08:00
|
|
|
if (!osec)
|
|
|
|
osec = ConcatOutputSection::getOrCreateForInput(isec);
|
|
|
|
isec->parent = osec;
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
inputSections.push_back(isec);
|
2021-11-05 11:55:31 +08:00
|
|
|
} else if (auto *isec =
|
|
|
|
dyn_cast<CStringInputSection>(subsection.isec)) {
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
if (in.cStringSection->inputOrder == UnspecifiedInputOrder)
|
|
|
|
in.cStringSection->inputOrder = inputOrder++;
|
|
|
|
in.cStringSection->addInput(isec);
|
2021-11-05 11:55:31 +08:00
|
|
|
} else if (auto *isec =
|
|
|
|
dyn_cast<WordLiteralInputSection>(subsection.isec)) {
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
if (in.wordLiteralSection->inputOrder == UnspecifiedInputOrder)
|
|
|
|
in.wordLiteralSection->inputOrder = inputOrder++;
|
|
|
|
in.wordLiteralSection->addInput(isec);
|
|
|
|
} else {
|
|
|
|
llvm_unreachable("unexpected input section kind");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
assert(inputOrder <= UnspecifiedInputOrder);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void foldIdenticalLiterals() {
|
|
|
|
// We always create a cStringSection, regardless of whether dedupLiterals is
|
|
|
|
// true. If it isn't, we simply create a non-deduplicating CStringSection.
|
|
|
|
// Either way, we must unconditionally finalize it here.
|
|
|
|
in.cStringSection->finalizeContents();
|
|
|
|
if (in.wordLiteralSection)
|
|
|
|
in.wordLiteralSection->finalizeContents();
|
|
|
|
}
|
|
|
|
|
2021-07-12 01:13:34 +08:00
|
|
|
static void referenceStubBinder() {
|
|
|
|
bool needsStubHelper = config->outputType == MH_DYLIB ||
|
|
|
|
config->outputType == MH_EXECUTE ||
|
|
|
|
config->outputType == MH_BUNDLE;
|
|
|
|
if (!needsStubHelper || !symtab->find("dyld_stub_binder"))
|
|
|
|
return;
|
|
|
|
|
|
|
|
// dyld_stub_binder is used by dyld to resolve lazy bindings. This code here
|
|
|
|
// adds a opportunistic reference to dyld_stub_binder if it happens to exist.
|
|
|
|
// dyld_stub_binder is in libSystem.dylib, which is usually linked in. This
|
|
|
|
// isn't needed for correctness, but the presence of that symbol suppresses
|
|
|
|
// "no symbols" diagnostics from `nm`.
|
|
|
|
// StubHelperSection::setup() adds a reference and errors out if
|
|
|
|
// dyld_stub_binder doesn't exist in case it is actually needed.
|
|
|
|
symtab->addUndefined("dyld_stub_binder", /*file=*/nullptr, /*isWeak=*/false);
|
|
|
|
}
|
|
|
|
|
2022-01-21 03:53:18 +08:00
|
|
|
bool macho::link(ArrayRef<const char *> argsArr, llvm::raw_ostream &stdoutOS,
|
|
|
|
llvm::raw_ostream &stderrOS, bool exitEarly,
|
|
|
|
bool disableOutput) {
|
|
|
|
// This driver-specific context will be freed later by lldMain().
|
|
|
|
auto *ctx = new CommonLinkerContext;
|
2021-10-31 07:35:30 +08:00
|
|
|
|
2022-01-21 03:53:18 +08:00
|
|
|
ctx->e.initialize(stdoutOS, stderrOS, exitEarly, disableOutput);
|
|
|
|
ctx->e.cleanupCallback = []() {
|
2021-11-04 02:08:57 +08:00
|
|
|
resolvedFrameworks.clear();
|
2021-11-04 00:49:13 +08:00
|
|
|
resolvedLibraries.clear();
|
2021-11-05 00:42:57 +08:00
|
|
|
cachedReads.clear();
|
2021-10-31 07:35:30 +08:00
|
|
|
concatOutputSections.clear();
|
|
|
|
inputFiles.clear();
|
|
|
|
inputSections.clear();
|
|
|
|
loadedArchives.clear();
|
|
|
|
syntheticSections.clear();
|
|
|
|
thunkMap.clear();
|
|
|
|
|
|
|
|
firstTLVDataSection = nullptr;
|
|
|
|
tar = nullptr;
|
|
|
|
memset(&in, 0, sizeof(in));
|
|
|
|
|
|
|
|
resetLoadedDylibs();
|
|
|
|
resetOutputSegments();
|
|
|
|
resetWriter();
|
|
|
|
InputFile::resetIdCount();
|
|
|
|
};
|
2021-03-02 03:45:17 +08:00
|
|
|
|
2022-01-21 03:53:18 +08:00
|
|
|
ctx->e.logName = args::getFilenameWithoutExe(argsArr[0]);
|
2020-04-30 06:42:32 +08:00
|
|
|
|
2020-04-03 02:54:05 +08:00
|
|
|
MachOOptTable parser;
|
2021-03-10 12:40:08 +08:00
|
|
|
InputArgList args = parser.parse(argsArr.slice(1));
|
2020-04-03 02:54:05 +08:00
|
|
|
|
2022-01-21 03:53:18 +08:00
|
|
|
ctx->e.errorLimitExceededMsg = "too many errors emitted, stopping now "
|
|
|
|
"(use --error-limit=0 to see all errors)";
|
|
|
|
ctx->e.errorLimit = args::getInteger(args, OPT_error_limit_eq, 20);
|
|
|
|
ctx->e.verbose = args.hasArg(OPT_verbose);
|
2021-04-26 08:28:49 +08:00
|
|
|
|
2020-06-16 03:36:32 +08:00
|
|
|
if (args.hasArg(OPT_help_hidden)) {
|
|
|
|
parser.printHelp(argsArr[0], /*showHidden=*/true);
|
|
|
|
return true;
|
2020-12-23 04:51:20 +08:00
|
|
|
}
|
|
|
|
if (args.hasArg(OPT_help)) {
|
2020-06-16 03:36:32 +08:00
|
|
|
parser.printHelp(argsArr[0], /*showHidden=*/false);
|
|
|
|
return true;
|
|
|
|
}
|
2020-12-23 04:51:20 +08:00
|
|
|
if (args.hasArg(OPT_version)) {
|
|
|
|
message(getLLDVersion());
|
|
|
|
return true;
|
|
|
|
}
|
2020-06-16 03:36:32 +08:00
|
|
|
|
2022-01-11 11:39:14 +08:00
|
|
|
config = std::make_unique<Configuration>();
|
|
|
|
symtab = std::make_unique<SymbolTable>();
|
2021-05-06 02:38:36 +08:00
|
|
|
target = createTargetInfo(args);
|
2022-01-11 11:39:14 +08:00
|
|
|
depTracker = std::make_unique<DependencyTracker>(
|
|
|
|
args.getLastArgValue(OPT_dependency_info));
|
2021-10-31 07:45:06 +08:00
|
|
|
if (errorCount())
|
|
|
|
return false;
|
2021-05-06 02:38:36 +08:00
|
|
|
|
2021-10-22 10:38:12 +08:00
|
|
|
config->osoPrefix = args.getLastArgValue(OPT_oso_prefix);
|
|
|
|
if (!config->osoPrefix.empty()) {
|
|
|
|
// Expand special characters, such as ".", "..", or "~", if present.
|
|
|
|
// Note: LD64 only expands "." and not other special characters.
|
|
|
|
// That seems silly to imitate so we will not try to follow it, but rather
|
|
|
|
// just use real_path() to do it.
|
|
|
|
|
|
|
|
// The max path length is 4096, in theory. However that seems quite long
|
|
|
|
// and seems unlikely that any one would want to strip everything from the
|
|
|
|
// path. Hence we've picked a reasonably large number here.
|
|
|
|
SmallString<1024> expanded;
|
|
|
|
if (!fs::real_path(config->osoPrefix, expanded,
|
|
|
|
/*expand_tilde=*/true)) {
|
|
|
|
// Note: LD64 expands "." to be `<current_dir>/`
|
|
|
|
// (ie., it has a slash suffix) whereas real_path() doesn't.
|
|
|
|
// So we have to append '/' to be consistent.
|
|
|
|
StringRef sep = sys::path::get_separator();
|
2021-11-10 13:28:56 +08:00
|
|
|
// real_path removes trailing slashes as part of the normalization, but
|
|
|
|
// these are meaningful for our text based stripping
|
|
|
|
if (config->osoPrefix.equals(".") || config->osoPrefix.endswith(sep))
|
2021-10-22 10:38:12 +08:00
|
|
|
expanded += sep;
|
2022-01-21 03:53:18 +08:00
|
|
|
config->osoPrefix = saver().save(expanded.str());
|
2021-10-22 10:38:12 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
[lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.
Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).
Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:
% ministat lld_*
x lld_nostrip.txt
+ lld_strip.txt
N Min Max Median Avg Stddev
x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794
+ 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651
Difference at 95.0% confidence
-0.144711 +/- 0.0336749
-3.60967% +/- 0.839989%
(Student's t, pooled s = 0.0358398)
This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols
It's possible it interacts with more features I didn't think of,
of course.
I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests
Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.
Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
(`.no_dead_strip` in asm)
Fixes PR49276.
Differential Revision: https://reviews.llvm.org/D103324
2021-05-08 05:10:05 +08:00
|
|
|
// Must be set before any InputSections and Symbols are created.
|
|
|
|
config->deadStrip = args.hasArg(OPT_dead_strip);
|
|
|
|
|
2021-05-06 02:38:36 +08:00
|
|
|
config->systemLibraryRoots = getSystemLibraryRoots(args);
|
2020-11-29 11:38:27 +08:00
|
|
|
if (const char *path = getReproduceOption(args)) {
|
|
|
|
// Note that --reproduce is a debug option so you can ignore it
|
|
|
|
// if you are trying to understand the whole picture of the code.
|
|
|
|
Expected<std::unique_ptr<TarWriter>> errOrWriter =
|
|
|
|
TarWriter::create(path, path::stem(path));
|
|
|
|
if (errOrWriter) {
|
|
|
|
tar = std::move(*errOrWriter);
|
|
|
|
tar->append("response.txt", createResponseFile(args));
|
|
|
|
tar->append("version.txt", getLLDVersion() + "\n");
|
|
|
|
} else {
|
|
|
|
error("--reproduce: " + toString(errOrWriter.takeError()));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-03-26 02:39:45 +08:00
|
|
|
if (auto *arg = args.getLastArg(OPT_threads_eq)) {
|
|
|
|
StringRef v(arg->getValue());
|
|
|
|
unsigned threads = 0;
|
|
|
|
if (!llvm::to_integer(v, threads, 0) || threads == 0)
|
|
|
|
error(arg->getSpelling() + ": expected a positive integer, but got '" +
|
|
|
|
arg->getValue() + "'");
|
|
|
|
parallel::strategy = hardware_concurrency(threads);
|
2021-04-09 00:14:47 +08:00
|
|
|
config->thinLTOJobs = v;
|
2021-03-26 02:39:45 +08:00
|
|
|
}
|
2021-04-09 00:14:47 +08:00
|
|
|
if (auto *arg = args.getLastArg(OPT_thinlto_jobs_eq))
|
|
|
|
config->thinLTOJobs = arg->getValue();
|
|
|
|
if (!get_threadpool_strategy(config->thinLTOJobs))
|
|
|
|
error("--thinlto-jobs: invalid job count: " + config->thinLTOJobs);
|
2021-03-26 02:39:45 +08:00
|
|
|
|
2021-03-10 12:15:29 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_u)) {
|
2021-02-09 21:18:23 +08:00
|
|
|
config->explicitUndefineds.push_back(symtab->addUndefined(
|
|
|
|
arg->getValue(), /*file=*/nullptr, /*isWeakRef=*/false));
|
|
|
|
}
|
2021-02-26 08:56:31 +08:00
|
|
|
|
2021-03-10 12:15:29 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_U))
|
2021-07-22 23:20:36 +08:00
|
|
|
config->explicitDynamicLookups.insert(arg->getValue());
|
2021-02-26 08:56:31 +08:00
|
|
|
|
2021-03-18 22:38:30 +08:00
|
|
|
config->mapFile = args.getLastArgValue(OPT_map);
|
2021-07-16 09:29:05 +08:00
|
|
|
config->optimize = args::getInteger(args, OPT_O, 1);
|
2020-04-03 02:54:05 +08:00
|
|
|
config->outputFile = args.getLastArgValue(OPT_o, "a.out");
|
2021-07-14 19:33:09 +08:00
|
|
|
config->finalOutput =
|
|
|
|
args.getLastArgValue(OPT_final_output, config->outputFile);
|
2021-04-09 02:12:20 +08:00
|
|
|
config->astPaths = args.getAllArgValues(OPT_add_ast_path);
|
2020-07-31 05:38:58 +08:00
|
|
|
config->headerPad = args::getHex(args, OPT_headerpad, /*Default=*/32);
|
2020-09-22 02:04:13 +08:00
|
|
|
config->headerPadMaxInstallNames =
|
|
|
|
args.hasArg(OPT_headerpad_max_install_names);
|
2021-06-10 03:16:45 +08:00
|
|
|
config->printDylibSearch =
|
|
|
|
args.hasArg(OPT_print_dylib_search) || getenv("RC_TRACE_DYLIB_SEARCHING");
|
2020-12-03 07:57:30 +08:00
|
|
|
config->printEachFile = args.hasArg(OPT_t);
|
2020-12-03 07:59:00 +08:00
|
|
|
config->printWhyLoad = args.hasArg(OPT_why_load);
|
2021-10-27 12:42:25 +08:00
|
|
|
config->omitDebugInfo = args.hasArg(OPT_S);
|
2020-09-01 14:23:37 +08:00
|
|
|
config->outputType = getOutputType(args);
|
2021-11-04 12:23:04 +08:00
|
|
|
config->errorForArchMismatch = args.hasArg(OPT_arch_errors_fatal);
|
2021-03-10 12:40:08 +08:00
|
|
|
if (const Arg *arg = args.getLastArg(OPT_bundle_loader)) {
|
2021-02-23 02:03:02 +08:00
|
|
|
if (config->outputType != MH_BUNDLE)
|
|
|
|
error("-bundle_loader can only be used with MachO bundle output");
|
2022-01-20 02:14:49 +08:00
|
|
|
addFile(arg->getValue(), ForceLoad::Default, /*isLazy=*/false,
|
|
|
|
/*isExplicit=*/false,
|
2021-06-02 20:54:36 +08:00
|
|
|
/*isBundleLoader=*/true);
|
2021-02-23 02:03:02 +08:00
|
|
|
}
|
2021-07-06 02:40:52 +08:00
|
|
|
if (const Arg *arg = args.getLastArg(OPT_umbrella)) {
|
|
|
|
if (config->outputType != MH_DYLIB)
|
|
|
|
warn("-umbrella used, but not creating dylib");
|
|
|
|
config->umbrella = arg->getValue();
|
|
|
|
}
|
2020-12-03 12:34:17 +08:00
|
|
|
config->ltoObjPath = args.getLastArgValue(OPT_object_path_lto);
|
2021-01-13 03:41:56 +08:00
|
|
|
config->ltoNewPassManager =
|
|
|
|
args.hasFlag(OPT_no_lto_legacy_pass_manager, OPT_lto_legacy_pass_manager,
|
|
|
|
LLVM_ENABLE_NEW_PASS_MANAGER);
|
2021-07-02 03:01:59 +08:00
|
|
|
config->ltoo = args::getInteger(args, OPT_lto_O, 2);
|
|
|
|
if (config->ltoo > 3)
|
|
|
|
error("--lto-O: invalid optimization level: " + Twine(config->ltoo));
|
2021-07-16 00:56:13 +08:00
|
|
|
config->thinLTOCacheDir = args.getLastArgValue(OPT_cache_path_lto);
|
|
|
|
config->thinLTOCachePolicy = getLTOCachePolicy(args);
|
2020-08-13 10:50:28 +08:00
|
|
|
config->runtimePaths = args::getStrings(args, OPT_rpath);
|
2022-01-19 09:17:28 +08:00
|
|
|
config->allLoad = args.hasFlag(OPT_all_load, OPT_noall_load, false);
|
2021-07-06 08:52:09 +08:00
|
|
|
config->archMultiple = args.hasArg(OPT_arch_multiple);
|
2021-07-12 22:26:54 +08:00
|
|
|
config->applicationExtension = args.hasFlag(
|
|
|
|
OPT_application_extension, OPT_no_application_extension, false);
|
2021-07-06 12:25:01 +08:00
|
|
|
config->exportDynamic = args.hasArg(OPT_export_dynamic);
|
2020-09-19 11:51:38 +08:00
|
|
|
config->forceLoadObjC = args.hasArg(OPT_ObjC);
|
2021-06-08 11:48:16 +08:00
|
|
|
config->forceLoadSwift = args.hasArg(OPT_force_load_swift_libs);
|
2021-06-01 10:12:35 +08:00
|
|
|
config->deadStripDylibs = args.hasArg(OPT_dead_strip_dylibs);
|
clang+lld: Improve clang+ld.darwinnew.lld interaction, pass -demangle
This patch:
- adds an ld64.lld.darwinnew symlink for lld, to go with f2710d4b576,
so that `clang -fuse-ld=lld.darwinnew` can be used to test new
Mach-O lld while it's in bring-up. (The expectation is that we'll
remove this again once new Mach-O lld is the defauld and only Mach-O
lld.)
- lets the clang driver know if the linker is lld (currently
only triggered if `-fuse-ld=lld` or `-fuse-ld=lld.darwinnew` is
passed). Currently only used for the next point, but could be used
to implement other features that need close coordination between
compiler and linker, e.g. having a diag for calling `clang++` instead
of `clang` when link errors are caused by a missing C++ stdlib.
- lets the clang driver pass `-demangle` to Mach-O lld (both old and
new), in addition to ld64
- implements -demangle for new Mach-O lld
- changes demangleItanium() to accept _Z, __Z, ___Z, ____Z prefixes
(and updates one test added in D68014). Mach-O has an extra
underscore for symbols, and the three (or, on Mach-O, four)
underscores are used for block names.
Differential Revision: https://reviews.llvm.org/D91884
2020-11-21 02:57:44 +08:00
|
|
|
config->demangle = args.hasArg(OPT_demangle);
|
2020-12-10 07:08:05 +08:00
|
|
|
config->implicitDylibs = !args.hasArg(OPT_no_implicit_dylibs);
|
2021-06-18 23:47:49 +08:00
|
|
|
config->emitFunctionStarts =
|
|
|
|
args.hasFlag(OPT_function_starts, OPT_no_function_starts, true);
|
2021-04-17 04:46:45 +08:00
|
|
|
config->emitBitcodeBundle = args.hasArg(OPT_bitcode_bundle);
|
2021-06-18 23:47:49 +08:00
|
|
|
config->emitDataInCodeInfo =
|
|
|
|
args.hasFlag(OPT_data_in_code_info, OPT_no_data_in_code_info, true);
|
2021-06-29 02:43:34 +08:00
|
|
|
config->icfLevel = getICFLevel(args);
|
2022-01-15 15:06:13 +08:00
|
|
|
config->dedupLiterals =
|
|
|
|
args.hasFlag(OPT_deduplicate_literals, OPT_icf_eq, false) ||
|
|
|
|
config->icfLevel != ICFLevel::none;
|
2021-11-10 10:10:28 +08:00
|
|
|
config->warnDylibInstallName = args.hasFlag(
|
|
|
|
OPT_warn_dylib_install_name, OPT_no_warn_dylib_install_name, false);
|
2022-01-12 23:47:04 +08:00
|
|
|
config->callGraphProfileSort = args.hasFlag(
|
|
|
|
OPT_call_graph_profile_sort, OPT_no_call_graph_profile_sort, true);
|
|
|
|
config->printSymbolOrder = args.getLastArgValue(OPT_print_symbol_order);
|
2021-04-17 04:46:45 +08:00
|
|
|
|
2021-06-01 18:55:36 +08:00
|
|
|
// FIXME: Add a commandline flag for this too.
|
|
|
|
config->zeroModTime = getenv("ZERO_AR_DATE");
|
|
|
|
|
2022-01-13 06:01:59 +08:00
|
|
|
std::array<PlatformType, 3> encryptablePlatforms{
|
|
|
|
PLATFORM_IOS, PLATFORM_WATCHOS, PLATFORM_TVOS};
|
2021-04-22 03:43:38 +08:00
|
|
|
config->emitEncryptionInfo =
|
|
|
|
args.hasFlag(OPT_encryptable, OPT_no_encryption,
|
|
|
|
is_contained(encryptablePlatforms, config->platform()));
|
2021-04-22 01:35:12 +08:00
|
|
|
|
2021-05-19 23:07:39 +08:00
|
|
|
#ifndef LLVM_HAVE_LIBXAR
|
2021-04-17 04:46:45 +08:00
|
|
|
if (config->emitBitcodeBundle)
|
|
|
|
error("-bitcode_bundle unsupported because LLD wasn't built with libxar");
|
|
|
|
#endif
|
2020-04-22 04:37:57 +08:00
|
|
|
|
2021-03-09 23:02:24 +08:00
|
|
|
if (const Arg *arg = args.getLastArg(OPT_install_name)) {
|
2021-11-10 10:10:28 +08:00
|
|
|
if (config->warnDylibInstallName && config->outputType != MH_DYLIB)
|
|
|
|
warn(
|
|
|
|
arg->getAsString(args) +
|
|
|
|
": ignored, only has effect with -dylib [--warn-dylib-install-name]");
|
2021-03-09 23:02:24 +08:00
|
|
|
else
|
|
|
|
config->installName = arg->getValue();
|
|
|
|
} else if (config->outputType == MH_DYLIB) {
|
2021-07-06 07:46:09 +08:00
|
|
|
config->installName = config->finalOutput;
|
2021-03-09 23:02:24 +08:00
|
|
|
}
|
|
|
|
|
2021-03-09 23:17:01 +08:00
|
|
|
if (args.hasArg(OPT_mark_dead_strippable_dylib)) {
|
|
|
|
if (config->outputType != MH_DYLIB)
|
|
|
|
warn("-mark_dead_strippable_dylib: ignored, only has effect with -dylib");
|
|
|
|
else
|
|
|
|
config->markDeadStrippableDylib = true;
|
|
|
|
}
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
if (const Arg *arg = args.getLastArg(OPT_static, OPT_dynamic))
|
2020-09-22 04:21:45 +08:00
|
|
|
config->staticLink = (arg->getOption().getID() == OPT_static);
|
|
|
|
|
2021-03-10 12:40:08 +08:00
|
|
|
if (const Arg *arg =
|
2021-03-04 04:52:06 +08:00
|
|
|
args.getLastArg(OPT_flat_namespace, OPT_twolevel_namespace))
|
2021-02-18 21:48:07 +08:00
|
|
|
config->namespaceKind = arg->getOption().getID() == OPT_twolevel_namespace
|
|
|
|
? NamespaceKind::twolevel
|
|
|
|
: NamespaceKind::flat;
|
2021-03-04 04:52:06 +08:00
|
|
|
|
|
|
|
config->undefinedSymbolTreatment = getUndefinedSymbolTreatment(args);
|
2021-02-18 21:48:07 +08:00
|
|
|
|
2021-05-10 08:05:45 +08:00
|
|
|
if (config->outputType == MH_EXECUTE)
|
|
|
|
config->entry = symtab->addUndefined(args.getLastArgValue(OPT_e, "_main"),
|
|
|
|
/*file=*/nullptr,
|
|
|
|
/*isWeakRef=*/false);
|
|
|
|
|
2020-09-19 11:51:38 +08:00
|
|
|
config->librarySearchPaths =
|
|
|
|
getLibrarySearchPaths(args, config->systemLibraryRoots);
|
|
|
|
config->frameworkSearchPaths =
|
|
|
|
getFrameworkSearchPaths(args, config->systemLibraryRoots);
|
2021-03-10 12:40:08 +08:00
|
|
|
if (const Arg *arg =
|
2020-09-20 23:37:20 +08:00
|
|
|
args.getLastArg(OPT_search_paths_first, OPT_search_dylibs_first))
|
|
|
|
config->searchDylibsFirst =
|
2021-01-10 10:17:59 +08:00
|
|
|
arg->getOption().getID() == OPT_search_dylibs_first;
|
2020-06-20 12:13:03 +08:00
|
|
|
|
2020-12-15 07:24:50 +08:00
|
|
|
config->dylibCompatibilityVersion =
|
|
|
|
parseDylibVersion(args, OPT_compatibility_version);
|
|
|
|
config->dylibCurrentVersion = parseDylibVersion(args, OPT_current_version);
|
|
|
|
|
2021-04-26 07:00:24 +08:00
|
|
|
config->dataConst =
|
|
|
|
args.hasFlag(OPT_data_const, OPT_no_data_const, dataConstDefault(args));
|
|
|
|
// Populate config->sectionRenameMap with builtin default renames.
|
|
|
|
// Options -rename_section and -rename_segment are able to override.
|
|
|
|
initializeSectionRenameMap();
|
2021-02-27 07:36:49 +08:00
|
|
|
// Reject every special character except '.' and '$'
|
|
|
|
// TODO(gkm): verify that this is the proper set of invalid chars
|
|
|
|
StringRef invalidNameChars("!\"#%&'()*+,-/:;<=>?@[\\]^`{|}~");
|
|
|
|
auto validName = [invalidNameChars](StringRef s) {
|
|
|
|
if (s.find_first_of(invalidNameChars) != StringRef::npos)
|
|
|
|
error("invalid name for segment or section: " + s);
|
|
|
|
return s;
|
|
|
|
};
|
2021-03-10 12:15:29 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_rename_section)) {
|
2021-02-27 07:36:49 +08:00
|
|
|
config->sectionRenameMap[{validName(arg->getValue(0)),
|
|
|
|
validName(arg->getValue(1))}] = {
|
|
|
|
validName(arg->getValue(2)), validName(arg->getValue(3))};
|
|
|
|
}
|
2021-03-10 12:15:29 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_rename_segment)) {
|
2021-02-27 07:36:49 +08:00
|
|
|
config->segmentRenameMap[validName(arg->getValue(0))] =
|
|
|
|
validName(arg->getValue(1));
|
|
|
|
}
|
|
|
|
|
2021-05-11 23:43:48 +08:00
|
|
|
config->sectionAlignments = parseSectAlign(args);
|
|
|
|
|
2021-03-30 02:08:12 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_segprot)) {
|
|
|
|
StringRef segName = arg->getValue(0);
|
|
|
|
uint32_t maxProt = parseProtection(arg->getValue(1));
|
|
|
|
uint32_t initProt = parseProtection(arg->getValue(2));
|
2021-04-22 03:43:38 +08:00
|
|
|
if (maxProt != initProt && config->arch() != AK_i386)
|
2021-03-30 02:08:12 +08:00
|
|
|
error("invalid argument '" + arg->getAsString(args) +
|
|
|
|
"': max and init must be the same for non-i386 archs");
|
|
|
|
if (segName == segment_names::linkEdit)
|
|
|
|
error("-segprot cannot be used to change __LINKEDIT's protections");
|
|
|
|
config->segmentProtections.push_back({segName, maxProt, initProt});
|
|
|
|
}
|
|
|
|
|
2021-03-04 04:15:09 +08:00
|
|
|
handleSymbolPatterns(args, config->exportedSymbols, OPT_exported_symbol,
|
|
|
|
OPT_exported_symbols_list);
|
|
|
|
handleSymbolPatterns(args, config->unexportedSymbols, OPT_unexported_symbol,
|
|
|
|
OPT_unexported_symbols_list);
|
|
|
|
if (!config->exportedSymbols.empty() && !config->unexportedSymbols.empty()) {
|
|
|
|
error("cannot use both -exported_symbol* and -unexported_symbol* options\n"
|
|
|
|
">>> ignoring unexports");
|
|
|
|
config->unexportedSymbols.clear();
|
|
|
|
}
|
2021-05-08 09:05:47 +08:00
|
|
|
// Explicitly-exported literal symbols must be defined, but might
|
|
|
|
// languish in an archive if unreferenced elsewhere. Light a fire
|
|
|
|
// under those lazy symbols!
|
|
|
|
for (const CachedHashStringRef &cachedName : config->exportedSymbols.literals)
|
|
|
|
symtab->addUndefined(cachedName.val(), /*file=*/nullptr,
|
|
|
|
/*isWeakRef=*/false);
|
2021-03-04 04:15:09 +08:00
|
|
|
|
2020-10-27 10:18:29 +08:00
|
|
|
config->saveTemps = args.hasArg(OPT_save_temps);
|
|
|
|
|
2021-03-06 01:12:56 +08:00
|
|
|
config->adhocCodesign = args.hasFlag(
|
|
|
|
OPT_adhoc_codesign, OPT_no_adhoc_codesign,
|
2021-04-22 03:43:38 +08:00
|
|
|
(config->arch() == AK_arm64 || config->arch() == AK_arm64e) &&
|
2022-01-13 06:01:59 +08:00
|
|
|
config->platform() == PLATFORM_MACOS);
|
2021-03-05 22:07:58 +08:00
|
|
|
|
2020-04-22 04:37:57 +08:00
|
|
|
if (args.hasArg(OPT_v)) {
|
2021-11-02 22:13:27 +08:00
|
|
|
message(getLLDVersion(), lld::errs());
|
2020-06-18 10:59:27 +08:00
|
|
|
message(StringRef("Library search paths:") +
|
2021-11-02 22:13:27 +08:00
|
|
|
(config->librarySearchPaths.empty()
|
|
|
|
? ""
|
|
|
|
: "\n\t" + join(config->librarySearchPaths, "\n\t")),
|
|
|
|
lld::errs());
|
2020-06-18 10:59:27 +08:00
|
|
|
message(StringRef("Framework search paths:") +
|
2021-11-02 22:13:27 +08:00
|
|
|
(config->frameworkSearchPaths.empty()
|
|
|
|
? ""
|
|
|
|
: "\n\t" + join(config->frameworkSearchPaths, "\n\t")),
|
|
|
|
lld::errs());
|
2020-04-22 04:37:57 +08:00
|
|
|
}
|
2020-04-03 02:54:05 +08:00
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
config->progName = argsArr[0];
|
2020-12-09 13:51:32 +08:00
|
|
|
|
2021-04-09 02:11:45 +08:00
|
|
|
config->timeTraceEnabled = args.hasArg(
|
|
|
|
OPT_time_trace, OPT_time_trace_granularity_eq, OPT_time_trace_file_eq);
|
2021-03-25 09:05:26 +08:00
|
|
|
config->timeTraceGranularity =
|
|
|
|
args::getInteger(args, OPT_time_trace_granularity_eq, 500);
|
2021-02-23 02:03:02 +08:00
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
// Initialize time trace profiler.
|
|
|
|
if (config->timeTraceEnabled)
|
|
|
|
timeTraceProfilerInitialize(config->timeTraceGranularity, config->progName);
|
|
|
|
|
|
|
|
{
|
2021-03-25 09:05:26 +08:00
|
|
|
TimeTraceScope timeScope("ExecuteLinker");
|
2021-03-11 22:04:27 +08:00
|
|
|
|
|
|
|
initLLVM(); // must be run before any call to addFile()
|
2021-03-26 02:39:44 +08:00
|
|
|
createFiles(args);
|
2020-04-03 02:54:05 +08:00
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
config->isPic = config->outputType == MH_DYLIB ||
|
2021-04-30 03:09:01 +08:00
|
|
|
config->outputType == MH_BUNDLE ||
|
|
|
|
(config->outputType == MH_EXECUTE &&
|
|
|
|
args.hasFlag(OPT_pie, OPT_no_pie, true));
|
2021-03-11 22:04:27 +08:00
|
|
|
|
|
|
|
// Now that all dylibs have been loaded, search for those that should be
|
|
|
|
// re-exported.
|
2021-04-07 13:32:59 +08:00
|
|
|
{
|
|
|
|
auto reexportHandler = [](const Arg *arg,
|
|
|
|
const std::vector<StringRef> &extensions) {
|
|
|
|
config->hasReexports = true;
|
|
|
|
StringRef searchName = arg->getValue();
|
|
|
|
if (!markReexport(searchName, extensions))
|
|
|
|
error(arg->getSpelling() + " " + searchName +
|
|
|
|
" does not match a supplied dylib");
|
|
|
|
};
|
|
|
|
std::vector<StringRef> extensions = {".tbd"};
|
|
|
|
for (const Arg *arg : args.filtered(OPT_sub_umbrella))
|
|
|
|
reexportHandler(arg, extensions);
|
|
|
|
|
|
|
|
extensions.push_back(".dylib");
|
|
|
|
for (const Arg *arg : args.filtered(OPT_sub_library))
|
|
|
|
reexportHandler(arg, extensions);
|
2021-03-11 22:04:27 +08:00
|
|
|
}
|
2020-04-24 11:16:49 +08:00
|
|
|
|
2021-10-31 07:35:30 +08:00
|
|
|
cl::ResetAllOptionOccurrences();
|
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
// Parse LTO options.
|
|
|
|
if (const Arg *arg = args.getLastArg(OPT_mcpu))
|
2022-01-21 03:53:18 +08:00
|
|
|
parseClangOption(saver().save("-mcpu=" + StringRef(arg->getValue())),
|
2021-03-11 22:04:27 +08:00
|
|
|
arg->getSpelling());
|
2020-12-08 21:08:56 +08:00
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_mllvm))
|
|
|
|
parseClangOption(arg->getValue(), arg->getSpelling());
|
2020-12-08 21:08:56 +08:00
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
compileBitcodeFiles();
|
|
|
|
replaceCommonSymbols();
|
2020-09-25 05:44:14 +08:00
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
StringRef orderFile = args.getLastArgValue(OPT_order_file);
|
2022-01-12 23:47:04 +08:00
|
|
|
if (!orderFile.empty()) {
|
2021-03-11 22:04:27 +08:00
|
|
|
parseOrderFile(orderFile);
|
2022-01-12 23:47:04 +08:00
|
|
|
config->callGraphProfileSort = false;
|
|
|
|
}
|
2020-05-06 07:37:34 +08:00
|
|
|
|
2021-07-12 01:13:34 +08:00
|
|
|
referenceStubBinder();
|
|
|
|
|
2021-05-11 03:45:18 +08:00
|
|
|
// FIXME: should terminate the link early based on errors encountered so
|
|
|
|
// far?
|
|
|
|
|
2021-05-04 06:31:23 +08:00
|
|
|
createSyntheticSections();
|
2021-03-19 06:49:45 +08:00
|
|
|
createSyntheticSymbols();
|
2020-04-22 04:37:57 +08:00
|
|
|
|
2021-05-18 08:53:55 +08:00
|
|
|
if (!config->exportedSymbols.empty()) {
|
2021-11-13 09:26:30 +08:00
|
|
|
parallelForEach(symtab->getSymbols(), [](Symbol *sym) {
|
2021-05-18 08:53:55 +08:00
|
|
|
if (auto *defined = dyn_cast<Defined>(sym)) {
|
|
|
|
StringRef symbolName = defined->getName();
|
|
|
|
if (config->exportedSymbols.match(symbolName)) {
|
|
|
|
if (defined->privateExtern) {
|
2021-11-09 08:50:34 +08:00
|
|
|
if (defined->weakDefCanBeHidden) {
|
|
|
|
// weak_def_can_be_hidden symbols behave similarly to
|
|
|
|
// private_extern symbols in most cases, except for when
|
|
|
|
// it is explicitly exported.
|
|
|
|
// The former can be exported but the latter cannot.
|
|
|
|
defined->privateExtern = false;
|
|
|
|
} else {
|
|
|
|
warn("cannot export hidden symbol " + symbolName +
|
|
|
|
"\n>>> defined in " + toString(defined->getFile()));
|
|
|
|
}
|
2021-05-18 08:53:55 +08:00
|
|
|
}
|
|
|
|
} else {
|
|
|
|
defined->privateExtern = true;
|
|
|
|
}
|
|
|
|
}
|
2021-11-13 09:26:30 +08:00
|
|
|
});
|
2021-05-18 08:53:55 +08:00
|
|
|
} else if (!config->unexportedSymbols.empty()) {
|
2021-11-13 09:26:30 +08:00
|
|
|
parallelForEach(symtab->getSymbols(), [](Symbol *sym) {
|
2021-05-18 08:53:55 +08:00
|
|
|
if (auto *defined = dyn_cast<Defined>(sym))
|
|
|
|
if (config->unexportedSymbols.match(defined->getName()))
|
|
|
|
defined->privateExtern = true;
|
2021-11-13 09:26:30 +08:00
|
|
|
});
|
2021-05-18 08:53:55 +08:00
|
|
|
}
|
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_sectcreate)) {
|
|
|
|
StringRef segName = arg->getValue(0);
|
|
|
|
StringRef sectName = arg->getValue(1);
|
|
|
|
StringRef fileName = arg->getValue(2);
|
|
|
|
Optional<MemoryBufferRef> buffer = readFile(fileName);
|
|
|
|
if (buffer)
|
|
|
|
inputFiles.insert(make<OpaqueFile>(*buffer, segName, sectName));
|
|
|
|
}
|
2020-08-11 09:47:13 +08:00
|
|
|
|
2022-01-29 10:46:51 +08:00
|
|
|
for (const Arg *arg : args.filtered(OPT_add_empty_section)) {
|
|
|
|
StringRef segName = arg->getValue(0);
|
|
|
|
StringRef sectName = arg->getValue(1);
|
|
|
|
inputFiles.insert(make<OpaqueFile>(MemoryBufferRef(), segName, sectName));
|
|
|
|
}
|
|
|
|
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
gatherInputSections();
|
2022-01-12 23:47:04 +08:00
|
|
|
if (config->callGraphProfileSort)
|
|
|
|
extractCallGraphProfile();
|
2021-03-11 22:04:27 +08:00
|
|
|
|
[lld/mac] Implement -dead_strip
Also adds support for live_support sections, no_dead_strip sections,
.no_dead_strip symbols.
Chromium Framework 345MB unstripped -> 250MB stripped
(vs 290MB unstripped -> 236M stripped with ld64).
Doing dead stripping is a bit faster than not, because so much less
data needs to be processed:
% ministat lld_*
x lld_nostrip.txt
+ lld_strip.txt
N Min Max Median Avg Stddev
x 10 3.929414 4.07692 4.0269079 4.0089678 0.044214794
+ 10 3.8129408 3.9025559 3.8670411 3.8642573 0.024779651
Difference at 95.0% confidence
-0.144711 +/- 0.0336749
-3.60967% +/- 0.839989%
(Student's t, pooled s = 0.0358398)
This interacts with many parts of the linker. I tried to add test coverage
for all added `isLive()` checks, so that some test will fail if any of them
is removed. I checked that the test expectations for the most part match
ld64's behavior (except for live-support-iterations.s, see the comment
in the test). Interacts with:
- debug info
- export tries
- import opcodes
- flags like -exported_symbol(s_list)
- -U / dynamic_lookup
- mod_init_funcs, mod_term_funcs
- weak symbol handling
- unwind info
- stubs
- map files
- -sectcreate
- undefined, dylib, common, defined (both absolute and normal) symbols
It's possible it interacts with more features I didn't think of,
of course.
I also did some manual testing:
- check-llvm check-clang check-lld work with lld with this patch
as host linker and -dead_strip enabled
- Chromium still starts
- Chromium's base_unittests still pass, including unwind tests
Implemenation-wise, this is InputSection-based, so it'll work for
object files with .subsections_via_symbols (which includes all
object files generated by clang). I first based this on the COFF
implementation, but later realized that things are more similar to ELF.
I think it'd be good to refactor MarkLive.cpp to look more like the ELF
part at some point, but I'd like to get a working state checked in first.
Mechanical parts:
- Rename canOmitFromOutput to wasCoalesced (no behavior change)
since it really is for weak coalesced symbols
- Add noDeadStrip to Defined, corresponding to N_NO_DEAD_STRIP
(`.no_dead_strip` in asm)
Fixes PR49276.
Differential Revision: https://reviews.llvm.org/D103324
2021-05-08 05:10:05 +08:00
|
|
|
if (config->deadStrip)
|
|
|
|
markLive();
|
|
|
|
|
[lld-macho] Move ICF earlier to avoid emitting redundant binds
This is a pretty big refactoring diff, so here are the motivations:
Previously, ICF ran after scanRelocations(), where we emitting
bind/rebase opcodes etc. So we had a bunch of redundant leftovers after
ICF. Having ICF run before Writer seems like a better design, and is
what LLD-ELF does, so this diff refactors it accordingly.
However, ICF had two dependencies on things occurring in Writer: 1) it
needs literals to be deduplicated beforehand and 2) it needs to know
which functions have unwind info, which was being handled by
`UnwindInfoSection::prepareRelocations()`.
In order to do literal deduplication earlier, we need to add literal
input sections to their corresponding output sections. So instead of
putting all input sections into the big `inputSections` vector, and then
filtering them by type later on, I've changed things so that literal
sections get added directly to their output sections during the 'gather'
phase. Likewise for compact unwind sections -- they get added directly
to the UnwindInfoSection now. This latter change is not strictly
necessary, but makes it easier for ICF to determine which functions have
unwind info.
Adding literal sections directly to their output sections means that we
can no longer determine `inputOrder` from iterating over
`inputSections`. Instead, we store that order explicitly on
InputSection. Bloating the size of InputSection for this purpose would
be unfortunate -- but LLD-ELF has already solved this problem: it reuses
`outSecOff` to store this order value.
One downside of this refactor is that we now make an additional pass
over the unwind info relocations to figure out which functions have
unwind info, since want to know that before `processRelocations()`. I've
made sure to run that extra loop only if ICF is enabled, so there should
be no overhead in non-optimizing runs of the linker.
The upside of all this is that the `inputSections` vector now contains
only ConcatInputSections that are destined for ConcatOutputSections, so
we can clean up a bunch of code that just existed to filter out other
elements from that vector.
I will test for the lack of redundant binds/rebases in the upcoming
cfstring deduplication diff. While binds/rebases can also happen in the
regular `.text` section, they're more common in `.data` sections, so it
seems more natural to test it that way.
This change is perf-neutral when linking chromium_framework.
Reviewed By: oontvoo
Differential Revision: https://reviews.llvm.org/D105044
2021-07-02 08:33:42 +08:00
|
|
|
// ICF assumes that all literals have been folded already, so we must run
|
|
|
|
// foldIdenticalLiterals before foldIdenticalSections.
|
|
|
|
foldIdenticalLiterals();
|
|
|
|
if (config->icfLevel != ICFLevel::none)
|
|
|
|
foldIdenticalSections();
|
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
// Write to an output file.
|
2021-04-03 06:46:18 +08:00
|
|
|
if (target->wordSize == 8)
|
|
|
|
writeResult<LP64>();
|
|
|
|
else
|
|
|
|
writeResult<ILP32>();
|
2021-03-23 10:05:46 +08:00
|
|
|
|
|
|
|
depTracker->write(getLLDVersion(), inputFiles, config->outputFile);
|
[lld-macho][re-land] Support .subsections_via_symbols
Summary:
This diff restores and builds upon @pcc and @ruiu's initial work on
subsections.
The .subsections_via_symbols directive indicates we can split each
section along symbol boundaries, unless those symbols have been marked
with `.alt_entry`.
We exercise this functionality in our tests by using order files that
rearrange those symbols.
Depends on D79668.
Reviewers: ruiu, pcc, MaskRay, smeenai, alexshap, gkm, Ktwu, christylee
Reviewed By: smeenai
Subscribers: thakis, llvm-commits, pcc, ruiu
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79926
2020-05-19 23:46:07 +08:00
|
|
|
}
|
2020-04-03 02:54:05 +08:00
|
|
|
|
2021-03-11 22:04:27 +08:00
|
|
|
if (config->timeTraceEnabled) {
|
2021-10-04 23:45:55 +08:00
|
|
|
checkError(timeTraceProfilerWrite(
|
|
|
|
args.getLastArgValue(OPT_time_trace_file_eq).str(),
|
|
|
|
config->outputFile));
|
2021-03-11 22:04:27 +08:00
|
|
|
|
|
|
|
timeTraceProfilerCleanup();
|
|
|
|
}
|
2022-01-21 03:53:18 +08:00
|
|
|
return errorCount() == 0;
|
2020-04-03 02:54:05 +08:00
|
|
|
}
|