Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection. We can move other global variables into ctx without
indirection concern. In the long term we may consider passing Ctx
as a parameter to various functions and eliminate global state as
much as possible and then remove `Ctx::reset`.
`config` has 1000+ uses so we try to avoid changing `config->foo`. Define a
wrapper with LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr
indirection.
My x86-64 lld executable is 11+KiB smaller.
Symbol::replace intends to overwrite a few fields (mostly Elf{32,64}_Sym
fields), but the implementation copies all fields then restores some old fields.
This is error-prone and wasteful. Add Symbol::overwrite to copy just the
needed fields and add other overwrite member functions to copy the extra
fields.
For RVC, GNU assembler and LLVM integrated assembler add c.nop followed by a
sequence of 4-byte nops. Even if remove % 4 == 0, we have to split one 4-byte
nop and therefore need to write the code sequence, otherwise we create an
incorrect c.unimp.
This fixes a regression since fa74144c64dff6b145b0b3fa9397f913ddaa87bf;
even if we're linking to the dylib (which handles all the dependencies
in LLVMSupport), we're now also directly referencing zstd from
lld/ELF, and thus need to explicitly express our dependency on it.
https://reviews.llvm.org/D133003#3806508 can reproduce a non-determinism with
--threads=4. Making the config serial fixes non-determinism (by running the link
many times and compare output).
See D117853: compressing debug sections is a bottleneck and therefore it
has a large value parallizing the step.
zstd provides multi-threading API and the output is deterministic even with
different numbers of threads (see https://github.com/facebook/zstd/issues/2238).
Therefore we can leverage it instead of using the pigz-style sharding approach.
Also, switch to the default compression level 3. The current level 5
is significantly slower without providing justifying size benefit.
```
'dash b.sh 1' ran
1.05 ± 0.01 times faster than 'dash b.sh 3'
1.18 ± 0.01 times faster than 'dash b.sh 4'
1.29 ± 0.02 times faster than 'dash b.sh 5'
level=1 size: 358946945
level=3 size: 309002145
level=4 size: 307693204
level=5 size: 297828315
```
Reviewed By: andrewng, peter.smith
Differential Revision: https://reviews.llvm.org/D133679
In GNU ld,
* --version skips linker input processing.
* -v and -V keep processing if there is any input file. -V has more
information we don't support.
We currently make -V an alias for --version which skips input processing.
On many `*-freebsd` and `powerpc-*` targets, `gcc -v` passes `-V` to ld
and expects to process input. Make -V an alias for -v to provide
compatibility.
Fix https://github.com/llvm/llvm-project/issues/57859
They may modify thinlto optimization.
This patch only extends support for `-mllvm`. There is another way to
pass llvm flags, `-plugin-opt`, but its processing is different and will
be provided in a subsequent patch.
Differential Revision: https://reviews.llvm.org/D134013
This improves consistency with other places (e.g. llvm::compression::decompress,
llvm::object::Decompressor::decompress, llvm-objcopy).
Note: when zstd::uncompress was added, we noticed that the API `ZSTD_decompress`
is fine while the zlib API `uncompress` is a misnomer.
On Unix platforms, this wrapper function is inline, so it should
expand to the same direct access to the thread local variable. On
Windows, it's a non-inline function within Parallel.cpp, allowing
making the thread_local variable static.
Windows Native TLS doesn't support direct access to thread local
variables in a different DLL, and GCC/binutils on Windows occasionally
has problems with non-static thread local variables too.
This fixes mingw dylib builds with native TLS after
e6aebff674.
At the same time, move the whole thread local variable within
#if LLVM_ENABLE_THREADS
to fix builds without threading support.
Differential Revision: https://reviews.llvm.org/D133759
When the same bitcode object file is given multiple times from the Command-line
as lazy object file, empty index is generated which overwrites the one from thinlink.
This could cause undefined symbols during final link.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D133740
* Change `Symbol::flags` to a `std::atomic<uint16_t>`
* Add `llvm::parallel::threadIndex` as a thread-local non-negative integer
* Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex
* Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output.
MIPS and PPC64 use global states for relocation scanning. Keep serial scanning.
Speed-up with mimalloc and --threads=8 on an Intel Skylake machine:
* clang (Release): 1.27x as fast
* clang (Debug): 1.06x as fast
* chrome (default): 1.05x as fast
* scylladb (default): 1.04x as fast
Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64):
* clang (Release): 1.31x as fast
* scylladb (default): 1.06x as fast
Reviewed By: andrewng
Differential Revision: https://reviews.llvm.org/D133003
`clang -gz=zstd a.o` passes this option to the linker. This option compresses output
debug sections with zstd and sets ch_type to ELFCOMPRESS_ZSTD. As of today, very
few DWARF consumers recognize ELFCOMPRESS_ZSTD.
Use the llvm::zstd::compress API with level llvm::zstd::DefaultCompression (5),
which we may tune after we have more experience with zstd output.
zstd has built-in parallel compression support (so we don't need to do D117853
for zlib), which is not leveraged yet.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D133548
so that lld accepts relocatable object files produced by `clang -c -g -gz=zstd`.
We don't want to increase the size of InputSection, so do redundant but cheap
ch_type checks instead.
Differential Revision: https://reviews.llvm.org/D129406
GNU uses a different hashing function compared to the sys-V standard
function already provided in libObject. This is already used internally
in LLD for generating synthetic sections. This patch simply extracts
this definition and makes it availible to other users of `libObject`.
This is done in preparation for supporting symbol name lookups via the
GNU hash table.
Reviewed By: MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D132696
This simplifies SymbolTableSection<ELFT>::writeTo. Add dsoProtected to be used
in canDefineSymbolInExecutable and get the side benefit that the protected DSO
preemption diagnostic is clearer.
VER_NDX_LOCAL/VER_NDX_GLOBAL cannot be hidden, so we can compare them with
versyms[i] instead of versyms[i] & ~VERSYM_HIDDEN. In the presence of an error,
we can suppress addSymbol.