llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	d001ab82e4	[ELF] Don't fall back to .text for e_entry We have the rule to simulate (https://sourceware.org/binutils/docs/ld/Entry-Point.html), but the behavior is questionable (https://sourceware.org/pipermail/binutils/2021-September/117929.html). gold doesn't fall back to .text. The behavior is unlikely relied by projects (there is even a warning for executable links), so let's just delete this fallback path. Reviewed By: jhenderson, peter.smith Differential Revision: https://reviews.llvm.org/D110014	2021-09-20 09:35:12 -07:00
Jessica Clarke	cfaa5bf4ce	[ELF] Align the first section of a PT_TLS even if its type is SHT_NOBITS This is somewhat of a repeat of D66658 but for sections in PT_TLS segments. Although such sections don't need to be aligned such that address and offset are congruent modulo the page size, they do need to be congruent modulo the segment alignment, otherwise the whole PT_TLS will be unaligned. We therefore use the normal calculation to determine the section's address within the PT_LOAD rather than bailing out early due to being SHT_NOBITS. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D106987	2021-07-29 15:14:00 +01:00
Amilendra Kodithuwakku	b9cf1769de	[lld][ELF] remove empty SyntheticSections from inputSections Change removeUnusedSyntheticSections() to actually remove empty SyntheticSections in inputSections. In addition to doing what removeUnusedSyntheticSections() was meant to do, this will also make the shuffle-sections tests, which shuffles inputSections, less sensitive to empty Synthetic Sections that will not appear in the final image. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D106427 Change-Id: I589eaf596472161a4395fb658aea0fad73318088	2021-07-27 23:29:02 +01:00
Fangrui Song	f8cb78e99a	[ELF] Don't define __rela_iplt_start for -pie/-shared `clang -fuse-ld=lld -static-pie -fpie` produced executable currently crashes and this patch makes it work. See https://sourceware.org/bugzilla/show_bug.cgi?id=27164 and https://sourceware.org/pipermail/libc-alpha/2021-July/128810.html While it seems unreasonable to keep csu/libc-start.c ARCH_APPLY_IREL unclear in static-pie mode and have an unneeded diff -u =(ld.bfd --verbose) =(ld.bfd -pie --verbose) difference, glibc folks don't want to fix their code. I feel sad about that but this patch can remove an iffy condition for lld/ELF as well: `needsInterpSection()`.	2021-07-15 11:31:11 -07:00
Alex Richardson	35c5e564e6	[ELF] Check the Elf_Rel addends for dynamic relocations There used to be many cases where addends for Elf_Rel were not emitted in the final object file (mostly when building for MIPS64 since the input .o files use RELA but the output uses REL). These cases have been fixed since, but this patch adds a check to ensure that the written values are correct. It is based on a previous patch that I added to the CHERI fork of LLD since we were using MIPS64 as a baseline. The work has now almost entirely shifted to RISC-V and Arm Morello (which use Elf_Rela), but I thought it would be useful to upstream our local changes anyway. This patch adds a (hidden) command line flag --check-dynamic-relocations that can be used to enable these checks. It is also on by default in assertions builds for targets that handle all dynamic relocations kinds that LLD can emit in Target::getImplicitAddend(). Currently this is enabled for ARM, MIPS, and I386. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101450	2021-07-09 10:41:40 +01:00
Alex Richardson	6d87ca08ae	[ELF] Refactor DynamicReloc to fix incorrect relocation addends This patch changes the DynamicReloc class to store an enum instead of the overloaded useSymVA member to make it easier to understand and fix incorrect addends being written in some corner cases. The change is motivated by a follow-up review that checks the value of implicit Elf_Rel addends written to the output file. This patch fixes an incorrect output when using `-z rela` for i386 files with R_386_GOT32 relocations (not that this really matters since it's an unsupported configuration). Storing the relocation expression kind also addresses an incorrect addend FIXME in ppc64-abs64-dyn.s introduced in D63383. DynamicReloc now also has a special case for the MIPS TLS relocations (DynamicReloc::AgainstSymbolWithTargetVA) since the R_MIPS_TLS_TPREL{32/64} the symbol VA to the GOT for preemptible symbols. I'm not sure if the symbol value actually should be written for R_MIPS_TLS_TPREL32, but this patch does not attempt to change that behaviour. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100490	2021-07-09 10:41:40 +01:00
Konstantin Schwarz	5d621ed85d	[ELF] Consider that NOLOAD sections should be placed in a PT_LOAD segment During PHDR creation, the case where an output section does not require a PT_LOAD header but still occupies memory in the current VMA region was not handled. If such an output section interleaves two output sections that have the same VMA and LMA regions set, we would previously re-use the existing PT_LOAD header for the second output section. However, since the memory region is not contiguous, we need to start a new PT_LOAD segment. This fixes https://bugs.llvm.org/show_bug.cgi?id=50558 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D103815	2021-06-16 12:36:45 +02:00
Igor Kudrin	70c23e232e	[LLD] Improve reporting unresolved symbols in shared libraries Currently, when reporting unresolved symbols in shared libraries, if an undefined symbol is firstly seen in a regular object file that shadows the reference for the same symbol in a shared object. As a result, the error for the unresolved symbol in the shared library is not reported. If referencing sections in regular object files are discarded because of '--gc-sections', no reports about such symbols are generated, and the linker finishes successfully, generating an output image that fails on the run. The patch fixes the issue by keeping symbols, which should be checked, for each shared library separately. Differential Revision: https://reviews.llvm.org/D101996	2021-05-11 12:48:29 +07:00
Jez Ng	9b6dde8af8	[lld-macho] Parallelize UUID hash computation This reuses the approach (and some code) from LLD-ELF. It's a decent win when linking chromium_framework on a Mac Pro (3.2 GHz 16-Core Intel Xeon W): N Min Max Median Avg Stddev x 20 4.58 4.83 4.66 4.6685 0.066591844 + 20 4.42 4.61 4.5 4.505 0.04751731 Difference at 95.0% confidence -0.1635 +/- 0.0370242 -3.5022% +/- 0.793064% (Student's t, pooled s = 0.0578462) The output binary is 381MB. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D99279	2021-03-31 15:48:36 -04:00
Fangrui Song	16c30c3c23	[ELF] Change --shuffle-sections=<seed> to --shuffle-sections=<section-glob>=<seed> `--shuffle-sections=<seed>` applies to all sections. The new `--shuffle-sections=<section-glob>=<seed>` makes shuffling selective. To the best of my knowledge, the option is only used as debugging, so just drop the original form. `--shuffle-sections '.init_array=-1'` `--shuffle-sections '.fini_array=-1'`. reverses static constructors/destructors of the same priority. Useful to detect some static initialization order fiasco. `--shuffle-sections '.data=-1'` reverses `.data` sections. Useful to detect unfunded pointer comparison results of two unrelated objects. If certain sections have an intrinsic order, the old form cannot be used. Differential Revision: https://reviews.llvm.org/D98679	2021-03-18 10:18:19 -07:00
Fangrui Song	423cb321df	[ELF] Special case --shuffle-sections=-1 to reverse input sections If the number of sections changes, which is common for re-links after incremental updates, the section order may change drastically. Special case -1 to reverse input sections. This is a stable transform. The section order is more resilient to incremental updates. Usually the code issue (e.g. Static Initialization Order Fiasco, assuming pointer comparison result of two unrelated objects) is due to the relative order between two problematic input files A and B. Checking the regular order and the reversed order is sufficient. Differential Revision: https://reviews.llvm.org/D98445	2021-03-17 09:32:44 -07:00
Nico Weber	cb4df6eb8d	fix comment typos to cycle bots	2021-02-18 14:25:21 -05:00
Bob Haarman	8e0b179315	[ELF] report section sizes when output file too large Fixes PR48523. When the linker errors with "output file too large", one question that comes to mind is how the section sizes differ from what they were previously. Unfortunately, this information is lost when the linker exits without writing the output file. This change makes it so that the error message includes the sizes of the largest sections. Reviewed By: MaskRay, grimar, jhenderson Differential Revision: https://reviews.llvm.org/D94560	2021-01-21 19:47:03 +00:00
Georgii Rymar	ed146d6291	[LLD][ELF] - Use LLVM_ELF_IMPORT_TYPES_ELFT instead of multiple types definitions. NFCI. We can reduce the number of "using" declarations. `LLVM_ELF_IMPORT_TYPES_ELFT` was extended in D93801. Differential revision: https://reviews.llvm.org/D93856	2020-12-29 10:50:07 +03:00
Fangrui Song	55d310adc0	[ELF] Fix interaction between --unresolved-symbols= and --[no-]allow-shlib-undefined As mentioned in https://reviews.llvm.org/D67479#1667256 , * `--[no-]allow-shlib-undefined` control the diagnostic for an unresolved symbol in a shared object * `-z defs/-z undefs` control the diagnostic for an unresolved symbol in a regular object file * `--unresolved-symbols=` controls both bits. In addition, make --warn-unresolved-symbols affect --no-allow-shlib-undefined. This patch makes the behavior match GNU ld. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D91510	2020-11-17 12:20:57 -08:00
James Henderson	439341b9bf	[lld][ELF] Add additional time trace categories I noticed when running a large link with the --time-trace option that there were several areas which were missing any specific time trace categories (aside from the generic link/ExecuteLinker categories). This patch adds new categories to fill most of the "gaps", or to provide more detail than was previously provided. Reviewed by: MaskRay, grimar, russell.gallop Differential Revision: https://reviews.llvm.org/D90686	2020-11-10 10:28:46 +00:00
Fangrui Song	2fc704a0a5	[ELF] --emit-relocs: fix st_value of STT_SECTION in the presence of a gap before the first input section In the presence of a gap, the st_value field of a STT_SECTION symbol is the address of the first input section (incorrect if there is a gap). Set it to the output section address instead. In -r mode, this bug can cause an incorrect non-zero st_value of a STT_SECTION symbol (while output sections have zero addresses, input sections may have non-zero outSecOff). The non-zero st_value can cause the final link to have incorrect relocation computation (both GNU ld and LLD add st_value of the STT_SECTION symbol to the output section address). Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D90520	2020-11-02 08:37:15 -08:00
Snehasish Kumar	070555c6c0	[lld] Make -z keep-text-section-prefix recognize .text.split. as a prefix. ".text.split." holds symbols which are split out from functions in other input sections. For example, with -fsplit-machine-functions, placing the cold parts in .text.split instead of .text.unlikely mitigates against poor profile inaccuracy. Techniques such as hugepage remapping can make conservative decisions at the section granularity. Differential Revision: https://reviews.llvm.org/D87840	2020-09-24 15:02:48 -07:00
Fangrui Song	15f0ad2fa2	[ELF] Bump the limit of thunk creation passes from 10 to 15 I have noticed that a 374MiB powerpc64le 'ld.lld' requires 11 passes to link. There is a ThunkSection (whose parent OutputSection is ".text" of 169MiB) with 12867 thunks.	2020-09-16 14:05:22 -07:00
Fangrui Song	e59d9df774	[ELF] --symbol-ordering-file: optimize a loop	2020-09-07 21:47:30 -07:00
Fangrui Song	ec29538af2	[ELF] Assign file offsets of non-SHF_ALLOC after SHF_ALLOC and set sh_addr=0 to non-SHF_ALLOC * GNU ld places non-SHF_ALLOC sections after SHF_ALLOC sections. This has the advantage that the file offsets of a non-SHF_ALLOC cannot be contained in a PT_LOAD. This patch matches the behavior. * For non-SHF_ALLOC non-orphan sections, GNU ld may assign non-zero sh_addr and treat them similar to SHT_NOBITS (not advance location counter). This is an alternative approach to what we have done in D85100. By placing non-SHF_ALLOC sections at the end, we can drop special cases in createSection and findOrphanPos added by D85100. Different from GNU ld, we set sh_addr to 0 for non-SHF_ALLOC sections. 0 arguably is better because non-SHF_ALLOC sections don't appear in the memory image. ELF spec says: > sh_addr - If the section will appear in the memory image of a process, this > member gives the address at which the section's first byte should > reside. Otherwise, the member contains 0. D85100 appeared to take a detour. If we take a combined view on D85100 and this patch, the overall complexity slightly increases (one more 3-line loop) and compatibility with GNU ld improves. The behavior we don't want to match is the special treatment of .symtab .shstrtab .strtab: they can be matched in LLD but not in GNU ld. Reviewed By: jhenderson, psmith Differential Revision: https://reviews.llvm.org/D85867	2020-08-18 09:03:01 -07:00
Fangrui Song	e8a11c0558	[ELF] Allow mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER sections and sort within InputSectionDescription LLD currently does not allow non-contiguous SHF_LINK_ORDER components in an output section. This makes it infeasible to add SHF_LINK_ORDER to an existing metadata section if backward compatibility with older object files are concerned. We did not allow mixed components (like GNU ld) and D77007 relaxed to allow non-contiguous SHF_LINK_ORDER components. This patch allows arbitrary mix, with sorting performed within an InputSectionDescription. For example, `.rodata : {(.rodata.foo) (.rodata.bar)}`, has two InputSectionDescription's. If there is at least one SHF_LINK_ORDER and at least one non-SHF_LINK_ORDER in .rodata.foo, they are ordered within `(.rodata.foo)`: we arbitrarily place SHF_LINK_ORDER components before non-SHF_LINK_ORDER components (like Solaris ld). `(.rodata.bar)` is ordered similarly, but the two InputSectionDescription's don't interact. It can be argued that this is more reasonable than the previous behavior where written order was not respected. It would be nice if the two different semantics (ordering requirement & garbage collection) were not overloaded on one section flag, however, it is probably difficult to obtain a generic flag at this point (https://groups.google.com/forum/#!topic/generic-abi/hgx_m1aXqUo "SHF_LINK_ORDER's original semantics make upgrade difficult"). (Actually, without the GC semantics, SHF_LINK_ORDER would still have the sh_link!=0 & sh_link=0 issue. It is just that people find the GC semantics more useful and tend to use the feature more often.) GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=16833 Differential Revision: https://reviews.llvm.org/D84001	2020-08-17 11:29:05 -07:00
Georgii Rymar	c135a68d42	[LLD][ELF] - Do not produce an invalid dynamic relocation order with --shuffle-sections. Normally (when not on android with android relocation packing enabled), we put IRelative relocations to ".rel[a].dyn", after other relocations, to ensure that IRelatives are processed last by the dynamic loader. To achieve that we add the `in.relaIplt` after the `part.relaDyn`: https://github.com/llvm/llvm-project/blob/master/lld/ELF/Writer.cpp#L540 The problem is that `--shuffle-sections` might break the sections order. This patch fixes it. Fixes https://bugs.llvm.org/show_bug.cgi?id=47056. Differential revision: https://reviews.llvm.org/D85651	2020-08-17 14:46:52 +03:00
Fangrui Song	a6db64ef4a	[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD (PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to SHF_ALLOC NOBITS sections. The location counter is not advanced). This patch tries to fix PR37607 (remove a special case in `Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot reset dot to 0 for a middle non-SHF_ALLOC output section. This results in removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC non-orphan sections can have non-zero addresses like in GNU ld. The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan only. This results in a special case in createSection and findOrphanPos, respectively. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85100	2020-08-06 08:27:15 -07:00
Muhammad Omair Javaid	d9e191cb17	Revert "[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD" This reverts commit `030ddc0a0b`. This breaks http://lab.llvm.org:8011/builders/lldb-arm-ubuntu and http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu Differential Revision: https://reviews.llvm.org/D85100	2020-08-06 16:30:05 +05:00
Fangrui Song	b216c80cc2	[ELF] Allow SHF_LINK_ORDER sections to have sh_link=0 Part of https://bugs.llvm.org/show_bug.cgi?id=41734 The semantics of SHF_LINK_ORDER have been extended to represent metadata sections associated with some other sections (usually text). The associated text section may be discarded (e.g. LTO) and we want the metadata section to have sh_link=0 (D72899, D76802). Normally the metadata section is only referenced by the associated text section. sh_link=0 means the associated text section is discarded, and the metadata section will be garbage collected. If there is another section (.gc_root) referencing the metadata section, the metadata section will be retained. It's the .gc_root consumer's job to validate the metadata sections. # This creates a SHF_LINK_ORDER .meta with sh_link=0 .section .meta,"awo",@progbits,0 1: .section .meta,"awo",@progbits,foo 2: .section .gc_root,"a",@progbits .quad 1b .quad 2b Reviewed By: pcc, jhenderson Differential Revision: https://reviews.llvm.org/D72904	2020-08-05 16:17:42 -07:00
Fangrui Song	030ddc0a0b	[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD (PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to SHF_ALLOC NOBITS sections. The location counter is not advanced). This patch tries to fix PR37607 (remove a special case in `Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot reset dot to 0 for a middle non-SHF_ALLOC output section. This results in removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC non-orphan sections can have non-zero addresses like in GNU ld. The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan only. This results in a special case in createSection and findOrphanPos, respectively. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85100	2020-08-05 09:30:23 -07:00
Fangrui Song	acb66b9111	[ELF] --oformat=binary: use LMA to compute file offsets --oformat=binary is rare (used in a few places in FreeBSD, see `stand/i386/mbr/Makefile` `LDFLAGS_BIN`) The result should be identical to a normal output transformed by `objcopy -O binary`. The current implementation ignores addresses and lays out sections by respecting output section alignments. It can fail when an output section address is specified, e.g. `.rodata ALIGN(16) :` (PR33651). Fix PR33651 by respecting LMA. The code is similar to `tools/llvm-objcop/ELF/Object.cpp` BinaryWriter::finalize after D71035 and D79229. Unforunately for an output section without PT_LOAD, we assume its LMA is equal to its VMA. So the result is still incorrect when an output section LMA (`AT(...)`) is specified Also drop `alignTo(off, config->wordsize)`. GNU ld does not round up the file size. Differential Revision: https://reviews.llvm.org/D85086	2020-08-05 09:10:01 -07:00
Petr Hosek	fffd05d525	[ELF] Add -z start-stop-visibility= to set __start_/__stop_ symbol visibility This matches the equivalent flag implemented in GNU linkers, see https://sourceware.org/pipermail/binutils/2020-June/111685.html for the associated discussion. Differential Revision: https://reviews.llvm.org/D55682	2020-06-23 15:59:59 -07:00
Fangrui Song	c4d13f72a6	[ELF] Refactor ObjFile<ELFT>::initializeSymbols to enforce the invariant: InputFile::symbols has non null entry Fixes PR46348. ObjFile<ELFT>::initializeSymbols contains two symbol iteration loops: ``` for each symbol if non-inheriting && non-local fill in this->symbols[i] for each symbol if local fill in this->symbols[i] else symbol resolution ``` Symbol resolution can trigger a duplicate symbol error which will call InputSectionBase::getObjMsg to iterate over InputFile::symbols. If a non-local symbol appears after the non-local symbol being resolved (violating ELF spec), its `this->symbols[i]` entry has not been filled in, InputSectionBase::getObjMsg will crash due to `dyn_cast<Defined>(nullptr)`. To fix the bug, reorganize the two loops to ensure this->symbols is complete before symbol resolution. This enforces the invariant: InputFile::symbols has none null entry when InputFile::getSymbols() is called. ``` for each symbol if non-inheriting fill in this->symbols[i] for each symbol starting from firstGlobal if non-local symbol resolution ``` Additionally, move the (non-local symbol in local part of .symtab) diagnostic from Writer<ELFT>::copyLocalSymbols() to initializeSymbols(). Reviewed By: grimar, jhenderson Differential Revision: https://reviews.llvm.org/D81988	2020-06-19 09:05:37 -07:00
Fangrui Song	3eb4bf13ba	[ELF] Append " [--no-allow-shlib-undefined]" to the corresponding diagnostics --no-allow-shlib-undefined (enabled by default when linking an executable) rejects unresolved references in shared objects. Users may be confused by the common diagnostics of unresolved symbols in object files (LLD: "undefined symbol: foo"; GNU ld/gold: "undefined reference to") Learn from GCC/clang " [-Wfoo]": append the option name to the diagnostics. Users can find relevant information by searching "--no-allow-shlib-undefined". It should also be obvious to them that the positive form --allow-shlib-undefined can suppress the error. Also downgrade the error to a warning if --noinhibit-exec is used (compatible with GNU ld and gold). Reviewed By: grimar, psmith Differential Revision: https://reviews.llvm.org/D81028	2020-06-03 07:59:37 -07:00
Fangrui Song	bae7cf6746	[ELF][PPC64] Synthesize _savegpr[01]_{14..31} and _restgpr[01]_{14..31} In the 64-bit ELF V2 API Specification: Power Architecture, 2.3.3.1. GPR Save and Restore Functions defines some special functions which may be referenced by GCC produced assembly (LLVM does not reference them). With GCC -Os, when the number of call-saved registers exceeds a certain threshold, GCC generates `_savegpr0_* _restgpr0_` calls and expects the linker to define them. See https://sourceware.org/pipermail/binutils/2002-February/017444.html and https://sourceware.org/pipermail/binutils/2004-August/036765.html . This is weird because libgcc.a would be the natural place. However, the linker generation approach has the advantage that the linker can generate multiple copies to avoid long branch thunks. We don't consider the advantage significant enough to complicate our trunk implementation, so we take a simple approach. Check whether `_savegpr0_{14..31}` are used * If yes, define needed symbols and add an InputSection with the code sequence. `_savegpr1_` `_restgpr0_` and `_restgpr1_*` are similar. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D79977	2020-05-26 09:35:41 -07:00
Fangrui Song	07837b8f49	[ELF] Use namespace qualifiers (lld:: or elf::) instead of `namespace lld { namespace elf {` Similar to D74882. This reverts much code from commit `bd8cfe65f5` (D68323) and fixes some problems before D68323. Sorry for the churn but D68323 was a mistake. Namespace qualifiers avoid bugs where the definition does not match the declaration from the header. See https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions (D74515) Differential Revision: https://reviews.llvm.org/D79982	2020-05-15 08:49:53 -07:00
Wei Mi	538208f6c0	[lld] Add a new output section ".text.unknown" for funtions with unknown hotness For sampleFDO, because the optimized build uses profile generated from previous release, often we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persue the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. This problem has been largely solved for regular sampleFDO using profile-symbol-list (https://reviews.llvm.org/D66374), but for the case when we use partial profile, we still waste a lot of memory because of it. In https://reviews.llvm.org/D62540, we propose to save functions with unknown hotness information in a special section called ".text.unknown", so that compiler will treat those functions as luck-warm, but runtime can choose not to mlock the special section in memory or use other strategy to save memory. That will solve most of the memory problem even if we use a partial profile. The patch adds the support in lld for the special section.For sampleFDO, because the optimized build uses profile generated from previous release, often we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persue the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. This problem has been largely solved for regular sampleFDO using profile-symbol-list (https://reviews.llvm.org/D66374), but for the case when we use partial profile, we still waste a lot of memory because of it. In https://reviews.llvm.org/D62540, we propose to save functions with unknown hotness information in a special section called ".text.unknown", so that compiler will treat those functions as luck-warm, but runtime can choose not to mlock the special section in memory or use other strategy to save memory. That will solve most of the memory problem even if we use a partial profile. The patch adds the support in lld for the special section. Differential Revision: https://reviews.llvm.org/D79590	2020-05-08 11:14:48 -07:00
Reid Kleckner	932f0276ea	[Support] Move LLD's parallel algorithm wrappers to support Essentially takes the lld/Common/Threads.h wrappers and moves them to the llvm/Support/Paralle.h algorithm header. The changes are: - Remove policy parameter, since all clients use `par`. - Rename the methods to `parallelSort` etc to match LLVM style, since they are no longer C++17 pstl compatible. - Move algorithms from llvm::parallel:: to llvm::, since they have "parallel" in the name and are no longer overloads of the regular algorithms. - Add range overloads - Use the sequential algorithm directly when 1 thread is requested (skips task grouping) - Fix the index type of parallelForEachN to size_t. Nobody in LLVM was using any other parameter, and it made overload resolution hard for for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t. Remove Threads.h and update LLD for that. This is a prerequisite for parallel public symbol processing in the PDB library, which is in LLVM. Reviewed By: MaskRay, aganea Differential Revision: https://reviews.llvm.org/D79390	2020-05-05 15:21:05 -07:00
Fangrui Song	c49f83b6e9	[ELF] Don't advance sh_offset for an empty section whose PT_LOAD is removed (due to p_memsz=0) removeEmptyPTLoad() removes empty (p_memsz=0) PT_LOAD segments. In assignFileOffsets(), setFileOffset() unnecessarily advances file offsets for containing empty sections. This is exposed by arm Linux kernel's multi_v5_defconfig (see https://bugs.llvm.org/show_bug.cgi?id=45632) ``` ld.lld (max-page-size=65536): [34] .init.data PROGBITS c0c24000 c34000 0128ac 00 WA 0 0 4096 [35] .text_itcm PROGBITS fffe0000 c50000 000000 00 WA 0 0 1 [36] .data_dtcm PROGBITS fffe8000 c58000 000000 00 WA 0 0 1 [37] .data PROGBITS c0c38000 c58000 0647a0 00 WA 0 0 32 arm-linux-gnueabi-ld (max-page-size=65536): [23] .init.data PROGBITS c0c12000 c22000 0128ac 00 WA 0 0 4096 [24] .text_itcm PROGBITS fffe0000 ca2558 000000 00 W 0 0 1 [25] .data_dtcm PROGBITS fffe8000 ca2558 000000 00 W 0 0 1 [26] .data PROGBITS c0c26000 c36000 0647a0 00 WA 0 0 32 ``` This patch clears OutputSection::ptLoad if ptLoad is removed by removeEmptyPTLoad(). Conceptually this removes "dangling" references. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D79254	2020-05-04 08:07:34 -07:00
Peter Smith	3834385f27	[ELF] Move SHF_LINK_ORDER till OutputSection addresses are known Sections with the SHF_LINK_ORDER flag must be ordered in the same relative order as the Sections they have a link to. When using a linker script an arbitrary expression may be used for the virtual address of the OutputSection. In some cases the virtual address does not monotonically increase as the OutputSection index increases, so if we base the ordering of the SHF_LINK_ORDER sections on the index then we can get the order wrong. We fix this by moving SHF_LINK_ORDER resolution till after we have created OutputSection virtual addresses. Differential Revision: https://reviews.llvm.org/D79286	2020-05-04 14:25:25 +01:00
Fangrui Song	b257d3c8a8	[ELF][PPC64] Suppress toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen The current implementation assumes that R_PPC64_TOC16_HA is always followed by R_PPC64_TOC16_LO_DS. This can break with R_PPC64_TOC16_LO: // Load the address of the TOC entry, instead of the value stored at that address addis 3, 2, .LC0@tloc@ha # R_PPC64_TOC16_HA addi 3, 3, .LC0@tloc@l # R_PPC64_TOC16_LO blr which is used by boringssl's util/fipstools/delocate/delocate.go https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation. In short, this tool converts an assembly file to avoid any potential relocations. The distance to an input .toc is not a constant after linking, so it cannot use an `addis;ld` pair. Instead, it jumps to a stub which loads the TOC entry address with `addis;addi`. This patch checks the presence of R_PPC64_TOC16_LO and suppresses toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen. This approach is conservative and loses some relaxation opportunities but is easy to implement. addis 3, 2, .LC0@toc@ha # no relaxation addi 3, 3, .LC0@toc@l # no relaxation li 9, 0 addis 4, 2, .LC0@toc@ha # can relax but suppressed ld 4, .LC0@toc@l(4) # can relax but suppressed Also note that interleaved R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS is possible and this patch accounts for that. addis 3, 2, .LC1@toc@ha # can relax addis 4, 2, .LC2@toc@ha # can relax ld 3, .LC1@toc@l(3) # can relax ld 4, .LC2@toc@l(4) # can relax Reviewed By: #powerpc, sfertile Differential Revision: https://reviews.llvm.org/D78431	2020-04-30 09:16:51 -07:00
Fangrui Song	b912b887d8	[ELF] Add --print-archive-stats= gold has an option --print-symbol-counts= which prints: // For each archive archive $archive $members $fetched_members // For each object file symbols $object $defined_symbols $used_defined_symbols In most cases, `$defined_symbols = $used_defined_symbols` unless weak symbols are present. Strangely `$used_defined_symbols` includes symbols defined relative to --gc-sections discarded sections. The `symbols` lines do not appear to be useful. `archive` lines are useful: `$fetched_members=0` lines correspond to unused archives. The information can be used to trim dependencies. This patch implements --print-archive-stats= which prints the number of members and the number of fetched members for each archive. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D78983	2020-04-29 18:04:37 -07:00
Igor Kudrin	9f65f5acca	[LLD][ELF] Eliminate symbols of merged .ARM.exidx sections. GNU tools generate mapping symbols "$d" for .ARM.exidx sections. The symbols are added to the symbol table much earlier than the merging takes place, and after that, they become dangling. Before the patch, LLD output those symbols as SHN_ABS with the value of 0. The patch removes such symbols from the symbol table. Differential Revision: https://reviews.llvm.org/D78820	2020-04-28 18:58:40 +07:00
Igor Kudrin	66e4eb9c1b	[LLD][ELF] Implement --discard-* for cases when -r or --emit-relocs are used. When discarding local symbols with --discard-all or --discard-locals, the ones which are used in relocations should be preserved. LLD used the simplest approach and just ignored those switches when -r or --emit-relocs was specified. The patch implements handling the --discard-* switches for the cases when relocations are kept by identifying used local symbols and allowing removing only unused ones. This makes the behavior of LLD compatible with GNU linkers. Differential Revision: https://reviews.llvm.org/D77807	2020-04-25 18:59:41 +07:00
Peter Smith	3b1622d63a	[LLD][ELF][ARM] recommit Fix ARM Exidx order for non monotonic section order Fixed error detected by msan. The size field of the .ARM.exidx synthetic section needs to be initialized to at least estimation level before calling assignAddresses as that will use the size field. This was previously reverted in `1ca16fc4f5`. Differential Revision: https://reviews.llvm.org/D78422	2020-04-24 13:47:28 +01:00
Peter Smith	1ca16fc4f5	Revert "[LLD][ELF][ARM] Fix ARM Exidx order for non monotonic section order" This reverts commit `f969c2aa65`. There are some msan buildbot failures sanitzer-x86_64-linux-fast that I need to investigate. Differential Revision: https://reviews.llvm.org/D78422	2020-04-23 16:58:50 +01:00
Peter Smith	f969c2aa65	[LLD][ELF][ARM] Fix ARM Exidx order for non monotonic section order The contents of the .ARM.exidx section must be ordered by SHF_LINK_ORDER rules. We don't need to know the precise address for this order, but we do need to know the relative order of sections. We have been using the sectionIndex for this purpose, this works when the OutputSection order has a monotonically increasing virtual address, but it is possible to write a linker script with non-monotonically increasing virtual address. For these cases we need to evaluate the base address of the OutputSection so that we can order the .ARM.exidx sections properly. This change moves the finalisation of .ARM.exidx till after the first call to AssignAddresses. This permits us to sort on virtual address which is linker script safe. It also permits a fix for part of pr44824 where we generate .ARM.exidx section for the vector table when that table is so far away it is out of range of the .ARM.exidx section. This fix will come in a follow up patch. Differential Revision: https://reviews.llvm.org/D78422	2020-04-23 15:46:44 +01:00
Fangrui Song	497c76e96d	[ELF] Keep local symbols when both --emit-relocs and --discard-all are specified This fixes a bug as exposed by D77807. Add tests for {--emit-relocs,-r} x {--discard-locals,--discard-all}. They add coverage for previously undertested cases: * STT_SECTION associated to GCed sections (`gc`) * STT_SECTION associated to retained sections (`text`) * STT_SECTION associated to non-SHF_ALLOC sections (`.comment`) * STB_LOCAL in GCed sections (`unused_gc`) Reviewed By: grimar, ikudrin Differential Revision: https://reviews.llvm.org/D78389	2020-04-21 08:28:12 -07:00
Sriraman Tallam	94317878d8	LLD Support for Basic Block Sections This is part of the Propeller framework to do post link code layout optimizations. Please see the RFC here: https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the detailed RFC doc here: https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf This patch adds lld support for basic block sections and performs relaxations after the basic blocks have been reordered. After the linker has reordered the basic block sections according to the desired sequence, it runs a relaxation pass to optimize jump instructions. Currently, the compiler emits the long form of all jump instructions. AMD64 ISA supports variants of jump instructions with one byte offset or a four byte offset. The compiler generates jump instructions with R_X86_64 32-bit PC relative relocations. We would like to use a new relocation type for these jump instructions as it makes it easy and accurate while relaxing these instructions. The relaxation pass does two things: First, it deletes all explicit fall-through direct jump instructions between adjacent basic blocks. This is done by discarding the tail of the basic block section. Second, If there are consecutive jump instructions, it checks if the first conditional jump can be inverted to convert the second into a fall through and delete the second. The jump instructions are relaxed by using jump instruction mods, something like relocations. These are used to modify the opcode of the jump instruction. Jump instruction mods contain three values, instruction offset, jump type and size. While writing this jump instruction out to the final binary, the linker uses the jump instruction mod to determine the opcode and the size of the modified jump instruction. These mods are required because the input object files are memory-mapped without write permissions and directly modifying the object files requires copying these sections. Copying a large number of basic block sections significantly bloats memory. Differential Revision: https://reviews.llvm.org/D68065	2020-04-07 06:55:57 -07:00
Peter Smith	2539b4ae47	[LLD][ELF] Allow empty (.init\|.preinit\|.fini)_array to be RELRO The default GNU linker script uses the following idiom for the array sections. I'll use .init_array here, but this also applies to .preinit_array and .fini_array sections. .init_array : { PROVIDE_HIDDEN (__init_array_start = .); KEEP (*(.init_array)) PROVIDE_HIDDEN (__init_array_end = .); } The C-library will take references to the _start and _end symbols to process the array. This will make LLD keep the OutputSection even if there are no .init_array sections. As the current check for RELRO uses the section type for .init_array the above example with no .init_array InputSections fails the checks as there are no .init_array sections to give the OutputSection a type of SHT_INIT_ARRAY. This often leads to a non-contiguous RELRO error message. The simple fix is to a textual section match as well as a section type match. Differential Revision: https://reviews.llvm.org/D76915	2020-03-31 12:53:12 +01:00
Fangrui Song	673e81eee4	[ELF] Allow SHF_LINK_ORDER and non-SHF_LINK_ORDER to be mixed Currently, `error: incompatible section flags for .rodata` is reported when we mix SHF_LINK_ORDER and non-SHF_LINK_ORDER sections in an output section. This is overconstrained. This patch allows mixed flags with the requirement that SHF_LINK_ORDER sections must be contiguous. Mixing flags is used by Linux aarch64 (https://github.com/ClangBuiltLinux/linux/issues/953) .init.data : { ... KEEP(*(__patchable_function_entries)) ... } When the integrated assembler is enabled, clang's -fpatchable-function-entry=N[,M] implementation sets the SHF_LINK_ORDER flag (D72215) to fix a number of garbage collection issues. Strictly speaking, the ELF specification does not require contiguous SHF_LINK_ORDER sections but for many current uses of SHF_LINK_ORDER like .ARM.exidx/__patchable_function_entries there has been a requirement for the sections to be contiguous on top of the requirements of the ELF specification. This patch also imposes one restriction: SHF_LINK_ORDER sections cannot be separated by a symbol assignment or a BYTE command. Not allowing BYTE is a natural extension that a non-SHF_LINK_ORDER cannot be a separator. Symbol assignments can delimiter the contents of SHF_LINK_ORDER sections. Allowing SHF_LINK_ORDER sections across symbol assignments (especially __start_/__stop_) can make things hard to explain. The restriction should not be a problem for practical use cases. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D77007	2020-03-30 10:03:55 -07:00
Fangrui Song	9e33c09647	[ELF] Keep orphan section names (.rodata.foo .text.foo) unchanged if !hasSectionsCommand This behavior matches GNU ld and seems reasonable. ``` // If a SECTIONS command is not specified .text.* -> .text .rodata.* -> .rodata .init_array.* -> .init_array ``` A proposed Linux feature CONFIG_FG_KASLR may depend on the GNU ld behavior. Reword a comment about -z keep-text-section-prefix and a comment about CommonSection (deleted by rL286234). Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D75225	2020-03-23 10:30:06 -07:00
Sid Manning	5a5a075c5b	[LLD][ELF][Hexagon] Support GDPLT transforms Hexagon ABI specifies that call x@gdplt is transformed to call __tls_get_addr. Example: call x@gdplt is changed to call __tls_get_addr When x is an external tls variable. Differential Revision: https://reviews.llvm.org/D74443	2020-03-13 11:02:11 -05:00

1 2 3 4 5 ...

1608 Commits