llvm-project

Commit Graph

Author	SHA1	Message	Date
Snehasish Kumar	070555c6c0	[lld] Make -z keep-text-section-prefix recognize .text.split. as a prefix. ".text.split." holds symbols which are split out from functions in other input sections. For example, with -fsplit-machine-functions, placing the cold parts in .text.split instead of .text.unlikely mitigates against poor profile inaccuracy. Techniques such as hugepage remapping can make conservative decisions at the section granularity. Differential Revision: https://reviews.llvm.org/D87840	2020-09-24 15:02:48 -07:00
Fangrui Song	15f0ad2fa2	[ELF] Bump the limit of thunk creation passes from 10 to 15 I have noticed that a 374MiB powerpc64le 'ld.lld' requires 11 passes to link. There is a ThunkSection (whose parent OutputSection is ".text" of 169MiB) with 12867 thunks.	2020-09-16 14:05:22 -07:00
Fangrui Song	e59d9df774	[ELF] --symbol-ordering-file: optimize a loop	2020-09-07 21:47:30 -07:00
Fangrui Song	ec29538af2	[ELF] Assign file offsets of non-SHF_ALLOC after SHF_ALLOC and set sh_addr=0 to non-SHF_ALLOC * GNU ld places non-SHF_ALLOC sections after SHF_ALLOC sections. This has the advantage that the file offsets of a non-SHF_ALLOC cannot be contained in a PT_LOAD. This patch matches the behavior. * For non-SHF_ALLOC non-orphan sections, GNU ld may assign non-zero sh_addr and treat them similar to SHT_NOBITS (not advance location counter). This is an alternative approach to what we have done in D85100. By placing non-SHF_ALLOC sections at the end, we can drop special cases in createSection and findOrphanPos added by D85100. Different from GNU ld, we set sh_addr to 0 for non-SHF_ALLOC sections. 0 arguably is better because non-SHF_ALLOC sections don't appear in the memory image. ELF spec says: > sh_addr - If the section will appear in the memory image of a process, this > member gives the address at which the section's first byte should > reside. Otherwise, the member contains 0. D85100 appeared to take a detour. If we take a combined view on D85100 and this patch, the overall complexity slightly increases (one more 3-line loop) and compatibility with GNU ld improves. The behavior we don't want to match is the special treatment of .symtab .shstrtab .strtab: they can be matched in LLD but not in GNU ld. Reviewed By: jhenderson, psmith Differential Revision: https://reviews.llvm.org/D85867	2020-08-18 09:03:01 -07:00
Fangrui Song	e8a11c0558	[ELF] Allow mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER sections and sort within InputSectionDescription LLD currently does not allow non-contiguous SHF_LINK_ORDER components in an output section. This makes it infeasible to add SHF_LINK_ORDER to an existing metadata section if backward compatibility with older object files are concerned. We did not allow mixed components (like GNU ld) and D77007 relaxed to allow non-contiguous SHF_LINK_ORDER components. This patch allows arbitrary mix, with sorting performed within an InputSectionDescription. For example, `.rodata : {(.rodata.foo) (.rodata.bar)}`, has two InputSectionDescription's. If there is at least one SHF_LINK_ORDER and at least one non-SHF_LINK_ORDER in .rodata.foo, they are ordered within `(.rodata.foo)`: we arbitrarily place SHF_LINK_ORDER components before non-SHF_LINK_ORDER components (like Solaris ld). `(.rodata.bar)` is ordered similarly, but the two InputSectionDescription's don't interact. It can be argued that this is more reasonable than the previous behavior where written order was not respected. It would be nice if the two different semantics (ordering requirement & garbage collection) were not overloaded on one section flag, however, it is probably difficult to obtain a generic flag at this point (https://groups.google.com/forum/#!topic/generic-abi/hgx_m1aXqUo "SHF_LINK_ORDER's original semantics make upgrade difficult"). (Actually, without the GC semantics, SHF_LINK_ORDER would still have the sh_link!=0 & sh_link=0 issue. It is just that people find the GC semantics more useful and tend to use the feature more often.) GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=16833 Differential Revision: https://reviews.llvm.org/D84001	2020-08-17 11:29:05 -07:00
Georgii Rymar	c135a68d42	[LLD][ELF] - Do not produce an invalid dynamic relocation order with --shuffle-sections. Normally (when not on android with android relocation packing enabled), we put IRelative relocations to ".rel[a].dyn", after other relocations, to ensure that IRelatives are processed last by the dynamic loader. To achieve that we add the `in.relaIplt` after the `part.relaDyn`: https://github.com/llvm/llvm-project/blob/master/lld/ELF/Writer.cpp#L540 The problem is that `--shuffle-sections` might break the sections order. This patch fixes it. Fixes https://bugs.llvm.org/show_bug.cgi?id=47056. Differential revision: https://reviews.llvm.org/D85651	2020-08-17 14:46:52 +03:00
Fangrui Song	a6db64ef4a	[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD (PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to SHF_ALLOC NOBITS sections. The location counter is not advanced). This patch tries to fix PR37607 (remove a special case in `Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot reset dot to 0 for a middle non-SHF_ALLOC output section. This results in removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC non-orphan sections can have non-zero addresses like in GNU ld. The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan only. This results in a special case in createSection and findOrphanPos, respectively. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85100	2020-08-06 08:27:15 -07:00
Muhammad Omair Javaid	d9e191cb17	Revert "[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD" This reverts commit `030ddc0a0b`. This breaks http://lab.llvm.org:8011/builders/lldb-arm-ubuntu and http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu Differential Revision: https://reviews.llvm.org/D85100	2020-08-06 16:30:05 +05:00
Fangrui Song	b216c80cc2	[ELF] Allow SHF_LINK_ORDER sections to have sh_link=0 Part of https://bugs.llvm.org/show_bug.cgi?id=41734 The semantics of SHF_LINK_ORDER have been extended to represent metadata sections associated with some other sections (usually text). The associated text section may be discarded (e.g. LTO) and we want the metadata section to have sh_link=0 (D72899, D76802). Normally the metadata section is only referenced by the associated text section. sh_link=0 means the associated text section is discarded, and the metadata section will be garbage collected. If there is another section (.gc_root) referencing the metadata section, the metadata section will be retained. It's the .gc_root consumer's job to validate the metadata sections. # This creates a SHF_LINK_ORDER .meta with sh_link=0 .section .meta,"awo",@progbits,0 1: .section .meta,"awo",@progbits,foo 2: .section .gc_root,"a",@progbits .quad 1b .quad 2b Reviewed By: pcc, jhenderson Differential Revision: https://reviews.llvm.org/D72904	2020-08-05 16:17:42 -07:00
Fangrui Song	030ddc0a0b	[ELF] Allow sections after a non-SHF_ALLOC section to be covered by PT_LOAD GNU ld allows sections after a non-SHF_ALLOC section to be covered by PT_LOAD (PR37607) and assigns addresses to non-SHF_ALLOC output sections (similar to SHF_ALLOC NOBITS sections. The location counter is not advanced). This patch tries to fix PR37607 (remove a special case in `Writer<ELFT>::createPhdrs`). To make the created PT_LOAD meaningful, we cannot reset dot to 0 for a middle non-SHF_ALLOC output section. This results in removal of two special cases in LinkerScript::assignOffsets. Non-SHF_ALLOC non-orphan sections can have non-zero addresses like in GNU ld. The zero address rule for non-SHF_ALLOC sections is weakened to apply to orphan only. This results in a special case in createSection and findOrphanPos, respectively. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D85100	2020-08-05 09:30:23 -07:00
Fangrui Song	acb66b9111	[ELF] --oformat=binary: use LMA to compute file offsets --oformat=binary is rare (used in a few places in FreeBSD, see `stand/i386/mbr/Makefile` `LDFLAGS_BIN`) The result should be identical to a normal output transformed by `objcopy -O binary`. The current implementation ignores addresses and lays out sections by respecting output section alignments. It can fail when an output section address is specified, e.g. `.rodata ALIGN(16) :` (PR33651). Fix PR33651 by respecting LMA. The code is similar to `tools/llvm-objcop/ELF/Object.cpp` BinaryWriter::finalize after D71035 and D79229. Unforunately for an output section without PT_LOAD, we assume its LMA is equal to its VMA. So the result is still incorrect when an output section LMA (`AT(...)`) is specified Also drop `alignTo(off, config->wordsize)`. GNU ld does not round up the file size. Differential Revision: https://reviews.llvm.org/D85086	2020-08-05 09:10:01 -07:00
Petr Hosek	fffd05d525	[ELF] Add -z start-stop-visibility= to set __start_/__stop_ symbol visibility This matches the equivalent flag implemented in GNU linkers, see https://sourceware.org/pipermail/binutils/2020-June/111685.html for the associated discussion. Differential Revision: https://reviews.llvm.org/D55682	2020-06-23 15:59:59 -07:00
Fangrui Song	c4d13f72a6	[ELF] Refactor ObjFile<ELFT>::initializeSymbols to enforce the invariant: InputFile::symbols has non null entry Fixes PR46348. ObjFile<ELFT>::initializeSymbols contains two symbol iteration loops: ``` for each symbol if non-inheriting && non-local fill in this->symbols[i] for each symbol if local fill in this->symbols[i] else symbol resolution ``` Symbol resolution can trigger a duplicate symbol error which will call InputSectionBase::getObjMsg to iterate over InputFile::symbols. If a non-local symbol appears after the non-local symbol being resolved (violating ELF spec), its `this->symbols[i]` entry has not been filled in, InputSectionBase::getObjMsg will crash due to `dyn_cast<Defined>(nullptr)`. To fix the bug, reorganize the two loops to ensure this->symbols is complete before symbol resolution. This enforces the invariant: InputFile::symbols has none null entry when InputFile::getSymbols() is called. ``` for each symbol if non-inheriting fill in this->symbols[i] for each symbol starting from firstGlobal if non-local symbol resolution ``` Additionally, move the (non-local symbol in local part of .symtab) diagnostic from Writer<ELFT>::copyLocalSymbols() to initializeSymbols(). Reviewed By: grimar, jhenderson Differential Revision: https://reviews.llvm.org/D81988	2020-06-19 09:05:37 -07:00
Fangrui Song	3eb4bf13ba	[ELF] Append " [--no-allow-shlib-undefined]" to the corresponding diagnostics --no-allow-shlib-undefined (enabled by default when linking an executable) rejects unresolved references in shared objects. Users may be confused by the common diagnostics of unresolved symbols in object files (LLD: "undefined symbol: foo"; GNU ld/gold: "undefined reference to") Learn from GCC/clang " [-Wfoo]": append the option name to the diagnostics. Users can find relevant information by searching "--no-allow-shlib-undefined". It should also be obvious to them that the positive form --allow-shlib-undefined can suppress the error. Also downgrade the error to a warning if --noinhibit-exec is used (compatible with GNU ld and gold). Reviewed By: grimar, psmith Differential Revision: https://reviews.llvm.org/D81028	2020-06-03 07:59:37 -07:00
Fangrui Song	bae7cf6746	[ELF][PPC64] Synthesize _savegpr[01]_{14..31} and _restgpr[01]_{14..31} In the 64-bit ELF V2 API Specification: Power Architecture, 2.3.3.1. GPR Save and Restore Functions defines some special functions which may be referenced by GCC produced assembly (LLVM does not reference them). With GCC -Os, when the number of call-saved registers exceeds a certain threshold, GCC generates `_savegpr0_* _restgpr0_` calls and expects the linker to define them. See https://sourceware.org/pipermail/binutils/2002-February/017444.html and https://sourceware.org/pipermail/binutils/2004-August/036765.html . This is weird because libgcc.a would be the natural place. However, the linker generation approach has the advantage that the linker can generate multiple copies to avoid long branch thunks. We don't consider the advantage significant enough to complicate our trunk implementation, so we take a simple approach. Check whether `_savegpr0_{14..31}` are used * If yes, define needed symbols and add an InputSection with the code sequence. `_savegpr1_` `_restgpr0_` and `_restgpr1_*` are similar. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D79977	2020-05-26 09:35:41 -07:00
Fangrui Song	07837b8f49	[ELF] Use namespace qualifiers (lld:: or elf::) instead of `namespace lld { namespace elf {` Similar to D74882. This reverts much code from commit `bd8cfe65f5` (D68323) and fixes some problems before D68323. Sorry for the churn but D68323 was a mistake. Namespace qualifiers avoid bugs where the definition does not match the declaration from the header. See https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions (D74515) Differential Revision: https://reviews.llvm.org/D79982	2020-05-15 08:49:53 -07:00
Wei Mi	538208f6c0	[lld] Add a new output section ".text.unknown" for funtions with unknown hotness For sampleFDO, because the optimized build uses profile generated from previous release, often we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persue the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. This problem has been largely solved for regular sampleFDO using profile-symbol-list (https://reviews.llvm.org/D66374), but for the case when we use partial profile, we still waste a lot of memory because of it. In https://reviews.llvm.org/D62540, we propose to save functions with unknown hotness information in a special section called ".text.unknown", so that compiler will treat those functions as luck-warm, but runtime can choose not to mlock the special section in memory or use other strategy to save memory. That will solve most of the memory problem even if we use a partial profile. The patch adds the support in lld for the special section.For sampleFDO, because the optimized build uses profile generated from previous release, often we couldn't tell a function without profile was truely cold or just newly created so we had to treat them conservatively and put them in .text section instead of .text.unlikely. The result was when we persue the best performance by locking .text.hot and .text in memory, we wasted a lot of memory to keep cold functions inside. This problem has been largely solved for regular sampleFDO using profile-symbol-list (https://reviews.llvm.org/D66374), but for the case when we use partial profile, we still waste a lot of memory because of it. In https://reviews.llvm.org/D62540, we propose to save functions with unknown hotness information in a special section called ".text.unknown", so that compiler will treat those functions as luck-warm, but runtime can choose not to mlock the special section in memory or use other strategy to save memory. That will solve most of the memory problem even if we use a partial profile. The patch adds the support in lld for the special section. Differential Revision: https://reviews.llvm.org/D79590	2020-05-08 11:14:48 -07:00
Reid Kleckner	932f0276ea	[Support] Move LLD's parallel algorithm wrappers to support Essentially takes the lld/Common/Threads.h wrappers and moves them to the llvm/Support/Paralle.h algorithm header. The changes are: - Remove policy parameter, since all clients use `par`. - Rename the methods to `parallelSort` etc to match LLVM style, since they are no longer C++17 pstl compatible. - Move algorithms from llvm::parallel:: to llvm::, since they have "parallel" in the name and are no longer overloads of the regular algorithms. - Add range overloads - Use the sequential algorithm directly when 1 thread is requested (skips task grouping) - Fix the index type of parallelForEachN to size_t. Nobody in LLVM was using any other parameter, and it made overload resolution hard for for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t. Remove Threads.h and update LLD for that. This is a prerequisite for parallel public symbol processing in the PDB library, which is in LLVM. Reviewed By: MaskRay, aganea Differential Revision: https://reviews.llvm.org/D79390	2020-05-05 15:21:05 -07:00
Fangrui Song	c49f83b6e9	[ELF] Don't advance sh_offset for an empty section whose PT_LOAD is removed (due to p_memsz=0) removeEmptyPTLoad() removes empty (p_memsz=0) PT_LOAD segments. In assignFileOffsets(), setFileOffset() unnecessarily advances file offsets for containing empty sections. This is exposed by arm Linux kernel's multi_v5_defconfig (see https://bugs.llvm.org/show_bug.cgi?id=45632) ``` ld.lld (max-page-size=65536): [34] .init.data PROGBITS c0c24000 c34000 0128ac 00 WA 0 0 4096 [35] .text_itcm PROGBITS fffe0000 c50000 000000 00 WA 0 0 1 [36] .data_dtcm PROGBITS fffe8000 c58000 000000 00 WA 0 0 1 [37] .data PROGBITS c0c38000 c58000 0647a0 00 WA 0 0 32 arm-linux-gnueabi-ld (max-page-size=65536): [23] .init.data PROGBITS c0c12000 c22000 0128ac 00 WA 0 0 4096 [24] .text_itcm PROGBITS fffe0000 ca2558 000000 00 W 0 0 1 [25] .data_dtcm PROGBITS fffe8000 ca2558 000000 00 W 0 0 1 [26] .data PROGBITS c0c26000 c36000 0647a0 00 WA 0 0 32 ``` This patch clears OutputSection::ptLoad if ptLoad is removed by removeEmptyPTLoad(). Conceptually this removes "dangling" references. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D79254	2020-05-04 08:07:34 -07:00
Peter Smith	3834385f27	[ELF] Move SHF_LINK_ORDER till OutputSection addresses are known Sections with the SHF_LINK_ORDER flag must be ordered in the same relative order as the Sections they have a link to. When using a linker script an arbitrary expression may be used for the virtual address of the OutputSection. In some cases the virtual address does not monotonically increase as the OutputSection index increases, so if we base the ordering of the SHF_LINK_ORDER sections on the index then we can get the order wrong. We fix this by moving SHF_LINK_ORDER resolution till after we have created OutputSection virtual addresses. Differential Revision: https://reviews.llvm.org/D79286	2020-05-04 14:25:25 +01:00
Fangrui Song	b257d3c8a8	[ELF][PPC64] Suppress toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen The current implementation assumes that R_PPC64_TOC16_HA is always followed by R_PPC64_TOC16_LO_DS. This can break with R_PPC64_TOC16_LO: // Load the address of the TOC entry, instead of the value stored at that address addis 3, 2, .LC0@tloc@ha # R_PPC64_TOC16_HA addi 3, 3, .LC0@tloc@l # R_PPC64_TOC16_LO blr which is used by boringssl's util/fipstools/delocate/delocate.go https://github.com/google/boringssl/blob/master/crypto/fipsmodule/FIPS.md has some documentation. In short, this tool converts an assembly file to avoid any potential relocations. The distance to an input .toc is not a constant after linking, so it cannot use an `addis;ld` pair. Instead, it jumps to a stub which loads the TOC entry address with `addis;addi`. This patch checks the presence of R_PPC64_TOC16_LO and suppresses toc-indirect to toc-relative relaxation if R_PPC64_TOC16_LO is seen. This approach is conservative and loses some relaxation opportunities but is easy to implement. addis 3, 2, .LC0@toc@ha # no relaxation addi 3, 3, .LC0@toc@l # no relaxation li 9, 0 addis 4, 2, .LC0@toc@ha # can relax but suppressed ld 4, .LC0@toc@l(4) # can relax but suppressed Also note that interleaved R_PPC64_TOC16_HA and R_PPC64_TOC16_LO_DS is possible and this patch accounts for that. addis 3, 2, .LC1@toc@ha # can relax addis 4, 2, .LC2@toc@ha # can relax ld 3, .LC1@toc@l(3) # can relax ld 4, .LC2@toc@l(4) # can relax Reviewed By: #powerpc, sfertile Differential Revision: https://reviews.llvm.org/D78431	2020-04-30 09:16:51 -07:00
Fangrui Song	b912b887d8	[ELF] Add --print-archive-stats= gold has an option --print-symbol-counts= which prints: // For each archive archive $archive $members $fetched_members // For each object file symbols $object $defined_symbols $used_defined_symbols In most cases, `$defined_symbols = $used_defined_symbols` unless weak symbols are present. Strangely `$used_defined_symbols` includes symbols defined relative to --gc-sections discarded sections. The `symbols` lines do not appear to be useful. `archive` lines are useful: `$fetched_members=0` lines correspond to unused archives. The information can be used to trim dependencies. This patch implements --print-archive-stats= which prints the number of members and the number of fetched members for each archive. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D78983	2020-04-29 18:04:37 -07:00
Igor Kudrin	9f65f5acca	[LLD][ELF] Eliminate symbols of merged .ARM.exidx sections. GNU tools generate mapping symbols "$d" for .ARM.exidx sections. The symbols are added to the symbol table much earlier than the merging takes place, and after that, they become dangling. Before the patch, LLD output those symbols as SHN_ABS with the value of 0. The patch removes such symbols from the symbol table. Differential Revision: https://reviews.llvm.org/D78820	2020-04-28 18:58:40 +07:00
Igor Kudrin	66e4eb9c1b	[LLD][ELF] Implement --discard-* for cases when -r or --emit-relocs are used. When discarding local symbols with --discard-all or --discard-locals, the ones which are used in relocations should be preserved. LLD used the simplest approach and just ignored those switches when -r or --emit-relocs was specified. The patch implements handling the --discard-* switches for the cases when relocations are kept by identifying used local symbols and allowing removing only unused ones. This makes the behavior of LLD compatible with GNU linkers. Differential Revision: https://reviews.llvm.org/D77807	2020-04-25 18:59:41 +07:00
Peter Smith	3b1622d63a	[LLD][ELF][ARM] recommit Fix ARM Exidx order for non monotonic section order Fixed error detected by msan. The size field of the .ARM.exidx synthetic section needs to be initialized to at least estimation level before calling assignAddresses as that will use the size field. This was previously reverted in `1ca16fc4f5`. Differential Revision: https://reviews.llvm.org/D78422	2020-04-24 13:47:28 +01:00
Peter Smith	1ca16fc4f5	Revert "[LLD][ELF][ARM] Fix ARM Exidx order for non monotonic section order" This reverts commit `f969c2aa65`. There are some msan buildbot failures sanitzer-x86_64-linux-fast that I need to investigate. Differential Revision: https://reviews.llvm.org/D78422	2020-04-23 16:58:50 +01:00
Peter Smith	f969c2aa65	[LLD][ELF][ARM] Fix ARM Exidx order for non monotonic section order The contents of the .ARM.exidx section must be ordered by SHF_LINK_ORDER rules. We don't need to know the precise address for this order, but we do need to know the relative order of sections. We have been using the sectionIndex for this purpose, this works when the OutputSection order has a monotonically increasing virtual address, but it is possible to write a linker script with non-monotonically increasing virtual address. For these cases we need to evaluate the base address of the OutputSection so that we can order the .ARM.exidx sections properly. This change moves the finalisation of .ARM.exidx till after the first call to AssignAddresses. This permits us to sort on virtual address which is linker script safe. It also permits a fix for part of pr44824 where we generate .ARM.exidx section for the vector table when that table is so far away it is out of range of the .ARM.exidx section. This fix will come in a follow up patch. Differential Revision: https://reviews.llvm.org/D78422	2020-04-23 15:46:44 +01:00
Fangrui Song	497c76e96d	[ELF] Keep local symbols when both --emit-relocs and --discard-all are specified This fixes a bug as exposed by D77807. Add tests for {--emit-relocs,-r} x {--discard-locals,--discard-all}. They add coverage for previously undertested cases: * STT_SECTION associated to GCed sections (`gc`) * STT_SECTION associated to retained sections (`text`) * STT_SECTION associated to non-SHF_ALLOC sections (`.comment`) * STB_LOCAL in GCed sections (`unused_gc`) Reviewed By: grimar, ikudrin Differential Revision: https://reviews.llvm.org/D78389	2020-04-21 08:28:12 -07:00
Sriraman Tallam	94317878d8	LLD Support for Basic Block Sections This is part of the Propeller framework to do post link code layout optimizations. Please see the RFC here: https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the detailed RFC doc here: https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf This patch adds lld support for basic block sections and performs relaxations after the basic blocks have been reordered. After the linker has reordered the basic block sections according to the desired sequence, it runs a relaxation pass to optimize jump instructions. Currently, the compiler emits the long form of all jump instructions. AMD64 ISA supports variants of jump instructions with one byte offset or a four byte offset. The compiler generates jump instructions with R_X86_64 32-bit PC relative relocations. We would like to use a new relocation type for these jump instructions as it makes it easy and accurate while relaxing these instructions. The relaxation pass does two things: First, it deletes all explicit fall-through direct jump instructions between adjacent basic blocks. This is done by discarding the tail of the basic block section. Second, If there are consecutive jump instructions, it checks if the first conditional jump can be inverted to convert the second into a fall through and delete the second. The jump instructions are relaxed by using jump instruction mods, something like relocations. These are used to modify the opcode of the jump instruction. Jump instruction mods contain three values, instruction offset, jump type and size. While writing this jump instruction out to the final binary, the linker uses the jump instruction mod to determine the opcode and the size of the modified jump instruction. These mods are required because the input object files are memory-mapped without write permissions and directly modifying the object files requires copying these sections. Copying a large number of basic block sections significantly bloats memory. Differential Revision: https://reviews.llvm.org/D68065	2020-04-07 06:55:57 -07:00
Peter Smith	2539b4ae47	[LLD][ELF] Allow empty (.init\|.preinit\|.fini)_array to be RELRO The default GNU linker script uses the following idiom for the array sections. I'll use .init_array here, but this also applies to .preinit_array and .fini_array sections. .init_array : { PROVIDE_HIDDEN (__init_array_start = .); KEEP (*(.init_array)) PROVIDE_HIDDEN (__init_array_end = .); } The C-library will take references to the _start and _end symbols to process the array. This will make LLD keep the OutputSection even if there are no .init_array sections. As the current check for RELRO uses the section type for .init_array the above example with no .init_array InputSections fails the checks as there are no .init_array sections to give the OutputSection a type of SHT_INIT_ARRAY. This often leads to a non-contiguous RELRO error message. The simple fix is to a textual section match as well as a section type match. Differential Revision: https://reviews.llvm.org/D76915	2020-03-31 12:53:12 +01:00
Fangrui Song	673e81eee4	[ELF] Allow SHF_LINK_ORDER and non-SHF_LINK_ORDER to be mixed Currently, `error: incompatible section flags for .rodata` is reported when we mix SHF_LINK_ORDER and non-SHF_LINK_ORDER sections in an output section. This is overconstrained. This patch allows mixed flags with the requirement that SHF_LINK_ORDER sections must be contiguous. Mixing flags is used by Linux aarch64 (https://github.com/ClangBuiltLinux/linux/issues/953) .init.data : { ... KEEP(*(__patchable_function_entries)) ... } When the integrated assembler is enabled, clang's -fpatchable-function-entry=N[,M] implementation sets the SHF_LINK_ORDER flag (D72215) to fix a number of garbage collection issues. Strictly speaking, the ELF specification does not require contiguous SHF_LINK_ORDER sections but for many current uses of SHF_LINK_ORDER like .ARM.exidx/__patchable_function_entries there has been a requirement for the sections to be contiguous on top of the requirements of the ELF specification. This patch also imposes one restriction: SHF_LINK_ORDER sections cannot be separated by a symbol assignment or a BYTE command. Not allowing BYTE is a natural extension that a non-SHF_LINK_ORDER cannot be a separator. Symbol assignments can delimiter the contents of SHF_LINK_ORDER sections. Allowing SHF_LINK_ORDER sections across symbol assignments (especially __start_/__stop_) can make things hard to explain. The restriction should not be a problem for practical use cases. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D77007	2020-03-30 10:03:55 -07:00
Fangrui Song	9e33c09647	[ELF] Keep orphan section names (.rodata.foo .text.foo) unchanged if !hasSectionsCommand This behavior matches GNU ld and seems reasonable. ``` // If a SECTIONS command is not specified .text.* -> .text .rodata.* -> .rodata .init_array.* -> .init_array ``` A proposed Linux feature CONFIG_FG_KASLR may depend on the GNU ld behavior. Reword a comment about -z keep-text-section-prefix and a comment about CommonSection (deleted by rL286234). Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D75225	2020-03-23 10:30:06 -07:00
Sid Manning	5a5a075c5b	[LLD][ELF][Hexagon] Support GDPLT transforms Hexagon ABI specifies that call x@gdplt is transformed to call __tls_get_addr. Example: call x@gdplt is changed to call __tls_get_addr When x is an external tls variable. Differential Revision: https://reviews.llvm.org/D74443	2020-03-13 11:02:11 -05:00
Fangrui Song	eb4b5a36a6	[ELF] Move --print-map(-M)/--cref before checkSections() and openFile() -M output can be useful when diagnosing an "error: output file too large" problem (emitted in openFile()). I just ran into such a situation where I had to debug an erronerous Linux kernel linker script. It tried to create a file larger than INT64_MAX bytes. This patch could have helped https://bugs.llvm.org/show_bug.cgi?id=44715 as well. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D75966	2020-03-12 08:00:18 -07:00
Fangrui Song	fbf41b5267	[ELF] Simplify sh_addr computation and warn if sh_addr is not a multiple of sh_addralign See `docs/ELF/linker_script.rst` for the new computation for sh_addr and sh_addralign. `ALIGN(section_align)` now means: "increase alignment to section_align" (like yet another input section requirement). The "start of section .foo changes from 0x11 to 0x20" warning no longer makes sense. Change it to warn if sh_addr%sh_addralign!=0. To decrease the alignment from the default max_input_align, use `.output ALIGN(8) : {}` instead of `.output : ALIGN(8) {}` See linkerscript/section-address-align.test as an example. When both an output section address and ALIGN are set (can be seen as an "undefined behavior" https://sourceware.org/ml/binutils/2020-03/msg00115.html), lld may align more than GNU ld, but it makes a linker script working with GNU ld hard to break with lld. This patch can be considered as restoring part of the behavior before D74736. Differential Revision: https://reviews.llvm.org/D75724	2020-03-11 09:35:42 -07:00
Fangrui Song	00925aadb3	[ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order Reviewed By: Bdragon28 Differential Revision: https://reviews.llvm.org/D75394	2020-02-28 22:23:14 -08:00
Fangrui Song	423194098b	[ELF] --orphan-handling=: don't warn/error for unused synthesized sections This makes --orphan-handling= less noisy. This change also improves our compatibility with GNU ld. GNU ld special cases .symtab, .strtab and .shstrtab . We need output section descriptions for .symtab, .strtab and .shstrtab to suppress: <internal>:(.symtab) is being placed in '.symtab' <internal>:(.shstrtab) is being placed in '.shstrtab' <internal>:(.strtab) is being placed in '.strtab' With --strip-all, .symtab and .strtab can be omitted (note, --strip-all is not compatible with --emit-relocs). Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D75149	2020-02-26 08:56:12 -08:00
Rafael Ávila de Espíndola	7b44f0428a	Add a llvm::shuffle and use it in lld With this --shuffle-sections=seed produces the same result in every host. Reviewed By: grimar, MaskRay Differential Revision: https://reviews.llvm.org/D74971	2020-02-22 10:05:29 -08:00
Fangrui Song	dbd7281aa7	[ELF] Shuffle .init_array/.fini_array with --shuffle-sections= Useful for detecting static initialization order fiasco. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74887	2020-02-21 08:16:07 -08:00
Fangrui Song	de0dda54d3	[ELF] Warn changed output section address When the output section address (addrExpr) is specified, GNU ld warns if sh_addr is different. This patch implements the warning. Note, LinkerScript::assignAddresses can be called more than once. We need to record the changed section addresses, and only report the warnings after the addresses are finalized. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74741	2020-02-21 08:13:29 -08:00
Fangrui Song	6ed8e20143	[ELF] Ignore the maximum of input section alignments for two cases Follow-up for D74286. Notations: * alignExpr: the computed ALIGN value * max_input_align: the maximum of input section alignments This patch changes the following two cases to match GNU ld: * When ALIGN is present, GNU ld sets output sh_addr to alignExpr, while lld use max(alignExpr, max_input_align) * When addrExpr is specified but alignExpr is not, GNU ld sets output sh_addr to addrExpr, while lld uses `advance(0, max_input_align)` Note, sh_addralign is still set to max(alignExpr, max_input_align). lma-align.test is enhanced a bit to check we don't overalign sh_addr. fixSectionAlignments() sets addrExpr but not alignExpr for the `!hasSectionsCommand` case. This patch sets alignExpr as well so that max_input_align will be respected. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74736	2020-02-21 08:12:00 -08:00
Rafael Ávila de Espíndola	d48d339156	[lld][ELF] Add --shuffle-sections=seed to shuffle input sections Summary: This option causes lld to shuffle sections by assigning different priorities in each run. The use case for this is to introduce randomization in benchmarks. The idea is inspired by the paper "Producing Wrong Data Without Doing Anything Obviously Wrong!" (https://www.inf.usi.ch/faculty/hauswirth/publications/asplos09.pdf). Unlike the paper, we shuffle individual sections, not just input files. Doing this in lld is particularly convenient as the --reproduce option makes it easy to collect all the necessary bits for relinking the program being benchmarked. Once that it is done, all that is needed is to add --shuffle-sections=0 to the response file and relink before each run of the benchmark. Differential Revision: https://reviews.llvm.org/D74791	2020-02-19 13:44:12 -08:00
Fangrui Song	7c426fb1a6	[ELF] Support INSERT [AFTER\|BEFORE] for orphan sections D43468+D44380 added INSERT [AFTER\|BEFORE] for non-orphan sections. This patch makes INSERT work for orphan sections as well. `SECTIONS {...} INSERT [AFTER\|BEFORE] .foo` does not set `hasSectionCommands`, so the result will be similar to a regular link without a linker script. The differences when `hasSectionCommands` is set include: * image base is different * -z noseparate-code/-z noseparate-loadable-segments are unavailable * some special symbols such as `_end _etext _edata` are not defined The behavior is similar to GNU ld: INSERT is not considered an external linker script. This feature makes the section layout more flexible. It can be used to: * Place .nv_fatbin before other readonly SHT_PROGBITS sections to mitigate relocation overflows. * Disturb the layout to expose address sensitive application bugs. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74375	2020-02-12 08:21:52 -08:00
Fangrui Song	b498d99338	[ELF] Start a new PT_LOAD if LMA region is different GNU ld has a counterintuitive lang_propagate_lma_regions rule. ``` // .foo's LMA region is propagated to .bar because their VMA region is the same, // and .bar does not have an explicit output section address (addr_tree). .foo : { (.foo) } >RAM AT> FLASH .bar : { (.bar) } >RAM // An explicit output section address disables propagation. .foo : { (.foo) } >RAM AT> FLASH .bar . : { (.bar) } >RAM ``` In both cases, lld thinks .foo's LMA region is propagated and places .bar in the same PT_LOAD, so lld diverges from GNU ld w.r.t. the second case (lma-align.test). This patch changes Writer<ELFT>::createPhdrs to disable propagation (start a new PT_LOAD). A user of the first case can make linker scripts portable by explicitly specifying `AT>`. By contrast, there was no workaround for the old behavior. This change uncovers another LMA related bug in assignOffsets() where `ctx->lmaOffset = 0;` was omitted. It caused a spurious "load address range overlaps" error for at2.test The new PT_LOAD rule is complex. For convenience, I listed the origins of some subexpressions: * rL323449: `sec->memRegion == load->firstSec->memRegion`; linkerscript/at3.test * D43284: `load->lastSec == Out::programHeaders` (don't start a new PT_LOAD after program headers); linkerscript/at4.test * D58892: `sec != relroEnd` (start a new PT_LOAD after PT_GNU_RELRO) Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D74297	2020-02-12 08:20:14 -08:00
Russell Gallop	e7cb374433	[LLD][ELF] Add time-trace to ELF LLD This adds some of LLD specific scopes and picks up optimisation scopes via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in `77e6bb3c`. Differential Revision: https://reviews.llvm.org/D71060	2020-02-06 12:14:13 +00:00
Fangrui Song	2d7a8cf904	[ELF] -r: don't create .interp `{clang,gcc} -nostdlib -r a.c` passes --dynamic-linker to the linker, and the expected behavior is to ignore it. If .interp is kept in the relocatable object file, a final link will get PT_INTERP even if --dynamic-linker is not specified. glibc ld.so expects to see PT_DYNAMIC and the executable will likely fail to run. Ignore --dynamic-linker in -r mode as well as -shared.	2020-01-16 12:14:32 -08:00
Fangrui Song	7cd429f27d	[ELF] Add -z force-ibt and -z shstk for Intel Control-flow Enforcement Technology This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang. It adds Intel CET (Control-flow Enforcement Technology) support to lld. The implementation follows the draft version of psABI which you can download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI. CET introduces a new restriction on indirect jump instructions so that you can limit the places to which you can jump to using indirect jumps. In order to use the feature, you need to compile source files with -fcf-protection=full. * IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt. * SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified. IBT-enabled executables/shared objects have two PLT sections, ".plt" and ".plt.sec". For the details as to why we have two sections, please read the comments. Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D59780	2020-01-13 23:39:28 -08:00
Fangrui Song	dce7a362be	[ELF] Improve the condition to create .interp This restores commit `1417558e4a` and its follow-up, reverted by commit `c3dbd782f1`. After this commit: clang -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created clang -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created clang -fuse-ld=gold -no-pie -nostdlib a.c => .interp not created clang -fuse-ld=gold -pie -fPIE -nostdlib a.c => .interp created clang -fuse-ld=lld -no-pie -nostdlib a.c => .interp created clang -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp created	2019-12-27 15:34:25 -08:00
Reid Kleckner	c3dbd782f1	Revert "[ELF] Improve the condition to create .interp" This reverts commit `1417558e4a`. Also reverts commit `019a92bb28`. This causes check-sanitizer to fail. The "-Nolib" variant of the test crashes on startup in the loader.	2019-12-27 13:05:41 -08:00
Fangrui Song	1417558e4a	[ELF] Improve the condition to create .interp Similar to rL362355, but with the `!config->shared` guard. (1) {gcc,clang} -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created (2) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp not created (3) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c a.so => .interp created The inconsistency of (2) is due to the condition `!Config->SharedFiles.empty()`. To make lld behave more like ld.bfd, we could change the condition to: config->hasDynSymTab && !config->dynamicLinker.empty() && script->needsInterpSection(); However, that would bring another inconsistency as can be observed with: (4) {gcc,clang} -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created	2019-12-26 13:26:43 -08:00
Fangrui Song	891a8655ab	[ELF] Add IpltSection PltSection is used by both PLT and IPLT. The PLT section may have a header while the IPLT section does not. Split off IpltSection from PltSection to be clearer. Unlike other targets, PPC64 cannot use the same code sequence for PLT and IPLT. This helps make a future PPC64 patch (D71509) more isolated. On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently we are inconsistent whether the PLT header is conceptually attached to in.plt or in.iplt . Consistently attach the header to in.plt can make the -z retpolineplt logic simpler. It also makes `jmp` point to an aesthetically better place for non-retpolineplt cases. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D71519	2019-12-17 00:06:04 -08:00
Rui Ueyama	69da7e29de	Revert an accidental commit `af5ca40b47`	2019-12-13 15:17:40 +09:00
Rui Ueyama	af5ca40b47	temporary	2019-12-13 14:35:03 +09:00
Fangrui Song	cd0ab2428f	[ELF] --icf: do not fold preemptible symbols Fixes PR44124. A preemptible symbol may refer to a different definition at runtime. When comparing a pair of relocations, if they refer to different symbols, and either symbol is preemptible, the two containing sections should be considered different. gold has a similar rule https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=ce97fa81e0c46d216b80b143ad8c02fff6906fef Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D71163	2019-12-10 09:06:08 -08:00
Peter Smith	4d6c4cb426	[LLD][ELF] Add support for PT_GNU_PROPERTY The PT_GNU_PROPERTY program header describes the location of the .note.gnu.property SHT_NOTES section. The linux kernel uses this program header to find the .note.gnu.property section rather than parsing. Executables that have properties that the kernel needs to act on that don't have the PT_GNU_PROPERTY program header will not boot. Differential Revision: https://reviews.llvm.org/D70961	2019-12-05 09:54:58 +00:00
Fangrui Song	a2fc964417	[ELF] Replace SymbolTable::forEachSymbol with iterator_range symbols() D62381 introduced forEachSymbol(). It seems that many call sites cannot be parallelized because the body shared some states. Replace forEachSymbol with iterator_range<filter_iterator<...>> symbols() to simplify code and improve debuggability (std::function calls take some frames). It also allows us to use early return to simplify code added in D69650. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D70505	2019-11-26 09:09:32 -08:00
Nick Terrell	6814232429	[LLD][ELF] Support --[no-]mmap-output-file with F_no_mmap Summary: Add a flag `F_no_mmap` to `FileOutputBuffer` to support `--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores this flag for compatibility with GNU ld and gold. We need this flag to speed up link time for large binaries in certain scenarios. When we link some of our larger binaries we find that LLD takes 50+ GB of memory, which causes memory pressure. The memory pressure causes the VM to flush dirty pages of the output file to disk. This is normally okay, since we should be flushing cold pages. However, when using BtrFS with compression we need to write 128KB at a time when we flush a page. If any page in that 128KB block is written again, then it must be flushed a second time, and so on. Since LLD doesn't write sequentially this causes write amplification. The same 128KB block will end up being flushed multiple times, causing the linker to many times more IO than necessary. We've observed 3-5x faster builds with -no-mmap-output-file when we hit this scenario. The bad scenario only applies to compressed filesystems, which group together multiple pages into a single compressed block. I've tested BtrFS, but the problem will be present for any compressed filesystem on Linux, since it is caused by the VM. Silently ignoring --no-mmap-output-file caused a silent regression when we switched from gold to lld. We pass --no-mmap-output-file to fix this edge case, but since lld silently ignored the flag we didn't realize it wasn't being respected. Benchmark building a 9 GB binary that exposes this edge case. I linked 3 times with --mmap-output-file and 3 times with --no-mmap-output-file and took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM, BtrFS mounted with -compress-force=zstd, and an 80% full disk. \| Mode \| Time \| \|---------\|-------\| \| mmap \| 894 s \| \| no mmap \| 126 s \| When compression is disabled, BtrFS performs just as well with and without mmap on this benchmark. I was unable to reproduce the regression with any binaries in lld-speed-test. Reviewed By: ruiu, MaskRay Differential Revision: https://reviews.llvm.org/D69294	2019-10-29 15:49:08 -07:00
Michał Górny	2a0fcae3d4	[lld] [ELF] Add '-z nognustack' opt to suppress emitting PT_GNU_STACK Add a new '-z nognustack' option that suppresses emitting PT_GNU_STACK segment. This segment is not supported at all on NetBSD (stack is always non-executable), and the option is meant to be used to disable emitting it. Differential Revision: https://reviews.llvm.org/D56554	2019-10-29 17:54:23 +01:00
Nico Weber	5976a3f5aa	Fix a few typos in lld/ELF to cycle bots	2019-10-28 21:41:47 -04:00
Fangrui Song	bd8cfe65f5	[ELF] Wrap things in `namespace lld { namespace elf {`, NFC This makes it clear `ELF/*/.cpp` files define things in the `lld::elf` namespace and simplifies `elf::foo` to `foo`. Reviewed By: atanasyan, grimar, ruiu Differential Revision: https://reviews.llvm.org/D68323 llvm-svn: 373885	2019-10-07 08:31:18 +00:00
Peter Collingbourne	0bb825d208	ELF: Add .interp synthetic sections first in createSyntheticSections(). Our .interp section is not a SyntheticSection. As a result, it terminates the loop in removeUnusedSyntheticSections(). This has at least two consequences: - The synthetic .bss and .bss.rel.ro sections are always present in dynamically linked executables, even when they are not needed. - The synthetic .ARM.exidx (and possibly other) sections are always present in partitions other than the last one, even when not needed. .ARM.exidx in particular is problematic because it assumes that its list of code sections is non-empty in getLinkOrderDep(), which can lead to a crash if the partition does not have any code sections. Fix these problems by moving the creation of the .interp sections to the top of createSyntheticSections(). While here, make the code a little less error-prone by changing the add() lambdas to take a SyntheticSection instead of an InputSectionBase. Differential Revision: https://reviews.llvm.org/D68256 llvm-svn: 373347	2019-10-01 16:10:13 +00:00
Fangrui Song	0264950697	[ELF] Add -z separate-loadable-segments to complement separate-code and noseparate-code D64906 allows PT_LOAD to have overlapping p_offset ranges. In the default R RX RW RW layout + -z noseparate-code case, we do not tail pad segments when transiting to another segment. This can save at most 3*maxPageSize bytes. a) Before D64906, we tail pad R, RX and the first RW. b) With -z separate-code, we tail pad R and RX, but not the first RW (RELRO). In some cases, b) saves one file page. In some cases, b) wastes one virtual memory page. The waste is a concern on Fuchsia. Because it uses compressed binaries, it doesn't benefit from the saved file page. This patch adds -z separate-loadable-segments to restore the behavior before D64906. It can affect section addresses and can thus be used as a debugging mechanism (see PR43214 and ld.so partition bug in crbug.com/998712). Reviewed By: jakehehrlich, ruiu Differential Revision: https://reviews.llvm.org/D67481 llvm-svn: 372807	2019-09-25 03:39:31 +00:00
Fangrui Song	e47bbd28f8	[ELF] Make MergeInputSection merging aware of output sections Fixes PR38748 mergeSections() calls getOutputSectionName() to get output section names. Two MergeInputSections may be merged even if they are made different by SECTIONS commands. This patch moves mergeSections() after processSectionCommands() and addOrphanSections() to fix the issue. The new pass is renamed to OutputSection::finalizeInputSections(). processSectionCommands() and addorphanSections() are changed to add sections to InputSectionDescription::sectionBases. finalizeInputSections() merges MergeInputSections and migrates `sectionBases` to `sections`. For the -r case, we drop an optimization that tries keeping sh_entsize non-zero. This is for the simplicity of addOrphanSections(). The updated merge-entsize2.s reflects the change. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D67504 llvm-svn: 372734	2019-09-24 11:48:31 +00:00
Fangrui Song	2672051495	[ELF] Error if the linked-to section of a SHF_LINK_ORDER section is discarded Summary: If st_link(A)=B, and A has the SHF_LINK_ORDER flag, we may dereference a null pointer if B is garbage collected (PR43147): 1. In Wrter.cpp:compareByFilePosition, `aOut->sectionIndex` or `bOut->sectionIndex` 2. In OutputSections::finalize, `d->getParent()->sectionIndex` Simply error and bail out to avoid null pointer dereferences. ld.bfd has a similar error: sh_link of section `.bar' points to discarded section `.foo0' of `a.o' ld.bfd is more permissive in that it just checks whether the linked-to section of the first input section is discarded. This is likely because it sets sh_link of the output section according to the first input section. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D67761 llvm-svn: 372400	2019-09-20 15:03:21 +00:00
Fangrui Song	4816e516e5	[ELF][Hexagon] Allow PT_LOAD to have overlapping p_offset ranges on EM_HEXAGON Port the D64906 technique to EM_HEXAGON. This concludes the patch series. Differential Revision: https://reviews.llvm.org/D67605 llvm-svn: 372059	2019-09-17 02:45:38 +00:00
Peter Smith	ea99ce5e9b	[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965	2019-09-16 09:38:38 +00:00
Fangrui Song	d4306e90cb	[ELF][X86] Allow PT_LOAD to have overlapping p_offset ranges on EM_X86_64 Port the D64906 technique to EM_X86_64. Differential Revision: https://reviews.llvm.org/D67482 llvm-svn: 371958	2019-09-16 07:05:34 +00:00
Fangrui Song	06bb7dfbd4	[ELF] Map the ELF header at imageBase If there is no readonly section, we map: * The ELF header at imageBase+maxPageSize * Program headers at imageBase+maxPageSize+sizeof(Ehdr) * The first section .text at imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers) Due to the interaction between Writer<ELFT>::fixSectionAlignments and LinkerScript::allocateHeaders, `alignDown(p_vaddr(R PT_LOAD)) = alignDown(p_vaddr(RX PT_LOAD))`. The RX PT_LOAD will override the R PT_LOAD at runtime, which is not ideal: ``` // PHDR at 0x401034, should be 0x400034 PHDR 0x000034 0x00401034 0x00401034 0x000a0 0x000a0 R 0x4 // R PT_LOAD contains just Ehdr and program headers. // At 0x401000, should be 0x400000 LOAD 0x000000 0x00401000 0x00401000 0x000d4 0x000d4 R 0x1000 LOAD 0x0000d4 0x004010d4 0x004010d4 0x00001 0x00001 R E 0x1000 ``` * createPhdrs allocates the headers to the R PT_LOAD. * fixSectionAlignments assigns `imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)` (formula: `alignTo(dot, maxPageSize) + dot % config->maxPageSize`) to addrExpr of .text * allocateHeaders computes the minimum address among SHF_ALLOC sections, i.e. addr(.text) * allocateHeaders sets address of ELF header to `addr(.text)-sizeof(Ehdr)-sizeof(program headers) = imageBase+maxPageSize` The main observation is that when the SECTIONS command is not used, we don't have to call allocateHeaders. This requires an assumption that the presence of PT_PHDR and addresses of headers can be decided regardless of address information. This may seem natural because dot is not manipulated by a linker script. The other thing is that we have to drop the special rule for -T<section> in `getInitialDot`. If -Ttext is smaller than the image base, the headers will not be allocated with the old behavior (allocateHeaders is called) but always allocated with the new behavior. The behavior change is not a problem. Whether and where headers are allocated can vary among linkers, or ld.bfd across different versions (--enable-separate-code or not). It is thus advised to use a linker script with the PHDRS command to have a consistent behavior across linkers. If PT_PHDR is needed, an explicit --image-base can be a simpler alternative. Differential Revision: https://reviews.llvm.org/D67325 llvm-svn: 371957	2019-09-16 07:04:16 +00:00
Simon Atanasyan	6c6f5a9984	[mips] Allow PT_LOAD to have overlapping p_offset ranges on EM_MIPS Port the D64906 <https://reviews.llvm.org/D64906> technique to MIPS. Fix PR33131 llvm-svn: 371554	2019-09-10 20:19:59 +00:00
Fangrui Song	e8c0d93360	[ELF] nmagic or omagic: don't allocate PT_PHDR or PF_R PT_LOAD for the !hasPhdrsCommands case ``` part.phdrs = script->hasPhdrsCommands() ? script->createPhdrs() : createPhdrs(part); ``` createPhdrs() allocates a PT_PHDR and a PF_R PT_LOAD, which will be deleted later in LinkerScript::allocateHeaders, but leave a gap between the program headers and the first section. Don't allocate the segments to avoid the gap. PT_INTERP is likely not needed as well. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D67324 llvm-svn: 371398	2019-09-09 13:08:51 +00:00
Fangrui Song	8d30c1dcec	Reland D66717 [ELF] Do not ICF two sections with different output sections (by SECTIONS commands) Recommit r370635 (reverted by r371202), with one change: move addOrphanSections() before ICF. Before, orphan sections in two different partitions may be folded and moved to the main partition. Now, InputSection->OutputSection assignment for orphans happens before ICF. ICF does not fold input sections with different output sections. With the PR43241 reproduce, `llvm-objcopy --extract-partition libvr.so libchrome__combined.so libvr.so` => no error Updated description: Fixes PR39418. Complements D47241 (the non-linker-script case). processSectionCommands() assigns input sections to output sections. ICF is called before it, so .text.foo and .text.bar may be folded even if their output sections are made different by SECTIONS commands. ``` markLive<ELFT>() doIcf<ELFT>() // During ICF, we don't know the output sections writeResult() combineEhSections<ELFT>() script->processSectionCommands() // InputSection -> OutputSection assignment ``` This patch splits processSectionCommands() into processSectionCommands() and processSymbolAssignments(), and moves processSectionCommands()/addOrphanSections() before ICF: ``` markLive<ELFT>() combineEhSections<ELFT>() script->processSectionCommands() script->addOrphanSections(); doIcf<ELFT>() // should remove folded input sections writeResult() script->processSymbolAssignments() ``` An alternative approach is to unfold a section `sec` in processSectionCommands() when we find `sec` and `sec->repl` belong to different output sections. I feel this patch is superior because this can fold more sections and the decouple of SectionCommand/SymbolAssignment gives flexibility: * An ExprValue can't be evaluated before its section is assigned to an output section -> we can delete getOutputSectionVA and simplify another place where we had to check if the output section is null. Moreover, a case in linkerscript/early-assign-symbol.s can be handled now. * processSectionCommands/processSymbolAssignments can be freely moved around. llvm-svn: 371216	2019-09-06 15:57:44 +00:00
Fangrui Song	5d9f419a2e	Revert "Revert r370635, it caused PR43241." This reverts commit 50d2dca22b3b05d0ee4883b0cbf93d7d15f241fc. llvm-svn: 371215	2019-09-06 15:57:24 +00:00
Nico Weber	8455294f2a	Revert r370635, it caused PR43241. llvm-svn: 371202	2019-09-06 13:23:42 +00:00
Fangrui Song	6dc2bd70bb	[ELF] Initialize PhdrEntry::p_align to maxPageSize for PT_LOAD ``` Writer<ELFT>::run assignFileOffsets setFileOffset computeFileOffset os->ptLoad->p_align may be smaller than config->maxPageSize setPhdrs p_align = max(p_align, config->maxPageSize) ``` If we move the config->maxPageSize logic to the constructor of PhdrEntry, computeFileOffset can be simplified. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D67211 llvm-svn: 371085	2019-09-05 16:32:31 +00:00
Rui Ueyama	e99dc4ba57	Align output segments correctly Previously, segments were aligned according to their first section's alignment requirements. That was not correct, but segments are also aligned to a page boundary, and a page boundary is usually much larger than a section alignment requirement, so no one noticed this bug before. Now, lld has --nmagic option which sets maxPageSize to 1 to effectively disable page alignment, which reveals the issue. Fixes https://bugs.llvm.org/show_bug.cgi?id=43212 Differential Revision: https://reviews.llvm.org/D67152 llvm-svn: 371013	2019-09-05 05:30:24 +00:00
Fangrui Song	d8bc6a48ea	[ELF] Do not ICF two sections with different output sections (by SECTIONS commands) Fixes PR39418. Complements D47241 (the non-linker-script case). processSectionCommands() assigns input sections to output sections. ICF is called before it, so .text.foo and .text.bar may be folded even if their output sections are made different by SECTIONS commands. ``` markLive<ELFT>() doIcf<ELFT>() // During ICF, we don't know the output sections writeResult() combineEhSections<ELFT>() script->processSectionCommands() // InputSection -> OutputSection assignment ``` This patch splits processSectionCommands() into processSectionCommands() and processSymbolAssignments(), and moves processSectionCommands() before ICF: ``` markLive<ELFT>() combineEhSections<ELFT>() script->processSectionCommands() doIcf<ELFT>() // should remove folded input sections writeResult() script->processSymbolAssignments() ``` An alternative approach is to unfold a section `sec` in processSectionCommands() when we find `sec` and `sec->repl` belong to different output sections. I feel this patch is superior because this can fold more sections and the decouple of SectionCommand/SymbolAssignment gives flexibility: * An ExprValue can't be evaluated before its section is assigned to an output section -> we can delete getOutputSectionVA and simplify another place where we had to check if the output section is null. Moreover, a case in linkerscript/early-assign-symbol.s can be handled now. * processSectionCommands/processSymbolAssignments can be freely moved around. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66717 llvm-svn: 370635	2019-09-02 10:33:58 +00:00
Fangrui Song	4514ac7cfb	[ELF] Align SHT_LLVM_PART_EHDR to a maximum page size boundary Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=998712 SHT_LLVM_PART_EHDR marks the start of a partition. The partition sections will be extracted to a separate file. Align to the next maximum page size boundary so that we can find the ELF header at the start. We cannot benefit from overlapping p_offset ranges with the previous segment anyway. It seems we lack some llvm-objcopy --extract-main-partition and --extract-partition sanity checks. It may place EHDR at the start even if p_offset if non zero. Anyway, the lld change is justified for the reasons above. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D67032 llvm-svn: 370629	2019-09-02 08:49:50 +00:00
Fangrui Song	523f999acf	[ELF][RISCV] Allow PT_LOAD to have overlapping p_offset ranges on EM_RISCV Port the D64906 technique to RISC-V. It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a RISC-V binary decreases by at most 12kb. llvm-svn: 370192	2019-08-28 12:06:06 +00:00
Fangrui Song	54a6f6839b	[ELF][AMDGPU][SPARC] Allow PT_LOAD to have overlapping p_offset ranges on EM_AMDGPU and EM_SPARCV9 llvm-svn: 370180	2019-08-28 09:45:06 +00:00
Fangrui Song	8fbe81fb29	[ELF][RISCV] Assign st_shndx of __global_pointer$ to 1 if .sdata does not exist This essentially reverts the code change of D63132 and switches to a simpler approach. In an executable/shared object, st_shndx of a symbol can be: 1) SHN_UNDEF: undefined symbol (or canonical PLT) 2) SHN_ABS: absolute symbol 3) any other value (usually a regular section index) represents a relative symbol. The actual value does not matter. Many ld.so (musl, all archs except MIPS of FreeBSD rtld-elf) even treat 2) and 3) the same. If .sdata does not exist, it does not matter what value/section __global_pointer$ has, as long as it is relative (otherwise there will be a pedantic lld error. See D63132). Just set the st_shndx arbitrarily to 1. Dummy st_shndx=1 may be used by __rela_iplt_start, linker-script-defined symbols outside a section, __dso_handle, etc. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66798 llvm-svn: 370172	2019-08-28 09:01:03 +00:00
Fangrui Song	024bf27ddf	[ELF][ARM] Allow PT_LOAD to have overlapping p_offset ranges on EM_ARM Port the D64906 technique to ARM. It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of an arm binary decreases by at most 12kb. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D66749 llvm-svn: 370049	2019-08-27 11:52:36 +00:00
Fangrui Song	1681ceb2c4	[ELF] EhFrameSection: postpone FDE liveness check to finalizeSections EhFrameSection::addSection checks liveness of FDE early. This makes it infeasible to move combineEhSections() before ICF. Postpone the check to EhFrameSection::finalizeContents(). This is what ARMExidxSyntheticSection does and it will make a subsequent patch D66717 simpler. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66727 llvm-svn: 369890	2019-08-26 10:32:12 +00:00
Fangrui Song	debcac9fef	[ELF] Make LinkerScript::assignAddresses iterative PR42990. For `SECTIONS { b = a; . = 0xff00 + (a >> 8); a = .; }`, we currently set st_value(a)=0xff00 while st_value(b)=0xffff. The following call tree demonstrates the problem: ``` link<ELF64LE>(Args); Script->declareSymbols(); // insert a and b as absolute Defined Writer<ELFT>().run(); Script->processSectionCommands(); addSymbol(cmd); // a and b are re-inserted. LinkerScript::getSymbolValue // is lazily called by subsequent evaluation finalizeSections(); forEachRelSec(scanRelocations<ELFT>); processRelocAux // another problem PR42506, not affected by this patch finalizeAddressDependentContent(); // loop executed once script->assignAddresses(); // a = 0, b = 0xff00 script->assignAddresses(); // a = 0xff00, _end = 0xffff ``` We need another assignAddresses() to finalize the value of `a`. This patch 1) modifies assignAddress() to track the original section/value of each symbol and return a symbol whose section/value has changed. 2) moves the post-finalizeSections assignAddress() inside the loop of finalizeAddressDependentContent() and makes it iterative. Symbol assignment may not converge so we make a few attempts before bailing out. Note, assignAddresses() must be called at least twice. The penultimate call finalized section addresses while the last finalized symbol values. It is somewhat obscure and there was no comment. linkerscript/addr-zero.test tests this. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66279 llvm-svn: 369889	2019-08-26 10:23:31 +00:00
Fangrui Song	6d5a8c92bf	[ELF] Simplify with less_second. NFC llvm-svn: 369844	2019-08-24 08:40:20 +00:00
Fangrui Song	62083ec157	[ELF] Make member function Writer<ELFT>::removeEmptyPTLoad non-member. NFC llvm-svn: 369838	2019-08-24 06:31:34 +00:00
Fangrui Song	af47d0021c	[ELF] Align the first section of a PT_LOAD even if its type is SHT_NOBITS Reported at https://reviews.llvm.org/D64930#1642223 If the only section of a PT_LOAD is a SHT_NOBITS section (e.g. .bss), we may not align its sh_offset. p_offset of the PT_LOAD will be set to sh_offset, and we will get p_offset!=p_vaddr (mod p_align). If such executable is mapped by the Linux kernel, it will segfault. After D64906, this may happen the non-linker script case. The linker script case has had this issue for a long time. This was fixed by rL321657 (but the test linkerscript/nobits-offset.s failed to test a SHT_NOBITS section), but broken by rL345154. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D66658 llvm-svn: 369828	2019-08-24 00:41:15 +00:00
Fangrui Song	12d83b4270	[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges on EM_PPC Ported the D64906 technique to EM_PPC. Delete ppc-rela.s that is covered by ppc32-abs-pic.s llvm-svn: 369351	2019-08-20 09:20:05 +00:00
Fangrui Song	9c371309f3	[ELF][X86] Allow PT_LOAD to have overlapping p_offset ranges on EM_386 Ported the D64906 technique to EM_386. If `sh_addralign(.tdata) < sh_addralign(.tbss)`, we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`. ld.so that are known to have problems if p_vaddr%p_align!=0: * FreeBSD 13.0-CURRENT rtld-elf * glibc https://sourceware.org/bugzilla/show_bug.cgi?id=24606 New test i386-tls-vaddr-align.s checks our workaround makes p_vaddr%p_align = 0. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D65865 llvm-svn: 369347	2019-08-20 08:43:47 +00:00
Fangrui Song	f66b767abe	[ELF][AArch64] Allow PT_LOAD to have overlapping p_offset ranges Ported the D64906 technique to AArch64. It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of an aarch64 binary decreases by at most 192kb. If `sh_addralign(.tdata) < sh_addralign(.tbss)`, we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`. ld.so that are known to have problems if p_vaddr%p_align!=0: * musl<=1.1.22 * FreeBSD 13.0-CURRENT (and before) rtld-elf arm64 New test aarch64-tls-vaddr-align.s checks that our workaround makes p_vaddr%p_align = 0. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64930 llvm-svn: 369344	2019-08-20 08:34:56 +00:00
Fangrui Song	01c7f4b606	[ELF][PPC] Allow PT_LOAD to have overlapping p_offset ranges This change affects the non-linker script case (precisely, when the `SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a powerpc64 binary can be decreased by at most 192kb. The technique can be ported to other targets. Let me demonstrate the idea with a maxPageSize=65536 example: When assigning the address to the first output section of a new PT_LOAD, if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020, 0x20000) in the output. Alternatively, if we advance to 0x20020, the new PT_LOAD will have p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset! Obviously 0x10020 is the choice because it leaves no gap. At runtime, p_vaddr will be rounded down by pagesize (65536 if pagesize=maxPageSize). This PT_LOAD will load additional initial contents from p_offset ranges [0x10000,0x10020), which will also be loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in effect or if we are not transiting between executable and non-executable segments. ld.bfd -z noseparate-code leverages this technique to keep output small. This patch implements the technique in lld, which is mostly effective on targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3 removed alignments can save almost 3*65536 bytes. Two places that rely on p_vaddr%pagesize = 0 have to be updated. 1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero. The updated formula takes account of that factor. 2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0. Fix them. See the updated comments in InputSection.cpp for details. On targets that we enable the technique (only PPC64 now), we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0` if `sh_addralign(.tdata) < sh_addralign(.tbss)` This exposes many problems in ld.so implementations, especially the offsets of dynamic TLS blocks. Known issues: FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64) glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606 musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...) So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS). The technique will be enabled (with updated tests) for other targets in subsequent patches. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64906 llvm-svn: 369343	2019-08-20 08:34:25 +00:00
Fangrui Song	dc06b0bc9a	[ELF] Don't special case symbolic relocations with 0 addend to ifunc in writable locations Currently the following 3 relocation types do not trigger the creation of a canonical PLT (which changes STT_GNU_IFUNC to STT_FUNC and redirects all references): 1) GOT-generating (`needsGot`) 2) PLT-generating (`needsPlt`) 3) R_ABS with 0 addend in a writable location. This is used for for ifunc function pointers in writable sections such as .data and .toc. This patch deletes case 3) to simplify the R__IRELATIVE generating logic added in D57371. Other advantages: It is guaranteed no more than 1 R__IRELATIVE is created for an ifunc. PPC64: no need to special case ifunc in toc-indirect to toc-relative relaxation. See D65755 The deleted elf::addIRelativeRelocs demonstrates that one-pass scan through relocations makes several optimizations difficult. This is something we can think about in the future. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D65995 llvm-svn: 368661	2019-08-13 09:43:40 +00:00
Fangrui Song	c6cd62352c	[ELF] Simplify handling of exportDynamic and isPreemptible In Writer::includeInDynSym(), exportDynamic is used by a Defined with protected or default visibility, to record whether it is required to be exported into .dynsym. It is set when any of the following conditions hold: 1) There is an interposable symbol from a DSO (Undefined or SharedSymbol with default visibility) 2) If -shared or --export-dynamic is specified, any symbol in an object file/bitcode sets this property, unless suppressed by canBeOmittedFromSymbolTable(). 3) --dynamic-list when producing an executable 4) protected symbol from a DSO preempted by copy relocation/canonical PLT when --ignore-{data,function}-address-equality is specified 5) ifunc is exported when -z ifunc-noplt is specified Bullet points 4) and 5) are irrelevant in this patch. Bullet 3) does not play well with 1) and 2). When -shared is specified, exportDynamic of most symbols is true. This makes it incapable to record --dynamic-list marked symbols. We thus have obscure: if (!config->shared) b->exportDynamic = true; else if (b->includeInDynsym()) b->isPreemptible = true; This patch adds another bit `Symbol::inDynamicList` to record 3). We can thus simplify handleDynamicList() by unifying the DSO and executable cases. It also allows us to simplify isPreemptible - now the field is only used in finalizeSections() and later stages. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D66091 llvm-svn: 368659	2019-08-13 09:12:52 +00:00
Fangrui Song	e28a70daf4	[ELF] Consistently prioritize non-* wildcards overs "" in version scripts We prioritize non- wildcards overs VER_NDX_LOCAL/VER_NDX_GLOBAL "". This patch generalizes the rule to "" of other versions and thus fixes PR40176. I don't feel strongly about this GNU linkers' behavior but the generalization simplifies code. Delete `config->defaultSymbolVersion` which was used to special case VER_NDX_LOCAL/VER_NDX_GLOBAL "*". In `SymbolTable::scanVersionScript`, custom versions are handled the same way as VER_NDX_LOCAL/VER_NDX_GLOBAL. So merge `config->versionScript{Locals,Globals}` into `config->versionDefinitions`. Overall this seems to simplify the code. In `SymbolTable::assign{Exact,Wildcard}Versions`, `sym->verdefIndex == config->defaultSymbolVersion` is changed to `verdefIndex == UINT32_C(-1)`. This allows us to give duplicate assignment diagnostics for `{ global: foo; };` `V1 { global: foo; };` In test/linkerscript/version-script.s: vs_index of an undefined symbol changes from 0 to 1. This doesn't matter (arguably 1 is better because the binding is STB_GLOBAL) because vs_index of an undefined symbol is ignored. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D65716 llvm-svn: 367869	2019-08-05 14:31:39 +00:00
Fangrui Song	25ab1c6471	[ELF] Move R__IRELATIVE from .rel[a].plt to .rel[a].dyn unless --pack-dyn-relocs=android[+relr] An R__IRELATIVE represents the address of a STT_GNU_IFUNC symbol (redirected at runtime) which is non-preemptable and is not associated with a canonical PLT (associated with a symbol with a section index of SHN_UNDEF but a non-zero st_value). .rel[a].plt [DT_JMPREL, DT_JMPREL+DT_JMPRELSZ) contains relocations that can be lazily resolved. R__IRELATIVE are always eagerly resolved, so conceptually they do not belong to .rela.plt. "iplt" is mostly a misnomer. glibc powerpc and powerpc64 do not resolve R__IRELATIVE if they are in .rela.plt. // a.o - synthesized PLT call stub has an R__IRELATIVE void ifunc(); int main() { ifunc(); } // b.o static void real() {} asm (".type ifunc, %gnu_indirect_function"); void ifunc() { return &real; } The lld-linked executable crashes. ld.bfd places R__IRELATIVE in .rela.dyn and the executable works. glibc i386, x86_64, and aarch64 have logic (glibc/sysdeps//dl-machine.h:elf_machine_lazy_rel) to eagerly resolve R__IRELATIVE in .rel[a].plt so the lld-linked executable works. Move R__IRELATIVE from .rel[a].plt to .rel[a].dyn to fix the crashes on glibc powerpc/powerpc64. This also helps simplifying ifunc implementation in FreeBSD rtld-elf powerpc64. If --pack-dyn-relocs=android[+relr] is specified, the Android packed dynamic relocation format is used for .rela.dyn. We cannot name in.relaIplt ".rela.dyn" because the output section will have mixed formats. This can be improved in the future. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D65651 llvm-svn: 367745	2019-08-03 02:26:52 +00:00
Fangrui Song	5391f158c2	[ELF] Add -z separate-code and pad the last page of last PF_X PT_LOAD with traps only if -z separate-code is specified This patch 1) adds -z separate-code and -z noseparate-code (default). 2) changes the condition that the last page of last PF_X PT_LOAD is padded with trap instructions. Current condition (after D33630): if there is no `SECTIONS` commands. After this change: if -z separate-code is specified. -z separate-code was introduced to ld.bfd in 2018, to place the text segment in its own pages. There is no overlap in pages between an executable segment and a non-executable segment: 1) RX cannot load initial contents from R or RW(or non-SHF_ALLOC). 2) R and RW(or non-SHF_ALLOC) cannot load initial contents from RX. lld's current status: - Between R and RX: in `Writer<ELFT>::fixSectionAlignments()`, the start of a segment is always aligned to maxPageSize, so the initial contents loaded by R and RX do not overlap. I plan to allow overlaps in D64906 if -z noseparate-code is in effect. - Between RX and RW(or non-SHF_ALLOC if RW doesn't exist): we currently unconditionally pad the last page to commonPageSize (defaults to 4096 on all targets we support). This patch will make it effective only if -z separate-code is specified. -z separate-code is a dubious feature that intends to reduce the number of ROP gadgets (which is actually ineffective because attackers can find plenty of gadgets in the text segment, no need to find gadgets in non-code regions). With the overlapping PT_LOAD technique D64906, -z noseparate-code removes two more alignments at segment boundaries than -z separate-code. This saves at most defaultCommonPageSize2 bytes, which are significant on targets with large defaultCommonPageSize (AArch64/MIPS/PPC: 65536). Issues/feedback on alignment at segment boundaries to help understand the implication: binutils PR24490 (the situation on ld.bfd is worse because they have two R-- on both sides of R-E so more alignments.) * In binutils, the 2018-02-27 commit "ld: Add --enable-separate-code" made -z separate-code the default on Linux. `d969dea983` In musl-cross-make, binutils is configured with --disable-separate-code to address size regressions caused by -z separate-code. (lld actually has the same issue, which I plan to fix in a future patch. The ld.bfd x86 status is worse because they default to max-page-size=0x200000). * https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237676 people want smaller code size. This patch will remove one alignment boundary. * Stef O'Rear: I'm opposed to any kind of page alignment at the text/rodata line (having a partial page of text aliased as rodata and vice versa has no demonstrable harm, and I actually care about small systems). So, make -z noseparate-code the default. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64903 llvm-svn: 367537	2019-08-01 09:58:25 +00:00
Fangrui Song	3d51d4ed6d	[ELF] Detemplate maybeReportUndefined and copySectionsIntoPartitions llvm-svn: 367117	2019-07-26 14:57:53 +00:00
Fangrui Song	2be0ebb0d8	[ELF] Delete redundant pageAlign at PT_GNU_RELRO boundaries after D58892 Summary: After D58892 split the RW PT_LOAD on the PT_GNU_RELRO boundary, the new layout is: PT_LOAD(PT_GNU_RELRO(.data.rel.ro .bss.rel.ro)) PT_LOAD(.data. .bss) The two pageAlign() calls at PT_GNU_RELRO boundaries are redundant due to the existence of PT_LOAD. Reviewers: grimar, peter.smith, ruiu, espindola Reviewed By: ruiu Subscribers: sfertile, atanasyan, emaste, arichardson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64854 llvm-svn: 366307	2019-07-17 09:23:04 +00:00
Fangrui Song	47cfe8f321	[ELF] Fix variable names in comments after VariableName -> variableName change Also fix some typos. llvm-svn: 366181	2019-07-16 05:50:45 +00:00
Rui Ueyama	bfaf64ae57	Update comments for r365730. NFC. llvm-svn: 365733	2019-07-11 06:08:54 +00:00
Rui Ueyama	136d27ab4d	[Coding style change][lld] Rename variables for non-ELF ports This patch does the same thing as r365595 to other subdirectories, which completes the naming style change for the entire lld directory. With this, the naming style conversion is complete for lld. Differential Revision: https://reviews.llvm.org/D64473 llvm-svn: 365730	2019-07-11 05:40:30 +00:00
Rui Ueyama	3837f4273f	[Coding style change] Rename variables so that they start with a lowercase letter This patch is mechanically generated by clang-llvm-rename tool that I wrote using Clang Refactoring Engine just for creating this patch. You can see the source code of the tool at https://reviews.llvm.org/D64123. There's no manual post-processing; you can generate the same patch by re-running the tool against lld's code base. Here is the main discussion thread to change the LLVM coding style: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html In the discussion thread, I proposed we use lld as a testbed for variable naming scheme change, and this patch does that. I chose to rename variables so that they are in camelCase, just because that is a minimal change to make variables to start with a lowercase letter. Note to downstream patch maintainers: if you are maintaining a downstream lld repo, just rebasing ahead of this commit would cause massive merge conflicts because this patch essentially changes every line in the lld subdirectory. But there's a remedy. clang-llvm-rename tool is a batch tool, so you can rename variables in your downstream repo with the tool. Given that, here is how to rebase your repo to a commit after the mass renaming: 1. rebase to the commit just before the mass variable renaming, 2. apply the tool to your downstream repo to mass-rename variables locally, and 3. rebase again to the head. Most changes made by the tool should be identical for a downstream repo and for the head, so at the step 3, almost all changes should be merged and disappear. I'd expect that there would be some lines that you need to merge by hand, but that shouldn't be too many. Differential Revision: https://reviews.llvm.org/D64121 llvm-svn: 365595	2019-07-10 05:00:37 +00:00
Nico Weber	2c45043415	lld/elf: Deduplicate undefined symbol diagnostics Before: ``` ld.lld: error: undefined symbol: f() >>> referenced by test.cc:3 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-9c0808.o:(g()) ld.lld: error: undefined symbol: f() >>> referenced by test.cc:4 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-9c0808.o:(h()) ld.lld: error: undefined symbol: f() >>> referenced by test.cc:5 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-9c0808.o:(j()) ld.lld: error: undefined symbol: k() >>> referenced by test.cc:5 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-9c0808.o:(j()) ld.lld: error: undefined symbol: f() >>> referenced by test2.cc:2 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test2-07b391.o:(asdf()) clang: error: linker command failed with exit code 1 (use -v to see invocation) ``` Now: ``` ld.lld: error: undefined symbol: f() >>> referenced by test.cc:3 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-0e07ba.o:(g()) >>> referenced by test.cc:4 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-0e07ba.o:(h()) >>> referenced by test.cc:5 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-0e07ba.o:(j()) >>> referenced by test2.cc:2 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test2-6bdb24.o:(asdf()) ld.lld: error: undefined symbol: k() >>> referenced by test.cc:5 >>> /var/folders/c5/8d7sdn1x2mg92mj0rndghhdr0000gn/T/test-0e07ba.o:(j()) clang: error: linker command failed with exit code 1 (use -v to see invocation) ``` If there are more than 10 references to an undefined symbol, only the first 10 are printed. Fixes PR42260. Differential Revision: https://reviews.llvm.org/D63344 llvm-svn: 363962	2019-06-20 18:25:57 +00:00
Fangrui Song	5b4285d82d	[ELF][RISCV] Create dummy .sdata for __global_pointer$ if .sdata does not exist If .sdata is absent, linker synthesized __global_pointer$ gets a section index of SHN_ABS. (ld.bfd has a similar issue: binutils PR24678) Scrt1.o may use `lla gp, __global_pointer$` to reference the symbol PC relatively. In -pie/-shared mode, lld complains if a PC relative relocation references an absolute symbol (SHN_ABS) but ld.bfd doesn't: ld.lld: error: relocation R_RISCV_PCREL_HI20 cannot refer to lute symbol: __global_pointer$ Let the reference of __global_pointer$ to force creation of .sdata to fix the problem. This is similar to _GLOBAL_OFFSET_TABLE_, which forces creation of .got or .got.plt . Also, change the visibility from STV_HIDDEN to STV_DEFAULT and don't define the symbol for -shared. This matches ld.bfd, though I don't understand why it uses STV_DEFAULT. Reviewed By: ruiu, jrtc27 Differential Revision: https://reviews.llvm.org/D63132 llvm-svn: 363351	2019-06-14 02:14:53 +00:00
Peter Collingbourne	eaf3f56924	ELF: Don't process the partition end marker during combineEhSections(). Otherwise the getPartition() accessor may return an OOB pointer. Found using _GLIBCXX_DEBUG. The error is benign (we never dereference the pointer for the end marker) so this wasn't caught by e.g. the sanitizer bots. llvm-svn: 363026	2019-06-11 02:54:30 +00:00
Peter Collingbourne	0282898586	ELF: Create synthetic sections for loadable partitions. We create several types of synthetic sections for loadable partitions, including: - The dynamic symbol table. This allows code outside of the loadable partitions to find entry points with dlsym. - Creating a dynamic symbol table also requires the creation of several other synthetic sections for the partition, such as the dynamic table and hash table sections. - The partition's ELF header is represented as a synthetic section in the combined output file, and will be used by llvm-objcopy to extract partitions. Differential Revision: https://reviews.llvm.org/D62350 llvm-svn: 362819	2019-06-07 17:57:58 +00:00
Jordan Rupprecht	0629e1252f	Revert [ELF] Simplify the condition to create .interp This reverts r362355 (git commit `c78c999a9c`) This causes some internal tests to fail; details provided offthread. llvm-svn: 362755	2019-06-06 23:23:14 +00:00
Fangrui Song	82442adfc0	[PPC32] Improve the 32-bit PowerPC port Many -static/-no-pie/-shared/-pie applications linked against glibc or musl should work with this patch. This also helps FreeBSD PowerPC64 to migrate their lib32 (PR40888). * Fix default image base and max page size. * Support new-style Secure PLT (see below). Old-style BSS PLT is not implemented, so it is not suitable for FreeBSD rtld now because it doesn't support Secure PLT yet. * Support more initial relocation types: R_PPC_ADDR32, R_PPC_REL16, R_PPC_LOCAL24PC, R_PPC_PLTREL24, and R_PPC_GOT16. The addend of R_PPC_PLTREL24 is special: it decides the call stub PLT type but it should be ignored for the computation of target symbol VA. Support GNU ifunc * Support .glink used for lazy PLT resolution in glibc * Add a new thunk type: PPC32PltCallStub that is similar to PPC64PltCallStub. It is used by R_PPC_REL24 and R_PPC_PLTREL24. A PLT stub used in -fPIE/-fPIC usually loads an address relative to .got2+0x8000 (-fpie/-fpic code uses _GLOBAL_OFFSET_TABLE_ relative addresses). Two .got2 sections in two object files have different addresses, thus a PLT stub can't be shared by two object files. To handle this incompatibility, change the parameters of Thunk::isCompatibleWith to `const InputSection &, const Relocation &`. PowerPC psABI specified an old-style .plt (BSS PLT) that is both writable and executable. Linkers don't make separate RW- and RWE segments, which causes all initially writable memory (think .data) executable. This is a big security concern so a new PLT scheme (secure PLT) was developed to address the security issue. TLS will be implemented in D62940. glibc older than ~2012 requires .rela.dyn to include .rela.plt, it can not handle the DT_RELA+DT_RELASZ == DT_JMPREL case correctly. A hack (not included in this patch) in LinkerScript.cpp addOrphanSections() to work around the issue: if (Config->EMachine == EM_PPC) { // Older glibc assumes .rela.dyn includes .rela.plt Add(In.RelaDyn); if (In.RelaPlt->isLive() && !In.RelaPlt->Parent) In.RelaDyn->getParent()->addSection(In.RelaPlt); } Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D62464 llvm-svn: 362721	2019-06-06 17:03:00 +00:00
Rui Ueyama	2057f8366a	Read .note.gnu.property sections and emit a merged .note.gnu.property section. This patch also adds `--require-cet` option for the sake of testing. The actual feature for IBT-aware PLT is not included in this patch. This is a part of https://reviews.llvm.org/D59780. Submitting this first should make it easy to work with a related change (https://reviews.llvm.org/D62609). Differential Revision: https://reviews.llvm.org/D62853 llvm-svn: 362579	2019-06-05 03:04:46 +00:00
Peter Collingbourne	06f3b094e4	ELF: Introduce a separate bit for tracking whether an output section has ever had an input section added to it. NFCI. We currently (ab)use the Live bit on output sections to track whether the section has ever had an input section added to it, and then later use it during orphan placement. This will conflict with one of my upcoming partition-related changes that will assign all output sections to a partition (thus marking them as live) so that they can be added to the correct segment by the code that creates program headers. Instead of using the Live bit for this purpose, create a new flag and start using it to track the property explicitly. Differential Revision: https://reviews.llvm.org/D62348 llvm-svn: 362444	2019-06-03 20:14:25 +00:00
Fangrui Song	c78c999a9c	[ELF] Simplify the condition to create .interp (1) {gcc,clang} -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created (2) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp not created (3) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c a.so => .interp created The inconsistency of (2) is due to the condition `!Config->SharedFiles.empty()`. To make lld behave more like ld.bfd, we could change the condition to: Config->HasDynSymTab && !Config->DynamicLinker.empty() && Script->needsInterpSection(); However, that would bring another inconsistency as can be observed with: (4) {gcc,clang} -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created So instead, use `!Config->DynamicLinker.empty() && Script->needsInterpSection()`, which is both simple and consistent in these cases. The inconsistency of (4) likely originated from ld.bfd and gold's choice to have a default --dynamic-linker. Their condition to create .interp is ANDed with (not -shared). Since lld doesn't have a default --dynamic-linker, compiler drivers (gcc/clang) don't pass --dynamic-linker for -shared, and direct ld users are not supposed to specify --dynamic-linker for -shared, we do not need the condition !Config->Shared. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D62765 llvm-svn: 362355	2019-06-03 05:25:03 +00:00
Fangrui Song	0a6bababa8	[ELF][MIPS] Delete dead !Sym->isDefined() check in addAbsolute() llvm-svn: 362314	2019-06-02 02:43:38 +00:00
Fangrui Song	0526c0cd8e	[ELF] Implement Local Dynamic style TLSDESC for x86-64 For the Local Dynamic case of TLSDESC, _TLS_MODULE_BASE_ is defined as a special TLS symbol that makes: 1) Without relaxation: it produces a dynamic TLSDESC relocation that computes 0. Adding @dtpoff to access a TLS symbol. 2) With LD->LE relaxation: _TLS_MODULE_BASE_@tpoff = 0 (lowest address in the TLS block). Adding @tpoff to access a TLS symbol. For 1), this saves dynamic relocations and GOT slots as otherwise (General Dynamic) we would create an R_X86_64_TLSDESC and reserve two GOT slots for each symbol. Add ElfSym::TlsModuleBase and change the signature of getTlsTpOffset() to special case _TLS_MODULE_BASE_. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D62577 llvm-svn: 362078	2019-05-30 10:00:20 +00:00
Peter Collingbourne	ba2816be82	ELF: Add basic partition data structures and behaviours. This change causes us to read partition specifications from partition specification sections and split output sections into partitions according to their reachability from partition entry points. This is only the first step towards a full implementation of partitions. Later changes will add additional synthetic sections to each partition so that they can be loaded independently. Differential Revision: https://reviews.llvm.org/D60353 llvm-svn: 361925	2019-05-29 03:55:20 +00:00
Fangrui Song	173a68f1fb	[ELF] Replace two addSymbol() call sites with Symbol::resolve(). NFC If we have a handle of the symbol, insert() called by addSymbol() is redundant. Just call resolve(). llvm-svn: 361802	2019-05-28 10:12:06 +00:00
Rui Ueyama	d8f8abbd4a	Use SymbolTable::insert() to implement --trace. Differential Revision: https://reviews.llvm.org/D62381 llvm-svn: 361791	2019-05-28 06:33:06 +00:00
Fangrui Song	ed2ad77ccb	[ARM][AArch64] Revert Android Bionic PT_TLS overaligning hack This reverts D53906. D53906 increased p_align of PT_TLS on ARM/AArch64 to 32/64 to make the static TLS layout compatible with Android Bionic's ELF TLS. However, this may cause glibc ARM/AArch64 programs to crash (see PR41527). The faulty PT_TLS in the executable satisfies p_vaddr%p_align != 0. The remainder is normally 0 but may be non-zero with the hack in place. The problem is that we increase PT_TLS's p_align after OutputSections' addresses are fixed (assignAddress()). It is possible that p_vaddr%old_p_align = 0 while p_vaddr%new_p_align != 0. For a thread local variable defined in the executable, lld computed TLS offset (local exec) is different from glibc computed TLS offset from another module (initial exec/generic dynamic). Note: PR41527 said the bug affects initial exec but actually generic dynamic is affected as well. (glibc is correct in that it compute offsets that satisfy `offset%p_align == p_vaddr%p_align`, which is a basic ELF requirement. This hack appears to work on FreeBSD rtld, musl<=1.1.22, and Bionic, but that is just because they (and lld) incorrectly compute offsets that satisfy `offset%p_align = 0` instead.) Android developers are fine to revert this patch, carry this patch in their tree before figuring out a long-term solution (e.g. a dummy .tdata with sh_addralign=64 sh_size={0,1} in crtbegin*.o files. The overhead is now insignificant after D62059). Reviewed By: rprichard, srhines Differential Revision: https://reviews.llvm.org/D62055 llvm-svn: 361090	2019-05-18 03:16:00 +00:00
Fangrui Song	f3dccc64af	[ELF] Don't align PT_TLS's p_memsz The code was added in r252352, probably to address some layout issues. Actually PT_TLS's p_memsz doesn't need to be aligned on either variant. ld.bfd doesn't do that. In case of larger alignment (e.g. 64 for Android Bionic on AArch64, see D62055), this may make the overhead smaller. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D62059 llvm-svn: 361029	2019-05-17 12:48:53 +00:00
Rui Ueyama	bbf154cf9c	Move symbol resolution code out of SymbolTable class. This is the last patch of the series of patches to make it possible to resolve symbols without asking SymbolTable to do so. The main point of this patch is the introduction of `elf::resolveSymbol(Symbol Old, Symbol New)`. That function resolves or merges given symbols by examining symbol types and call replaceSymbol (which memcpy's New to Old) if necessary. With the new function, we have now separated symbol resolution from symbol lookup. If you already have a Symbol pointer, you can directly resolve the symbol without asking SymbolTable to do that. Now that the nice abstraction become available, I can start working on performance improvement of the linker. As a starter, I'm thinking of making --{start,end}-lib faster. --{start,end}-lib is currently unnecessarily slow because it looks up the symbol table twice for each symbol. - The first hash table lookup/insertion occurs when we instantiate a LazyObject file to insert LazyObject symbols. - The second hash table lookup/insertion occurs when we create an ObjFile from LazyObject file. That overwrites LazyObject symbols with Defined symbols. I think it is not too hard to see how we can now eliminate the second hash table lookup. We can keep LazyObject symbols in Step 1, and then call elf::resolveSymbol() to do Step 2. Differential Revision: https://reviews.llvm.org/D61898 llvm-svn: 360975	2019-05-17 01:55:20 +00:00
Rui Ueyama	5c073a94f9	Introduce CommonSymbol. Previously, we handled common symbols as a kind of Defined symbol, but what we were doing for common symbols is pretty different from regular defined symbols. Common symbol and defined symbol are probably as different as shared symbol and defined symbols are different. This patch introduces CommonSymbol to represent common symbols. After symbols are resolved, they are converted to Defined symbols residing in a .bss section. Differential Revision: https://reviews.llvm.org/D61895 llvm-svn: 360841	2019-05-16 03:29:03 +00:00
Rui Ueyama	7d4761928e	Simplify SymbolTable::add{Defined,Undefined,...} functions. SymbolTable's add-family functions have lots of parameters because when they have to create a new symbol, they forward given arguments to Symbol's constructors. Therefore, the functions take at least as many arguments as their corresponding constructors. This patch simplifies the add-family functions. Now, the functions take a symbol instead of arguments to construct a symbol. If there's no existing symbol, a given symbol is memcpy'ed to the symbol table. Otherwise, the functions attempt to merge the existing and a given new symbol. I also eliminated `CanOmitFromDynSym` parameter, so that the functions take really one argument. Symbol classes are trivially constructible, so looks like constructing them to pass to add-family functions is as cheap as passing a lot of arguments to the functions. A quick benchmark showed that this patch seems performance-neutral. This is a preparation for http://lists.llvm.org/pipermail/llvm-dev/2019-April/131902.html Differential Revision: https://reviews.llvm.org/D61855 llvm-svn: 360838	2019-05-16 02:14:00 +00:00
Peter Smith	4e21c770ec	[ELF] Full support for -n (--nmagic) and -N (--omagic) via common page The -n (--nmagic) disables page alignment, and acts as a -Bstatic The -N (--omagic) does what -n does but also marks the executable segment as writeable. As page alignment is disabled headers are not allocated unless explicit in the linker script. To disable page alignment in LLD we choose to set the page sizes to 1 so that any alignment based on the page size does nothing. To set the Target->PageSize to 1 we implement -z common-page-size, which has the side effect of allowing the user to set the value as well. Setting the page alignments to 1 does mean that any use of CONSTANT(MAXPAGESIZE) or CONSTANT(COMMONPAGESIZE) in a linker script will return 1, unlike in ld.bfd. However given that -n and -N disable paging these probably shouldn't be used in a linker script where -n or -N is in use. Differential Revision: https://reviews.llvm.org/D61688 llvm-svn: 360593	2019-05-13 16:01:26 +00:00
Ben Dunbobbin	3edca1ac1a	[LLD][NFC] Refactor: BuildID hash size now computed in one place. Differential Revision: https://reviews.llvm.org/D61078 llvm-svn: 360316	2019-05-09 08:08:09 +00:00
Fangrui Song	d45df09435	[ELF] Place SHT_NOTE sections with the same alignment into one PT_NOTE Summary: While the generic ABI requires notes to be 8-byte aligned in ELF64, many vendor-specific notes (from Linux, NetBSD, Solaris, etc) use 4-byte alignment. In a PT_NOTE segment, if 4-byte aligned notes are followed by an 8-byte aligned note, the possible 4-byte padding may make consumers fail to parse the 8-byte aligned note. See PR41000 for a recent report about .note.gnu.property (NT_GNU_PROPERTY_TYPE_0). (Note, for NT_GNU_PROPERTY_TYPE_0, the consumers should probably migrate to PT_GNU_PROPERTY, but the alignment issue affects other notes as well.) To fix the issue, don't mix notes with different alignments in one PT_NOTE. If compilers emit 4-byte aligned notes before 8-byte aligned notes, we'll create at most 2 segments. sh_size%sh_addralign=0 is actually implied by the rule for linking unrecognized sections (in generic ABI), so we don't have to check that. Notes that match in name, type and attribute flags are concatenated into a single output section. The compilers have to ensure sh_size%sh_addralign=0 to make concatenated notes parsable. An alternative approach is to create a PT_NOTE for each SHT_NOTE, but we'll have to incur the sizeof(Elf64_Phdr)=56 overhead every time a new note section is introduced. Reviewers: ruiu, jakehehrlich, phosek, jhenderson, pcc, espindola Subscribers: emaste, arichardson, krytarowski, fedor.sergeev, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61296 llvm-svn: 359853	2019-05-03 00:35:49 +00:00
Andrew Ng	0f4c58f6f4	[LLD][ELF] Fix getRankProximity to "ignore" not live sections This is a follow up to r358979 which made findOrphanPos only consider live sections. Unfortunately, this required change to getRankProximity, used by findOrphanPos, was missed. Differential Revision: https://reviews.llvm.org/D61197 llvm-svn: 359554	2019-04-30 12:27:06 +00:00
Andrew Ng	98c858a23b	[ELF] Change findOrphanPos to only consider live sections This patch changes the behaviour of findOrphanPos to only consider live sections when placing orphan sections. This used to be how it behaved in the past. Differential Revision: https://reviews.llvm.org/D60273 llvm-svn: 358979	2019-04-23 12:17:15 +00:00
Fangrui Song	32c0ebe615	Use llvm::stable_sort Make some small adjustment while touching the code: make parameters const, use less_first(), etc. Differential Revision: https://reviews.llvm.org/D60989 llvm-svn: 358943	2019-04-23 02:42:06 +00:00
George Rimar	da49faf15e	[LLD][ELF] - Fix the different behavior of the linker script symbols on different platforms. This generalizes code and also fixes the broken behavior shown in one of our test cases for some targets, like x86-64. The issue occurs when the forward declarations are used in the script. One of the samples is: SECTIONS { foo = ADDR(.text) - ABSOLUTE(ADDR(.text)); }; In that case, we have a broken output when output target does not use thunks. That happens because thunks creating code (called from maybeAddThunks) calls Script->assignAddresses() at least one more time, what fixups the values. As a result final symbols values can be different on AArch64 and x86, for example. In this patch, I generalize and rename maybeAddThunks to finalizeAddressDependentContent and now it is used and called by all targets. Differential revision: https://reviews.llvm.org/D55550 llvm-svn: 358646	2019-04-18 08:15:54 +00:00
Peter Collingbourne	97d25e068f	ELF: Move build id computation to Writer. NFCI. With partitions, each partition should have the same build id. This means that the build id needs to be only computed once, otherwise we will end up with different build ids in each partition as a result of the file contents changing. This change moves the computation of the build id into Writer so that it only happens once. Differential Revision: https://reviews.llvm.org/D60342 llvm-svn: 358536	2019-04-16 22:45:14 +00:00
Peter Collingbourne	d3e207057f	ELF: Move verneed tracking data structures out of VersionNeedSection. For partitions I intend to use the same set of version indexes in each partition for simplicity. Since each partition will need its own VersionNeedSection this will require moving the verneed tracking out of VersionNeedSection. The way I've done this is to move most of the tracking into SharedFile. What will eventually become the per-partition tracking still lives in VersionNeedSection. As a bonus the code gets a little simpler and more consistent with how we handle verdef. Differential Revision: https://reviews.llvm.org/D60307 llvm-svn: 357926	2019-04-08 17:48:05 +00:00
Peter Collingbourne	cc1618e668	ELF: De-template SharedFile. NFCI. Differential Revision: https://reviews.llvm.org/D60305 llvm-svn: 357925	2019-04-08 17:35:55 +00:00
Rui Ueyama	4af8d47d05	Fix -emit-reloc against local symbols. Previously, we drop symbols starting with .L from the symbol table, so if there is a relocation that refers a .L symbol, it ended up referencing a null -- which happened to be interpreted as an absolute symbol. This patch copies all symbols including local ones if -emit-reloc is given. Fixes https://bugs.llvm.org/show_bug.cgi?id=41385 Differential Revision: https://reviews.llvm.org/D60306 llvm-svn: 357885	2019-04-08 06:45:07 +00:00
Rui Ueyama	7c28937baf	Remove redundant parameters. NFC. llvm-svn: 357738	2019-04-05 01:30:09 +00:00
Peter Collingbourne	a9e847238e	ELF: Perform per-section .ARM.exidx processing during combineEhFrameSections(). NFCI. And rename the function to combineEhSections(). This makes the processing of .ARM.exidx even more similar to .eh_frame and means that we can avoid an additional loop over InputSections. Differential Revision: https://reviews.llvm.org/D60026 llvm-svn: 357417	2019-04-01 18:01:18 +00:00
Fangrui Song	d83fb24533	[ELF] Rename SyntheticSection::empty to more appropriate isNeeded() with opposite meaning Summary: Some synthetic sections can be empty while still being needed, thus they can't be removed by removeUnusedSyntheticSections(). Rename this member function to more appropriate isNeeded() with the opposite meaning. No functional change intended. Reviewers: ruiu, espindola Reviewed By: ruiu Subscribers: jhenderson, grimar, emaste, arichardson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59982 llvm-svn: 357377	2019-04-01 08:16:08 +00:00
Rui Ueyama	68b9f45fee	Replace `typedef A B` with `using B = A`. NFC. I did this using Perl. Differential Revision: https://reviews.llvm.org/D60003 llvm-svn: 357372	2019-04-01 00:11:24 +00:00
Fangrui Song	8048fe2b8c	[ELF][MachO][wasm] Simplify range-style std::find{,_if} with STLExtras.h utilities. NFC llvm-svn: 357269	2019-03-29 16:21:16 +00:00
Rui Ueyama	f28825bc06	Create an instance of Target after reading all input files. NFC. This change itself doesn't mean anything, but it helps D59780 because in patch, we don't know whether we need to create a CET-aware PLT or not until we read all input files. llvm-svn: 357194	2019-03-28 17:38:53 +00:00
Peter Smith	3ce9af9370	[ELF][ARM] Recommit Redesign of .ARM.exidx handling to use a SyntheticSection Recommit r356666 with fixes for buildbot failure, as well as handling for --emit-relocs, which we decide not to emit any relocation sections as the table is already position independent and an offline tool can deduce the relocations. Instead of creating extra Synthetic .ARM.exidx sections to account for gaps in the table, create a single .ARM.exidx SyntheticSection that can derive the contents of the gaps from a sorted list of the executable InputSections. This has the benefit of moving the ARM specific code for SyntheticSections in SHF_LINK_ORDER processing and the table merging code into the ARM specific SyntheticSection. This also makes it easier to create EXIDX_CANTUNWIND table entries for executable InputSections that don't have an associated .ARM.exidx section. Fixes pr40277 Differential Revision: https://reviews.llvm.org/D59216 llvm-svn: 357160	2019-03-28 11:10:20 +00:00
Fangrui Song	210949a221	[ELF] Change GOT_FROM_END (relative to end(.got)) to GOTPLT (start(.got.plt)) Summary: This should address remaining issues discussed in PR36555. Currently R_GOT_FROM_END are exclusively used by x86 and x86_64 to express relocations types relative to the GOT base. We have _GLOBAL_OFFSET_TABLE_ (GOT base) = start(.got.plt) but end(.got) != start(.got.plt) This can have problems when _GLOBAL_OFFSET_TABLE_ is used as a symbol, e.g. glibc dl_machine_dynamic assumes _GLOBAL_OFFSET_TABLE_ is start(.got.plt), which is not true. extern const ElfW(Addr) _GLOBAL_OFFSET_TABLE_[] attribute_hidden; return _GLOBAL_OFFSET_TABLE_[0]; // R_X86_64_GOTPC32 In this patch, we Change all GOT_FROM_END to GOTPLT to fix the problem. * Add HasGotPltOffRel to denote whether .got.plt should be kept even if the section is empty. * Simplify GotSection::empty and GotPltSection::empty by setting HasGotOffRel and HasGotPltOffRel according to GlobalOffsetTable early. The change of R_386_GOTPC makes X86::writePltHeader simpler as we don't have to compute the offset start(.got.plt) - Ebx (it is constant 0). We still diverge from ld.bfd (at least in most cases) and gold in that .got.plt and .got are not adjacent, but the advantage doing that is unclear. Reviewers: ruiu, sivachandra, espindola Subscribers: emaste, mehdi_amini, arichardson, dexonsmith, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59594 llvm-svn: 356968	2019-03-25 23:46:19 +00:00
Peter Smith	54dab70bb7	[ELF][ARM] Revert Redesign of .ARM.exidx handling to use a SyntheticSection There is a reproducible buildbot failure (segfault) on the 2 stage clang-cmake-armv8-lld bot. Reverting while I investigate. Differential Revision: https://reviews.llvm.org/D59216 llvm-svn: 356684	2019-03-21 17:17:54 +00:00
Peter Smith	d3511a214e	[ELF][ARM] Redesign of .ARM.exidx handling to use a SyntheticSection Instead of creating extra Synthetic .ARM.exidx sections to account for gaps in the table, create a single .ARM.exidx SyntheticSection that can derive the contents of the gaps from a sorted list of the executable InputSections. This has the benefit of moving the ARM specific code for SyntheticSections in SHF_LINK_ORDER processing and the table merging code into the ARM specific SyntheticSection. This also makes it easier to create EXIDX_CANTUNWIND table entries for executable InputSections that don't have an associated .ARM.exidx section. Fixes pr40277 Differential Revision: https://reviews.llvm.org/D59216 llvm-svn: 356666	2019-03-21 14:06:40 +00:00
Fangrui Song	6778b53e95	[ELF] De-virtualize findOrphanPos, excludeLibs and handleARMTlsRelocation llvm-svn: 356331	2019-03-17 13:53:42 +00:00
Fangrui Song	e8710ef1fb	[ELF] Split RW PT_LOAD on the PT_GNU_RELRO boundary Summary: Based on Peter Collingbourne's suggestion in D56828. Before D56828: PT_LOAD(.data PT_GNU_RELRO(.data.rel.ro .bss.rel.ro) .bss) Old: PT_LOAD(PT_GNU_RELRO(.data.rel.ro .bss.rel.ro) .data .bss) New: PT_LOAD(PT_GNU_RELRO(.data.rel.ro .bss.rel.ro)) PT_LOAD(.data. .bss) The new layout reflects the runtime memory mappings. By having two PT_LOAD segments, we can utilize the NOBITS part of the first PT_LOAD and save bytes for .bss.rel.ro. .bss.rel.ro is currently small and only used by copy relocations of symbols in read-only segments, but it can be used for other purposes in the future, e.g. if a relro section's statically relocated data is all zeros, we can move it to .bss.rel.ro. Reviewers: espindola, ruiu, pcc Reviewed By: ruiu Subscribers: nemanjai, jvesely, nhaehnle, javed.absar, kbarton, emaste, arichardson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58892 llvm-svn: 356226	2019-03-15 01:29:57 +00:00
Fangrui Song	07f8daf05e	[ELF] Simplify RelRo, TLS, NOBITS section ranks and make RW PT_LOAD start with RelRo Old: PT_LOAD(.data \| PT_GNU_RELRO(.data.rel.ro .bss.rel.ro) \| .bss) New: PT_LOAD(PT_GNU_RELRO(.data.rel.ro .bss.rel.ro) \| .data .bss) The placement of \| indicates page alignment caused by PT_GNU_RELRO. The new layout has simpler rules and saves space for many cases. Old size: roundup(.data) + roundup(.data.rel.ro) New size: roundup(.data.rel.ro + .bss.rel.ro) + .data Other advantages: * At runtime the 3 memory mappings decrease to 2. * start(PT_TLS) = start(PT_GNU_RELRO) = start(RW PT_LOAD). This simplifies binary manipulation tools. GNU strip before 2.31 discards PT_GNU_RELRO if its address is not equal to the start of its associated PT_LOAD. This has been fixed by https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f2731e0c374e5323ce4cdae2bcc7b7fe22da1a6f But with this change, we will be compatible with GNU strip before 2.31 * Before, .got.plt (non-relro by default) was placed before .got (relro by default), which made it impossible to have _GLOBAL_OFFSET_TABLE_ (start of .got.plt on x86-64) equal to the end of .got (R_GOT*_FROM_END) (https://bugs.llvm.org/show_bug.cgi?id=36555). With the new ordering, we can improve on this regard if we'd like to. Reviewers: ruiu, espindola, pcc Subscribers: emaste, arichardson, llvm-commits, joerg, jdoerfert Differential Revision: https://reviews.llvm.org/D56828 llvm-svn: 356117	2019-03-14 03:47:45 +00:00
Peter Collingbourne	8a28673a2e	ELF: Don't add .dynamic strings to .dynstr early. This does not appear to be necessary because StringTableSection does not need to be finalized, which also means that we can remove the call to finalizeSynthetic on .dynstr. Differential Revision: https://reviews.llvm.org/D59240 llvm-svn: 355977	2019-03-12 20:58:34 +00:00
Rui Ueyama	7fd99fc475	Fail early if an output file is not writable Fixes https://bugs.llvm.org/show_bug.cgi?id=36478 Differential Revision: https://reviews.llvm.org/D43664 llvm-svn: 355834	2019-03-11 16:30:55 +00:00
Peter Collingbourne	5ee9abd4c8	ELF: De-template OutputSection::finalize() and MipsGotSection::build(). NFCI. Differential Revision: https://reviews.llvm.org/D58810 llvm-svn: 355479	2019-03-06 03:07:57 +00:00
Peter Collingbourne	704dfd6e28	ELF: Extract a non-ELFT base class for VersionNeedSection. We're going to need a separate VersionNeedSection for each partition, and the partition data structure won't be templated. With this the VersionTableSection class no longer needs ELFT, so detemplate it. Differential Revision: https://reviews.llvm.org/D58808 llvm-svn: 355478	2019-03-06 03:07:48 +00:00
Peter Collingbourne	16d9a0acfd	ELF: Change FileSize back to a uint64_t. This lets us detect file size overflows when creating a 64-bit binary on a 32-bit machine. Differential Revision: https://reviews.llvm.org/D58840 llvm-svn: 355218	2019-03-01 18:53:41 +00:00
Peter Smith	22ce712c19	[ELF][ARM] Fix clang-armv7-linux-build-cache builds of LLD [NFC] r355153 introduced a build failure on a build bot that uses clang natively on an armv7-a machine. This a temporary fix to use size_t rather than uint64_t. llvm-svn: 355195	2019-03-01 10:52:25 +00:00

1 2 3 4 5 ...

1691 Commits