llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	fbf41b5267	[ELF] Simplify sh_addr computation and warn if sh_addr is not a multiple of sh_addralign See `docs/ELF/linker_script.rst` for the new computation for sh_addr and sh_addralign. `ALIGN(section_align)` now means: "increase alignment to section_align" (like yet another input section requirement). The "start of section .foo changes from 0x11 to 0x20" warning no longer makes sense. Change it to warn if sh_addr%sh_addralign!=0. To decrease the alignment from the default max_input_align, use `.output ALIGN(8) : {}` instead of `.output : ALIGN(8) {}` See linkerscript/section-address-align.test as an example. When both an output section address and ALIGN are set (can be seen as an "undefined behavior" https://sourceware.org/ml/binutils/2020-03/msg00115.html), lld may align more than GNU ld, but it makes a linker script working with GNU ld hard to break with lld. This patch can be considered as restoring part of the behavior before D74736. Differential Revision: https://reviews.llvm.org/D75724	2020-03-11 09:35:42 -07:00
Fangrui Song	00925aadb3	[ELF][PPC32] Fix canonical PLTs when the order does not match the PLT order Reviewed By: Bdragon28 Differential Revision: https://reviews.llvm.org/D75394	2020-02-28 22:23:14 -08:00
Fangrui Song	423194098b	[ELF] --orphan-handling=: don't warn/error for unused synthesized sections This makes --orphan-handling= less noisy. This change also improves our compatibility with GNU ld. GNU ld special cases .symtab, .strtab and .shstrtab . We need output section descriptions for .symtab, .strtab and .shstrtab to suppress: <internal>:(.symtab) is being placed in '.symtab' <internal>:(.shstrtab) is being placed in '.shstrtab' <internal>:(.strtab) is being placed in '.strtab' With --strip-all, .symtab and .strtab can be omitted (note, --strip-all is not compatible with --emit-relocs). Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D75149	2020-02-26 08:56:12 -08:00
Rafael Ávila de Espíndola	7b44f0428a	Add a llvm::shuffle and use it in lld With this --shuffle-sections=seed produces the same result in every host. Reviewed By: grimar, MaskRay Differential Revision: https://reviews.llvm.org/D74971	2020-02-22 10:05:29 -08:00
Fangrui Song	dbd7281aa7	[ELF] Shuffle .init_array/.fini_array with --shuffle-sections= Useful for detecting static initialization order fiasco. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74887	2020-02-21 08:16:07 -08:00
Fangrui Song	de0dda54d3	[ELF] Warn changed output section address When the output section address (addrExpr) is specified, GNU ld warns if sh_addr is different. This patch implements the warning. Note, LinkerScript::assignAddresses can be called more than once. We need to record the changed section addresses, and only report the warnings after the addresses are finalized. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74741	2020-02-21 08:13:29 -08:00
Fangrui Song	6ed8e20143	[ELF] Ignore the maximum of input section alignments for two cases Follow-up for D74286. Notations: * alignExpr: the computed ALIGN value * max_input_align: the maximum of input section alignments This patch changes the following two cases to match GNU ld: * When ALIGN is present, GNU ld sets output sh_addr to alignExpr, while lld use max(alignExpr, max_input_align) * When addrExpr is specified but alignExpr is not, GNU ld sets output sh_addr to addrExpr, while lld uses `advance(0, max_input_align)` Note, sh_addralign is still set to max(alignExpr, max_input_align). lma-align.test is enhanced a bit to check we don't overalign sh_addr. fixSectionAlignments() sets addrExpr but not alignExpr for the `!hasSectionsCommand` case. This patch sets alignExpr as well so that max_input_align will be respected. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74736	2020-02-21 08:12:00 -08:00
Rafael Ávila de Espíndola	d48d339156	[lld][ELF] Add --shuffle-sections=seed to shuffle input sections Summary: This option causes lld to shuffle sections by assigning different priorities in each run. The use case for this is to introduce randomization in benchmarks. The idea is inspired by the paper "Producing Wrong Data Without Doing Anything Obviously Wrong!" (https://www.inf.usi.ch/faculty/hauswirth/publications/asplos09.pdf). Unlike the paper, we shuffle individual sections, not just input files. Doing this in lld is particularly convenient as the --reproduce option makes it easy to collect all the necessary bits for relinking the program being benchmarked. Once that it is done, all that is needed is to add --shuffle-sections=0 to the response file and relink before each run of the benchmark. Differential Revision: https://reviews.llvm.org/D74791	2020-02-19 13:44:12 -08:00
Fangrui Song	7c426fb1a6	[ELF] Support INSERT [AFTER\|BEFORE] for orphan sections D43468+D44380 added INSERT [AFTER\|BEFORE] for non-orphan sections. This patch makes INSERT work for orphan sections as well. `SECTIONS {...} INSERT [AFTER\|BEFORE] .foo` does not set `hasSectionCommands`, so the result will be similar to a regular link without a linker script. The differences when `hasSectionCommands` is set include: * image base is different * -z noseparate-code/-z noseparate-loadable-segments are unavailable * some special symbols such as `_end _etext _edata` are not defined The behavior is similar to GNU ld: INSERT is not considered an external linker script. This feature makes the section layout more flexible. It can be used to: * Place .nv_fatbin before other readonly SHT_PROGBITS sections to mitigate relocation overflows. * Disturb the layout to expose address sensitive application bugs. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D74375	2020-02-12 08:21:52 -08:00
Fangrui Song	b498d99338	[ELF] Start a new PT_LOAD if LMA region is different GNU ld has a counterintuitive lang_propagate_lma_regions rule. ``` // .foo's LMA region is propagated to .bar because their VMA region is the same, // and .bar does not have an explicit output section address (addr_tree). .foo : { (.foo) } >RAM AT> FLASH .bar : { (.bar) } >RAM // An explicit output section address disables propagation. .foo : { (.foo) } >RAM AT> FLASH .bar . : { (.bar) } >RAM ``` In both cases, lld thinks .foo's LMA region is propagated and places .bar in the same PT_LOAD, so lld diverges from GNU ld w.r.t. the second case (lma-align.test). This patch changes Writer<ELFT>::createPhdrs to disable propagation (start a new PT_LOAD). A user of the first case can make linker scripts portable by explicitly specifying `AT>`. By contrast, there was no workaround for the old behavior. This change uncovers another LMA related bug in assignOffsets() where `ctx->lmaOffset = 0;` was omitted. It caused a spurious "load address range overlaps" error for at2.test The new PT_LOAD rule is complex. For convenience, I listed the origins of some subexpressions: * rL323449: `sec->memRegion == load->firstSec->memRegion`; linkerscript/at3.test * D43284: `load->lastSec == Out::programHeaders` (don't start a new PT_LOAD after program headers); linkerscript/at4.test * D58892: `sec != relroEnd` (start a new PT_LOAD after PT_GNU_RELRO) Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D74297	2020-02-12 08:20:14 -08:00
Russell Gallop	e7cb374433	[LLD][ELF] Add time-trace to ELF LLD This adds some of LLD specific scopes and picks up optimisation scopes via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in `77e6bb3c`. Differential Revision: https://reviews.llvm.org/D71060	2020-02-06 12:14:13 +00:00
Fangrui Song	2d7a8cf904	[ELF] -r: don't create .interp `{clang,gcc} -nostdlib -r a.c` passes --dynamic-linker to the linker, and the expected behavior is to ignore it. If .interp is kept in the relocatable object file, a final link will get PT_INTERP even if --dynamic-linker is not specified. glibc ld.so expects to see PT_DYNAMIC and the executable will likely fail to run. Ignore --dynamic-linker in -r mode as well as -shared.	2020-01-16 12:14:32 -08:00
Fangrui Song	7cd429f27d	[ELF] Add -z force-ibt and -z shstk for Intel Control-flow Enforcement Technology This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang. It adds Intel CET (Control-flow Enforcement Technology) support to lld. The implementation follows the draft version of psABI which you can download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI. CET introduces a new restriction on indirect jump instructions so that you can limit the places to which you can jump to using indirect jumps. In order to use the feature, you need to compile source files with -fcf-protection=full. * IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt. * SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified. IBT-enabled executables/shared objects have two PLT sections, ".plt" and ".plt.sec". For the details as to why we have two sections, please read the comments. Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D59780	2020-01-13 23:39:28 -08:00
Fangrui Song	dce7a362be	[ELF] Improve the condition to create .interp This restores commit `1417558e4a` and its follow-up, reverted by commit `c3dbd782f1`. After this commit: clang -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created clang -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created clang -fuse-ld=gold -no-pie -nostdlib a.c => .interp not created clang -fuse-ld=gold -pie -fPIE -nostdlib a.c => .interp created clang -fuse-ld=lld -no-pie -nostdlib a.c => .interp created clang -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp created	2019-12-27 15:34:25 -08:00
Reid Kleckner	c3dbd782f1	Revert "[ELF] Improve the condition to create .interp" This reverts commit `1417558e4a`. Also reverts commit `019a92bb28`. This causes check-sanitizer to fail. The "-Nolib" variant of the test crashes on startup in the loader.	2019-12-27 13:05:41 -08:00
Fangrui Song	1417558e4a	[ELF] Improve the condition to create .interp Similar to rL362355, but with the `!config->shared` guard. (1) {gcc,clang} -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created (2) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp not created (3) {gcc,clang} -fuse-ld=lld -pie -fPIE -nostdlib a.c a.so => .interp created The inconsistency of (2) is due to the condition `!Config->SharedFiles.empty()`. To make lld behave more like ld.bfd, we could change the condition to: config->hasDynSymTab && !config->dynamicLinker.empty() && script->needsInterpSection(); However, that would bring another inconsistency as can be observed with: (4) {gcc,clang} -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created	2019-12-26 13:26:43 -08:00
Fangrui Song	891a8655ab	[ELF] Add IpltSection PltSection is used by both PLT and IPLT. The PLT section may have a header while the IPLT section does not. Split off IpltSection from PltSection to be clearer. Unlike other targets, PPC64 cannot use the same code sequence for PLT and IPLT. This helps make a future PPC64 patch (D71509) more isolated. On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently we are inconsistent whether the PLT header is conceptually attached to in.plt or in.iplt . Consistently attach the header to in.plt can make the -z retpolineplt logic simpler. It also makes `jmp` point to an aesthetically better place for non-retpolineplt cases. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D71519	2019-12-17 00:06:04 -08:00
Rui Ueyama	69da7e29de	Revert an accidental commit `af5ca40b47`	2019-12-13 15:17:40 +09:00
Rui Ueyama	af5ca40b47	temporary	2019-12-13 14:35:03 +09:00
Fangrui Song	cd0ab2428f	[ELF] --icf: do not fold preemptible symbols Fixes PR44124. A preemptible symbol may refer to a different definition at runtime. When comparing a pair of relocations, if they refer to different symbols, and either symbol is preemptible, the two containing sections should be considered different. gold has a similar rule https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=ce97fa81e0c46d216b80b143ad8c02fff6906fef Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D71163	2019-12-10 09:06:08 -08:00
Peter Smith	4d6c4cb426	[LLD][ELF] Add support for PT_GNU_PROPERTY The PT_GNU_PROPERTY program header describes the location of the .note.gnu.property SHT_NOTES section. The linux kernel uses this program header to find the .note.gnu.property section rather than parsing. Executables that have properties that the kernel needs to act on that don't have the PT_GNU_PROPERTY program header will not boot. Differential Revision: https://reviews.llvm.org/D70961	2019-12-05 09:54:58 +00:00
Fangrui Song	a2fc964417	[ELF] Replace SymbolTable::forEachSymbol with iterator_range symbols() D62381 introduced forEachSymbol(). It seems that many call sites cannot be parallelized because the body shared some states. Replace forEachSymbol with iterator_range<filter_iterator<...>> symbols() to simplify code and improve debuggability (std::function calls take some frames). It also allows us to use early return to simplify code added in D69650. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D70505	2019-11-26 09:09:32 -08:00
Nick Terrell	6814232429	[LLD][ELF] Support --[no-]mmap-output-file with F_no_mmap Summary: Add a flag `F_no_mmap` to `FileOutputBuffer` to support `--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores this flag for compatibility with GNU ld and gold. We need this flag to speed up link time for large binaries in certain scenarios. When we link some of our larger binaries we find that LLD takes 50+ GB of memory, which causes memory pressure. The memory pressure causes the VM to flush dirty pages of the output file to disk. This is normally okay, since we should be flushing cold pages. However, when using BtrFS with compression we need to write 128KB at a time when we flush a page. If any page in that 128KB block is written again, then it must be flushed a second time, and so on. Since LLD doesn't write sequentially this causes write amplification. The same 128KB block will end up being flushed multiple times, causing the linker to many times more IO than necessary. We've observed 3-5x faster builds with -no-mmap-output-file when we hit this scenario. The bad scenario only applies to compressed filesystems, which group together multiple pages into a single compressed block. I've tested BtrFS, but the problem will be present for any compressed filesystem on Linux, since it is caused by the VM. Silently ignoring --no-mmap-output-file caused a silent regression when we switched from gold to lld. We pass --no-mmap-output-file to fix this edge case, but since lld silently ignored the flag we didn't realize it wasn't being respected. Benchmark building a 9 GB binary that exposes this edge case. I linked 3 times with --mmap-output-file and 3 times with --no-mmap-output-file and took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM, BtrFS mounted with -compress-force=zstd, and an 80% full disk. \| Mode \| Time \| \|---------\|-------\| \| mmap \| 894 s \| \| no mmap \| 126 s \| When compression is disabled, BtrFS performs just as well with and without mmap on this benchmark. I was unable to reproduce the regression with any binaries in lld-speed-test. Reviewed By: ruiu, MaskRay Differential Revision: https://reviews.llvm.org/D69294	2019-10-29 15:49:08 -07:00
Michał Górny	2a0fcae3d4	[lld] [ELF] Add '-z nognustack' opt to suppress emitting PT_GNU_STACK Add a new '-z nognustack' option that suppresses emitting PT_GNU_STACK segment. This segment is not supported at all on NetBSD (stack is always non-executable), and the option is meant to be used to disable emitting it. Differential Revision: https://reviews.llvm.org/D56554	2019-10-29 17:54:23 +01:00
Nico Weber	5976a3f5aa	Fix a few typos in lld/ELF to cycle bots	2019-10-28 21:41:47 -04:00
Fangrui Song	bd8cfe65f5	[ELF] Wrap things in `namespace lld { namespace elf {`, NFC This makes it clear `ELF/*/.cpp` files define things in the `lld::elf` namespace and simplifies `elf::foo` to `foo`. Reviewed By: atanasyan, grimar, ruiu Differential Revision: https://reviews.llvm.org/D68323 llvm-svn: 373885	2019-10-07 08:31:18 +00:00
Peter Collingbourne	0bb825d208	ELF: Add .interp synthetic sections first in createSyntheticSections(). Our .interp section is not a SyntheticSection. As a result, it terminates the loop in removeUnusedSyntheticSections(). This has at least two consequences: - The synthetic .bss and .bss.rel.ro sections are always present in dynamically linked executables, even when they are not needed. - The synthetic .ARM.exidx (and possibly other) sections are always present in partitions other than the last one, even when not needed. .ARM.exidx in particular is problematic because it assumes that its list of code sections is non-empty in getLinkOrderDep(), which can lead to a crash if the partition does not have any code sections. Fix these problems by moving the creation of the .interp sections to the top of createSyntheticSections(). While here, make the code a little less error-prone by changing the add() lambdas to take a SyntheticSection instead of an InputSectionBase. Differential Revision: https://reviews.llvm.org/D68256 llvm-svn: 373347	2019-10-01 16:10:13 +00:00
Fangrui Song	0264950697	[ELF] Add -z separate-loadable-segments to complement separate-code and noseparate-code D64906 allows PT_LOAD to have overlapping p_offset ranges. In the default R RX RW RW layout + -z noseparate-code case, we do not tail pad segments when transiting to another segment. This can save at most 3*maxPageSize bytes. a) Before D64906, we tail pad R, RX and the first RW. b) With -z separate-code, we tail pad R and RX, but not the first RW (RELRO). In some cases, b) saves one file page. In some cases, b) wastes one virtual memory page. The waste is a concern on Fuchsia. Because it uses compressed binaries, it doesn't benefit from the saved file page. This patch adds -z separate-loadable-segments to restore the behavior before D64906. It can affect section addresses and can thus be used as a debugging mechanism (see PR43214 and ld.so partition bug in crbug.com/998712). Reviewed By: jakehehrlich, ruiu Differential Revision: https://reviews.llvm.org/D67481 llvm-svn: 372807	2019-09-25 03:39:31 +00:00
Fangrui Song	e47bbd28f8	[ELF] Make MergeInputSection merging aware of output sections Fixes PR38748 mergeSections() calls getOutputSectionName() to get output section names. Two MergeInputSections may be merged even if they are made different by SECTIONS commands. This patch moves mergeSections() after processSectionCommands() and addOrphanSections() to fix the issue. The new pass is renamed to OutputSection::finalizeInputSections(). processSectionCommands() and addorphanSections() are changed to add sections to InputSectionDescription::sectionBases. finalizeInputSections() merges MergeInputSections and migrates `sectionBases` to `sections`. For the -r case, we drop an optimization that tries keeping sh_entsize non-zero. This is for the simplicity of addOrphanSections(). The updated merge-entsize2.s reflects the change. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D67504 llvm-svn: 372734	2019-09-24 11:48:31 +00:00
Fangrui Song	2672051495	[ELF] Error if the linked-to section of a SHF_LINK_ORDER section is discarded Summary: If st_link(A)=B, and A has the SHF_LINK_ORDER flag, we may dereference a null pointer if B is garbage collected (PR43147): 1. In Wrter.cpp:compareByFilePosition, `aOut->sectionIndex` or `bOut->sectionIndex` 2. In OutputSections::finalize, `d->getParent()->sectionIndex` Simply error and bail out to avoid null pointer dereferences. ld.bfd has a similar error: sh_link of section `.bar' points to discarded section `.foo0' of `a.o' ld.bfd is more permissive in that it just checks whether the linked-to section of the first input section is discarded. This is likely because it sets sh_link of the output section according to the first input section. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D67761 llvm-svn: 372400	2019-09-20 15:03:21 +00:00
Fangrui Song	4816e516e5	[ELF][Hexagon] Allow PT_LOAD to have overlapping p_offset ranges on EM_HEXAGON Port the D64906 technique to EM_HEXAGON. This concludes the patch series. Differential Revision: https://reviews.llvm.org/D67605 llvm-svn: 372059	2019-09-17 02:45:38 +00:00
Peter Smith	ea99ce5e9b	[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965	2019-09-16 09:38:38 +00:00
Fangrui Song	d4306e90cb	[ELF][X86] Allow PT_LOAD to have overlapping p_offset ranges on EM_X86_64 Port the D64906 technique to EM_X86_64. Differential Revision: https://reviews.llvm.org/D67482 llvm-svn: 371958	2019-09-16 07:05:34 +00:00
Fangrui Song	06bb7dfbd4	[ELF] Map the ELF header at imageBase If there is no readonly section, we map: * The ELF header at imageBase+maxPageSize * Program headers at imageBase+maxPageSize+sizeof(Ehdr) * The first section .text at imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers) Due to the interaction between Writer<ELFT>::fixSectionAlignments and LinkerScript::allocateHeaders, `alignDown(p_vaddr(R PT_LOAD)) = alignDown(p_vaddr(RX PT_LOAD))`. The RX PT_LOAD will override the R PT_LOAD at runtime, which is not ideal: ``` // PHDR at 0x401034, should be 0x400034 PHDR 0x000034 0x00401034 0x00401034 0x000a0 0x000a0 R 0x4 // R PT_LOAD contains just Ehdr and program headers. // At 0x401000, should be 0x400000 LOAD 0x000000 0x00401000 0x00401000 0x000d4 0x000d4 R 0x1000 LOAD 0x0000d4 0x004010d4 0x004010d4 0x00001 0x00001 R E 0x1000 ``` * createPhdrs allocates the headers to the R PT_LOAD. * fixSectionAlignments assigns `imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)` (formula: `alignTo(dot, maxPageSize) + dot % config->maxPageSize`) to addrExpr of .text * allocateHeaders computes the minimum address among SHF_ALLOC sections, i.e. addr(.text) * allocateHeaders sets address of ELF header to `addr(.text)-sizeof(Ehdr)-sizeof(program headers) = imageBase+maxPageSize` The main observation is that when the SECTIONS command is not used, we don't have to call allocateHeaders. This requires an assumption that the presence of PT_PHDR and addresses of headers can be decided regardless of address information. This may seem natural because dot is not manipulated by a linker script. The other thing is that we have to drop the special rule for -T<section> in `getInitialDot`. If -Ttext is smaller than the image base, the headers will not be allocated with the old behavior (allocateHeaders is called) but always allocated with the new behavior. The behavior change is not a problem. Whether and where headers are allocated can vary among linkers, or ld.bfd across different versions (--enable-separate-code or not). It is thus advised to use a linker script with the PHDRS command to have a consistent behavior across linkers. If PT_PHDR is needed, an explicit --image-base can be a simpler alternative. Differential Revision: https://reviews.llvm.org/D67325 llvm-svn: 371957	2019-09-16 07:04:16 +00:00
Simon Atanasyan	6c6f5a9984	[mips] Allow PT_LOAD to have overlapping p_offset ranges on EM_MIPS Port the D64906 <https://reviews.llvm.org/D64906> technique to MIPS. Fix PR33131 llvm-svn: 371554	2019-09-10 20:19:59 +00:00
Fangrui Song	e8c0d93360	[ELF] nmagic or omagic: don't allocate PT_PHDR or PF_R PT_LOAD for the !hasPhdrsCommands case ``` part.phdrs = script->hasPhdrsCommands() ? script->createPhdrs() : createPhdrs(part); ``` createPhdrs() allocates a PT_PHDR and a PF_R PT_LOAD, which will be deleted later in LinkerScript::allocateHeaders, but leave a gap between the program headers and the first section. Don't allocate the segments to avoid the gap. PT_INTERP is likely not needed as well. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D67324 llvm-svn: 371398	2019-09-09 13:08:51 +00:00
Fangrui Song	8d30c1dcec	Reland D66717 [ELF] Do not ICF two sections with different output sections (by SECTIONS commands) Recommit r370635 (reverted by r371202), with one change: move addOrphanSections() before ICF. Before, orphan sections in two different partitions may be folded and moved to the main partition. Now, InputSection->OutputSection assignment for orphans happens before ICF. ICF does not fold input sections with different output sections. With the PR43241 reproduce, `llvm-objcopy --extract-partition libvr.so libchrome__combined.so libvr.so` => no error Updated description: Fixes PR39418. Complements D47241 (the non-linker-script case). processSectionCommands() assigns input sections to output sections. ICF is called before it, so .text.foo and .text.bar may be folded even if their output sections are made different by SECTIONS commands. ``` markLive<ELFT>() doIcf<ELFT>() // During ICF, we don't know the output sections writeResult() combineEhSections<ELFT>() script->processSectionCommands() // InputSection -> OutputSection assignment ``` This patch splits processSectionCommands() into processSectionCommands() and processSymbolAssignments(), and moves processSectionCommands()/addOrphanSections() before ICF: ``` markLive<ELFT>() combineEhSections<ELFT>() script->processSectionCommands() script->addOrphanSections(); doIcf<ELFT>() // should remove folded input sections writeResult() script->processSymbolAssignments() ``` An alternative approach is to unfold a section `sec` in processSectionCommands() when we find `sec` and `sec->repl` belong to different output sections. I feel this patch is superior because this can fold more sections and the decouple of SectionCommand/SymbolAssignment gives flexibility: * An ExprValue can't be evaluated before its section is assigned to an output section -> we can delete getOutputSectionVA and simplify another place where we had to check if the output section is null. Moreover, a case in linkerscript/early-assign-symbol.s can be handled now. * processSectionCommands/processSymbolAssignments can be freely moved around. llvm-svn: 371216	2019-09-06 15:57:44 +00:00
Fangrui Song	5d9f419a2e	Revert "Revert r370635, it caused PR43241." This reverts commit 50d2dca22b3b05d0ee4883b0cbf93d7d15f241fc. llvm-svn: 371215	2019-09-06 15:57:24 +00:00
Nico Weber	8455294f2a	Revert r370635, it caused PR43241. llvm-svn: 371202	2019-09-06 13:23:42 +00:00
Fangrui Song	6dc2bd70bb	[ELF] Initialize PhdrEntry::p_align to maxPageSize for PT_LOAD ``` Writer<ELFT>::run assignFileOffsets setFileOffset computeFileOffset os->ptLoad->p_align may be smaller than config->maxPageSize setPhdrs p_align = max(p_align, config->maxPageSize) ``` If we move the config->maxPageSize logic to the constructor of PhdrEntry, computeFileOffset can be simplified. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D67211 llvm-svn: 371085	2019-09-05 16:32:31 +00:00
Rui Ueyama	e99dc4ba57	Align output segments correctly Previously, segments were aligned according to their first section's alignment requirements. That was not correct, but segments are also aligned to a page boundary, and a page boundary is usually much larger than a section alignment requirement, so no one noticed this bug before. Now, lld has --nmagic option which sets maxPageSize to 1 to effectively disable page alignment, which reveals the issue. Fixes https://bugs.llvm.org/show_bug.cgi?id=43212 Differential Revision: https://reviews.llvm.org/D67152 llvm-svn: 371013	2019-09-05 05:30:24 +00:00
Fangrui Song	d8bc6a48ea	[ELF] Do not ICF two sections with different output sections (by SECTIONS commands) Fixes PR39418. Complements D47241 (the non-linker-script case). processSectionCommands() assigns input sections to output sections. ICF is called before it, so .text.foo and .text.bar may be folded even if their output sections are made different by SECTIONS commands. ``` markLive<ELFT>() doIcf<ELFT>() // During ICF, we don't know the output sections writeResult() combineEhSections<ELFT>() script->processSectionCommands() // InputSection -> OutputSection assignment ``` This patch splits processSectionCommands() into processSectionCommands() and processSymbolAssignments(), and moves processSectionCommands() before ICF: ``` markLive<ELFT>() combineEhSections<ELFT>() script->processSectionCommands() doIcf<ELFT>() // should remove folded input sections writeResult() script->processSymbolAssignments() ``` An alternative approach is to unfold a section `sec` in processSectionCommands() when we find `sec` and `sec->repl` belong to different output sections. I feel this patch is superior because this can fold more sections and the decouple of SectionCommand/SymbolAssignment gives flexibility: * An ExprValue can't be evaluated before its section is assigned to an output section -> we can delete getOutputSectionVA and simplify another place where we had to check if the output section is null. Moreover, a case in linkerscript/early-assign-symbol.s can be handled now. * processSectionCommands/processSymbolAssignments can be freely moved around. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66717 llvm-svn: 370635	2019-09-02 10:33:58 +00:00
Fangrui Song	4514ac7cfb	[ELF] Align SHT_LLVM_PART_EHDR to a maximum page size boundary Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=998712 SHT_LLVM_PART_EHDR marks the start of a partition. The partition sections will be extracted to a separate file. Align to the next maximum page size boundary so that we can find the ELF header at the start. We cannot benefit from overlapping p_offset ranges with the previous segment anyway. It seems we lack some llvm-objcopy --extract-main-partition and --extract-partition sanity checks. It may place EHDR at the start even if p_offset if non zero. Anyway, the lld change is justified for the reasons above. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D67032 llvm-svn: 370629	2019-09-02 08:49:50 +00:00
Fangrui Song	523f999acf	[ELF][RISCV] Allow PT_LOAD to have overlapping p_offset ranges on EM_RISCV Port the D64906 technique to RISC-V. It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of a RISC-V binary decreases by at most 12kb. llvm-svn: 370192	2019-08-28 12:06:06 +00:00
Fangrui Song	54a6f6839b	[ELF][AMDGPU][SPARC] Allow PT_LOAD to have overlapping p_offset ranges on EM_AMDGPU and EM_SPARCV9 llvm-svn: 370180	2019-08-28 09:45:06 +00:00
Fangrui Song	8fbe81fb29	[ELF][RISCV] Assign st_shndx of __global_pointer$ to 1 if .sdata does not exist This essentially reverts the code change of D63132 and switches to a simpler approach. In an executable/shared object, st_shndx of a symbol can be: 1) SHN_UNDEF: undefined symbol (or canonical PLT) 2) SHN_ABS: absolute symbol 3) any other value (usually a regular section index) represents a relative symbol. The actual value does not matter. Many ld.so (musl, all archs except MIPS of FreeBSD rtld-elf) even treat 2) and 3) the same. If .sdata does not exist, it does not matter what value/section __global_pointer$ has, as long as it is relative (otherwise there will be a pedantic lld error. See D63132). Just set the st_shndx arbitrarily to 1. Dummy st_shndx=1 may be used by __rela_iplt_start, linker-script-defined symbols outside a section, __dso_handle, etc. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66798 llvm-svn: 370172	2019-08-28 09:01:03 +00:00
Fangrui Song	024bf27ddf	[ELF][ARM] Allow PT_LOAD to have overlapping p_offset ranges on EM_ARM Port the D64906 technique to ARM. It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of an arm binary decreases by at most 12kb. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D66749 llvm-svn: 370049	2019-08-27 11:52:36 +00:00
Fangrui Song	1681ceb2c4	[ELF] EhFrameSection: postpone FDE liveness check to finalizeSections EhFrameSection::addSection checks liveness of FDE early. This makes it infeasible to move combineEhSections() before ICF. Postpone the check to EhFrameSection::finalizeContents(). This is what ARMExidxSyntheticSection does and it will make a subsequent patch D66717 simpler. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66727 llvm-svn: 369890	2019-08-26 10:32:12 +00:00
Fangrui Song	debcac9fef	[ELF] Make LinkerScript::assignAddresses iterative PR42990. For `SECTIONS { b = a; . = 0xff00 + (a >> 8); a = .; }`, we currently set st_value(a)=0xff00 while st_value(b)=0xffff. The following call tree demonstrates the problem: ``` link<ELF64LE>(Args); Script->declareSymbols(); // insert a and b as absolute Defined Writer<ELFT>().run(); Script->processSectionCommands(); addSymbol(cmd); // a and b are re-inserted. LinkerScript::getSymbolValue // is lazily called by subsequent evaluation finalizeSections(); forEachRelSec(scanRelocations<ELFT>); processRelocAux // another problem PR42506, not affected by this patch finalizeAddressDependentContent(); // loop executed once script->assignAddresses(); // a = 0, b = 0xff00 script->assignAddresses(); // a = 0xff00, _end = 0xffff ``` We need another assignAddresses() to finalize the value of `a`. This patch 1) modifies assignAddress() to track the original section/value of each symbol and return a symbol whose section/value has changed. 2) moves the post-finalizeSections assignAddress() inside the loop of finalizeAddressDependentContent() and makes it iterative. Symbol assignment may not converge so we make a few attempts before bailing out. Note, assignAddresses() must be called at least twice. The penultimate call finalized section addresses while the last finalized symbol values. It is somewhat obscure and there was no comment. linkerscript/addr-zero.test tests this. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D66279 llvm-svn: 369889	2019-08-26 10:23:31 +00:00
Fangrui Song	6d5a8c92bf	[ELF] Simplify with less_second. NFC llvm-svn: 369844	2019-08-24 08:40:20 +00:00

1 2 3 4 5 ...

1557 Commits