llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	06bb7dfbd4	[ELF] Map the ELF header at imageBase If there is no readonly section, we map: * The ELF header at imageBase+maxPageSize * Program headers at imageBase+maxPageSize+sizeof(Ehdr) * The first section .text at imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers) Due to the interaction between Writer<ELFT>::fixSectionAlignments and LinkerScript::allocateHeaders, `alignDown(p_vaddr(R PT_LOAD)) = alignDown(p_vaddr(RX PT_LOAD))`. The RX PT_LOAD will override the R PT_LOAD at runtime, which is not ideal: ``` // PHDR at 0x401034, should be 0x400034 PHDR 0x000034 0x00401034 0x00401034 0x000a0 0x000a0 R 0x4 // R PT_LOAD contains just Ehdr and program headers. // At 0x401000, should be 0x400000 LOAD 0x000000 0x00401000 0x00401000 0x000d4 0x000d4 R 0x1000 LOAD 0x0000d4 0x004010d4 0x004010d4 0x00001 0x00001 R E 0x1000 ``` * createPhdrs allocates the headers to the R PT_LOAD. * fixSectionAlignments assigns `imageBase+maxPageSize+sizeof(Ehdr)+sizeof(program headers)` (formula: `alignTo(dot, maxPageSize) + dot % config->maxPageSize`) to addrExpr of .text * allocateHeaders computes the minimum address among SHF_ALLOC sections, i.e. addr(.text) * allocateHeaders sets address of ELF header to `addr(.text)-sizeof(Ehdr)-sizeof(program headers) = imageBase+maxPageSize` The main observation is that when the SECTIONS command is not used, we don't have to call allocateHeaders. This requires an assumption that the presence of PT_PHDR and addresses of headers can be decided regardless of address information. This may seem natural because dot is not manipulated by a linker script. The other thing is that we have to drop the special rule for -T<section> in `getInitialDot`. If -Ttext is smaller than the image base, the headers will not be allocated with the old behavior (allocateHeaders is called) but always allocated with the new behavior. The behavior change is not a problem. Whether and where headers are allocated can vary among linkers, or ld.bfd across different versions (--enable-separate-code or not). It is thus advised to use a linker script with the PHDRS command to have a consistent behavior across linkers. If PT_PHDR is needed, an explicit --image-base can be a simpler alternative. Differential Revision: https://reviews.llvm.org/D67325 llvm-svn: 371957	2019-09-16 07:04:16 +00:00
Fangrui Song	f66b767abe	[ELF][AArch64] Allow PT_LOAD to have overlapping p_offset ranges Ported the D64906 technique to AArch64. It deletes 3 alignments at PT_LOAD boundaries for the default case: the size of an aarch64 binary decreases by at most 192kb. If `sh_addralign(.tdata) < sh_addralign(.tbss)`, we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`. ld.so that are known to have problems if p_vaddr%p_align!=0: * musl<=1.1.22 * FreeBSD 13.0-CURRENT (and before) rtld-elf arm64 New test aarch64-tls-vaddr-align.s checks that our workaround makes p_vaddr%p_align = 0. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64930 llvm-svn: 369344	2019-08-20 08:34:56 +00:00
Fangrui Song	5391f158c2	[ELF] Add -z separate-code and pad the last page of last PF_X PT_LOAD with traps only if -z separate-code is specified This patch 1) adds -z separate-code and -z noseparate-code (default). 2) changes the condition that the last page of last PF_X PT_LOAD is padded with trap instructions. Current condition (after D33630): if there is no `SECTIONS` commands. After this change: if -z separate-code is specified. -z separate-code was introduced to ld.bfd in 2018, to place the text segment in its own pages. There is no overlap in pages between an executable segment and a non-executable segment: 1) RX cannot load initial contents from R or RW(or non-SHF_ALLOC). 2) R and RW(or non-SHF_ALLOC) cannot load initial contents from RX. lld's current status: - Between R and RX: in `Writer<ELFT>::fixSectionAlignments()`, the start of a segment is always aligned to maxPageSize, so the initial contents loaded by R and RX do not overlap. I plan to allow overlaps in D64906 if -z noseparate-code is in effect. - Between RX and RW(or non-SHF_ALLOC if RW doesn't exist): we currently unconditionally pad the last page to commonPageSize (defaults to 4096 on all targets we support). This patch will make it effective only if -z separate-code is specified. -z separate-code is a dubious feature that intends to reduce the number of ROP gadgets (which is actually ineffective because attackers can find plenty of gadgets in the text segment, no need to find gadgets in non-code regions). With the overlapping PT_LOAD technique D64906, -z noseparate-code removes two more alignments at segment boundaries than -z separate-code. This saves at most defaultCommonPageSize2 bytes, which are significant on targets with large defaultCommonPageSize (AArch64/MIPS/PPC: 65536). Issues/feedback on alignment at segment boundaries to help understand the implication: binutils PR24490 (the situation on ld.bfd is worse because they have two R-- on both sides of R-E so more alignments.) * In binutils, the 2018-02-27 commit "ld: Add --enable-separate-code" made -z separate-code the default on Linux. `d969dea983` In musl-cross-make, binutils is configured with --disable-separate-code to address size regressions caused by -z separate-code. (lld actually has the same issue, which I plan to fix in a future patch. The ld.bfd x86 status is worse because they default to max-page-size=0x200000). * https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237676 people want smaller code size. This patch will remove one alignment boundary. * Stef O'Rear: I'm opposed to any kind of page alignment at the text/rodata line (having a partial page of text aliased as rodata and vice versa has no demonstrable harm, and I actually care about small systems). So, make -z noseparate-code the default. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64903 llvm-svn: 367537	2019-08-01 09:58:25 +00:00
Fangrui Song	b159906a9a	[test] Change llvm-readobj -long-option to --long-option or well-known short options. NFC Also change some options that have different semantics (cause confusion) in llvm-readelf mode: -s => -S -t => --symbols -sd => --section-data llvm-svn: 359651	2019-05-01 05:49:01 +00:00
Dimitry Andric	c9de3b4d26	Align AArch64 and i386 image base to superpage Summary: As for x86_64, the default image base for AArch64 and i386 should be aligned to a superpage appropriate for the architecture. On AArch64, this is 2 MiB, on i386 it is 4 MiB. Reviewers: emaste, grimar, javed.absar, espindola, ruiu, peter.smith, srhines, rprichard Reviewed By: ruiu, peter.smith Subscribers: jfb, markj, arichardson, krytarowski, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D50297 llvm-svn: 342746	2018-09-21 16:58:13 +00:00
Fangrui Song	eb75b8f8f7	[ELF] Move `# REQUIRES:` line to the top llvm-svn: 335625	2018-06-26 16:58:19 +00:00
Rafael Espindola	a5d43d004a	Propagate sh_entsize out. No difference in practice other than having sh_entsize in the output. This should simplify the patch for handling SHF_MERGE in -r. Based on a patch by George Rimar. llvm-svn: 318306	2017-11-15 16:56:20 +00:00
Petr Hosek	7ab9f7be0c	[ELF] Set p_memsz to p_filesz when aligning the last segment to page boundary Having p_filesz different from p_memsz is confusing some tools. Differential Revision: https://reviews.llvm.org/D37369 llvm-svn: 312384	2017-09-01 21:48:20 +00:00
Petr Hosek	edd6c3587c	[ELF] When the code segment is the last, align it to the page boundary When the data segment is the last segment, it is correct to leave it unaligned. However, when the code segment is the last segment, it should be aligned to the page boundary to avoid loading the non-segment parts of the ELF file at the end of the file. Differential Revision: https://reviews.llvm.org/D33630 llvm-svn: 309829	2017-08-02 16:35:00 +00:00
Rafael Espindola	2532431332	Stop propagating Entsize. Now that we combine multiple synthetic merge section into one output section there is no point in trying to propagate a value. llvm-svn: 294048	2017-02-03 21:29:51 +00:00
Rui Ueyama	3da3f06dd3	Include version string into ".comment" section. Summary: This patch adds a ".comment" section to an output. The comment section contains the linker's version string. You can now find out whether a binary is created by LLD or not using objdump command like this. $ objdump -s -j .comment foo foo: file format elf64-x86-64 Contents of section .comment: 0000 00474343 3a202855 62756e74 7520342e .GCC: (Ubuntu 4. 0010 382e342d 32756275 6e747531 7e31342e 8.4-2ubuntu1~14. ... 00c0 766d2f74 72756e6b 20323835 38343629 vm/trunk 285846) 00d0 004c696e 6b65723a 204c4c44 20342e30 .Linker: LLD 4.0 00e0 2e302028 7472756e 6b203238 36343036 .0 (trunk 286406 00f0 2900 ). Compilers emits .comment section as well, so the output contains both compiler and linker information. Alternative considered: I first tried to add a SHT_NOTE section because GNU gold does that. A NOTE section starts with a header which contains content type. It turned out that ld.gold sets type NT_GNU_GOLD_VERSION to their NOTE section. So the NOTE type is only for GNU gold (surprise!) Next, I tried to create ".linker-version" section. However, it seems that reusing the existing ".comment" section is better because 1) other tools already know about .comment section and is able to strip it and 2) the result contans not only linker info but also compiler info. Differential Revision: https://reviews.llvm.org/D26487 llvm-svn: 286496	2016-11-10 20:20:37 +00:00
Rui Ueyama	5fc84a1828	Remove string table offsets from tests. <N> where "foo (<N>)" is the offset of string "foo" in the string table. llvm-svn: 285751	2016-11-01 21:26:28 +00:00
Eugene Leviant	ee8dcfbdf7	[ELF] Set max page size to 64K for AArch64 Differential revision: https://reviews.llvm.org/D25079 llvm-svn: 283200	2016-10-04 08:58:55 +00:00
Rui Ueyama	144debcc0f	Remove unnecessary trailing semicolons. Since this semicolon existed in an early test file, it has spread to many files. llvm-svn: 267659	2016-04-27 02:58:27 +00:00
Rui Ueyama	76c0063eeb	ELF: Improve performance of string table construction. String tables in unstripped executable files are fairly large in size. For example, lld's executable file is about 34.4 MB in my environment, and of which 3.5 MB is the string table. Efficiency of string table construction matters. Previously, the string table was built in an inefficient way. We used StringTableBuilder to build that and enabled string tail merging, although tail merging is not effective for the symbol table (you can only make the string table 0.3% smaller for lld.) Tail merging is computation intensive task and slow. This patch eliminates string tail merging. I changed the way of adding strings to the string table in this patch too. Previously, strings were added using add() and the same strings were then passed to getOffset() to get their offsets in the string table. In this way, getOffset() needs to look up a hash table to get offsets for given strings. This is a violation of "we look up the symbol table (or a hash table) only once for each symbol" dogma of the new LLD's design. Hash table lookup for long C++ mangled names is slow. I eliminated that lookup in this patch. In total, this patch improves link time of lld itself about 12% (3.50 seconds -> 3.08 seconds.) llvm-svn: 257017	2016-01-07 02:35:32 +00:00
Rui Ueyama	7b19c34550	Revert "ELF: Make .note.GNU-stack more compatible with traditional linkers." This reverts commit r253797 because it was based on a misunderstanding that lld wouldn't work on NetBSD without this change. llvm-svn: 254003	2015-11-24 18:48:16 +00:00
Rui Ueyama	e79b09a616	ELF: Make .note.GNU-stack more compatible with traditional linkers. With this patch, lld creates PT_GNU_STACK segments only when all input files have .note.GNU-stack sections. This is in line with other linkers with a minor difference (we don't care about .note.GNU-stack rwx bits as you can always remove .note.GNU-stack sections instead of setting x bit.) At least, NetBSD loader does not understand PT_GNU_STACK segments and reject any executables that have the section. This patch makes lld compatible with such operating systems. llvm-svn: 253797	2015-11-21 22:19:32 +00:00
Rafael Espindola	9c8904fb38	Rename ld.lld2 to ld.lld since it is the default. llvm-svn: 253437	2015-11-18 06:11:01 +00:00
Rafael Espindola	4b1285c55a	Rename test/elf2 to test/ELF. llvm-svn: 253313	2015-11-17 05:36:42 +00:00

19 Commits