llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	891a8655ab	[ELF] Add IpltSection PltSection is used by both PLT and IPLT. The PLT section may have a header while the IPLT section does not. Split off IpltSection from PltSection to be clearer. Unlike other targets, PPC64 cannot use the same code sequence for PLT and IPLT. This helps make a future PPC64 patch (D71509) more isolated. On EM_386 and EM_X86_64, when PLT is empty while IPLT is not, currently we are inconsistent whether the PLT header is conceptually attached to in.plt or in.iplt . Consistently attach the header to in.plt can make the -z retpolineplt logic simpler. It also makes `jmp` point to an aesthetically better place for non-retpolineplt cases. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D71519	2019-12-17 00:06:04 -08:00
Fangrui Song	90d195d026	[ELF] Delete relOff from TargetInfo::writePLT This change only affects EM_386. relOff can be computed from `index` easily, so it is unnecessarily passed as a parameter. Both in.plt and in.iplt entries are written by writePLT. For in.iplt, the instruction `push reloc_offset` will change because `index` is now different. Fortunately, this does not matter because `push; jmp` is only used by PLT. IPLT does not need the code sequence. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D71518	2019-12-16 11:10:02 -08:00
Fangrui Song	bd8cfe65f5	[ELF] Wrap things in `namespace lld { namespace elf {`, NFC This makes it clear `ELF/*/.cpp` files define things in the `lld::elf` namespace and simplifies `elf::foo` to `foo`. Reviewed By: atanasyan, grimar, ruiu Differential Revision: https://reviews.llvm.org/D68323 llvm-svn: 373885	2019-10-07 08:31:18 +00:00
Rui Ueyama	3837f4273f	[Coding style change] Rename variables so that they start with a lowercase letter This patch is mechanically generated by clang-llvm-rename tool that I wrote using Clang Refactoring Engine just for creating this patch. You can see the source code of the tool at https://reviews.llvm.org/D64123. There's no manual post-processing; you can generate the same patch by re-running the tool against lld's code base. Here is the main discussion thread to change the LLVM coding style: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html In the discussion thread, I proposed we use lld as a testbed for variable naming scheme change, and this patch does that. I chose to rename variables so that they are in camelCase, just because that is a minimal change to make variables to start with a lowercase letter. Note to downstream patch maintainers: if you are maintaining a downstream lld repo, just rebasing ahead of this commit would cause massive merge conflicts because this patch essentially changes every line in the lld subdirectory. But there's a remedy. clang-llvm-rename tool is a batch tool, so you can rename variables in your downstream repo with the tool. Given that, here is how to rebase your repo to a commit after the mass renaming: 1. rebase to the commit just before the mass variable renaming, 2. apply the tool to your downstream repo to mass-rename variables locally, and 3. rebase again to the head. Most changes made by the tool should be identical for a downstream repo and for the head, so at the step 3, almost all changes should be merged and disappear. I'd expect that there would be some lines that you need to merge by hand, but that shouldn't be too many. Differential Revision: https://reviews.llvm.org/D64121 llvm-svn: 365595	2019-07-10 05:00:37 +00:00
Fangrui Song	2fb6b0f2ba	[ELF][PPC][X86] Use [-2(n-1), 2n) to check overflows for R_PPC_ADDR16, R_PPC64_ADDR{16,32}, R_X86_64_{8,16} Similar to R_AARCH64_ABS32, R_PPC64_ADDR32 can represent either a signed value or unsigned value, thus we should use `[-2(n-1), 2n)` instead of `[-2(n-1), 2(n-1))` to check overflows. The issue manifests as a bogus linker error when linking the powerpc64le Linux kernel. The new behavior is compatible with ld.bfd's complain_overflow_bitfield. The upper bound of the error message is not correct. Fix it as well. The changes to R_PPC_ADDR16, R_PPC64_ADDR16, R_X86_64_8 and R_X86_64_16 are similar. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D63690 llvm-svn: 364164	2019-06-24 05:37:20 +00:00
Fangrui Song	025a815d75	[ELF] Make the rule to create relative relocations in a writable section stricter The current rule is loose: `!Sym.IsPreemptible \|\| Expr == R_GOT`. When the symbol is non-preemptable, this allows absolute relocation types with smaller numbers of bits, e.g. R_X86_64_{8,16,32}. They are disallowed by ld.bfd and gold, e.g. ld.bfd: a.o: relocation R_X86_64_8 against `.text' can not be used when making a shared object; recompile with -fPIC This patch: a) Add TargetInfo::SymbolicRel to represent relocation types that resolve to a symbol value (e.g. R_AARCH_ABS64, R_386_32, R_X86_64_64). As a side benefit, we currently (ab)use GotRel (R__GLOB_DAT) to resolve GOT slots that are link-time constants. Since we now use Target->SymbolRel to do the job, we can remove R__GLOB_DAT from relocateOne() for all targets. R_*_GLOB_DAT cannot be used as static relocation types. b) Change the condition to `!Sym.IsPreemptible && Type != Target->SymbolicRel \|\| Expr == R_GOT`. Some tests are caught by the improved error checking (ld.bfd/gold also issue errors on them). Many misuse .long where .quad should be used instead. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D63121 llvm-svn: 363059	2019-06-11 12:59:30 +00:00
Peter Collingbourne	0282898586	ELF: Create synthetic sections for loadable partitions. We create several types of synthetic sections for loadable partitions, including: - The dynamic symbol table. This allows code outside of the loadable partitions to find entry points with dlsym. - Creating a dynamic symbol table also requires the creation of several other synthetic sections for the partition, such as the dynamic table and hash table sections. - The partition's ELF header is represented as a synthetic section in the combined output file, and will be used by llvm-objcopy to extract partitions. Differential Revision: https://reviews.llvm.org/D62350 llvm-svn: 362819	2019-06-07 17:57:58 +00:00
Fangrui Song	e98baf8631	[ELF] Delete GotEntrySize and GotPltEntrySize GotEntrySize and GotPltEntrySize were added in D22288. Later, with the introduction of wordsize() (then Config->Wordsize), they become redundant, because there is no target that sets GotEntrySize or GotPltEntrySize to a number different from Config->Wordsize. Reviewed By: grimar, ruiu Differential Revision: https://reviews.llvm.org/D62727 llvm-svn: 362220	2019-05-31 10:35:45 +00:00
Fangrui Song	719322411c	[ELF] Implement General Dynamic style TLSDESC for x86-64 This handles two initial relocation types R_X86_64_GOTPC32_TLSDESC and R_X86_64_TLSDESC_CALL, as well as the GD->LE and GD->IE relaxations. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D62513 llvm-svn: 361911	2019-05-29 02:03:56 +00:00
Fangrui Song	912251e82f	[PPC64] toc-indirect to toc-relative relaxation This is based on D54720 by Sean Fertile. When accessing a global symbol which is not defined in the translation unit, compilers will generate instructions that load the address from the toc entry. If the symbol is defined, non-preemptable, and addressable with a 32-bit signed offset from the toc pointer, the address can be computed directly. e.g. addis 3, 2, .LC0@toc@ha # R_PPC64_TOC16_HA ld 3, .LC0@toc@l(3) # R_PPC64_TOC16_LO_DS, load the address from a .toc entry ld/lwa 3, 0(3) # load the value from the address .section .toc,"aw",@progbits .LC0: .tc var[TC],var can be relaxed to addis 3,2,var@toc@ha # this may be relaxed to a nop, addi 3,3,var@toc@l # then this becomes addi 3,2,var@toc ld/lwa 3, 0(3) # load the value from the address We can delete the test ppc64-got-indirect.s as its purpose is covered by newly added ppc64-toc-relax.s and ppc64-toc-relax-constants.s Reviewed By: ruiu, sfertile Differential Revision: https://reviews.llvm.org/D60958 llvm-svn: 360112	2019-05-07 04:26:05 +00:00
Fangrui Song	bc4b159bb1	[ELF][X86] Allow R_386_TLS_LDO_32 and R_X86_64_DTPOFF{32,64} to preemptable local-dynamic symbols Summary: Fixes PR35242. A simplified reproduce: thread_local int i; int f() { return i; } % {g++,clang++} -fPIC -shared -ftls-model=local-dynamic -fuse-ld=lld a.cc ld.lld: error: can't create dynamic relocation R_X86_64_DTPOFF32 against symbol: i in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output In isStaticLinkTimeConstant(), Syn.IsPreemptible is true, so it is not seen as a constant. The error is then issued in processRelocAux(). A symbol of the local-dynamic TLS model cannot be preempted but it can preempt symbols of the global-dynamic TLS model in other DSOs. So it makes some sense that the variable is not static. This patch fixes the linking error by changing getRelExpr() on R_386_TLS_LDO_32 and R_X86_64_DTPOFF{32,64} from R_ABS to R_DTPREL. R_PPC64_DTPREL_* and R_MIPS_TLS_DTPREL_* need similar fixes, but they are not handled in this patch. As a bonus, we use `if (Expr == R_ABS && !Config->Shared)` to find ld-to-le opportunities. R_ABS is overloaded here for such STT_TLS symbols. A dedicated R_DTPREL is clearer. Differential Revision: https://reviews.llvm.org/D60945 llvm-svn: 358870	2019-04-22 03:10:40 +00:00
Rui Ueyama	8521ba37d7	Make a member function a non-member function. Since this member function doesn't use anything in the class, it doesn't have to be a member of the class. llvm-svn: 357193	2019-03-28 17:35:00 +00:00
Rui Ueyama	676d25ab94	De-template X86_64TargetInfo. NFC. llvm-svn: 357191	2019-03-28 17:31:12 +00:00
Rui Ueyama	a0a50a7a5b	Inline a trivial function. NFC. I found that hiding this particular actual expression doesn't help readers understand the code. So I remove and inline that function. llvm-svn: 357140	2019-03-28 01:37:48 +00:00
Fangrui Song	210949a221	[ELF] Change GOT_FROM_END (relative to end(.got)) to GOTPLT (start(.got.plt)) Summary: This should address remaining issues discussed in PR36555. Currently R_GOT_FROM_END are exclusively used by x86 and x86_64 to express relocations types relative to the GOT base. We have _GLOBAL_OFFSET_TABLE_ (GOT base) = start(.got.plt) but end(.got) != start(.got.plt) This can have problems when _GLOBAL_OFFSET_TABLE_ is used as a symbol, e.g. glibc dl_machine_dynamic assumes _GLOBAL_OFFSET_TABLE_ is start(.got.plt), which is not true. extern const ElfW(Addr) _GLOBAL_OFFSET_TABLE_[] attribute_hidden; return _GLOBAL_OFFSET_TABLE_[0]; // R_X86_64_GOTPC32 In this patch, we Change all GOT_FROM_END to GOTPLT to fix the problem. * Add HasGotPltOffRel to denote whether .got.plt should be kept even if the section is empty. * Simplify GotSection::empty and GotPltSection::empty by setting HasGotOffRel and HasGotPltOffRel according to GlobalOffsetTable early. The change of R_386_GOTPC makes X86::writePltHeader simpler as we don't have to compute the offset start(.got.plt) - Ebx (it is constant 0). We still diverge from ld.bfd (at least in most cases) and gold in that .got.plt and .got are not adjacent, but the advantage doing that is unclear. Reviewers: ruiu, sivachandra, espindola Subscribers: emaste, mehdi_amini, arichardson, dexonsmith, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59594 llvm-svn: 356968	2019-03-25 23:46:19 +00:00
Rui Ueyama	b8b81e9b43	Improve error message for unknown relocations. Previously, we showed the following message for an unknown relocation: foo.o: unrecognized reloc 256 This patch improves it so that the error message includes a symbol name: foo.o: unknown relocation (256) against symbol bar llvm-svn: 354040	2019-02-14 18:02:20 +00:00
Rui Ueyama	04cbd988f9	Fix a bug in R_X86_64_PC{8,16} relocation handling. R_X86_64_PC{8,16} relocations are sign-extended, so when we check for relocation overflow, we had to use checkInt instead of checkUInt. I confirmed that GNU linkers create the same output for the test case. llvm-svn: 353437	2019-02-07 18:12:57 +00:00
George Rimar	55f7c72bea	[LLD][ELF] - Set DF_STATIC_TLS flag for X64 target This is the same as D57749, but for x64 target. "ELF Handling For Thread-Local Storage" p41 says (https://www.akkadia.org/drepper/tls.pdf): R_X86_64_GOTTPOFF relocation is used for IE TLS models. Hence if linker sees this relocation we should add DF_STATIC_TLS flag. Differential revision: https://reviews.llvm.org/D57821 llvm-svn: 353378	2019-02-07 07:59:43 +00:00
Rui Ueyama	33dbcbb2bc	Support R_X86_64_PC8 and R_X86_64_PC16. They are defined by the x86-64 ELF ABI standard. Differential Revision: https://reviews.llvm.org/D57799 llvm-svn: 353314	2019-02-06 16:50:09 +00:00
Fangrui Song	f55e9a2d2e	[PPC64] Set the number of relocations processed for R_PPC64_TLS[GL]D to 2 Summary: R_PPC64_TLSGD and R_PPC64_TLSLD are used as markers on TLS code sequences. After GD-to-IE or GD-to-LE relaxation, the next relocation R_PPC64_REL24 should be skipped to not create a false dependency on __tls_get_addr. When linking statically, the false dependency may cause an "undefined symbol: __tls_get_addr" error. R_PPC64_GOT_TLSGD16_HA R_PPC64_GOT_TLSGD16_LO R_PPC64_TLSGD R_TLSDESC_CALL R_PPC64_REL24 __tls_get_addr Reviewers: ruiu, sfertile, syzaara, espindola Reviewed By: sfertile Subscribers: emaste, nemanjai, arichardson, kbarton, jsji, llvm-commits, tamur Tags: #llvm Differential Revision: https://reviews.llvm.org/D57673 llvm-svn: 353262	2019-02-06 02:00:24 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Peter Wu	5391489674	[ELF][X86_64] Fix corrupted LD -> LE optimization for TLS without PLT The LD -> LE optimization for Thread-Local Storage without PLT requires an additional "66" prefix, otherwise the next instruction will be corrupted, causing runtime misbehavior (crashes) of the linked object. The other (GD -> IE/LD) optimizations are the same with or without PLT, but add tests for completeness. The instructions are copied from https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf#subsection.11.1.2 This does not try to address ILP32 (x32) support. Fixes https://bugs.llvm.org/show_bug.cgi?id=37303 Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D56779 llvm-svn: 351396	2019-01-16 23:28:51 +00:00
Rui Ueyama	63d397ea6e	Simplify Symbol::getPltVA. This patch also makes getPltEntryOffset a non-member function because it doesn't depend on any private members of the TargetInfo class. I tried a few different ideas, and it seems this change fits in best to me. Differential Revision: https://reviews.llvm.org/D54981 llvm-svn: 347781	2018-11-28 17:42:59 +00:00
Simon Atanasyan	b0486051d2	[ELF] Make TrapInstr and Filler byte arrays. NFC. The uint32_t type does not clearly convey that these fields are interpreted in the target endianness. Converting them to byte arrays should make this more obvious and less error-prone. Patch by James Clarke Differential Revision: http://reviews.llvm.org/D54207 llvm-svn: 346893	2018-11-14 21:05:20 +00:00
Sean Fertile	4b5ec7fb80	Reland "[PPC64] Add split - stack support." Recommitting https://reviews.llvm.org/rL344544 after fixing undefined behavior from left-shifting a negative value. Original commit message: This support is slightly different then the X86_64 implementation in that calls to __morestack don't need to get rewritten to calls to __moresatck_non_split when a split-stack caller calls a non-split-stack callee. Instead the size of the stack frame requested by the caller is adjusted prior to the call to __morestack. The size the stack-frame will be adjusted by is tune-able through a new --split-stack-adjust-size option. llvm-svn: 344622	2018-10-16 17:13:01 +00:00
Sean Fertile	831a1336ff	Revert "[PPC64] Add split - stack support." This reverts commit https://reviews.llvm.org/rL344544, which causes failures on a undefined behaviour sanitizer bot --> lld/ELF/Arch/PPC64.cpp:849:35: runtime error: left shift of negative value -1 llvm-svn: 344551	2018-10-15 20:20:28 +00:00
Sean Fertile	795cc9332b	[PPC64] Add split - stack support. This support is slightly different then the X86_64 implementation in that calls to __morestack don't need to get rewritten to calls to __moresatck_non_split when a split-stack caller calls a non-split-stack callee. Instead the size of the stack frame requested by the caller is adjusted prior to the call to __morestack. The size the stack-frame will be adjusted by is tune-able through a new --split-stack-adjust-size option. Differential Revision: https://reviews.llvm.org/D52099 llvm-svn: 344544	2018-10-15 19:05:57 +00:00
George Rimar	95aae4c59d	[ELF] - Do not fail on R__NONE relocations when parsing the debug info. This is https://bugs.llvm.org//show_bug.cgi?id=38919. Currently, LLD may report "unsupported relocation target while parsing debug info" when parsing the debug information. At the same time LLD does that for zeroed R_X86_64_NONE relocations, which obviously has "invalid" targets. The nature of R__NONE relocation assumes them should be ignored. This patch teaches LLD to stop reporting the debug information parsing errors for them. Differential revision: https://reviews.llvm.org/D52408 llvm-svn: 343078	2018-09-26 08:11:34 +00:00
Rui Ueyama	4e247522ac	Reset input section pointers to null on each linker invocation. Previously, if you invoke lld's `main` more than once in the same process, the second invocation could fail or produce a wrong result due to a stale pointer values of the previous run. Differential Revision: https://reviews.llvm.org/D52506 llvm-svn: 343009	2018-09-25 19:26:58 +00:00
Jordan Rupprecht	0f6d31812e	[LLD] Update split stack support to handle more generic prologues. Improve error handling. Add test file for better code-coverage. Update tests to be more complete. Submitting patch on behalf of saugustine. Differential Revision: https://reviews.llvm.org/D49926 llvm-svn: 338750	2018-08-02 18:13:40 +00:00
Sterling Augustine	2526104ee4	Workaround warning bug in old versions of gcc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56480 llvm-svn: 337340	2018-07-18 00:33:25 +00:00
Sterling Augustine	4fd84c18df	Implement framework for linking split-stack object files, and x86_64 support. llvm-svn: 337332	2018-07-17 23:16:02 +00:00
Fangrui Song	4ff63648ad	[ELF][X86_64] Use R_GOTREL_FROM_END instead of R_GOTREL for R_X86_64_GOTOFF64 Summary: R_X86_64_GOTOFF64: S + A - GOT R_X86_64_GOTPC{32,64}: GOT + A - P (R_GOTONLY_PC_FROM_END) R_X86_64_GOTOFF64 should use R_GOTREL_FROM_END so that in conjunction with R_X86_64_GOTPC{32,64}, the `GOT` term is neutralized. This also matches the handling of R_386_GOTOFF (S + A - GOT). Reviewers: ruiu, espindola Subscribers: emaste, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D48095 llvm-svn: 334672	2018-06-13 23:29:28 +00:00
Rui Ueyama	864289990a	Handle R_X86_64_GOTOFF64. R_X86_64_GOTOFF64 is a relocation type to set to a distance betwween a symbol and the beginning of the .got section. Previously, we always created a dynamic relocation for the relocation type even though it can be resolved at link-time. Creating a dynamic relocation for R_X86_64_GOTOFF64 caused link failure for some programs that do have a relocation of the type in a .text section, as text relocations are prohibited in most configurations. Differential Revision: https://reviews.llvm.org/D48058 llvm-svn: 334534	2018-06-12 20:27:16 +00:00
Fangrui Song	aa473c5387	[ELF] Support R_X86_64_GOTPC{32,64} Reviewers: ruiu, grimar, espindola Subscribers: emaste, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D47098 llvm-svn: 334532	2018-06-12 20:18:41 +00:00
Rui Ueyama	1fa3c728b9	Fix retpoline PLT for x86-64 when used for >4GB address. Previously, we wrote only the least significant 32 bits. llvm-svn: 333313	2018-05-25 21:14:45 +00:00
Rui Ueyama	95d6ca52ac	Add a comment for retpoline PLT. llvm-svn: 333312	2018-05-25 21:02:47 +00:00
Rui Ueyama	7ae70fc14a	Fix a bug that we truncated GOTPLT entries to 32 bits. llvm-svn: 333294	2018-05-25 18:26:14 +00:00
Benjamin Kramer	5455038d98	[lld] Make helpers static. NFC. llvm-svn: 332408	2018-05-15 22:01:54 +00:00
George Rimar	f9936e1fc9	[ELF] - Eliminate Target::isPicRel method. As was mentioned in comments for D45158, isPicRel's name does not make much sense, because what this method does is checks if we need to create the dynamic relocation or not. Instead of renaming it to something different, we can 'isPicRel' completely. We can reuse the getDynRel method. They are logically very close, getDynRel can just return R_*_NONE in case no dynamic relocation should be produced and that would simplify things and avoid functionality correlation/duplication with 'isPicRel'. The patch does this change. Differential revision: https://reviews.llvm.org/D45248 llvm-svn: 329275	2018-04-05 12:07:20 +00:00
George Rimar	785a79133b	[ELF] - X86_64: Use white list for relocations checked by isPicRel. isPicRel is used to check if we want to create the dynamic relocations. Not all of the dynamic relocations we create are passing through this check, but those that are, probably better be whitelisted. Differential revision: https://reviews.llvm.org/D45252 llvm-svn: 329203	2018-04-04 15:21:21 +00:00
George Rimar	c6735c23d2	[ELF] - X86_64: don't allow 8/16 bit dynamic relocations. Having 8/16 bits dynamic relocations is incorrect. Both gold and bfd (built from latest sources) disallow that too. Differential revision: https://reviews.llvm.org/D45158 llvm-svn: 329059	2018-04-03 11:58:23 +00:00
Rui Ueyama	f001ead490	Do not use template for check{Int,UInt,IntUInt,Alignment}. Template is just unnecessary. Differential Revision: https://reviews.llvm.org/D45063 llvm-svn: 328843	2018-03-29 22:40:52 +00:00
Andrew Ng	fe1d346f99	[ELF] Fix X86 & X86_64 PLT retpoline padding The PLT retpoline support for X86 and X86_64 did not include the padding when writing the header and entries. This issue was revealed when linker scripts were used, as this disables the built-in behaviour of filling the last page of executable segments with trap instructions. This particular behaviour was hiding the missing padding. Added retpoline tests with linker scripts. Differential Revision: https://reviews.llvm.org/D44682 llvm-svn: 328777	2018-03-29 14:03:01 +00:00
Peter Smith	3d044f57d4	[ELF] Recommit 327248 with Arm using the .got for _GLOBAL_OFFSET_TABLE_ This is the same as 327248 except Arm defining _GLOBAL_OFFSET_TABLE_ to be the base of the .got section as some existing code is relying upon it. For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] = reserved value that is by convention the address of the dynamic section. Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end of the .got section with the intention that the .got.plt section would follow the .got. However this does not always hold with the current default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent with the reserved first entry of the .got.plt. X86, X86_64 and AArch64 will use the .got.plt. Arm, Mips and Power use .got Fixes PR36555 Differential Revision: https://reviews.llvm.org/D44259 llvm-svn: 327823	2018-03-19 06:52:51 +00:00
Peter Collingbourne	5c902845e5	Revert r327248, "For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at" This change broke ARM code that expects to be able to add _GLOBAL_OFFSET_TABLE_ to the result of an R_ARM_REL32. I will provide a reproducer on llvm-commits. llvm-svn: 327688	2018-03-16 01:01:44 +00:00
Rafael Espindola	74acdfa691	Reduce code duplication a bit. The code for computing the offset of an entry in the plt is simple, but it was duplicated in quite a few places. llvm-svn: 327536	2018-03-14 17:41:34 +00:00
Peter Smith	18aa0be36e	For most Targets the _GLOBAL_OFFSET_TABLE_ symbol is expected to be at the start of the .got.plt section so that _GLOBAL_OFFSET_TABLE_[0] = reserved value that is by convention the address of the dynamic section. Previously we had defined _GLOBAL_OFFSET_TABLE_ as either the start or end of the .got section with the intention that the .got.plt section would follow the .got. However this does not always hold with the current default section ordering so _GLOBAL_OFFSET_TABLE_[0] may not be consistent with the reserved first entry of the .got.plt. X86, X86_64, Arm and AArch64 will use the .got.plt. Mips and Power use .got Fixes PR36555 Differential Revision: https://reviews.llvm.org/D44259 llvm-svn: 327248	2018-03-11 20:58:18 +00:00
Chandler Carruth	c58f2166ab	Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took in the speculative execution, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` in addition to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile all libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we strongly recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155	2018-01-22 22:05:25 +00:00
Rui Ueyama	17a3077f59	Make it clear where is a placeholder for later binary patching. This is an aesthetic change to represent a placeholder for later binary patching as "0, 0, 0, 0" instead of "0x00, 0x00, 0x00, 0x00". The former is how we represent it in COFF, and I found it easier to read than the latter. llvm-svn: 321471	2017-12-27 06:54:18 +00:00

1 2

60 Commits