llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	4adf7a7604	[ELF] Add -Bno-symbolic This option will be available in GNU ld 2.27 (https://sourceware.org/bugzilla/show_bug.cgi?id=27834). This option can cancel previously specified -Bsymbolic and -Bsymbolic-functions. This is useful for excluding some links when the default uses -Bsymbolic-functions. Reviewed By: jhenderson, peter.smith Differential Revision: https://reviews.llvm.org/D102383	2021-05-14 09:40:32 -07:00
Fangrui Song	16c30c3c23	[ELF] Change --shuffle-sections=<seed> to --shuffle-sections=<section-glob>=<seed> `--shuffle-sections=<seed>` applies to all sections. The new `--shuffle-sections=<section-glob>=<seed>` makes shuffling selective. To the best of my knowledge, the option is only used as debugging, so just drop the original form. `--shuffle-sections '.init_array=-1'` `--shuffle-sections '.fini_array=-1'`. reverses static constructors/destructors of the same priority. Useful to detect some static initialization order fiasco. `--shuffle-sections '.data=-1'` reverses `.data` sections. Useful to detect unfunded pointer comparison results of two unrelated objects. If certain sections have an intrinsic order, the old form cannot be used. Differential Revision: https://reviews.llvm.org/D98679	2021-03-18 10:18:19 -07:00
Albion Fung	36192790d8	[PowerPC][PC Rel] Implement option to omit Power10 instructions from stubs Implemented the option to omit Power10 instructions from save stubs via the option --no-power10-stubs or --power10-stubs=no on lld. --power10-stubs= will override the other option. --power10-stubs=auto also exists to use the default behaviour (ie allow Power10 instructions in stubs). Differential Revision: https://reviews.llvm.org/D94627	2021-03-04 13:27:46 -05:00
Fangrui Song	4bbcd63eea	[ELF] Add -z start-stop-gc to let __start_/__stop_ not retain C identifier name sections For one metadata section usage, each text section references a metadata section. The metadata sections have a C identifier name to allow the runtime to collect them via `__start_/__stop_` symbols. Since `__start_`/`__stop_` references are always present from live sections, the C identifier name sections appear like GC roots, which means they cannot be discarded by `ld --gc-sections`. To make such sections GCable, either SHF_LINK_ORDER or a section group is needed. SHF_LINK_ORDER is not suitable for the references can be inlined into other functions (See D97430: Function A (in the section .text.A) references its `__sancov_guard` section. Function B inlines A (so now .text.B references `__sancov_guard` - this is invalid with the semantics of SHF_LINK_ORDER). In the linking stage, if `.text.A` gets discarded, and `__sancov_guard` is retained via the reference from `.text.B`, the output will be invalid because `__sancov_guard` references the discarded `.text.A`. LLD errors "sh_link points to discarded section". ) A section group have size overhead, and is cumbersome when there is just one metadata section. Add `-z start-stop-gc` to drop the "__start_/__stop_ references retain non-SHF_LINK_ORDER non-SHF_GROUP C identifier name sections" rule. We reserve the rights to switch the default in the future. Reviewed By: phosek, jrtc27 Differential Revision: https://reviews.llvm.org/D96914	2021-02-25 15:46:37 -08:00
Fangrui Song	eea34aae2e	[ELF] Inspect -EL & -EB for OUTPUT_FORMAT(default, big, little) Choose big if -EB is specified, little if -EL is specified, or default if neither is specified. The new behavior matches GNU ld. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1025 Differential Revision: https://reviews.llvm.org/D96214	2021-02-08 10:34:57 -08:00
Fangrui Song	57bfa2ddb6	[ELF] Delete unused --warn-ifunc-textrel The option catches incompatibility between `R_*_IRELATIVE` and DT_TEXTREL/DF_TEXTREL before glibc 2.29. Newer glibc versions are more common nowadays and I don't think this option has ever been used. Diagnosing this problem is also straightforward by reading the stack trace.	2021-02-02 09:47:06 -08:00
Hongtao Yu	8aa3ee241d	[CSSPGO] LTO option for pseudo probe Adding a lld option to support emitting pseudo probe metadata in LTO mode. Reviewed By: MaskRay, wmi, wenlei Differential Revision: https://reviews.llvm.org/D95056	2021-01-22 11:07:10 -08:00
Sean Fertile	8f91f38148	[LLD] Search archives for symbol defs to override COMMON symbols. This patch changes the archive handling to enable the semantics needed for legacy FORTRAN common blocks and block data. When we have a COMMON definition of a symbol and are including an archive, LLD will now search the members for global/weak defintions to override the COMMON symbol. The previous LLD behavior (where a member would only be included if it satisifed some other needed symbol definition) can be re-enabled with the option '-no-fortran-common'. Differential Revision: https://reviews.llvm.org/D86142	2020-12-07 10:09:19 -05:00
Wei Wang	3acda91742	[Remarks][1/2] Expand remarks hotness threshold option support in more tools This is the #1 of 2 changes that make remarks hotness threshold option available in more tools. The changes also allow the threshold to sync with hotness threshold from profile summary with special value 'auto'. This change modifies the interface of lto::setupLLVMOptimizationRemarks() to accept remarks hotness threshold. Update all the tools that use it with remarks hotness threshold options: * lld: '--opt-remarks-hotness-threshold=' * llvm-lto2: '--pass-remarks-hotness-threshold=' * llvm-lto: '--lto-pass-remarks-hotness-threshold=' * gold plugin: '-plugin-opt=opt-remarks-hotness-threshold=' Differential Revision: https://reviews.llvm.org/D85809	2020-11-30 21:55:49 -08:00
Fangrui Song	55d310adc0	[ELF] Fix interaction between --unresolved-symbols= and --[no-]allow-shlib-undefined As mentioned in https://reviews.llvm.org/D67479#1667256 , * `--[no-]allow-shlib-undefined` control the diagnostic for an unresolved symbol in a shared object * `-z defs/-z undefs` control the diagnostic for an unresolved symbol in a regular object file * `--unresolved-symbols=` controls both bits. In addition, make --warn-unresolved-symbols affect --no-allow-shlib-undefined. This patch makes the behavior match GNU ld. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D91510	2020-11-17 12:20:57 -08:00
Nemanja Ivanovic	cddb0dbcef	[LLD][PowerPC] Implement GOT to PC-Rel relaxation This patch implements the handling for the R_PPC64_PCREL_OPT relocation as well as the GOT relocation for the associated R_PPC64_GOT_PCREL34 relocation. On Power10 targets with PC-Relative addressing, the linker can relax GOT-relative accesses to PC-Relative under some conditions. Since the sequence consists of a prefixed load, followed by a non-prefixed access (load or store), the linker needs to replace the first instruction (as the replacement instruction will be prefixed). The compiler communicates to the linker that this optimization is safe by placing the two aforementioned relocations on the GOT load (of the address). The linker then does two things: - Convert the load from the got into a PC-Relative add to compute the address relative to the PC - Find the instruction referred to by the second relocation (R_PPC64_PCREL_OPT) and replace the first with the PC-Relative version of it It is important to synchronize the mapping from legacy memory instructions to their PC-Relative form. Hence, this patch adds a file to be included by both the compiler and the linker so they're always in agreement. Differential revision: https://reviews.llvm.org/D84360	2020-08-17 09:36:09 -05:00
Petr Hosek	81eeabbd97	[ELF] Add --dependency-file option Clang and GCC have a feature (-MD flag) to create a dependency file in a format that build systems such as Make or Ninja can read, which specifies all the additional inputs such .h files. This change introduces the same functionality to lld bringing it to feature parity with ld and gold which gained this feature recently. See https://sourceware.org/bugzilla/show_bug.cgi?id=22843 for more details and discussion. The implementation corresponds to -MD -MP compiler flag where the generated dependency file also includes phony targets which works around the errors where the dependency is removed. This matches the format used by ld and gold. Fixes PR42806 Differential Revision: https://reviews.llvm.org/D82437	2020-08-03 16:59:13 -07:00
Petr Hosek	0bd918c828	Revert "[ELF] Add --dependency-file option" This reverts commit `b4c7657ba6` which seems to be breaking certain bots with assertion error.	2020-07-31 01:12:59 -07:00
Petr Hosek	b4c7657ba6	[ELF] Add --dependency-file option Clang and GCC have a feature (-MD flag) to create a dependency file in a format that build systems such as Make or Ninja can read, which specifies all the additional inputs such .h files. This change introduces the same functionality to lld bringing it to feature parity with ld and gold which gained this feature recently. See https://sourceware.org/bugzilla/show_bug.cgi?id=22843 for more details and discussion. The implementation corresponds to -MD -MP compiler flag where the generated dependency file also includes phony targets which works around the errors where the dependency is removed. This matches the format used by ld and gold. Fixes PR42806 Differential Revision: https://reviews.llvm.org/D82437	2020-07-30 12:31:20 -07:00
Fangrui Song	4ce56b8122	[ELF] Add -z dead-reloc-in-nonalloc=<section_glob>=<value> ... to customize the tombstone value we use for an absolute relocation referencing a discarded symbol. This can be used as a workaround when some debug processing tool has trouble with current -1 tombstone value (https://bugs.chromium.org/p/chromium/issues/detail?id=1102223#c11 ) For example, to get the current built-in rules (not considering the .debug_line special case for ICF): ``` -z dead-reloc-in-nonalloc='.debug_=0xffffffffffffffff' -z dead-reloc-in-nonalloc=.debug_loc=0xfffffffffffffffe -z dead-reloc-in-nonalloc=.debug_ranges=0xfffffffffffffffe ``` To get GNU ld (as of binutils 2.35)'s behavior: ``` -z dead-reloc-in-nonalloc='=0' -z dead-reloc-in-nonalloc=.debug_ranges=1 ``` This option has other use cases. For example, if we want to check whether a non-SHF_ALLOC section has dead relocations. With this patch, we can run a regular LLD and run another with a special -z dead-reloc-in-nonalloc=, then compare their output. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D83264	2020-07-08 10:15:16 -07:00
Petr Hosek	fffd05d525	[ELF] Add -z start-stop-visibility= to set __start_/__stop_ symbol visibility This matches the equivalent flag implemented in GNU linkers, see https://sourceware.org/pipermail/binutils/2020-June/111685.html for the associated discussion. Differential Revision: https://reviews.llvm.org/D55682	2020-06-23 15:59:59 -07:00
Hongtao Yu	2638aafe12	[LLD][ThinLTO] Add --thinlto-single-module to allow compiling partial modules. This change introduces an LLD switch --thinlto-single-module to allow compiling only a part of the input modules. This is specifically enables: 1. Fast investigating/debugging modules of interest without spending time on compiling unrelated modules. 2. Compiler debug dump with -mllvm -debug-only= for specific modules. It will be useful for large applications which has 1K+ input modules for thinLTO. The switch can be combined with `--lto-obj-path=` or `--lto-emit-asm` to obtain intermediate object files or assembly files. So far the module name matching is implemented as a fuzzy name lookup where the modules with name containing the switch value are compiled. E.g, Command: ld.lld main.o thin.a --thinlto-single-module=thin.a --lto-obj-path=single.o log: [ThinLTO] Selecting thin.a(thin1.o at 168) to compile [ThinLTO] Selecting thin.a(thin2.o at 228) to compile Command: ld.lld main.o thin.a --thinlto-single-module=thin1.o --lto-obj-path=single.o log: [ThinLTO] Selecting thin.a(thin1.o at 168) to compile Differential Revision: https://reviews.llvm.org/D80406	2020-06-10 15:32:30 -07:00
Sriraman Tallam	e0bca46b08	Options for Basic Block Sections, enabled in D68063 and D73674. This patch adds clang options: -fbasic-block-sections={all,<filename>,labels,none} and -funique-basic-block-section-names. LLVM Support for basic block sections is already enabled. + -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic block sections for all or a subset of basic blocks. "labels" only enables basic block symbols. + -funique-basic-block-section-names: Enables unique section names for basic block sections, disabled by default. Differential Revision: https://reviews.llvm.org/D68049	2020-06-02 00:23:32 -07:00
Fangrui Song	751f18e7d4	[ELF] Refine --export-dynamic-symbol semantics to be compatible GNU ld 2.35 GNU ld from binutils 2.35 onwards will likely support --export-dynamic-symbol but with different semantics. https://sourceware.org/pipermail/binutils/2020-May/111302.html Differences: 1. -export-dynamic-symbol is not supported 2. --export-dynamic-symbol takes a glob argument 3. --export-dynamic-symbol can suppress binding the references to the definition within the shared object if (-Bsymbolic or -Bsymbolic-functions) 4. --export-dynamic-symbol does not imply -u I don't think the first three points can affect any user. For the fourth point, Not implying -u can lead to some archive members unfetched. Add -u foo to restore the previous behavior. Exact semantics: * -no-pie or -pie: matched non-local defined symbols will be added to the dynamic symbol table. * -shared: matched non-local STV_DEFAULT symbols will not be bound to definitions within the shared object even if they would otherwise be due to -Bsymbolic, -Bsymbolic-functions, or --dynamic-list. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D80487	2020-06-01 11:30:03 -07:00
Fangrui Song	b912b887d8	[ELF] Add --print-archive-stats= gold has an option --print-symbol-counts= which prints: // For each archive archive $archive $members $fetched_members // For each object file symbols $object $defined_symbols $used_defined_symbols In most cases, `$defined_symbols = $used_defined_symbols` unless weak symbols are present. Strangely `$used_defined_symbols` includes symbols defined relative to --gc-sections discarded sections. The `symbols` lines do not appear to be useful. `archive` lines are useful: `$fetched_members=0` lines correspond to unused archives. The information can be used to trim dependencies. This patch implements --print-archive-stats= which prints the number of members and the number of fetched members for each archive. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D78983	2020-04-29 18:04:37 -07:00
Hongtao Yu	964ef8eecc	[lld] Support --lto-emit-asm and --plugin-opt=emit-asm Summary: The switch --plugin-opt=emit-asm can be used with the gold linker to dump the final assembly code generated by LTO in a user-friendly way. Unfortunately it doesn't work with lld. I'm hooking it up with lld. With that switch, lld emits assembly code into the output file (specified by -o) and if there are multiple input files, each of their assembly code will be emitted into a separate file named by suffixing the output file name with a unique number, respectively. The linking then stops after generating those assembly files. Reviewers: espindola, wenlei, tejohnson, MaskRay, grimar Reviewed By: tejohnson, MaskRay, grimar Subscribers: pcc, emaste, inglorion, arichardson, hiraditya, MaskRay, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77231	2020-04-27 11:00:46 -07:00
Fangrui Song	232578804a	[ELF] Add --warn-backrefs-exclude=<glob> D77522 changed --warn-backrefs to not warn for linking sandwich problems (-ldef1 -lref -ldef2). This removed lots of false positives. However, glibc still has some problems. libc.a defines some symbols which are normally in libm.a and libpthread.a, e.g. __isnanl/raise. For a linking order `-lm -lpthread -lc`, I have seen: ``` // different resolutions: GNU ld/gold select libc.a(s_isnan.o) as the definition backward reference detected: __isnanl in libc.a(printf_fp.o) refers to libm.a(m_isnanl.o) // different resolutions: GNU ld/gold select libc.a(raise.o) as the definition backward reference detected: raise in libc.a(abort.o) refers to libpthread.a(pt-raise.o) ``` To facilitate deployment of --warn-backrefs, add --warn-backrefs-exclude= so that certain known issues (which may be impractical to fix) can be whitelisted. Deliberate choices: * Not a comma-separated list (`--warn-backrefs-exclude=liba.a,libb.a`). -Wl, splits the argument at commas, so we cannot use commas. --export-dynamic-symbol is similar. * Not in the style of `--warn-backrefs='*' --warn-backrefs=-liba.a`. We just need exclusion, not inclusion. For easier build system integration, we should avoid order dependency. With the current scheme, we enable --warn-backrefs, and indivial libraries can add --warn-backrefs-exclude=<glob> to their LDFLAGS. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D77512	2020-04-20 07:52:15 -07:00
Sriraman Tallam	94317878d8	LLD Support for Basic Block Sections This is part of the Propeller framework to do post link code layout optimizations. Please see the RFC here: https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the detailed RFC doc here: https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf This patch adds lld support for basic block sections and performs relaxations after the basic blocks have been reordered. After the linker has reordered the basic block sections according to the desired sequence, it runs a relaxation pass to optimize jump instructions. Currently, the compiler emits the long form of all jump instructions. AMD64 ISA supports variants of jump instructions with one byte offset or a four byte offset. The compiler generates jump instructions with R_X86_64 32-bit PC relative relocations. We would like to use a new relocation type for these jump instructions as it makes it easy and accurate while relaxing these instructions. The relaxation pass does two things: First, it deletes all explicit fall-through direct jump instructions between adjacent basic blocks. This is done by discarding the tail of the basic block section. Second, If there are consecutive jump instructions, it checks if the first conditional jump can be inverted to convert the second into a fall through and delete the second. The jump instructions are relaxed by using jump instruction mods, something like relocations. These are used to modify the opcode of the jump instruction. Jump instruction mods contain three values, instruction offset, jump type and size. While writing this jump instruction out to the final binary, the linker uses the jump instruction mod to determine the opcode and the size of the modified jump instruction. These mods are required because the input object files are memory-mapped without write permissions and directly modifying the object files requires copying these sections. Copying a large number of basic block sections significantly bloats memory. Differential Revision: https://reviews.llvm.org/D68065	2020-04-07 06:55:57 -07:00
Alexandre Ganea	09158252f7	[ThinLTO] Allow usage of all hardware threads in the system Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled. One can now say in LLD: /opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified. /opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency(). /opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask. When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows). When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows). Differential Revision: https://reviews.llvm.org/D75153	2020-03-27 10:20:58 -04:00
Shoaib Meenai	2822852ffc	[ELF] Correct error message when OUTPUT_FORMAT is used Any OUTPUT_FORMAT in a linker script overrides the emulation passed on the command line, so record the passed bfdname and use that in the error message about incompatible input files. This prevents confusing error messages. For example, if you explicitly pass `-m elf_x86_64` to LLD but accidentally include a linker script which sets `OUTPUT_FORMAT(elf32-i386)`, LLD would previously complain about your input files being compatible with elf_x86_64, which isn't the actual issue, and is confusing because the input files are in fact x86-64 ELF files. Interestingly enough, this also prevents a segfault! When we don't pass `-m` and we have an object file which is incompatible with the `OUTPUT_FORMAT` set by a linker script, the object file is checked for compatibility before it's added to the objectFiles vector. config->emulation, objectFiles, and sharedFiles will all be empty, so we'll attempt to access bitcodeFiles[0], but bitcodeFiles is also empty, so we'll segfault. This commit prevents the segfault by adding OUTPUT_FORMAT as a possible source of machine configuration, and it also adds an llvm_unreachable to diagnose similar issues in the future. Differential Revision: https://reviews.llvm.org/D76109	2020-03-12 22:54:53 -07:00
David Bozier	6e2804ce6b	[LLD] Add support for --unique option Summary: Places orphan sections into a unique output section. This prevents the merging of orphan sections of the same name. Matches behaviour of GNU ld --unique. --unique=pattern is not implemented. Motivated user case shown in the test has 2 local symbols as they would appear if C++ source has been compiled with -ffunction-sections. The merging of these sections in the case of a partial link (-r) may limit the effectiveness of -gc-sections of a subsequent link. Reviewers: espindola, jhenderson, bd1976llvm, edd, andrewng, JonChesterfield, MaskRay, grimar, ruiu, psmith Reviewed By: MaskRay, grimar Subscribers: emaste, arichardson, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75536	2020-03-10 12:20:21 +00:00
Rafael Ávila de Espíndola	d48d339156	[lld][ELF] Add --shuffle-sections=seed to shuffle input sections Summary: This option causes lld to shuffle sections by assigning different priorities in each run. The use case for this is to introduce randomization in benchmarks. The idea is inspired by the paper "Producing Wrong Data Without Doing Anything Obviously Wrong!" (https://www.inf.usi.ch/faculty/hauswirth/publications/asplos09.pdf). Unlike the paper, we shuffle individual sections, not just input files. Doing this in lld is particularly convenient as the --reproduce option makes it easy to collect all the necessary bits for relinking the program being benchmarked. Once that it is done, all that is needed is to add --shuffle-sections=0 to the response file and relink before each run of the benchmark. Differential Revision: https://reviews.llvm.org/D74791	2020-02-19 13:44:12 -08:00
Fangrui Song	105a270028	[ELF][AArch64] Rename pacPlt to zPacPlt and forceBti to zForceIbt after D71327. NFC We use config->z* for -z options.	2020-02-13 21:02:54 -08:00
Russell Gallop	e7cb374433	[LLD][ELF] Add time-trace to ELF LLD This adds some of LLD specific scopes and picks up optimisation scopes via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in `77e6bb3c`. Differential Revision: https://reviews.llvm.org/D71060	2020-02-06 12:14:13 +00:00
Teresa Johnson	2f63d549f1	Restore "[LTO/WPD] Enable aggressive WPD under LTO option" This restores `59733525d3` (D71913), along with bot fix `19c76989bb`. The bot failure should be fixed by D73418, committed as `af954e441a`. I also added a fix for non-x86 bot failures by requiring x86 in new test lld/test/ELF/lto/devirt_vcall_vis_public.ll.	2020-01-27 07:55:05 -08:00
Teresa Johnson	90e630a95e	Revert "[LTO/WPD] Enable aggressive WPD under LTO option" This reverts commit `59733525d3`. There is a windows sanitizer bot failure in one of the cfi tests that I will need some time to figure out: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio	2020-01-23 17:29:24 -08:00
Teresa Johnson	59733525d3	[LTO/WPD] Enable aggressive WPD under LTO option Summary: Third part in series to support Safe Whole Program Devirtualization Enablement, see RFC here: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html This patch adds type test metadata under -fwhole-program-vtables, even for classes without hidden visibility. It then changes WPD to skip devirtualization for a virtual function call when any of the compatible vtables has public vcall visibility. Additionally, internal LLVM options as well as lld and gold-plugin options are added which enable upgrading all public vcall visibility to linkage unit (hidden) visibility during LTO. This enables the more aggressive WPD to kick in based on LTO time knowledge of the visibility guarantees. Support was added to all flavors of LTO WPD (regular, hybrid and index-only), and to both the new and old LTO APIs. Unfortunately it was not simple to split the first and second parts of this part of the change (the unconditional emission of type tests and the upgrading of the vcall visiblity) as I needed a way to upgrade the public visibility on legacy WPD llvm assembly tests that don't include linkage unit vcall visibility specifiers, to avoid a lot of test churn. I also added a mechanism to LowerTypeTests that allows dropping type test assume sequences we now aggressively insert when we invoke distributed ThinLTO backends with null indexes, which is used in testing mode, and which doesn't invoke the normal ThinLTO backend pipeline. Depends on D71907 and D71911. Reviewers: pcc, evgeny777, steven_wu, espindola Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71913	2020-01-23 16:09:44 -08:00
Fangrui Song	0fbf28f7aa	[ELF] --no-dynamic-linker: don't emit undefined weak symbols to .dynsym I felt really sad to push this commit for my selfish purpose to make glibc -static-pie build with lld. Some code constructs in glibc require R_X86_64_GOTPCREL/R_X86_64_REX_GOTPCRELX referencing undefined weak to be resolved to a GOT entry not relocated by R_X86_64_GLOB_DAT (GNU ld behavior), e.g. csu/libc-start.c if (__pthread_initialize_minimal != NULL) __pthread_initialize_minimal (); elf/dl-object.c void _dl_add_to_namespace_list (struct link_map new, Lmid_t nsid) { / We modify the list of loaded objects. */ __rtld_lock_lock_recursive (GL(dl_load_write_lock)); Emitting a GLOB_DAT will make the address equal &__ehdr_start (true value) and cause elf/ldconfig to segfault. glibc really should move away from weak references, which do not have defined semantics. Temporarily special case --no-dynamic-linker.	2020-01-23 12:25:15 -08:00
Fangrui Song	7cd429f27d	[ELF] Add -z force-ibt and -z shstk for Intel Control-flow Enforcement Technology This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang. It adds Intel CET (Control-flow Enforcement Technology) support to lld. The implementation follows the draft version of psABI which you can download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI. CET introduces a new restriction on indirect jump instructions so that you can limit the places to which you can jump to using indirect jumps. In order to use the feature, you need to compile source files with -fcf-protection=full. * IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt. * SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified. IBT-enabled executables/shared objects have two PLT sections, ".plt" and ".plt.sec". For the details as to why we have two sections, please read the comments. Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D59780	2020-01-13 23:39:28 -08:00
Rui Ueyama	69da7e29de	Revert an accidental commit `af5ca40b47`	2019-12-13 15:17:40 +09:00
Rui Ueyama	af5ca40b47	temporary	2019-12-13 14:35:03 +09:00
Fangrui Song	f0558f582a	[ELF] Delete unused Configuration::zExecstack after D56554	2019-11-25 14:44:09 -08:00
Nick Terrell	6814232429	[LLD][ELF] Support --[no-]mmap-output-file with F_no_mmap Summary: Add a flag `F_no_mmap` to `FileOutputBuffer` to support `--[no-]mmap-output-file` in ELF LLD. LLD currently explicitly ignores this flag for compatibility with GNU ld and gold. We need this flag to speed up link time for large binaries in certain scenarios. When we link some of our larger binaries we find that LLD takes 50+ GB of memory, which causes memory pressure. The memory pressure causes the VM to flush dirty pages of the output file to disk. This is normally okay, since we should be flushing cold pages. However, when using BtrFS with compression we need to write 128KB at a time when we flush a page. If any page in that 128KB block is written again, then it must be flushed a second time, and so on. Since LLD doesn't write sequentially this causes write amplification. The same 128KB block will end up being flushed multiple times, causing the linker to many times more IO than necessary. We've observed 3-5x faster builds with -no-mmap-output-file when we hit this scenario. The bad scenario only applies to compressed filesystems, which group together multiple pages into a single compressed block. I've tested BtrFS, but the problem will be present for any compressed filesystem on Linux, since it is caused by the VM. Silently ignoring --no-mmap-output-file caused a silent regression when we switched from gold to lld. We pass --no-mmap-output-file to fix this edge case, but since lld silently ignored the flag we didn't realize it wasn't being respected. Benchmark building a 9 GB binary that exposes this edge case. I linked 3 times with --mmap-output-file and 3 times with --no-mmap-output-file and took the average. The machine has 24 cores @ 2.4 GHz, 112 GB of RAM, BtrFS mounted with -compress-force=zstd, and an 80% full disk. \| Mode \| Time \| \|---------\|-------\| \| mmap \| 894 s \| \| no mmap \| 126 s \| When compression is disabled, BtrFS performs just as well with and without mmap on this benchmark. I was unable to reproduce the regression with any binaries in lld-speed-test. Reviewed By: ruiu, MaskRay Differential Revision: https://reviews.llvm.org/D69294	2019-10-29 15:49:08 -07:00
Michał Górny	2a0fcae3d4	[lld] [ELF] Add '-z nognustack' opt to suppress emitting PT_GNU_STACK Add a new '-z nognustack' option that suppresses emitting PT_GNU_STACK segment. This segment is not supported at all on NetBSD (stack is always non-executable), and the option is meant to be used to disable emitting it. Differential Revision: https://reviews.llvm.org/D56554	2019-10-29 17:54:23 +01:00
Nico Weber	5976a3f5aa	Fix a few typos in lld/ELF to cycle bots	2019-10-28 21:41:47 -04:00
Fangrui Song	0264950697	[ELF] Add -z separate-loadable-segments to complement separate-code and noseparate-code D64906 allows PT_LOAD to have overlapping p_offset ranges. In the default R RX RW RW layout + -z noseparate-code case, we do not tail pad segments when transiting to another segment. This can save at most 3*maxPageSize bytes. a) Before D64906, we tail pad R, RX and the first RW. b) With -z separate-code, we tail pad R and RX, but not the first RW (RELRO). In some cases, b) saves one file page. In some cases, b) wastes one virtual memory page. The waste is a concern on Fuchsia. Because it uses compressed binaries, it doesn't benefit from the saved file page. This patch adds -z separate-loadable-segments to restore the behavior before D64906. It can affect section addresses and can thus be used as a debugging mechanism (see PR43214 and ld.so partition bug in crbug.com/998712). Reviewed By: jakehehrlich, ruiu Differential Revision: https://reviews.llvm.org/D67481 llvm-svn: 372807	2019-09-25 03:39:31 +00:00
Peter Smith	ea99ce5e9b	[ELF][ARM] Implement --fix-cortex-a8 to fix erratum 657417 The --fix-cortex-a8 option implements a linker workaround for the coretex-a8 erratum 657417. A summary of the erratum conditions is: - A 32-bit Thumb-2 branch instruction B.w, Bcc.w, BL, BLX spans two 4KiB regions. - The destination of the branch is to the first 4KiB region. - The instruction before the branch is a 32-bit Thumb-2 non-branch instruction. The linker fix is to redirect the branch to a patch not in the first 4KiB region. The patch forwards the branch on to its target. The cortex-a8, is an old CPU, with the first implementation of this workaround in ld.bfd appearing in 2009. The cortex-a8 has been used in early Android Phones and there are some critical applications that still need to run on a cortex-a8 that have the erratum. The patch is applied roughly 10 times on LLD and 20 on Clang when they are built with --fix-cortex-a8 on an Arm system. The formal erratum description is avaliable in the ARM Core Cortex-A8 (AT400/AT401) Errata Notice document. This is available from Arm on request but it seems to be findable via a web search. Differential Revision: https://reviews.llvm.org/D67284 llvm-svn: 371965	2019-09-16 09:38:38 +00:00
Fangrui Song	e28a70daf4	[ELF] Consistently prioritize non-* wildcards overs "" in version scripts We prioritize non- wildcards overs VER_NDX_LOCAL/VER_NDX_GLOBAL "". This patch generalizes the rule to "" of other versions and thus fixes PR40176. I don't feel strongly about this GNU linkers' behavior but the generalization simplifies code. Delete `config->defaultSymbolVersion` which was used to special case VER_NDX_LOCAL/VER_NDX_GLOBAL "*". In `SymbolTable::scanVersionScript`, custom versions are handled the same way as VER_NDX_LOCAL/VER_NDX_GLOBAL. So merge `config->versionScript{Locals,Globals}` into `config->versionDefinitions`. Overall this seems to simplify the code. In `SymbolTable::assign{Exact,Wildcard}Versions`, `sym->verdefIndex == config->defaultSymbolVersion` is changed to `verdefIndex == UINT32_C(-1)`. This allows us to give duplicate assignment diagnostics for `{ global: foo; };` `V1 { global: foo; };` In test/linkerscript/version-script.s: vs_index of an undefined symbol changes from 0 to 1. This doesn't matter (arguably 1 is better because the binding is STB_GLOBAL) because vs_index of an undefined symbol is ignored. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D65716 llvm-svn: 367869	2019-08-05 14:31:39 +00:00
Fangrui Song	5391f158c2	[ELF] Add -z separate-code and pad the last page of last PF_X PT_LOAD with traps only if -z separate-code is specified This patch 1) adds -z separate-code and -z noseparate-code (default). 2) changes the condition that the last page of last PF_X PT_LOAD is padded with trap instructions. Current condition (after D33630): if there is no `SECTIONS` commands. After this change: if -z separate-code is specified. -z separate-code was introduced to ld.bfd in 2018, to place the text segment in its own pages. There is no overlap in pages between an executable segment and a non-executable segment: 1) RX cannot load initial contents from R or RW(or non-SHF_ALLOC). 2) R and RW(or non-SHF_ALLOC) cannot load initial contents from RX. lld's current status: - Between R and RX: in `Writer<ELFT>::fixSectionAlignments()`, the start of a segment is always aligned to maxPageSize, so the initial contents loaded by R and RX do not overlap. I plan to allow overlaps in D64906 if -z noseparate-code is in effect. - Between RX and RW(or non-SHF_ALLOC if RW doesn't exist): we currently unconditionally pad the last page to commonPageSize (defaults to 4096 on all targets we support). This patch will make it effective only if -z separate-code is specified. -z separate-code is a dubious feature that intends to reduce the number of ROP gadgets (which is actually ineffective because attackers can find plenty of gadgets in the text segment, no need to find gadgets in non-code regions). With the overlapping PT_LOAD technique D64906, -z noseparate-code removes two more alignments at segment boundaries than -z separate-code. This saves at most defaultCommonPageSize2 bytes, which are significant on targets with large defaultCommonPageSize (AArch64/MIPS/PPC: 65536). Issues/feedback on alignment at segment boundaries to help understand the implication: binutils PR24490 (the situation on ld.bfd is worse because they have two R-- on both sides of R-E so more alignments.) * In binutils, the 2018-02-27 commit "ld: Add --enable-separate-code" made -z separate-code the default on Linux. `d969dea983` In musl-cross-make, binutils is configured with --disable-separate-code to address size regressions caused by -z separate-code. (lld actually has the same issue, which I plan to fix in a future patch. The ld.bfd x86 status is worse because they default to max-page-size=0x200000). * https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237676 people want smaller code size. This patch will remove one alignment boundary. * Stef O'Rear: I'm opposed to any kind of page alignment at the text/rodata line (having a partial page of text aliased as rodata and vice versa has no demonstrable harm, and I actually care about small systems). So, make -z noseparate-code the default. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D64903 llvm-svn: 367537	2019-08-01 09:58:25 +00:00
Fangrui Song	47cfe8f321	[ELF] Fix variable names in comments after VariableName -> variableName change Also fix some typos. llvm-svn: 366181	2019-07-16 05:50:45 +00:00
Rui Ueyama	3837f4273f	[Coding style change] Rename variables so that they start with a lowercase letter This patch is mechanically generated by clang-llvm-rename tool that I wrote using Clang Refactoring Engine just for creating this patch. You can see the source code of the tool at https://reviews.llvm.org/D64123. There's no manual post-processing; you can generate the same patch by re-running the tool against lld's code base. Here is the main discussion thread to change the LLVM coding style: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html In the discussion thread, I proposed we use lld as a testbed for variable naming scheme change, and this patch does that. I chose to rename variables so that they are in camelCase, just because that is a minimal change to make variables to start with a lowercase letter. Note to downstream patch maintainers: if you are maintaining a downstream lld repo, just rebasing ahead of this commit would cause massive merge conflicts because this patch essentially changes every line in the lld subdirectory. But there's a remedy. clang-llvm-rename tool is a batch tool, so you can rename variables in your downstream repo with the tool. Given that, here is how to rebase your repo to a commit after the mass renaming: 1. rebase to the commit just before the mass variable renaming, 2. apply the tool to your downstream repo to mass-rename variables locally, and 3. rebase again to the head. Most changes made by the tool should be identical for a downstream repo and for the head, so at the step 3, almost all changes should be merged and disappear. I'd expect that there would be some lines that you need to merge by hand, but that shouldn't be too many. Differential Revision: https://reviews.llvm.org/D64121 llvm-svn: 365595	2019-07-10 05:00:37 +00:00
Francis Visoiu Mistrih	34667519dc	[Remarks] Extend -fsave-optimization-record to specify the format Use -fsave-optimization-record=<format> to specify a different format than the default, which is YAML. For now, only YAML is supported. llvm-svn: 363573	2019-06-17 16:06:00 +00:00
Peter Collingbourne	0282898586	ELF: Create synthetic sections for loadable partitions. We create several types of synthetic sections for loadable partitions, including: - The dynamic symbol table. This allows code outside of the loadable partitions to find entry points with dlsym. - Creating a dynamic symbol table also requires the creation of several other synthetic sections for the partition, such as the dynamic table and hash table sections. - The partition's ELF header is represented as a synthetic section in the combined output file, and will be used by llvm-objcopy to extract partitions. Differential Revision: https://reviews.llvm.org/D62350 llvm-svn: 362819	2019-06-07 17:57:58 +00:00
Peter Smith	e208208a31	[ELF][AArch64] Support for BTI and PAC Branch Target Identification (BTI) and Pointer Authentication (PAC) are architecture features introduced in v8.5a and 8.3a respectively. The new instructions have been added in the hint space so that binaries take advantage of support where it exists yet still run on older hardware. The impact of each feature is: BTI: For executable pages that have been guarded, all indirect branches must have a destination that is a BTI instruction of the appropriate type. For the static linker, this means that PLT entries must have a "BTI c" as the first instruction in the sequence. BTI is an all or nothing property for a link unit, any indirect branch not landing on a valid destination will cause a Branch Target Exception. PAC: The dynamic loader encodes with PACIA the address of the destination that the PLT entry will load from the .plt.got, placing the result in a subset of the top-bits that are not valid virtual addresses. The PLT entry may authenticate these top-bits using the AUTIA instruction before branching to the destination. Use of PAC in PLT sequences is a contract between the dynamic loader and the static linker, it is independent of whether the relocatable objects use PAC. BTI and PAC are independent features that can be combined. So we can have several combinations of PLT: - Standard with no BTI or PAC - BTI PLT with "BTI c" as first instruction. - PAC PLT with "AUTIA1716" before the indirect branch to X17. - BTIPAC PLT with "BTI c" as first instruction and "AUTIA1716" before the first indirect branch to X17. The use of BTI and PAC in relocatable object files are encoded by feature bits in the .note.gnu.property section in a similar way to Intel CET. There is one AArch64 specific program property GNU_PROPERTY_AARCH64_FEATURE_1_AND and two target feature bits defined: - GNU_PROPERTY_AARCH64_FEATURE_1_BTI -- All executable sections are compatible with BTI. - GNU_PROPERTY_AARCH64_FEATURE_1_PAC -- All executable sections have return address signing enabled. Due to the properties of FEATURE_1_AND the static linker can tell when all input relocatable objects have the BTI and PAC feature bits set. The static linker uses this to enable the appropriate PLT sequence. Neither -> standard PLT GNU_PROPERTY_AARCH64_FEATURE_1_BTI -> BTI PLT GNU_PROPERTY_AARCH64_FEATURE_1_PAC -> PAC PLT Both properties -> BTIPAC PLT In addition to the .note.gnu.properties there are two new command line options: --force-bti : Act as if all relocatable inputs had GNU_PROPERTY_AARCH64_FEATURE_1_BTI and warn for every relocatable object that does not. --pac-plt : Act as if all relocatable inputs had GNU_PROPERTY_AARCH64_FEATURE_1_PAC. As PAC is a contract between the loader and static linker no warning is given if it is not present in an input. Two processor specific dynamic tags are used to communicate that a non standard PLT sequence is being used. DTI_AARCH64_BTI_PLT and DTI_AARCH64_BTI_PAC. Differential Revision: https://reviews.llvm.org/D62609 llvm-svn: 362793	2019-06-07 13:00:17 +00:00
Rui Ueyama	2057f8366a	Read .note.gnu.property sections and emit a merged .note.gnu.property section. This patch also adds `--require-cet` option for the sake of testing. The actual feature for IBT-aware PLT is not included in this patch. This is a part of https://reviews.llvm.org/D59780. Submitting this first should make it easy to work with a related change (https://reviews.llvm.org/D62609). Differential Revision: https://reviews.llvm.org/D62853 llvm-svn: 362579	2019-06-05 03:04:46 +00:00

1 2 3 4 5 ...

363 Commits