llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	f77b77e8db	[ELF][RISCV] Relax local-exec TLS model In -mrelax mode, GCC/Clang may generate a local-exec TLS code sequence like: ``` # R_RISCV_TPREL_HI20, R_RISCV_RELAX lui rd, %tprel_hi(x) # R_RISCV_TPREL_ADD, R_RISCV_RELAX add rd, rd, tp, %tprel_add(x) # (R_RISCV_TPREL_LO12_I \|\| R_RISCV_TPREL_LO12_S), R_RISCV_RELAX addi rd, rd, %tprel_lo(x) \|\| sw rs, %tprel(x)(rd) ``` Note: st_value(x) for TLS should be in the range [0,p_memsz(PT_TLS)). When st_value(x) < 2048 (i.e. hi20(x) == 0), the linker can relax the code sequence to: ``` addi rd, tp, st_value(x) \|\| sw rs, st_value(x)(rd) ``` Differential Revision: https://reviews.llvm.org/D129425	2022-07-15 10:08:08 -07:00
Fangrui Song	51b9e099d5	[ELF] Reword --no-allow-shlib-undefined diagnostic Use a format more similar to unresolved references from regular object files. It's probably easier to read for people who are less familiar with the linker diagnostics. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D129790	2022-07-15 01:29:58 -07:00
Alexandre Ganea	17a4427e82	[LLD][COFF] On Windows, fix the date formatting in the 'incremental' test. On my system the date formatting is a bit different from what the test used to support. I'm using: Windows 11 version 21H2, build 22000.795 using the English(Canada) region. ls from BusyBox 1.36 VS 2022 17.2.5 WinSDK 10.0.22000	2022-07-14 17:10:09 -04:00
Fangrui Song	889c6f3996	[ELF][test] Fix a typo in aarch64-ifunc-bti.s to actually test what was intended Thanks to Alex Brachet for spotting it in D110217.	2022-07-14 13:46:38 -07:00
Jez Ng	403d61aedd	[lld-macho] Enable EH frame relocation / pruning This just removes the code that gates the logic. The main issue here is perf impact: without {D122258}, LLD takes a significant perf hit because it now has to do a lot more work in the input parsing phase. But with that change to eliminate unnecessary EH frames from input object files, the perf overhead here is minimal. Concretely, here are the numbers for some builds as measured on my 16-core Mac Pro: chromium_framework This is without the use of `-femit-dwarf-unwind=no-compact-unwind`: base diff difference (95% CI) sys_time 1.826 ± 0.019 1.962 ± 0.034 [ +6.5% .. +8.4%] user_time 9.306 ± 0.054 9.926 ± 0.082 [ +6.2% .. +7.1%] wall_time 8.225 ± 0.068 8.947 ± 0.128 [ +8.0% .. +9.6%] samples 15 22 With that flag enabled, the regression mostly disappears, as hoped: base diff difference (95% CI) sys_time 1.839 ± 0.062 1.866 ± 0.068 [ -0.9% .. +3.8%] user_time 9.452 ± 0.068 9.490 ± 0.067 [ -0.1% .. +0.9%] wall_time 8.383 ± 0.127 8.452 ± 0.114 [ -0.1% .. +1.8%] samples 17 21 Unnamed internal app Without `-femit-dwarf-unwind`, this is the perf hit: base diff difference (95% CI) sys_time 1.372 ± 0.029 1.317 ± 0.024 [ -4.6% .. -3.5%] user_time 2.835 ± 0.028 2.980 ± 0.027 [ +4.8% .. +5.4%] wall_time 3.205 ± 0.079 3.383 ± 0.066 [ +4.9% .. +6.2%] samples 102 83 With `-femit-dwarf-unwind`, the perf hit almost disappears: base diff difference (95% CI) sys_time 1.274 ± 0.026 1.270 ± 0.025 [ -0.9% .. +0.3%] user_time 2.812 ± 0.023 2.822 ± 0.035 [ +0.1% .. +0.7%] wall_time 3.166 ± 0.047 3.174 ± 0.059 [ -0.2% .. +0.7%] samples 95 97 Just for fun, I measured the impact of `-femit-dwarf-unwind` on ld64 (`base` has the extra DWARF unwind info in the input object files, `diff` doesn't): base diff difference (95% CI) sys_time 1.128 ± 0.010 1.124 ± 0.023 [ -1.3% .. +0.6%] user_time 7.176 ± 0.030 7.106 ± 0.094 [ -1.5% .. -0.4%] wall_time 7.874 ± 0.041 7.795 ± 0.121 [ -1.7% .. -0.3%] samples 16 25 And for LLD: base diff difference (95% CI) sys_time 1.315 ± 0.019 1.280 ± 0.019 [ -3.2% .. -2.0%] user_time 2.980 ± 0.022 2.822 ± 0.016 [ -5.5% .. -5.0%] wall_time 3.369 ± 0.038 3.175 ± 0.033 [ -6.2% .. -5.3%] samples 47 47 So parsing the extra EH frames is a lot more expensive for us than for ld64. But given that we are quite a lot faster than ld64 to begin with, I guess this isn't entirely unexpected... Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D129540	2022-07-13 21:14:05 -04:00
Stefan Pintilie	c1f3cffee1	[PowerPC][LLD] Change PPC64R2SaveStub to only use non-PC-relative code Currently the PPC64R2SaveStub thunk will produce Power 10 code by default. This produced an issue when linking older code that made use of the st_other=1 bit but was never meant to be linked or run on Power 10. This patch makes it so that only the R_PPC64_REL24_NOTOC relocation can produce Power 10 code. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129580	2022-07-13 19:34:33 -05:00
Fangrui Song	e690137dde	[Support] Change compression::zlib::{compress,uncompress} to use uint8_t * It's more natural to use uint8_t * (std::byte needs C++17 and llvm has too much uint8_t ) and most callers use uint8_t instead of char *. The functions are recently moved into `llvm::compression::zlib::`, so downstream projects need to make adaption anyway.	2022-07-13 16:26:54 -07:00
Daniel Bertalan	94e0f8e001	[lld-macho] Accept dylibs with LC_DYLD_EXPORTS_TRIE This load command specifies the offset and size of the exports trie. This information used to be a field in LC_DYLD_INFO, but in newer libraries, it has a dedicated load command: LC_DYLD_EXPORTS_TRIE. The format of the trie is the same for both load commands, so the code for parsing it can be shared. LLD does not generate this yet; it is mainly useful when chained fixups are in use, as the other members of LC_DYLD_INFO are unused then, so the smaller LC_DYLD_EXPORTS_TRIE can be output instead. LLDB gained support for this in D107673. Fixes #54550 Differential Revision: https://reviews.llvm.org/D129430	2022-07-13 22:34:11 +02:00
Daniel Bertalan	ecb14fd872	[lld-macho] Add LOH_ARM64_ADRP_LDR_GOT_LDR optimization hint support This hint instructs the linker to relax a GOT-indirect load. If the referenced symbol is external and its GOT entry is within +/- 1 MiB, the GOT entry can be loaded with a single literal ldr instruction. If the referenced symbol is local, its address may be loaded directly if it's close enough, or with an adr(p) + ldr pair if it's not. This type accounts for more than half of all LOHs in chromium_framework. This commit moves the eligibility checks into helper functions to improve the readability of the LOH processing code. Ho functional changes are intended to the previously implemented LOH types. Differential Revision: https://reviews.llvm.org/D129427	2022-07-13 12:20:14 +02:00
Kazu Hirata	e5f568a49f	Use has_value instead of hasValue (NFC)	2022-07-13 01:58:03 -07:00
Fangrui Song	9ea5b34f05	[ELF][RISCV] Use unshifted value for overflow check The unshifted value indicates an displacement in bytes which is more meaningful.	2022-07-13 00:28:29 -07:00
Fangrui Song	6b1d151fe3	[ELF] Fix displacement computation for intra-section branch after D127611 D127611 computed st_value is inaccurate: * For a backward branch, the destination address may be wrong if there is no relaxable relocation between it and the current location due to `if (remove)`. We may incorrectly relax a branch to c.j which ends up an overflow. * For a forward branch, the destination address may be overestimated and lose relaxation opportunities. To fix the issues, * Don't reset st_value to the original value. * Save the st_value delta from the previous iteration into valueDelta, and use `sa[0].d->value -= delta - valueDelta.find(sa[0].d)->second`.	2022-07-13 00:17:17 -07:00
Fangrui Song	67d760dd49	[ELF][test] Remove unneeded --mcpu=future from llvm-objdump commands	2022-07-12 21:08:52 -07:00
Fangrui Song	4864aba631	[ELF][test] Remove unneeded --mcpu=pwr10 from llvm-objdump commands llvm-objdump has defaulted to decode all known instructions for PPC64.	2022-07-12 21:07:45 -07:00
Jez Ng	61ace8f78b	[lld-macho][nfc] Change force-load.s test to actually test I'd forgotten to change a copypasted line...	2022-07-12 17:57:09 -04:00
YongKang Zhu	2324c2e3c3	[LLD] Two tweaks to symbol ordering scheme When `--symbol-ordering-file` is specified, the linker today will always put hot contributions in the middle of cold ones when targeting RISC machine, so to minimize the chances that branch thunks need be generated for hot code calling into cold code. This is not necessary when user specifies an ordering of read-only data (vs. function) symbols, or when output section is small such that no branch thunk would ever be required. The latter is common for mobile apps. For example, among all the native ARM64 libraries in Facebook Instagram App for Android, 80% of them have text section smaller than 64KB and the largest text section seen is less than 8MB, well below the distance that a BRANCH26 can reach. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D128382	2022-07-12 11:34:17 -07:00
Alex Brachet	5176a7671f	Fix build on Windows It seems like the `sed` on Windows is not particularly smart. It's not actually needed in this place, so I've removed it's usage and just created an invalid yaml another way.	2022-07-11 22:47:26 +00:00
Alex Brachet	d27984a651	Fix build on Windows Error message is not capitalized on Windows	2022-07-11 21:44:28 +00:00
Alex Brachet	fd9962e75d	[COFF] Add vfsoverlay flag This patch adds a new flag vfsoverlay similar to clang’s ivfsoverlay flag. This is helpful when compiling on case sensitive file systems when cross compiling to Windows. Particularly when compiling third party code containing \#pragma comment(“linker”, “/defaultlib:...”) which can’t be easily changed. Differential Revision: https://reviews.llvm.org/D125800	2022-07-11 21:31:01 +00:00
Kaining Zhong	6c641d0de6	[lld-macho] Handle user-provided dtrace symbols to avoid linking failure This fixes https://github.com/llvm/llvm-project/issues/56238. ld64.lld currently does not generate __dof section in Mach-O, and -no_dtrace_dof option is on by default. However when there are user-defined dtrace symbols, ld64.lld will treat them as undefined symbols, which causes the linking to fail because lld cannot find their definitions. This patch allows ld64.lld to rewrite the instructions calling dtrace symbols to instructions like nop as what ld64 does; therefore, when encountered with user-provided dtrace probes, the linking can still succeed. I'm not sure whether support for dtrace is expected in lld, so for now I didn't add codes to make lld emit __dof section like ld64, and only made it possible to link with dtrace symbols provided. If this feature is needed, I can add that part in Dtrace.cpp & Dtrace.h. Reviewed By: int3, #lld-macho Differential Revision: https://reviews.llvm.org/D129062	2022-07-11 15:32:26 -04:00
David Spickett	79942d32a6	[lld-macho] Fix compact unwind output for 32 bit builds This test was failing on our 32 bit build bot: https://lab.llvm.org/buildbot/#/builders/178/builds/2463 This happened because in UnwindInfoSectionImpl::finalize a decision is made whether to write out regular or compressed unwind info. One check in this does: ``` if (cuPtr->functionAddress >= functionAddressMax) { break; ``` Where cuPtr->functionAddress was uint64_t and functionAddressMax was uintptr_t, which is 4 bytes on a 32 bit system. Using uint64_t for functionAddressMax fixes this problem. Presumably because at only 4 bytes, the max is much lower than we expect. We're targetting 64 bit though so the size of the max should match the size of the addresses. Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D129363	2022-07-11 08:21:03 +00:00
Nico Weber	109d7fb4e6	fix comment typo to cycle bots	2022-07-09 22:41:58 +02:00
Fangrui Song	dd74d3117d	[ELF] Refactor ELFCOMPRESS_ZLIB handling and improve diagnostics And add some tests.	2022-07-08 14:04:19 -07:00
Leonard Chan	474c873148	Revert "[llvm] cmake config groundwork to have ZSTD in LLVM" This reverts commit `f07caf20b9` which seems to break upstream https://lab.llvm.org/buildbot/#/builders/109/builds/42253.	2022-07-08 13:48:05 -07:00
Cole Kissane	f07caf20b9	[llvm] cmake config groundwork to have ZSTD in LLVM - added `FindZSTD.cmake` - added a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - likewise added have_zstd to compiler-rt/test/lit.common.cfg.py, clang-tools-extra/clangd/test/lit.cfg.py, and several lit.site.cfg.py.in files mirroring have_zlib behavior Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-08 11:46:52 -07:00
Cole Kissane	ea61750c35	[NFC] Refactor llvm::zlib namespace * Refactor compression namespaces across the project, making way for a possible introduction of alternatives to zlib compression. Changes are as follows: * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`. Reviewed By: MaskRay, leonardchan, phosek Differential Revision: https://reviews.llvm.org/D128953	2022-07-08 11:19:07 -07:00
Fangrui Song	75e551e5d8	[ELF] Relax R_RISCV_CALL and R_RISCV_CALL_PLT A pair of auipc+jalr relocated by R_RISCV_CALL or R_RISCV_CALL_PLT can be converted to c.j, c.jal, or jal. * c.j: RVC and displacement is representable as an int12 * c.jal: RV32C and displacement is representable as an int12 * jal: displacement is representable as an int21 Use the D127581 relaxation framework to implement the relaxation. If a shorter sequence is satisfied, we record the new relocation type in `relocTypes` and saves the new instruction into `writes`. Finally let `riscvFinalizeRelax` rewrite the instruction by setting `skip`. Differential Revision: https://reviews.llvm.org/D127611	2022-07-07 10:18:45 -07:00
Fangrui Song	6611d58f5b	[ELF] Relax R_RISCV_ALIGN Alternative to D125036. Implement R_RISCV_ALIGN relaxation so that we can handle -mrelax object files (i.e. -mno-relax is no longer needed) and creates a framework for future relaxation. `relaxAux` is placed in a union with InputSectionBase::jumpInstrMod, storing auxiliary information for relaxation. In the first pass, `relaxAux` is allocated. The main data structure is `relocDeltas`: when referencing `relocations[i]`, the actual offset is `r_offset - (i ? relocDeltas[i-1] : 0)`. `relaxOnce` performs one relaxation pass. It computes `relocDeltas` for all text section. Then, adjust st_value/st_size for symbols relative to this section based on `SymbolAnchor`. `bytesDropped` is set so that `assignAddresses` knows that the size has changed. Run `relaxOnce` in the `finalizeAddressDependentContent` loop to wait for convergence of text sections and other address dependent sections (e.g. SHT_RELR). Note: extrating `relaxOnce` into a separate loop works for many cases but has issues in some linker script edge cases. After convergence, compute section contents: shrink the NOP sequence of each R_RISCV_ALIGN as appropriate. Instead of deleting bytes, we run a sequence of memcpy on the content delimitered by relocation locations. For R_RISCV_ALIGN let the next memcpy skip the desired number of bytes. Section content computation is parallelizable, but let's ensure the implementation is mature before optimizations. Technically we can save a copy if we interleave some code with `OutputSection::writeTo`, but let's not pollute the generic code (we don't have templated relocation resolving, so using conditions can impose overhead to non-RISCV.) Tested: `make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- LLVM=1 defconfig all` built Linux kernel using -mrelax is bootable. FreeBSD RISCV64 system using -mrelax is bootable. bash/curl/firefox/libevent/vim/tmux using -mrelax works. Differential Revision: https://reviews.llvm.org/D127581	2022-07-07 10:16:09 -07:00
Tim Northover	0f4339a835	lld test fix: don't check the precise hex emitted as a comment. It can vary depending on the platform, so as with the NO-FMA test just check for "0x".	2022-07-07 13:25:24 +01:00
Tim Northover	fe62019387	lld: fix test after x86 instruction comments now end in newline	2022-07-07 13:01:32 +01:00
Jin Xin Ng	65001f5777	[LTO][ELF] Add selective --save-temps= option Allows specific “temps” to be saved, instead of the current all-or-nothing nature of --save-temps. Multiple of these “temps” can be saved by specifying the argument multiple times. Differential Revision: https://reviews.llvm.org/D127778	2022-07-06 10:06:18 -07:00
Fangrui Song	e0612c91cd	[ELF] Optimize getInputSections. NFC In the majority of cases (e.g. orphan sections), an OutputSection has at most one InputSectionDescription (isd). By changing the return type to ArrayRef<InputSection *> we can just reference the isd->sections. For OutputSections with more than one InputSectionDescription we use a caller provided SmallVector to copy the elements as before. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D129111	2022-07-05 23:31:09 -07:00
Ben Dunbobbin	c35a6454b1	[BUILD] Add missed CMakeLists.txt change from `dfb77f2` See: https://reviews.llvm.org/D128195	2022-07-05 16:04:58 +01:00
Ben Dunbobbin	dfb77f2e99	[LLD][ELF] Add FORCE_LLD_DIAGNOSTICS_CRASH to force LLD to crash Add FORCE_LLD_DIAGNOSTICS_CRASH inspired by the existing FORCE_CLANG_DIAGNOSTICS_CRASH. This is particularly useful for people customizing LLD as they may want to modify the crash reporting behavior. Differential Revision: https://reviews.llvm.org/D128195	2022-07-05 09:43:09 +01:00
Daniel Bertalan	2028fe6fbc	[lld-macho] Handle LOH_ARM64_ADRP_LDR_GOT optimization hints This hint instructs the linker to perform the AdrpLdr or AdrpAdd transformation depending on whether the GOT load has been relaxed to load a local symbol's address. Differential Revision: https://reviews.llvm.org/D129059	2022-07-05 07:33:13 +02:00
Pengxuan Zheng	b5e49cdea9	[LLD][COFF] Ignore /kernel flag There exists some description of the flag from Microsoft, but not sure if there's more to it. We ignore the flag for now until we find out more about it. https://docs.microsoft.com/en-us/cpp/build/reference/kernel-create-kernel-mode-binary?view=msvc-170 Reviewed By: thieta, hans Differential Revision: https://reviews.llvm.org/D128238	2022-07-01 10:03:02 -07:00
Daniel Bertalan	73b659ff55	[lld-macho] Fix left shift of negative value UB I introduced this mistake in `573c7e6b3c`. Fixes the failure on this UBSan bot: https://lab.llvm.org/buildbot/#/builders/5/builds/25537	2022-07-01 12:00:16 +02:00
Daniel Bertalan	573c7e6b3c	[lld-macho] Handle LOH_ARM64_ADRP_LDR linker optimization hints This linker optimization hint transforms a pair of adrp+ldr (immediate) instructions into an ldr (literal) load from a PC-relative address if it is 4-byte aligned and within +/- 1 MiB, as ldr can encode a signed 19-bit offset that gets multiplied by 4. In the wild, only a small number of these hints are applicable because not many loads end up close enough to the data segment. However, the added helper functions will be useful in implementing the rest of the LOH types. Differential Revision: https://reviews.llvm.org/D128942	2022-07-01 09:44:24 +02:00
Daniel Bertalan	a3f67f0920	[lld-macho] Initial support for Linker Optimization Hints Linker optimization hints mark a sequence of instructions used for synthesizing an address, like ADRP+ADD. If the referenced symbol ends up close enough, it can be replaced by a faster sequence of instructions like ADR+NOP. This commit adds support for 2 of the 7 defined ARM64 optimization hints: - LOH_ARM64_ADRP_ADD, which transforms a pair of ADRP+ADD into ADR+NOP if the referenced address is within +/- 1 MiB - LOH_ARM64_ADRP_ADRP, which transforms two ADRP instructions into ADR+NOP if they reference the same page These two kinds already cover more than 50% of all LOHs in chromium_framework. Differential Review: https://reviews.llvm.org/D128093	2022-06-30 06:28:42 +02:00
Fangrui Song	9a572164d5	[ELF] Move InputFiles global variables (memoryBuffers, objectFiles, etc) into Ctx. NFC	2022-06-29 18:53:38 -07:00
Fangrui Song	e980f16d52	[ELF] Move whyExtract/backwardReferences from LinkerDriver to Ctx. NFC Ctx was recently added as a more suitable place for such singletons.	2022-06-29 17:34:31 -07:00
Daniel Bertalan	8d29f0fdb9	[lld-macho] Emit REBASE_OPCODE_ADD_ADDR_IMM_SCALED if possible An ADD_ADDR rebase opcode's argument can be encoded as an immediate if the offset is less than 15 * word size. This change reduces the size of chromium_framework by 100+ KiB. Differential Revision: https://reviews.llvm.org/D128798	2022-06-29 22:28:39 +02:00
Brad Smith	84b2e04aea	[docs] Remove outdated status update for FreeBSD Reviewed By: emaste, MaskRay Differential Revision: https://reviews.llvm.org/D128592	2022-06-27 19:41:53 -04:00
Sam Clegg	53217ecb88	[lld][WebAssembly] Don't apply data relocations at static constructor time Instead, export `__wasm_apply_data_relocs` and `__wasm_call_ctors` separately. This is required since user code in a shared library (such as static constructors) should not be run until relocations have been applied to all loaded libraries. See: https://github.com/emscripten-core/emscripten/issues/17295 Differential Revision: https://reviews.llvm.org/D128515	2022-06-27 15:50:02 -07:00
Kazu Hirata	586fb81eee	[lld] Don't use Optional::hasValue (NFC) This patch replaces x.hasValue() with x where x is contextually convertible to bool.	2022-06-26 19:37:14 -07:00
Fangrui Song	0688b00fc3	[ELF] Remove deprecated -dc -dc is deprecated in release/14.x. Remove it for 15.0. The only usage I know was FreeBSD crungen which was removed by https://reviews.freebsd.org/D34215 glibc just dropped -Wl,-d today. Keep -d for now.	2022-06-26 17:26:44 -07:00
Fangrui Song	b95cca03cd	[ELF] Improve compound assignment tests Also use strchr instead of is_contained.	2022-06-25 22:30:52 -07:00
Fangrui Song	0a0effdd5b	[ELF] Support -= *= /= <<= >>= &= \|= in symbol assignments	2022-06-25 22:22:59 -07:00
Fangrui Song	77295c5486	[ELF] Allow ? without adjacent space GNU ld allows 1 ? 2?3:4 : 5?6 :7	2022-06-25 21:16:59 -07:00
Fangrui Song	e3f3d2abf0	[ELF][test] Improve expression test	2022-06-25 21:11:32 -07:00

1 2 3 4 5 ...

15417 Commits