llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	70257fab68	Use any_of (NFC)	2022-07-22 01:05:17 -07:00
John Ericson	07b749800c	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in `d0e1c2a550` to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-07-21 19:04:00 +00:00
Jez Ng	ec315a5fa1	[lld-macho] Fix LOH parsing segfault `advanceSubsection()` didn't account for the possibility that a section could have no subsections. Reviewed By: #lld-macho, thakis, BertalanD Differential Revision: https://reviews.llvm.org/D130288	2022-07-21 13:59:39 -04:00
Jez Ng	241f62d8d3	[lld-macho] Fix assertion when two symbols at same addr have unwind info If there are multiple symbols at the same address, our unwind info implementation assumes that we always register unwind entries to a single canonical symbol. This assumption was violated by the `registerEhFrame` code. Fixes #56570. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D130208	2022-07-21 09:44:49 -04:00
Daniel Bertalan	888d0a5ef2	[lld-macho][NFC] Remove redundant StringRef construction It's only used in one branch, so we were unnecessarily calculating the length of many symbol names. Tiny speedup when linking chromium_framework on my M1 Mac mini: x before.txt + after.txt N Min Max Median Avg Stddev x 10 3.9917109 4.0418 4.0318099 4.0203902 0.021459873 + 10 3.944725 4.053988 3.9708955 3.9825602 0.037257609 Difference at 95.0% confidence -0.03783 +/- 0.0285663 -0.940953% +/- 0.710536% (Student's t, pooled s = 0.0304028) Differential Revision: https://reviews.llvm.org/D130234	2022-07-21 15:36:56 +02:00
Daniel Bertalan	54e18b2397	[lld-macho] Optimize rebase opcode generation This commit reduces the size of the emitted rebase sections by generating the REBASE_OPCODE_DO_REBASE_ADD_ADDR_ULEB and REBASE_OPCODE_DO_REBASE_ULEB_TIMES_SKIPPING_ULEB opcodes. With this change, chromium_framework's rebase section is a 40% smaller 197 kilobytes, down from the previous 320 kB. That is 6 kB smaller than what ld64 produces for the same input. Performance figures from my M1 Mac mini: x before + after N Min Max Median Avg Stddev x 10 4.2269349 4.3300061 4.2689675 4.2690016 0.031151669 + 10 4.219331 4.2914009 4.2398136 4.2448277 0.023817308 No difference proven at 95.0% confidence Differential Revision: https://reviews.llvm.org/D130180	2022-07-21 10:00:39 +02:00
Keith Smiley	15f685eaa8	[lld-macho] Fold cfstrings with --deduplicate-literals Similar to cstrings ld64 always deduplicates cfstrings. This was already being done when enabling ICF, but for debug builds you may want to flip this on if you cannot eliminate your instances of this, so this change makes --deduplicate-literals also apply to cfstrings. Differential Revision: https://reviews.llvm.org/D130134	2022-07-20 11:11:09 -07:00
Kazu Hirata	360c1111e3	Use llvm::is_contained (NFC)	2022-07-20 09:09:19 -07:00
Martin Storsjö	801971e5b4	[LLD] [COFF] Improve the error message for too many exported symbols Print the actual number of symbols that would have been exported too, which helps assessing the situation. Differential Revision: https://reviews.llvm.org/D130117	2022-07-20 16:58:29 +03:00
Jez Ng	87ce7b41d8	[lld-macho] Simplify archive loading logic This is a follow-on to {D129556}. I've refactored the code such that `addFile()` no longer needs to take an extra parameter. Additionally, the "do we force-load or not" policy logic is now fully contained within addFile, instead of being split between `addFile` and `parseLCLinkerOptions`. This also allows us to move the `ForceLoad` (now `LoadType`) enum out of the header file. Additionally, we can now correctly report loads induced by `LC_LINKER_OPTION` in our `-why_load` output. I've also added another test to check that CLI library non-force-loads take precedence over `LC_LINKER_OPTION` + `-force_load_swift_libs`. (The existing logic is correct, just untested.) Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D130137	2022-07-19 21:56:24 -04:00
Jez Ng	2d889a87fb	[lld-macho] Read in new addrsig format The new format uses symbol relocations, as described in {D127637}. Reviewed By: #lld-macho, alx32 Differential Revision: https://reviews.llvm.org/D128938	2022-07-19 21:22:27 -04:00
Kaining Zhong	dd5635541c	[lld-macho] Fix loading same libraries from both LC_LINKER_OPTION and command line This fixes https://github.com/llvm/llvm-project/issues/56059 and https://github.com/llvm/llvm-project/issues/56440. This is inspired by tapthaker's patch (https://reviews.llvm.org/D127941), and has reused his test cases. This patch adds an bool "isCommandLineLoad" to indicate where archives are from. If lld tries to load the same library loaded previously by LC_LINKER_OPTION from CLI, it will use this isCommandLineLoad to determine if it should be affected by -all_load & -ObjC flags. This also prevents -force_load from affecting archives loaded previously from CLI without such flag, whereas tapthaker's patch will fail such test case (introduced by https://reviews.llvm.org/D128025). Reviewed By: int3, #lld-macho Differential Revision: https://reviews.llvm.org/D129556	2022-07-19 17:46:14 -04:00
Keith Smiley	0bc100986c	[lld-macho] Add support for -alias This creates a symbol alias similar to --defsym in the elf linker. This is used by swiftpm for all executables, so it's useful to support. This doesn't implement -alias_list but that could be done pretty easily as needed. Differential Revision: https://reviews.llvm.org/D129938	2022-07-19 13:55:56 -07:00
Arthur Eubanks	5bce73ba75	[test] Convert some tests to use opaque pointers	2022-07-19 13:11:08 -07:00
Jez Ng	f6017abb60	[lld-macho] Support folding of functions with identical LSDAs To do this, we need to slice away the LSDA pointer, just like we are slicing away the functionAddress pointer. No observable difference in perf on chromium_framework: base diff difference (95% CI) sys_time 1.769 ± 0.068 1.761 ± 0.065 [ -2.7% .. +1.8%] user_time 9.517 ± 0.110 9.528 ± 0.116 [ -0.6% .. +0.8%] wall_time 8.291 ± 0.174 8.307 ± 0.183 [ -1.1% .. +1.5%] samples 21 25 Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D129830	2022-07-19 13:29:52 -04:00
Daniel Bertalan	1fb9466c6a	[lld-macho] Devirtualize TargetInfo::getRelocAttrs This method is called on each relocation when parsing input files, so the overhead of using virtual functions ends up being quite large. We now have a single non-virtual method, which reads from the appropriate array of relocation attributes set in the TargetInfo constructor. This change results in a modest 2.3% reduction in link time for chromium_framework measured on an x86-64 VPS, and 0.7% on an arm64 Mac. N Min Max Median Avg Stddev x 10 11.869417 12.032609 11.935041 11.938268 0.045802324 + 10 11.581526 11.785265 11.649885 11.659507 0.054634834 Difference at 95.0% confidence -0.278761 +/- 0.0473673 -2.33502% +/- 0.396768% (Student's t, pooled s = 0.0504124) Differential Revision: https://reviews.llvm.org/D130000	2022-07-18 19:32:58 +02:00
Nico Weber	7b3146dcd3	fix comment typo to cycle bots	2022-07-17 09:10:05 -04:00
Daniel Bertalan	2b2e858e9c	[lld-macho] Handle filename being passed in -lto_object_path Clang passes a filename rather than a directory in -lto_object_path when using FullLTO. Previously, it was always treated it as a directory, so lld would crash when it attempted to create temporary files inside it. Fixes #54805 Differential Revision: https://reviews.llvm.org/D129705	2022-07-16 21:46:47 +02:00
Jez Ng	fe47cfb324	[lld-macho][nfc] Add more tests + comments around ICF + unwind info interaction While working on {D129830}, I realized that our handling of ICF + eh_frame combined was untested. Additionally I realized that the comment explaining why we were safely slicing away the functionAddress reloc from our compact unwind entries was... insufficient and slightly misleading. I've tried to clarify it. Reviewed By: #lld-macho, thevinster Differential Revision: https://reviews.llvm.org/D129894	2022-07-16 00:52:47 -04:00
Kazu Hirata	5cff5142a8	Use value instead of getValue (NFC)	2022-07-15 20:03:13 -07:00
Jez Ng	dbbdc3d6fb	[lld-macho][nfc] Fix numeric substitutions in icf.s test We were re-defining the various numeric variables when we actually intended to check already-defined variables against the value on the current CHECK line. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D129831	2022-07-15 15:28:27 -04:00
Fangrui Song	f77b77e8db	[ELF][RISCV] Relax local-exec TLS model In -mrelax mode, GCC/Clang may generate a local-exec TLS code sequence like: ``` # R_RISCV_TPREL_HI20, R_RISCV_RELAX lui rd, %tprel_hi(x) # R_RISCV_TPREL_ADD, R_RISCV_RELAX add rd, rd, tp, %tprel_add(x) # (R_RISCV_TPREL_LO12_I \|\| R_RISCV_TPREL_LO12_S), R_RISCV_RELAX addi rd, rd, %tprel_lo(x) \|\| sw rs, %tprel(x)(rd) ``` Note: st_value(x) for TLS should be in the range [0,p_memsz(PT_TLS)). When st_value(x) < 2048 (i.e. hi20(x) == 0), the linker can relax the code sequence to: ``` addi rd, tp, st_value(x) \|\| sw rs, st_value(x)(rd) ``` Differential Revision: https://reviews.llvm.org/D129425	2022-07-15 10:08:08 -07:00
Fangrui Song	51b9e099d5	[ELF] Reword --no-allow-shlib-undefined diagnostic Use a format more similar to unresolved references from regular object files. It's probably easier to read for people who are less familiar with the linker diagnostics. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D129790	2022-07-15 01:29:58 -07:00
Alexandre Ganea	17a4427e82	[LLD][COFF] On Windows, fix the date formatting in the 'incremental' test. On my system the date formatting is a bit different from what the test used to support. I'm using: Windows 11 version 21H2, build 22000.795 using the English(Canada) region. ls from BusyBox 1.36 VS 2022 17.2.5 WinSDK 10.0.22000	2022-07-14 17:10:09 -04:00
Fangrui Song	889c6f3996	[ELF][test] Fix a typo in aarch64-ifunc-bti.s to actually test what was intended Thanks to Alex Brachet for spotting it in D110217.	2022-07-14 13:46:38 -07:00
Jez Ng	403d61aedd	[lld-macho] Enable EH frame relocation / pruning This just removes the code that gates the logic. The main issue here is perf impact: without {D122258}, LLD takes a significant perf hit because it now has to do a lot more work in the input parsing phase. But with that change to eliminate unnecessary EH frames from input object files, the perf overhead here is minimal. Concretely, here are the numbers for some builds as measured on my 16-core Mac Pro: chromium_framework This is without the use of `-femit-dwarf-unwind=no-compact-unwind`: base diff difference (95% CI) sys_time 1.826 ± 0.019 1.962 ± 0.034 [ +6.5% .. +8.4%] user_time 9.306 ± 0.054 9.926 ± 0.082 [ +6.2% .. +7.1%] wall_time 8.225 ± 0.068 8.947 ± 0.128 [ +8.0% .. +9.6%] samples 15 22 With that flag enabled, the regression mostly disappears, as hoped: base diff difference (95% CI) sys_time 1.839 ± 0.062 1.866 ± 0.068 [ -0.9% .. +3.8%] user_time 9.452 ± 0.068 9.490 ± 0.067 [ -0.1% .. +0.9%] wall_time 8.383 ± 0.127 8.452 ± 0.114 [ -0.1% .. +1.8%] samples 17 21 Unnamed internal app Without `-femit-dwarf-unwind`, this is the perf hit: base diff difference (95% CI) sys_time 1.372 ± 0.029 1.317 ± 0.024 [ -4.6% .. -3.5%] user_time 2.835 ± 0.028 2.980 ± 0.027 [ +4.8% .. +5.4%] wall_time 3.205 ± 0.079 3.383 ± 0.066 [ +4.9% .. +6.2%] samples 102 83 With `-femit-dwarf-unwind`, the perf hit almost disappears: base diff difference (95% CI) sys_time 1.274 ± 0.026 1.270 ± 0.025 [ -0.9% .. +0.3%] user_time 2.812 ± 0.023 2.822 ± 0.035 [ +0.1% .. +0.7%] wall_time 3.166 ± 0.047 3.174 ± 0.059 [ -0.2% .. +0.7%] samples 95 97 Just for fun, I measured the impact of `-femit-dwarf-unwind` on ld64 (`base` has the extra DWARF unwind info in the input object files, `diff` doesn't): base diff difference (95% CI) sys_time 1.128 ± 0.010 1.124 ± 0.023 [ -1.3% .. +0.6%] user_time 7.176 ± 0.030 7.106 ± 0.094 [ -1.5% .. -0.4%] wall_time 7.874 ± 0.041 7.795 ± 0.121 [ -1.7% .. -0.3%] samples 16 25 And for LLD: base diff difference (95% CI) sys_time 1.315 ± 0.019 1.280 ± 0.019 [ -3.2% .. -2.0%] user_time 2.980 ± 0.022 2.822 ± 0.016 [ -5.5% .. -5.0%] wall_time 3.369 ± 0.038 3.175 ± 0.033 [ -6.2% .. -5.3%] samples 47 47 So parsing the extra EH frames is a lot more expensive for us than for ld64. But given that we are quite a lot faster than ld64 to begin with, I guess this isn't entirely unexpected... Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D129540	2022-07-13 21:14:05 -04:00
Stefan Pintilie	c1f3cffee1	[PowerPC][LLD] Change PPC64R2SaveStub to only use non-PC-relative code Currently the PPC64R2SaveStub thunk will produce Power 10 code by default. This produced an issue when linking older code that made use of the st_other=1 bit but was never meant to be linked or run on Power 10. This patch makes it so that only the R_PPC64_REL24_NOTOC relocation can produce Power 10 code. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129580	2022-07-13 19:34:33 -05:00
Fangrui Song	e690137dde	[Support] Change compression::zlib::{compress,uncompress} to use uint8_t * It's more natural to use uint8_t * (std::byte needs C++17 and llvm has too much uint8_t ) and most callers use uint8_t instead of char *. The functions are recently moved into `llvm::compression::zlib::`, so downstream projects need to make adaption anyway.	2022-07-13 16:26:54 -07:00
Daniel Bertalan	94e0f8e001	[lld-macho] Accept dylibs with LC_DYLD_EXPORTS_TRIE This load command specifies the offset and size of the exports trie. This information used to be a field in LC_DYLD_INFO, but in newer libraries, it has a dedicated load command: LC_DYLD_EXPORTS_TRIE. The format of the trie is the same for both load commands, so the code for parsing it can be shared. LLD does not generate this yet; it is mainly useful when chained fixups are in use, as the other members of LC_DYLD_INFO are unused then, so the smaller LC_DYLD_EXPORTS_TRIE can be output instead. LLDB gained support for this in D107673. Fixes #54550 Differential Revision: https://reviews.llvm.org/D129430	2022-07-13 22:34:11 +02:00
Daniel Bertalan	ecb14fd872	[lld-macho] Add LOH_ARM64_ADRP_LDR_GOT_LDR optimization hint support This hint instructs the linker to relax a GOT-indirect load. If the referenced symbol is external and its GOT entry is within +/- 1 MiB, the GOT entry can be loaded with a single literal ldr instruction. If the referenced symbol is local, its address may be loaded directly if it's close enough, or with an adr(p) + ldr pair if it's not. This type accounts for more than half of all LOHs in chromium_framework. This commit moves the eligibility checks into helper functions to improve the readability of the LOH processing code. Ho functional changes are intended to the previously implemented LOH types. Differential Revision: https://reviews.llvm.org/D129427	2022-07-13 12:20:14 +02:00
Kazu Hirata	e5f568a49f	Use has_value instead of hasValue (NFC)	2022-07-13 01:58:03 -07:00
Fangrui Song	9ea5b34f05	[ELF][RISCV] Use unshifted value for overflow check The unshifted value indicates an displacement in bytes which is more meaningful.	2022-07-13 00:28:29 -07:00
Fangrui Song	6b1d151fe3	[ELF] Fix displacement computation for intra-section branch after D127611 D127611 computed st_value is inaccurate: * For a backward branch, the destination address may be wrong if there is no relaxable relocation between it and the current location due to `if (remove)`. We may incorrectly relax a branch to c.j which ends up an overflow. * For a forward branch, the destination address may be overestimated and lose relaxation opportunities. To fix the issues, * Don't reset st_value to the original value. * Save the st_value delta from the previous iteration into valueDelta, and use `sa[0].d->value -= delta - valueDelta.find(sa[0].d)->second`.	2022-07-13 00:17:17 -07:00
Fangrui Song	67d760dd49	[ELF][test] Remove unneeded --mcpu=future from llvm-objdump commands	2022-07-12 21:08:52 -07:00
Fangrui Song	4864aba631	[ELF][test] Remove unneeded --mcpu=pwr10 from llvm-objdump commands llvm-objdump has defaulted to decode all known instructions for PPC64.	2022-07-12 21:07:45 -07:00
Jez Ng	61ace8f78b	[lld-macho][nfc] Change force-load.s test to actually test I'd forgotten to change a copypasted line...	2022-07-12 17:57:09 -04:00
YongKang Zhu	2324c2e3c3	[LLD] Two tweaks to symbol ordering scheme When `--symbol-ordering-file` is specified, the linker today will always put hot contributions in the middle of cold ones when targeting RISC machine, so to minimize the chances that branch thunks need be generated for hot code calling into cold code. This is not necessary when user specifies an ordering of read-only data (vs. function) symbols, or when output section is small such that no branch thunk would ever be required. The latter is common for mobile apps. For example, among all the native ARM64 libraries in Facebook Instagram App for Android, 80% of them have text section smaller than 64KB and the largest text section seen is less than 8MB, well below the distance that a BRANCH26 can reach. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D128382	2022-07-12 11:34:17 -07:00
Alex Brachet	5176a7671f	Fix build on Windows It seems like the `sed` on Windows is not particularly smart. It's not actually needed in this place, so I've removed it's usage and just created an invalid yaml another way.	2022-07-11 22:47:26 +00:00
Alex Brachet	d27984a651	Fix build on Windows Error message is not capitalized on Windows	2022-07-11 21:44:28 +00:00
Alex Brachet	fd9962e75d	[COFF] Add vfsoverlay flag This patch adds a new flag vfsoverlay similar to clang’s ivfsoverlay flag. This is helpful when compiling on case sensitive file systems when cross compiling to Windows. Particularly when compiling third party code containing \#pragma comment(“linker”, “/defaultlib:...”) which can’t be easily changed. Differential Revision: https://reviews.llvm.org/D125800	2022-07-11 21:31:01 +00:00
Kaining Zhong	6c641d0de6	[lld-macho] Handle user-provided dtrace symbols to avoid linking failure This fixes https://github.com/llvm/llvm-project/issues/56238. ld64.lld currently does not generate __dof section in Mach-O, and -no_dtrace_dof option is on by default. However when there are user-defined dtrace symbols, ld64.lld will treat them as undefined symbols, which causes the linking to fail because lld cannot find their definitions. This patch allows ld64.lld to rewrite the instructions calling dtrace symbols to instructions like nop as what ld64 does; therefore, when encountered with user-provided dtrace probes, the linking can still succeed. I'm not sure whether support for dtrace is expected in lld, so for now I didn't add codes to make lld emit __dof section like ld64, and only made it possible to link with dtrace symbols provided. If this feature is needed, I can add that part in Dtrace.cpp & Dtrace.h. Reviewed By: int3, #lld-macho Differential Revision: https://reviews.llvm.org/D129062	2022-07-11 15:32:26 -04:00
David Spickett	79942d32a6	[lld-macho] Fix compact unwind output for 32 bit builds This test was failing on our 32 bit build bot: https://lab.llvm.org/buildbot/#/builders/178/builds/2463 This happened because in UnwindInfoSectionImpl::finalize a decision is made whether to write out regular or compressed unwind info. One check in this does: ``` if (cuPtr->functionAddress >= functionAddressMax) { break; ``` Where cuPtr->functionAddress was uint64_t and functionAddressMax was uintptr_t, which is 4 bytes on a 32 bit system. Using uint64_t for functionAddressMax fixes this problem. Presumably because at only 4 bytes, the max is much lower than we expect. We're targetting 64 bit though so the size of the max should match the size of the addresses. Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D129363	2022-07-11 08:21:03 +00:00
Nico Weber	109d7fb4e6	fix comment typo to cycle bots	2022-07-09 22:41:58 +02:00
Fangrui Song	dd74d3117d	[ELF] Refactor ELFCOMPRESS_ZLIB handling and improve diagnostics And add some tests.	2022-07-08 14:04:19 -07:00
Leonard Chan	474c873148	Revert "[llvm] cmake config groundwork to have ZSTD in LLVM" This reverts commit `f07caf20b9` which seems to break upstream https://lab.llvm.org/buildbot/#/builders/109/builds/42253.	2022-07-08 13:48:05 -07:00
Cole Kissane	f07caf20b9	[llvm] cmake config groundwork to have ZSTD in LLVM - added `FindZSTD.cmake` - added a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - likewise added have_zstd to compiler-rt/test/lit.common.cfg.py, clang-tools-extra/clangd/test/lit.cfg.py, and several lit.site.cfg.py.in files mirroring have_zlib behavior Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-08 11:46:52 -07:00
Cole Kissane	ea61750c35	[NFC] Refactor llvm::zlib namespace * Refactor compression namespaces across the project, making way for a possible introduction of alternatives to zlib compression. Changes are as follows: * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`. Reviewed By: MaskRay, leonardchan, phosek Differential Revision: https://reviews.llvm.org/D128953	2022-07-08 11:19:07 -07:00
Fangrui Song	75e551e5d8	[ELF] Relax R_RISCV_CALL and R_RISCV_CALL_PLT A pair of auipc+jalr relocated by R_RISCV_CALL or R_RISCV_CALL_PLT can be converted to c.j, c.jal, or jal. * c.j: RVC and displacement is representable as an int12 * c.jal: RV32C and displacement is representable as an int12 * jal: displacement is representable as an int21 Use the D127581 relaxation framework to implement the relaxation. If a shorter sequence is satisfied, we record the new relocation type in `relocTypes` and saves the new instruction into `writes`. Finally let `riscvFinalizeRelax` rewrite the instruction by setting `skip`. Differential Revision: https://reviews.llvm.org/D127611	2022-07-07 10:18:45 -07:00
Fangrui Song	6611d58f5b	[ELF] Relax R_RISCV_ALIGN Alternative to D125036. Implement R_RISCV_ALIGN relaxation so that we can handle -mrelax object files (i.e. -mno-relax is no longer needed) and creates a framework for future relaxation. `relaxAux` is placed in a union with InputSectionBase::jumpInstrMod, storing auxiliary information for relaxation. In the first pass, `relaxAux` is allocated. The main data structure is `relocDeltas`: when referencing `relocations[i]`, the actual offset is `r_offset - (i ? relocDeltas[i-1] : 0)`. `relaxOnce` performs one relaxation pass. It computes `relocDeltas` for all text section. Then, adjust st_value/st_size for symbols relative to this section based on `SymbolAnchor`. `bytesDropped` is set so that `assignAddresses` knows that the size has changed. Run `relaxOnce` in the `finalizeAddressDependentContent` loop to wait for convergence of text sections and other address dependent sections (e.g. SHT_RELR). Note: extrating `relaxOnce` into a separate loop works for many cases but has issues in some linker script edge cases. After convergence, compute section contents: shrink the NOP sequence of each R_RISCV_ALIGN as appropriate. Instead of deleting bytes, we run a sequence of memcpy on the content delimitered by relocation locations. For R_RISCV_ALIGN let the next memcpy skip the desired number of bytes. Section content computation is parallelizable, but let's ensure the implementation is mature before optimizations. Technically we can save a copy if we interleave some code with `OutputSection::writeTo`, but let's not pollute the generic code (we don't have templated relocation resolving, so using conditions can impose overhead to non-RISCV.) Tested: `make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- LLVM=1 defconfig all` built Linux kernel using -mrelax is bootable. FreeBSD RISCV64 system using -mrelax is bootable. bash/curl/firefox/libevent/vim/tmux using -mrelax works. Differential Revision: https://reviews.llvm.org/D127581	2022-07-07 10:16:09 -07:00
Tim Northover	0f4339a835	lld test fix: don't check the precise hex emitted as a comment. It can vary depending on the platform, so as with the NO-FMA test just check for "0x".	2022-07-07 13:25:24 +01:00

1 2 3 4 5 ...

15438 Commits