llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	72bedf46c7	[ELF] Inline InputSection::getParent. NFC Combined with the previous change, lld executable is ~2K smaller and some code paths using InputSection::getParent are more efficient. The fragmented headers lead to a design limitation that OutputSection has to be incomplete, so we cannot use static_cast.	2022-03-08 11:26:12 -08:00
Fangrui Song	6c814931bc	[ELF] Don't use multiple inheritance for OutputSection. NFC Add an OutputDesc class inheriting from SectionCommand. An OutputDesc wraps an OutputSection. This change allows InputSection::getParent to be inlined. Differential Revision: https://reviews.llvm.org/D120650	2022-03-08 11:23:42 -08:00
Jez Ng	ce2ae38124	[lld-macho] Deduplicate the `__objc_classrefs` section contents ld64 breaks down `__objc_classrefs` on a per-word level and deduplicates them. This greatly reduces the number of bind entries emitted (and therefore the amount of work `dyld` has to do at runtime). For chromium_framework, this change to LLD cuts the number of (non-lazy) binds from 912 to 190, getting us to parity with ld64 in this aspect. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D121053	2022-03-08 08:34:04 -05:00
Jez Ng	8ec1033933	[lld-macho] Deduplicate CFStrings during ICF `__cfstring` has embedded addends that foil ICF's hashing / equality checks. (We can ignore embedded addends when doing ICF because the same information gets recorded in our Reloc structs.) Therefore, in order to properly dedup CFStrings, we create a mutable copy of the CFString and zero out the embedded addends before performing any hashing / equality checks. (We did in fact have a partial implementation of CFString deduplication already. However, it only worked when the cstrings they point to are at identical offsets in their object files.) I anticipate this approach can be extended to other similar statically-allocated struct sections in the future. In addition, we previously treated all references with differing addends as unequal. This is not true when the references are to literals: different addends may point to the same literal in the output binary. In particular, `__cfstring` has such references to `__cstring`. I've adjusted ICF's `equalsConstant` logic accordingly, and I've added a few more tests to make sure the addend-comparison code path is adequately covered. Fixes https://github.com/llvm/llvm-project/issues/51281. Reviewed By: #lld-macho, Roger Differential Revision: https://reviews.llvm.org/D120137	2022-03-08 08:34:03 -05:00
Jez Ng	0405920c5f	Re-land [lld-macho][nfc] Don't use `stubsHelperIndex` in ICF hash Previous attempt was commit `112135e774` and reverted in `d86d431814`.	2022-03-07 16:58:00 -05:00
Nico Weber	d86d431814	Revert "[lld-macho][nfc] Don't use `stubsHelperIndex` in ICF hash" This reverts commit `112135e774`. Breaks lld/test/MachO/{icf.s,cfstring-dedup.s,invalid/cfstring.s}	2022-03-07 13:50:38 -05:00
Jez Ng	ad1c32e9b3	[lld-macho][nfc] Reduce size of icfEqClass hash ... from a `uint64_t` to a `uint32_t`. (LLD-ELF uses a `uint32_t` too.) About a 1.7% reduction in peak RSS when linking chromium_framework on my 3.2 GHz 16-Core Intel Xeon W Mac Pro, and no stat sig change in wall time. </Users/jezng/test2.sh ["before"]> </Users/jezng/test2.sh ["after"]> difference (95% CI) RSS 1003036672.000 ± 9891065.259 985539505.231 ± 10272748.749 [ -2.3% .. -1.2%] samples 27 26 base diff difference (95% CI) sys_time 1.277 ± 0.023 1.277 ± 0.024 [ -0.9% .. +0.9%] user_time 6.682 ± 0.046 6.598 ± 0.043 [ -1.6% .. -0.9%] wall_time 5.904 ± 0.062 5.895 ± 0.063 [ -0.7% .. +0.4%] samples 46 28 No appreciable change (~0.01%) in number of `equals` comparisons either: Before: ld64.lld: ICF needed 8 iterations ld64.lld: equalsConstant() called 701643 times ld64.lld: equalsVariable() called 3438526 times After: ld64.lld: ICF needed 8 iterations ld64.lld: equalsConstant() called 701729 times ld64.lld: equalsVariable() called 3438526 times Reviewed By: #lld-macho, MaskRay, thakis Differential Revision: https://reviews.llvm.org/D121052	2022-03-07 12:36:28 -05:00
Jez Ng	112135e774	[lld-macho][nfc] Don't use `stubsHelperIndex` in ICF hash The existing hashing of stubsHelperIndex has mostly been a no-op* for some time now (ever since we made ICF run before dylib symbols get their stubs indices assigned). I guess we could consider hashing the name + filename of the DylibSymbol instead, but I'm not sure the overhead's worth it... moreover, LLD/ELF only hashes their Defined symbols as well. *: Technically it does change the hash value since stubsHelperIndex is initialized to `UINT32_MAX` by default. But since all stubsHelperIndex values are the same at when ICF runs, they don't add any useful information to the hash.	2022-03-07 12:36:28 -05:00
Jez Ng	7028799ca3	[lld-macho][nfc] Rename isec -> referentIsec to avoid shadowing I found the shadowing a bit confusing	2022-03-07 12:36:28 -05:00
Jez Ng	64cc719766	[lld-macho][nfc] Track # of ICF calls to `equals` methods This is debug code that is disabled by default. It'll provide a easy way to figure out the impact (if any) of tweaking ICF's hashing algorithm (since a poor quality hash will result in many more `equals` calls). Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D121051	2022-03-07 12:36:27 -05:00
Jez Ng	53e7eef43f	[lld-macho][nfc] Use llvm::function_ref instead of std::function	2022-03-07 12:36:27 -05:00
Jez Ng	c416f3fafd	[lld-macho][nfc] Remove file statics from ICF.cpp This gets us closer to the [LLD-as-a-library goal][1]. [1]: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D121050	2022-03-07 12:36:26 -05:00
Fangrui Song	a815424cc5	Reland D119909 [ELF] Parallelize initializeLocalSymbols ObjFile::parse combines symbol initialization and resolution. Many tasks unrelated to symbol resolution can be postponed and parallelized. This patch extracts local symbol initialization and parallelizes it. Technically the new function initializeLocalSymbols can be merged into ObjFile::postParse, but functions like getSrcMsg may access the uninitialized (all nullptr) local part of InputFile::symbols. Linking chrome: 1.02x as fast with glibc malloc, 1.04x as fast with mimalloc Depends on `f456c3ae3f` and D119908 Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D119909	2022-03-04 19:00:10 -08:00
Fangrui Song	f456c3ae3f	[ELF] Move addWrappedSymbols before postParseObjectFile addWrappedSymbols may trigger archive extraction: split stack implementation uses --wrap=pthread_create, which extracts libgcc.a(generic-morestack-thread.o). This fixes the regression caused by `09602d3b47` by making the invariant satisfied: no more non-compileBitcodeFiles object file is produced at postParseObjectFile.	2022-03-04 18:56:37 -08:00
Jorge Gorbe Moya	449b649fec	Revert "[ELF] Parallelize initializeLocalSymbols" This reverts commit `09602d3b47`.	2022-03-04 15:01:17 -08:00
Jez Ng	72c5b26f3d	[lld-macho][nfc] Use %X in mapfile test LLD (and ld64) emits uppercase hex addresses in the mapfile. The map-file.s test passes right now because the addresses we emit happen not to include any alphabets, but that can easily change. I noticed this while dealing with https://github.com/llvm/llvm-project/issues/54184. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D120941	2022-03-04 14:21:17 -05:00
Jez Ng	984197612c	[lld-macho][nfc] Rename some tests for consistency Now all the tests that cover symbol resolution / precedence have "resolution" in their filename. I also added a couple of extra comments. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D120938	2022-03-04 14:21:16 -05:00
Jez Ng	070af48d13	[lld-macho][nfc] Decouple tapi-link.s test from libSystem If we fix https://github.com/llvm/llvm-project/issues/54184, we will end up including libSystem in every %lld invocation, which would break tapi-link.s as it assumes that libSystem isn't directly linked (instead it goes through libReexportSystem). Let's remove this unnecessary coupling, as well as use `split-file` instead of having a separate file under `Inputs`. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D120939	2022-03-03 19:48:59 -05:00
Jez Ng	dd29597e10	[LTO] Initialize canAutoHide() using canBeOmittedFromSymbolTable() Per discussion on https://reviews.llvm.org/D59709#inline-1148734, this seems like the right course of action. `canBeOmittedFromSymbolTable()` subsumes and generalizes the previous logic. In addition to handling `linkonce_odr` `unnamed_addr` globals, we now also internalize `linkonce_odr` + `local_unnamed_addr` constants. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D120173	2022-03-03 19:04:11 -05:00
Jez Ng	5c268743da	[lld-macho][nfc] Use %lld-watchos substitution in bind-opcodes.s Previously, we were using a syslibroot that pointed to macos while linking against arch arm64_32, which didn't really make sense. It isn't currently an issue, but will be if we add the `-lSystem` as part of dealing with https://github.com/llvm/llvm-project/issues/54184.	2022-03-03 19:00:28 -05:00
Jez Ng	f7547558c9	[lld-macho][nfc] Avoid using absolute addresses in cgprofile-icf.s If we fix https://github.com/llvm/llvm-project/issues/54184, the `dyld_stub_binder` symbol will get included in every output dylib. This would cause the addresses of the other symbols to shift, breaking the test as it currently stands. Let's make the test more flexible. Reviewed By: lgrey Differential Revision: https://reviews.llvm.org/D120940	2022-03-03 19:00:28 -05:00
Martin Storsjö	4c3b74b7f5	[LLD] [COFF] Order .debug_* sections at the end, to avoid leaving gaps if stripped So far, we sort all discardable sections at the end, with only some extra logic to make sure that the .reloc section is at the start of that group of sections. But if there are other discardable sections, other than .reloc, they must also be ordered before .debug_* sections, to avoid leaving gaps if the executable is stripped. (Stripping executables doesn't remove all discardable sections, only the ones named .debug_*). Rust binaries seem to include a .rmeta section, which is marked discardable. This fixes stripping such binaries if built with dwarf debug info included. This fixes issues observed in MSYS2 in https://github.com/msys2/MINGW-packages/pull/10555. Differential Revision: https://reviews.llvm.org/D120805	2022-03-03 10:08:51 +02:00
Douglas Yung	e81e5d788c	Add "REQUIRES: x86" to test as it calls llc with an x86_64 triple.	2022-03-02 11:12:41 -08:00
Sam Clegg	1cf6ebc0e9	[lld][WebAssembly] Improve error reporting for bad ar archive members Show the name of of the archive in the error message as well as the name of the object within it. Differential Revision: https://reviews.llvm.org/D120689	2022-03-01 15:21:53 -08:00
Zequan Wu	5c9e20d7d0	[PDB] Add char8_t type Differential Revision: https://reviews.llvm.org/D120690	2022-03-01 13:39:51 -08:00
Martin Storsjö	9ffeaaa0ea	[LLD] [COFF] Use StringTableBuilder to optimize the string table This does tail merging (and deduplication) of the strings. On a statically linked clang.exe, this shrinks the ~17 MB string table by around 0.5 MB. This adds ~160 ms to the linking time which originally was around 950 ms. For cases where `-debug:symtab` or `-debug:dwarf` isn't set, the string table is only used for long section names, where this shouldn't make any difference at all. Differential Revision: https://reviews.llvm.org/D120677	2022-03-01 18:44:03 +02:00
Martin Storsjö	9dd2d50984	[LLD] [COFF] Use the new encodeSectionName() helper for long section names The previous code used an unbounded sprintf, which in theory can overflow, writing either the null terminator or the last digits into the next struct member. In practice, in LLD, all long section names are written sequentially first at the start of the string table, followed by all the long symbol names. Due to this, even if the total string table would end up large, the long section names have fairly short offsets, which is why this hasn't been an issue in practice. I don't think it's worth trying to write a test that produces an executable with enough long section names to make the section names themselves exceed 10^6 bytes, which is currently necessary to trigger faults with the previous form. Differential Revision: https://reviews.llvm.org/D120676	2022-03-01 11:33:02 +02:00
Fangrui Song	87034ad2a4	[ELF] isKnownZFlag: move known literal flags to an array. NFC The chain of == comparisons is a bit unwieldy to update. While here, sort the entries alphabetically.	2022-02-28 23:23:33 -08:00
Jez Ng	a552fb2a86	[lld-macho] Have relocation address included in range-check error message This makes it easier to debug those errors. See e.g. https://github.com/llvm/llvm-project/issues/52767#issuecomment-1028713943 We take the approach of 'reverse-engineering' the InputSection from the output buffer offset. This provides for a cleaner Target API, and is similar to LLD-ELF's implementation of getErrorPlace(). Reviewed By: #lld-macho, Roger Differential Revision: https://reviews.llvm.org/D118903	2022-02-28 21:56:38 -05:00
Fangrui Song	9e9c86fd67	[ELF] Change some non-null pointer parameters to references. NFC To decrease difference for D120650. Also, rename some `OutputSection *sec` (and `cmd`) to the more common `osec`.	2022-02-28 11:19:00 -08:00
Fangrui Song	b07ef4d566	[ELF] Rename Symbol::compare to shouldReplace. NFC The return value is not a boolean instead of a tri-state. Suggested by Peter Smith in D120640.	2022-02-28 18:25:21 +00:00
Fangrui Song	8d01ac75e7	[ELF] Replace an unneeded dyn_cast_or_null with dyn_cast. NFC	2022-02-28 00:50:06 -08:00
Fangrui Song	fee78961f5	[ELF] Optimize SectionBase::Kind values to make isa<InputSection> more efficient. NFC Surprisingly my lld executable is 1.5KiB smaller.	2022-02-28 00:24:25 -08:00
Fangrui Song	bb3eeac773	[ELF] Make InputSection::classof inline. NFC	2022-02-28 00:16:45 -08:00
Fangrui Song	4976d1fe58	[ELF] Move SyntheticSection check from InputSection::writeTo to OutputSection::writeTo. NFC Simplify code and make the heavyweight operation to the call site so that it is clearer how to improve the inefficient scheduling in the future.	2022-02-27 23:28:52 -08:00
Fangrui Song	d07ff99591	[ELF] Enforce double-dash form --error-limit It's ld.lld specific and by convention we enforce the double-dash form to avoid collision with the short option -e (--entry).	2022-02-27 20:49:36 +00:00
Fangrui Song	87e6251d66	[ELF] Use --error-limit instead of -error-limit	2022-02-27 20:47:37 +00:00
Fangrui Song	d14d8664e3	[ELF] Change global variable backwardReferences to a LinkerDriver member variable. NFC Similar to whyExtract.	2022-02-27 20:33:28 +00:00
Fangrui Song	7fd3849b35	[ELF] Move --print-archive-stats= and --why-extract= beside --warn-backrefs report So that early errors don't suppress their output.	2022-02-27 20:23:09 +00:00
Fangrui Song	bd448f01a6	[ELF] BitcodeFile: resolve defined symbols before undefined symbols This ports D95985 for ELF relocatable object files to BitcodeFile.	2022-02-27 05:37:08 +00:00
Joao Moreira	9d7001eba9	[ELF][X86] Don't create IBT .plt if there is no PLT entry https://github.com/ClangBuiltLinux/linux/issues/1606 When GNU_PROPERTY_X86_FEATURE_1_IBT is enabled, ld.lld will create .plt output section even if there is no PLT entry. Fix this by implementing IBTPltSection::isNeeded instead of using the default code path (which always returns true). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D120600	2022-02-26 03:55:40 +00:00
Fangrui Song	767e64fc11	[ELF] Support some absolute/PC-relative relocation types for REL format ctfconvert seems to use REL-format `.rel.SUNW_dof` for 32-bit architectures. ``` Binary file usr/ports/lang/perl5.32/work/perl-5.32.1/dtrace_mini.o matches [alfredo.junior@dell-a ~/tmp/llvm-bug]$ readelf -r dtrace_mini.o Relocation section (.rel.SUNW_dof): r_offset r_info r_type st_value st_name 00000184 0000281a R_PPC_REL32 00000000 $dtrace1772974259.Perl_dtrace_probe_load ``` Support R_PPC_REL32 to fix `ld.lld: error: drti.c:(.SUNW_dof+0x4E4): internal linker error: cannot read addend for relocation R_PPC_REL32`. While here, add some common relocation types for AArch64, PPC, and PPC64. We perform minimum tests. Reviewed By: adalava, arichardson Differential Revision: https://reviews.llvm.org/D120535	2022-02-25 19:25:18 +00:00
Sam Clegg	4c75521ce0	[MC][WebAssembly] Fix crash when relocation addend underlows U32 For the object file writer we need to allow the underflow (ar write zero), but for the final linker output we should probably generate an error (I've left that as a TODO for now). Fixes: https://github.com/llvm/llvm-project/issues/54012 Differential Revision: https://reviews.llvm.org/D120522	2022-02-25 07:13:15 -08:00
Fangrui Song	09602d3b47	[ELF] Parallelize initializeLocalSymbols ObjFile::parse combines symbol initialization and resolution. Many tasks unrelated to symbol resolution can be postponed and parallelized. This patch extracts local symbol initialization and parallelizes it. Technically the new function initializeLocalSymbols can be merged into ObjFile::postParse, but functions like getSrcMsg may access the uninitialized (all nullptr) local part of InputFile::symbols. Linking chrome: 1.02x as fast with glibc malloc, 1.04x as fast with mimalloc Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D119909	2022-02-24 20:05:59 -08:00
Fangrui Song	19e37a7415	[ELF] Update comment. NFC	2022-02-24 14:09:00 -08:00
Fangrui Song	6d94340809	[ELF] Simplify resolveDefined and resolveCommon This is NFC for valid input (COMMON symbols cannot be weak or versioned).	2022-02-24 14:08:06 -08:00
Reid Kleckner	da11f17e90	[lld/MachO] Fix +asserts build after recent change	2022-02-24 13:12:48 -08:00
Fangrui Song	b6a71d9e12	[ELF][test] Remove invalid weak COMMON tests GNU as reports `Error: symbol `foo' can not be both weak and common`, though LLVM integrated assembler does not report an error yet.	2022-02-24 12:54:16 -08:00
Jez Ng	850592ec14	[lld-macho] Implement -why_live (without perf overhead) This was based off @thakis' draft in {D103517}. I employed templates to ensure the support for `-why_live` wouldn't slow down the regular non-why-live code path. No stat sig perf difference on my 3.2 GHz 16-Core Intel Xeon W: base diff difference (95% CI) sys_time 1.195 ± 0.015 1.199 ± 0.022 [ -0.4% .. +1.0%] user_time 3.716 ± 0.022 3.701 ± 0.025 [ -0.7% .. -0.1%] wall_time 4.606 ± 0.034 4.597 ± 0.046 [ -0.6% .. +0.2%] samples 44 37 Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D120377	2022-02-24 15:49:36 -05:00
Fangrui Song	15617cdb55	[ELF] Simplify --fortran-common. NFC	2022-02-24 12:21:40 -08:00

1 2 3 4 5 ...

15144 Commits