llvm-project

Commit Graph

Author	SHA1	Message	Date
Luqman Aden	32a4ad3b6c	[LLD] Set alignment as part of Characteristics in TLS table. Fixes https://bugs.llvm.org/show_bug.cgi?id=46473 LLD wasn't previously specifying any specific alignment in the TLS table's Characteristics field so the loader would just assume the default value (16 bytes). This works most of the time except if you have thread locals that want specific higher alignments (e.g. 32 as in the bug) even if they specify an alignment on the thread local. This change updates LLD to take the max alignment from tls section. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D88637	2020-10-14 19:41:03 -07:00
Luqman Aden	7fe13af676	Nit: Use early return to reduce indentation.	2020-10-14 19:34:32 -07:00
Luqman Aden	f80950a8bb	Update tests.	2020-10-14 19:34:32 -07:00
Luqman Aden	8b70d527d7	[LLD] Set alignment as part of Characteristics in TLS table. Differential Revision: https://reviews.llvm.org/D88637	2020-10-14 19:34:31 -07:00
Konstantin Zhuravlyov	3fdf3b1539	AMDGPU: Update AMDHSA code object version handling Differential Revision: https://reviews.llvm.org/D89076	2020-10-14 13:04:27 -04:00
jasonliu	f85bcc21dd	[AIX] Turn -fdata-sections on by default in Clang Summary: This patch does the following: 1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a parameter, because some options' default value is triple dependant. 2. DataSections is turned on by default on AIX for llc. 3. Test cases change accordingly because of the default behaviour change. 4. Clang Driver passes in -fdata-sections by default on AIX. Reviewed By: MaskRay, DiggerLin Differential Revision: https://reviews.llvm.org/D88737	2020-10-14 15:58:31 +00:00
Luqman Aden	dc128e5968	[test][lld] Mark TLS tests as REQUIRES: x86. Fixes http://lab.llvm.org:8011/#/builders/119/builds/92	2020-10-14 00:29:06 -07:00
Luqman Aden	6b7738e204	[LLD] Add baseline test for TLS alignment. NFC. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D88646	2020-10-13 20:53:32 -07:00
Alexandre Ganea	617d64f6c5	Re-land [ThinLTO] Re-order modules for optimal multi-threaded processing This reverts `9b5b305023` and fixes the unwanted re-ordering when generating ThinLTO indexes. The goal of this patch is to better balance thread utilization during ThinLTO in-process linking (in llvm-lto2 or in LLD). Before this patch, large modules would often be scheduled late during execution, taking a long time to complete, thus starving the thread pool. We now sort modules in descending order, based on each module's bitcode size, so that larger modules are processed first. By doing so, smaller modules have a better chance to keep the thread pool active, and thus avoid starvation when the bitcode compilation is almost complete. In our case (on dual Intel Xeon Gold 6140, Windows 10 version 2004, two-stage build), this saves 15 sec when linking `clang.exe` with LLD & -flto=thin, /opt:lldltojobs=all, no ThinLTO cache, -DLLVM_INTEGRATED_CRT_ALLOC=d:\git\rpmalloc. Before patch: 100 sec After patch: 85 sec Inspired by the work done by David Callahan in D60495. Differential Revision: https://reviews.llvm.org/D87966	2020-10-13 21:54:15 -04:00
Andrew Paverd	cfd8481da1	Reland [CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-10-13 13:20:52 -07:00
Konstantin Zhuravlyov	f218652a36	LLD/AMDGPU: Infer os abi based on input llvm bitcode Differential Revision: https://reviews.llvm.org/D89042	2020-10-13 12:20:28 -04:00
Paulo Matos	388fb67b0d	[WebAssembly] Added .tabletype to asm and multiple table support in obj files Adds more testing in basic-assembly.s and a new test tables.s. Adds support to yaml reading and writing of tables as well. Differential Revision: https://reviews.llvm.org/D88815	2020-10-13 07:52:23 -07:00
Sam Clegg	b3b4cda104	[lld][WebAssembly] Don't GC library objects under `--whole-archive` Followup on https://reviews.llvm.org/D85062 which ignores entire library objects when no symbols are used within them. This is shouldn't apply with `--whole-archive` since this is specified to treat them like direct object inputs. Differential Revision: https://reviews.llvm.org/D89290	2020-10-12 21:19:19 -07:00
Dan Gohman	950ae43091	[WebAssembly] GC constructor functions in otherwise unused archive objects This allows `__wasilibc_populate_libpreopen` to be GC'd in more cases where it isn't needed, including when linked from Rust's libstd. Differential Revision: https://reviews.llvm.org/D85062	2020-10-12 18:54:57 -07:00
Sam Clegg	2513407d39	[lld][WebAssembly] Add support for -Bsymbolic flag This flag works in a similar way to the ELF linker in that it will resolve any defined symbols to their local definition with a shared library or -pie executable. This flag has no effect on static linking. Differential Revision: https://reviews.llvm.org/D89152	2020-10-12 17:25:04 -07:00
Martin Storsjö	d77d727339	[LLD] [COFF] Fix a ubsan error in pdb-type-server-missing.yaml This error has been present since `5519e4da83`. Differential Revision: https://reviews.llvm.org/D89027	2020-10-12 23:28:23 +03:00
Christian Iversen	a9cefc3dee	[ELF] Fix broken bitstream linking with lld when e_machine > 255 In ELF/InputFiles.cpp, getBitcodeMachineKind() is limited to uint8_t return type. This works as long as EM_xxx is < 256, which is true for common architectures, but not for some newly assigned or unofficial EM_* values. The corresponding ELF field (e_machine) can hold uint16_t. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D89185	2020-10-11 14:19:25 -07:00
Martin Storsjö	1dbfd87319	[LLD] [ELF] Fix the help listing for the wrap option. NFC. This option just takes a single symbol name per invocation of the option. Differential Revision: https://reviews.llvm.org/D89007	2020-10-09 15:32:00 +03:00
Fangrui Song	db1988f038	[ELF] Don't change binding to STB_WEAK for an undefined specified by -u Similar to D66992. In GNU ld, a -u specified symbol is a STB_DEFAULT undefined. It cannot be changed to STB_WEAK by a later STB_WEAK undefined in a regular object file. The behavior is consistent with our model because -u means "we need to fetch a lazy definition". It should not be altered just because there is also a STB_WEAK undefined. Note, our -u semantics are still different from GNU ld (https://github.com/ClangBuiltLinux/linux/issues/515): we don't force the specified symbol to appear in .symtab This is a deliberate decision. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D88945	2020-10-08 08:31:34 -07:00
Mateusz Mikuła	9b58b0c06e	[LLD] Ignore ELF tests when ld.lld defaults to MinGW Follow-up to D87418. Differential Revision: https://reviews.llvm.org/D88991	2020-10-08 09:34:46 +03:00
Martin Storsjö	9b2b32743d	[LLD] [ELF] Fix up a comment regarding the --wrap option. NFC. Add missing leading underscores to the __wrap_<symbol> and __real_<symbol> names. Differential Revision: https://reviews.llvm.org/D89008	2020-10-08 09:33:23 +03:00
Martin Storsjö	6e6a5acf00	[LLD] [MinGW] Move an option definitions to alphabetical order, wrap a line. NFC.	2020-10-07 15:14:07 +03:00
Martin Storsjö	61e2f9fa2e	[LLD] [MinGW] Support setting the subsystem version via the subsystem argument If a version is specified both with --{major,minor}-subsystem-version and with --subsystem <name>:<version>, the one specified last (that actually sets a version) takes precedance in GNU ld; thus doing the same here. Differential Revision: https://reviews.llvm.org/D88804	2020-10-05 23:08:08 +03:00
Martin Storsjö	bc8f3b424c	[LLD] [MinGW] Simplify handling of os/subsystem version As they can be set independently after D88802, we can get rid of a bit of extra code - simplifying the logic here before adding more complication to it later. Differential Revision: https://reviews.llvm.org/D88803	2020-10-05 23:08:02 +03:00
Martin Storsjö	45c4c54003	[LLD] [COFF] Add a private option for setting the os version separately from subsystem version The MinGW driver has separate options for OS and subsystem version. Having this available in lld-link allows the MinGW driver to both match GNU ld better and simplifies the code for merging two (potentially mismatching) arguments into one. Differential Revision: https://reviews.llvm.org/D88802	2020-10-05 23:08:01 +03:00
Martin Storsjö	19e86336ef	[LLD] [COFF] Fix parsing version numbers with leading zeros Parse the components as decimal, instead of decuding the base from the string. This avoids ambiguity if the second number contains leading zeros, which previously were parsed as indicating an octal number. MS link.exe doesn't support hexadecimal numbers in the version numbers, neither in /version nor in /subsystem. Differential Revision: https://reviews.llvm.org/D88801	2020-10-05 23:08:00 +03:00
Alexandre Ganea	fe1f0a1a19	[LLD] Fix /time formatting for very long runs. NFC.	2020-10-02 09:53:43 -04:00
Alexandre Ganea	55b97a6d2a	[LLD][COFF] Add more type record information to /summary This adds the following two new lines to /summary: 21351 Input OBJ files (expanded from all cmd-line inputs) 61 PDB type server dependencies 38 Precomp OBJ dependencies 1420669231 Input type records <<<< 78665073382 Input type records bytes <<<< 8801393 Merged TPI records 3177158 Merged IPI records 59194 Output PDB strings 71576766 Global symbol records 25416935 Module symbol records 2103431 Public symbol records Differential Revision: https://reviews.llvm.org/D88703	2020-10-02 09:36:11 -04:00
Alexandre Ganea	4140f0744f	[LLD][COFF] Fix crash with /summary and PCH input files Before this patch /summary was crashing with some .PCH.OBJ files, because tpiMap[srcIdx++] was reading at the wrong location. When the TpiSource depends on a .PCH.OBJ file, the types should be offset by the previously merged PCH.OBJ set of indices. Differential Revision: https://reviews.llvm.org/D88678	2020-10-01 17:08:35 -04:00
Fangrui Song	88f2fe5cad	Raland D87318 [LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic. The compiler will produce code that looks like: ``` pla r3, x@got@tlsgd@pcrel R_PPC64_GOT_TLSGD_PCREL34 bl __tls_get_addr@notoc(x@tlsgd) R_PPC64_TLSGD R_PPC64_REL24_NOTOC ``` LLD should be able to correctly compute the relocation for R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible: General Dynamic to Local Exec: ``` paddi r3, r13, x@tprel nop ``` and General Dynamic to Initial Exec: ``` pld r3, x@got@tprel@pcrel add r3, r3, r13 ``` Note: This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic. The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence. Reviewed By: NeHuang, sfertile, MaskRay Differential Revision: https://reviews.llvm.org/D87318	2020-10-01 12:36:33 -07:00
Reid Kleckner	5d46d7e8b2	[PDB] Use one func id DenseMap instead of per-source maps, NFC This avoids some DenseMap copies when /Zi is in use, and results in fewer data structures. Differential Revision: https://reviews.llvm.org/D88617	2020-10-01 12:22:27 -07:00
Arthur Eubanks	499260c03b	Revert "[CFGuard] Add address-taken IAT tables and delay-load support" This reverts commit `ef4e971e5e`.	2020-10-01 11:29:54 -07:00
Stefan Pintilie	5f3e565f59	Revert "[LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic" This reverts commit `79122868f9`.	2020-10-01 13:28:35 -05:00
Stefan Pintilie	79122868f9	[LLD][PowerPC] Add support for R_PPC64_GOT_TLSGD_PCREL34 used in TLS General Dynamic Add Thread Local Storage support for the 34 bit relocation R_PPC64_GOT_TLSGD_PCREL34 used in General Dynamic. The compiler will produce code that looks like: ``` pla r3, x@got@tlsgd@pcrel R_PPC64_GOT_TLSGD_PCREL34 bl __tls_get_addr@notoc(x@tlsgd) R_PPC64_TLSGD R_PPC64_REL24_NOTOC ``` LLD should be able to correctly compute the relocation for R_PPC64_GOT_TLSGD_PCREL34 as well as do the following two relaxations where possible: General Dynamic to Local Exec: ``` paddi r3, r13, x@tprel nop ``` and General Dynamic to Initial Exec: ``` pld r3, x@got@tprel@pcrel add r3, r3, r13 ``` Note: This patch adds support for the PC Relative (no TOC) version of General Dynamic on top of the existing support for the TOC version of General Dynamic. The ABI does not provide any way to tell by looking only at the relocation `R_PPC64_TLSGD` when it is being used in a TOC instruction sequence or and when it is being used in a no TOC sequence. The TOC sequence should always be 4 byte aligned. This patch adds one to the offset of the relocation when it is being used in a no TOC sequence. In this way LLD can tell by looking at the alignment of the offset of `R_PPC64_TLSGD` whether or not it is being used as part of a TOC or no TOC sequence. Reviewed By: NeHuang, sfertile, MaskRay Differential Revision: https://reviews.llvm.org/D87318	2020-10-01 13:00:37 -05:00
James Henderson	a20168d030	[Archive] Don't throw away errors for malformed archive members When adding an archive member with a problem, e.g. a new bitcode with an old archiver, containing an unsupported attribute, or an ELF file with a malformed symbol table, the archiver would throw away the error and simply add the member to the archive without any symbol entries. This meant that the resultant archive could be silently unusable when not using --whole-archive, and result in unexpected undefined symbols. This change fixes this issue by addressing two FIXMEs and only throwing away not-an-object errors. However, this meant that some LLD tests which didn't need symbol tables and were using invalid members deliberately to test the linker's malformed input handling no longer worked, so this patch also stops the archiver from looking for symbols in an object if it doesn't require a symbol table, and updates the tests accordingly. Differential Revision: https://reviews.llvm.org/D88288 Reviewed by: grimar, rupprecht, MaskRay	2020-10-01 14:03:34 +01:00
Andrew Paverd	ef4e971e5e	[CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-10-01 12:45:07 +01:00
Fangrui Song	4e9277eda1	[ELF] --wrap: don't unnecessarily expose __real_ The routing rules are: sym -> __wrap_sym __real_sym -> sym __wrap_sym and sym are routing targets, so they need to be exposed to the symbol table. __real_sym is not and can be eliminated if not used by regular object.	2020-09-30 20:09:25 -07:00
Dan Gohman	6cd8511e59	[WebAssembly] New-style command support This adds support for new-style command support. In this mode, all exports are considered command entrypoints, and the linker inserts calls to `__wasm_call_ctors` and `__wasm_call_dtors` for all such entrypoints. This enables support for: - Command entrypoints taking arguments other than strings and return values other than `int`. - Multicall executables without requiring on the use of string-based command-line arguments. This new behavior is disabled when the input has an explicit call to `__wasm_call_ctors`, indicating code not expecting new-style command support. This change does mean that wasm-ld no longer supports DCE-ing the `__wasm_call_ctors` function when there are no calls to it. If there are no calls to it, and there are ctors present, we assume it's wasm-ld's job to insert the calls. This seems ok though, because if there are ctors present, the program is expecting them to be called. This change affects the init-fini-gc.ll test.	2020-09-30 19:02:40 -07:00
Sam Clegg	3c45a06f26	[lld][WebAssembly] Allow exporting of mutable globals In particular allow explict exporting of `__stack_pointer` but exclud this from `--export-all` to avoid requiring the mutable globals feature whenenve `--export-all` is used. This uncovered a bug in populateTargetFeatures regarding checking if the mutable-globals feature is allowed. See: https://github.com/WebAssembly/binaryen/issues/2934 Differential Revision: https://reviews.llvm.org/D88506	2020-09-30 17:53:27 -07:00
Reid Kleckner	5519e4da83	Re-land "[PDB] Merge types in parallel when using ghashing" Stored Error objects have to be checked, even if they are success values. This reverts commit `8d250ac3cd`. Relands commit 49b3459930655d879b2dc190ff8fe11c38a8be5f.. Original commit message: ----------------------------------------- This makes type merging much faster (-24% on chrome.dll) when multiple threads are available, but it slightly increases the time to link (+10%) when /threads:1 is passed. With only one more thread, the new type merging is faster (-11%). The output PDB should be identical to what it was before this change. To give an idea, here is the /time output placed side by side: BEFORE \| AFTER Input File Reading: 956 ms \| 968 ms Code Layout: 258 ms \| 190 ms Commit Output File: 6 ms \| 7 ms PDB Emission (Cumulative): 6691 ms \| 4253 ms Add Objects: 4341 ms \| 2927 ms Type Merging: 2814 ms \| 1269 ms -55%! Symbol Merging: 1509 ms \| 1645 ms Publics Stream Layout: 111 ms \| 112 ms TPI Stream Layout: 764 ms \| 26 ms trivial Commit to Disk: 1322 ms \| 1036 ms -300ms ----------------------------------------- -------- Total Link Time: 8416 ms 5882 ms -30% overall The main source of the additional overhead in the single-threaded case is the need to iterate all .debug$T sections up front to check which type records should go in the IPI stream. See fillIsItemIndexFromDebugT. With changes to the .debug$H section, we could pre-calculate this info and eliminate the need to do this walk up front. That should restore single-threaded performance back to what it was before this change. This change will cause LLD to be much more parallel than it used to, and for users who do multiple links in parallel, it could regress performance. However, when the user is only doing one link, it's a huge improvement. In the future, we can use NT worker threads to avoid oversaturating the machine with work, but for now, this is such an improvement for the single-link use case that I think we should land this as is. Algorithm ---------- Before this change, we essentially used a DenseMap<GloballyHashedType, TypeIndex> to check if a type has already been seen, and if it hasn't been seen, insert it now and use the next available type index for it in the destination type stream. DenseMap does not support concurrent insertion, and even if it did, the linker must be deterministic: it cannot produce different PDBs by using different numbers of threads. The output type stream must be in the same order regardless of the order of hash table insertions. In order to create a hash table that supports concurrent insertion, the table cells must be small enough that they can be updated atomically. The algorithm I used for updating the table using linear probing is described in this paper, "Concurrent Hash Tables: Fast and General(?)!": https://dl.acm.org/doi/10.1145/3309206 The GHashCell in this change is essentially a pair of 32-bit integer indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the TpiSource object, and it represents an input type stream. The typeIndex is the index of the type in the stream. Together, we have something like a ragged 2D array of ghashes, which can be looked up as: tpiSources[tpiSrcIndex]->ghashes[typeIndex] By using these side tables, we can omit the key data from the hash table, and keep the table cell small. There is a cost to this: resolving hash table collisions requires many more loads than simply looking at the key in the same cache line as the insertion position. However, most supported platforms should have a 64-bit CAS operation to update the cell atomically. To make the result of concurrent insertion deterministic, the cell payloads must have a priority function. Defining one is pretty straightforward: compare the two 32-bit numbers as a combined 64-bit number. This means that types coming from inputs earlier on the command line have a higher priority and are more likely to appear earlier in the final PDB type stream than types from an input appearing later on the link line. After table insertion, the non-empty cells in the table can be copied out of the main table and sorted by priority to determine the ordering of the final type index stream. At this point, item and type records must be separated, either by sorting or by splitting into two arrays, and I chose sorting. This is why the GHashCell must contain the isItem bit. Once the final PDB TPI stream ordering is known, we need to compute a mapping from source type index to PDB type index. To avoid starting over from scratch and looking up every type again by its ghash, we save the insertion position of every hash table insertion during the first insertion phase. Because the table does not support rehashing, the insertion position is stable. Using the array of insertion positions indexed by source type index, we can replace the source type indices in the ghash table cells with the PDB type indices. Once the table cells have been updated to contain PDB type indices, the mapping for each type source can be computed in parallel. Simply iterate the list of cell positions and replace them with the PDB type index, since the insertion positions are no longer needed. Once we have a source to destination type index mapping for every type source, there are no more data dependencies. We know which type records are "unique" (not duplicates), and what their final type indices will be. We can do the remapping in parallel, and accumulate type sizes and type hashes in parallel by type source. Lastly, TPI stream layout must be done serially. Accumulate all the type records, sizes, and hashes, and add them to the PDB. Differential Revision: https://reviews.llvm.org/D87805	2020-09-30 15:44:38 -07:00
Reid Kleckner	8d250ac3cd	Revert "[PDB] Merge types in parallel when using ghashing" This reverts commit `49b3459930`.	2020-09-30 14:55:32 -07:00
Reid Kleckner	49b3459930	[PDB] Merge types in parallel when using ghashing This makes type merging much faster (-24% on chrome.dll) when multiple threads are available, but it slightly increases the time to link (+10%) when /threads:1 is passed. With only one more thread, the new type merging is faster (-11%). The output PDB should be identical to what it was before this change. To give an idea, here is the /time output placed side by side: BEFORE \| AFTER Input File Reading: 956 ms \| 968 ms Code Layout: 258 ms \| 190 ms Commit Output File: 6 ms \| 7 ms PDB Emission (Cumulative): 6691 ms \| 4253 ms Add Objects: 4341 ms \| 2927 ms Type Merging: 2814 ms \| 1269 ms -55%! Symbol Merging: 1509 ms \| 1645 ms Publics Stream Layout: 111 ms \| 112 ms TPI Stream Layout: 764 ms \| 26 ms trivial Commit to Disk: 1322 ms \| 1036 ms -300ms ----------------------------------------- -------- Total Link Time: 8416 ms 5882 ms -30% overall The main source of the additional overhead in the single-threaded case is the need to iterate all .debug$T sections up front to check which type records should go in the IPI stream. See fillIsItemIndexFromDebugT. With changes to the .debug$H section, we could pre-calculate this info and eliminate the need to do this walk up front. That should restore single-threaded performance back to what it was before this change. This change will cause LLD to be much more parallel than it used to, and for users who do multiple links in parallel, it could regress performance. However, when the user is only doing one link, it's a huge improvement. In the future, we can use NT worker threads to avoid oversaturating the machine with work, but for now, this is such an improvement for the single-link use case that I think we should land this as is. Algorithm ---------- Before this change, we essentially used a DenseMap<GloballyHashedType, TypeIndex> to check if a type has already been seen, and if it hasn't been seen, insert it now and use the next available type index for it in the destination type stream. DenseMap does not support concurrent insertion, and even if it did, the linker must be deterministic: it cannot produce different PDBs by using different numbers of threads. The output type stream must be in the same order regardless of the order of hash table insertions. In order to create a hash table that supports concurrent insertion, the table cells must be small enough that they can be updated atomically. The algorithm I used for updating the table using linear probing is described in this paper, "Concurrent Hash Tables: Fast and General(?)!": https://dl.acm.org/doi/10.1145/3309206 The GHashCell in this change is essentially a pair of 32-bit integer indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the TpiSource object, and it represents an input type stream. The typeIndex is the index of the type in the stream. Together, we have something like a ragged 2D array of ghashes, which can be looked up as: tpiSources[tpiSrcIndex]->ghashes[typeIndex] By using these side tables, we can omit the key data from the hash table, and keep the table cell small. There is a cost to this: resolving hash table collisions requires many more loads than simply looking at the key in the same cache line as the insertion position. However, most supported platforms should have a 64-bit CAS operation to update the cell atomically. To make the result of concurrent insertion deterministic, the cell payloads must have a priority function. Defining one is pretty straightforward: compare the two 32-bit numbers as a combined 64-bit number. This means that types coming from inputs earlier on the command line have a higher priority and are more likely to appear earlier in the final PDB type stream than types from an input appearing later on the link line. After table insertion, the non-empty cells in the table can be copied out of the main table and sorted by priority to determine the ordering of the final type index stream. At this point, item and type records must be separated, either by sorting or by splitting into two arrays, and I chose sorting. This is why the GHashCell must contain the isItem bit. Once the final PDB TPI stream ordering is known, we need to compute a mapping from source type index to PDB type index. To avoid starting over from scratch and looking up every type again by its ghash, we save the insertion position of every hash table insertion during the first insertion phase. Because the table does not support rehashing, the insertion position is stable. Using the array of insertion positions indexed by source type index, we can replace the source type indices in the ghash table cells with the PDB type indices. Once the table cells have been updated to contain PDB type indices, the mapping for each type source can be computed in parallel. Simply iterate the list of cell positions and replace them with the PDB type index, since the insertion positions are no longer needed. Once we have a source to destination type index mapping for every type source, there are no more data dependencies. We know which type records are "unique" (not duplicates), and what their final type indices will be. We can do the remapping in parallel, and accumulate type sizes and type hashes in parallel by type source. Lastly, TPI stream layout must be done serially. Accumulate all the type records, sizes, and hashes, and add them to the PDB. Differential Revision: https://reviews.llvm.org/D87805	2020-09-30 14:22:48 -07:00
Fangrui Song	259bb61c11	[ELF] Fix multiple -mllvm after D70378 Fixes https://reviews.llvm.org/D70378#2299569 Multiple -mllvm is intended to be supported. We don't have a proper test for `-plugin-opt=-`. This patch adds the test as well. Differential Revision: https://reviews.llvm.org/D88461	2020-09-29 10:26:58 -07:00
Benjamin Kramer	b59dff4b16	[wasm] Move WasmTraits.h to BinaryFormat There's no dependency on Object in there and this avoids a cyclic dependency between libMC and libObject.	2020-09-28 22:07:28 +02:00
Fangrui Song	20e9c36c01	Internalize functions from various tools. NFC And internalize some classes if I noticed them:)	2020-09-26 15:57:13 -07:00
Jez Ng	2c2a749448	[lld-macho] Ignore a few more undocumented flags Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D88268	2020-09-25 11:28:37 -07:00
Jez Ng	643ec67a64	[lld-macho] Always include custom syslibroot when running tests This greatly reduces the amount of boilerplate in our tests. Reviewed By: #lld-macho, compnerd Differential Revision: https://reviews.llvm.org/D87960	2020-09-25 11:28:36 -07:00
Jez Ng	62a3f0c984	[lld-macho] Support absolute symbols They operate like Defined symbols but with no associated InputSection. Note that `ld64` seems to treat the weak definition flag like a no-op for absolute symbols, so I have replicated that behavior. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D87909	2020-09-25 11:28:35 -07:00
Jez Ng	c7c9776f77	[lld-macho] Allow the entry symbol to be dynamically bound Apparently this is used in real programs. I've handled this by reusing the logic we already have for branch (function call) relocations. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D87852	2020-09-25 11:28:33 -07:00
Jez Ng	f23f512691	[lld-macho] Support -bundle Not 100% sure but it appears that bundles are almost identical to dylibs, aside from the fact that they do not contain `LC_ID_DYLIB`. ld64's code seems to treat bundles and dylibs identically in most places. Supporting bundles allows us to run e.g. XCTests, as all test suites are compiled into bundles which get dynamically loaded by the `xctest` test runner. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D87856	2020-09-25 11:28:32 -07:00

1 2 3 4 5 ...

13427 Commits