llvm-project

Commit Graph

Author	SHA1	Message	Date
Martin Storsjö	3c6f8ca7c9	[lld] Rename StringRef _lower() method calls to _insensitive()	2021-06-25 00:22:01 +03:00
Reid Kleckner	8d84751ac4	Revert "[LLD] [COFF] Avoid doing repeated fuzzy symbol lookup for each iteration. NFC." This reverts commit `e1adf90826`. This appears to affect the way that C++ mangled symbols appear in the import library when using a .def file that names a C++ free function with no name decoration. I will follow up with a reduced test case shortly.	2021-06-22 11:35:14 -07:00
Martin Storsjö	e1adf90826	[LLD] [COFF] Avoid doing repeated fuzzy symbol lookup for each iteration. NFC. This is run every time around in the main linker loop. Once a match has been found, stop trying to rematch such a symbol. Not sure if this has any actual measurable performance impact though (SymbolTable::findMangle() iterates over the whole symbol table for each call and does fuzzy matching on top of that) but this makes the code more reassuring to read at least. (This is in practice run for def files listing undecorated stdcall functions to be exported.) Differential Revision: https://reviews.llvm.org/D104529	2021-06-19 22:32:37 +03:00
Reid Kleckner	109aac9212	[PDB] Enable parallel ghash type merging by default Ghashing is probably going to be faster in most cases, even without precomputed ghashes in object files. Here is my table of results linking clang.pdb: ------------------------------- \| threads \| GHASH \| NOGHASH \| ------------------------------- \| j1 \| 51.031s \| 25.141s \| \| j2 \| 31.079s \| 22.109s \| \| j4 \| 18.609s \| 23.156s \| \| j8 \| 11.938s \| 21.984s \| \| j28 \| 8.375s \| 18.391s \| ------------------------------- This shows that ghashing is faster if at least four cores are available. This may make the linker slower if most cores are busy in the middle of a build, but in that case, the linker probably isn't on the critical path of the build. Incremental build performance is arguably more important than highly contended batch build link performance. The -time output indicates that ghash computation is the dominant factor: Input File Reading: 924 ms ( 1.8%) GC: 689 ms ( 1.3%) ICF: 527 ms ( 1.0%) Code Layout: 414 ms ( 0.8%) Commit Output File: 24 ms ( 0.0%) PDB Emission (Cumulative): 49938 ms ( 94.8%) Add Objects: 46783 ms ( 88.8%) Global Type Hashing: 38983 ms ( 74.0%) GHash Type Merging: 5640 ms ( 10.7%) Symbol Merging: 2154 ms ( 4.1%) Publics Stream Layout: 188 ms ( 0.4%) TPI Stream Layout: 18 ms ( 0.0%) Commit to Disk: 2818 ms ( 5.4%) -------------------------------------------------- Total Link Time: 52669 ms (100.0%) We can speed that up with a faster content hash (not SHA1). Differential Revision: https://reviews.llvm.org/D102888	2021-05-27 14:19:36 -07:00
Martin Storsjö	33b71ec9c6	[LLD] [COFF] Fix automatic export of symbols from LTO objects Differential Revision: https://reviews.llvm.org/D101569	2021-05-21 00:36:58 +03:00
Martin Storsjö	7e0768329c	[LLD] [COFF] Fix including the personality function for DWARF EH when linking with --gc-sections Since `c579a5b1d9` we don't traverse .eh_frame when doing GC. But the exception handling personality function needs to be included, and is only referenced from within .eh_frame. Differential Revision: https://reviews.llvm.org/D102138	2021-05-12 22:23:01 +03:00
Alex Reinking	7ac3fcc526	Allow /STACK in #pragma comment(linker, ...) The Halide project uses `#pragma comment(linker, "/STACK:...")` to set the stack size high enough for our embedded compiler to run in end-user programs on Windows. Unfortunately, lld-link.exe breaks on this when embedded in a COFF object, despite supporting the flag on the command line. MSVC's link.exe supports this fine. This patch extends support for this to lld-link.exe for better compatibility with MSVC projects. Differential Revision: https://reviews.llvm.org/D99680	2021-05-05 16:00:33 -07:00
Martin Storsjö	82de4e0753	[LLD] [COFF] Actually include the exported comdat symbols This is a followup to 2b01a417d7ccb001ccc1185ef5fdc967c9fac8d7; previously the RVAs of the exported symbols from comdats were left zero. Thanks to Kleis Auke Wolthuizen for the fix suggestion and pointing out the omission. Differential Revision: https://reviews.llvm.org/D101615	2021-05-04 22:13:08 +03:00
Pengfei Wang	184377da5c	[LLD] Implement /guard:[no]ehcont Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99078	2021-04-14 15:06:49 +08:00
Abhina Sreeskantharajan	c83cd8feef	[NFC] Reordering parameters in getFile and getFileOrSTDIN In future patches I will be setting the IsText parameter frequently so I will refactor the args to be in the following order. I have removed the FileSize parameter because it is never used. ``` static ErrorOr<std::unique_ptr<MemoryBuffer>> getFile(const Twine &Filename, bool IsText = false, bool RequiresNullTerminator = true, bool IsVolatile = false); static ErrorOr<std::unique_ptr<MemoryBuffer>> getFileOrSTDIN(const Twine &Filename, bool IsText = false, bool RequiresNullTerminator = true); static ErrorOr<std::unique_ptr<MB>> getFileAux(const Twine &Filename, uint64_t MapSize, uint64_t Offset, bool IsText, bool RequiresNullTerminator, bool IsVolatile); static ErrorOr<std::unique_ptr<WritableMemoryBuffer>> getFile(const Twine &Filename, bool IsVolatile = false); ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D99182	2021-03-25 09:47:49 -04:00
Yolanda Chen	4f9c61ef72	[lld] add context-sensitive PGO options for COFF. Add lld CSPGO (Contex-Sensitive PGO) options for COFF target. Reference the ELF options from https://reviews.llvm.org/D56675 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D98763	2021-03-24 23:40:09 -07:00
Zequan Wu	5bdc5e7efd	[lld-link] Add safe icf mode to lld-link, which does safe icf for all sections. Differential Revision: https://reviews.llvm.org/D97436	2021-03-03 14:52:33 -08:00
Martin Storsjö	075539ddf6	[LLD] [COFF] Allow invoking lib.exe mode via -lib in addition to /lib Remove a stray -lib argument in guardcf-lto.ll; llvm-lib doesn't support generating import libs from a def file unlike lib.exe. Previously this worked because the -lib argument was ignored (printing only a warning). Differential Revision: https://reviews.llvm.org/D96699	2021-02-24 11:16:12 +02:00
Nico Weber	e6d1f261a5	[lld-link] Add /reproduce: support for several flags /reproduce: now works correctly with: - /call-graph-ordering-file: - /def: - /natvis: - /order: - /pdbstream: I went through all instances of MemoryBuffer::getFile() and made sure everything that didn't already do so called takeBuffer(). For natvis, that wasn't possible since DebugInfo/PDB wants to take owernship of the natvis buffer. For that case, I'm manually adding the tar file entry. /natvis: and /pdbstream: is slightly awkward, since createResponseFile() always adds these flags to the response file but createPDB() (which ultimately adds the files referenced by the flags) is only called if /debug is also passed. So when using /natvis: without /debug with /reproduce:, lld won't warn, but when linking using the response file from the archive, it won't find the natvis file since it's not in the tar. This isn't a new issue though, and after this patch things at least work with using /natvis: _with_ debug with /reproduce:. (Same for /pdbstream:) Differential Revison: https://reviews.llvm.org/D97212	2021-02-22 16:52:49 -05:00
Reshabh Sharma	fdd6ed8e93	[LLD] Rename lld port driver entry function to a consistent name Libraries linked to the lld elf library exposes a function named main. When debugging code linked to such libraries and intending to set a breakpoint at main, the debugger also sets breakpoint at the main function at lld elf driver. The possible choice was to rename it to link but that would again clash with lld::*::link. This patch tries to consistently rename them to linkerMain. Differential Revision: https://reviews.llvm.org/D91418	2020-12-18 12:18:37 +05:30
Nico Weber	f710bb7063	lld: Replace some lld::outs()s with message() No behavior change.	2020-12-17 16:19:09 -05:00
Arthur Eubanks	fed7565ee2	[COFF][LTO][NPM] Use NPM for LTO with ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER Reviewed By: hans Differential Revision: https://reviews.llvm.org/D92866	2020-12-09 08:53:50 -08:00
Nico Weber	a0994cbe27	lld-link: Let LLD_REPRODUCE control /reproduce:, like in ld.lld Also sync help texts for the option between elf and coff ports. Decisions: - Do this even if /lldignoreenv is passed. /reproduce: does not affect the main output, and this makes the env var more convenient to use. (On the other hand, it's now possible to set this env var and forget about it, and all future builds in the same shell will be much slower. That's true for ld.lld, but posix shells have an easy way to set an env var for a single command; in cmd.exe this is not possible without contortions. Then again, lld-link runs in posix shells too.) Original patch rebased across D68378 and D68381. Differential Revision: https://reviews.llvm.org/D67707	2020-11-27 13:33:55 -05:00
rojamd	b79e990f40	[lld][COFF] Add command line options for LTO with new pass manager This is more or less a port of rL329598 (D45275) to the COFF linker. Since there were already LTO-related settings under -opt:, I added them there instead of new flags. Differential Revision: https://reviews.llvm.org/D90624	2020-11-05 14:41:35 -05:00
Martin Storsjö	3785a413fe	Reapply [LLD] [COFF] Implement a GNU/ELF like -wrap option Add a simple forwarding option in the MinGW frontend, and implement the private -wrap option in the COFF linker. The feature in lld-link isn't gated by the -lldmingw option, but the option is left as a private, undocumented option primarily used by the MinGW driver. The implementation is significantly based on the support for --wrap in the ELF linker, but many small nuance details are different between the ELF and COFF linkers, ending up with more than a few implementation differences. This fixes https://bugs.llvm.org/show_bug.cgi?id=47384. Differential Revision: https://reviews.llvm.org/D89004 Reapplied with the bitfield member canInline fixed so it doesn't break builds targeting windows.	2020-10-15 22:14:02 +03:00
Arthur Eubanks	3d338f6813	Revert "[LLD] [COFF] Implement a GNU/ELF like -wrap option" This reverts commit `a012c704b5`. Breaks Windows builds. C:\src\llvm-mint\lld\COFF\Symbols.cpp(26,1): error: static_assert failed due to requirement 'sizeof(lld::coff::SymbolUnion) <= 48' "symbols should be optimized for memory usage" static_assert(sizeof(SymbolUnion) <= 48,	2020-10-15 10:27:25 -07:00
Martin Storsjö	a012c704b5	[LLD] [COFF] Implement a GNU/ELF like -wrap option Add a simple forwarding option in the MinGW frontend, and implement the private -wrap option in the COFF linker. The feature in lld-link isn't gated by the -lldmingw option, but the option is left as a private, undocumented option primarily used by the MinGW driver. The implementation is significantly based on the support for --wrap in the ELF linker, but many small nuance details are different between the ELF and COFF linkers, ending up with more than a few implementation differences. This fixes https://bugs.llvm.org/show_bug.cgi?id=47384. Differential Revision: https://reviews.llvm.org/D89004	2020-10-15 18:34:02 +03:00
Martin Storsjö	45c4c54003	[LLD] [COFF] Add a private option for setting the os version separately from subsystem version The MinGW driver has separate options for OS and subsystem version. Having this available in lld-link allows the MinGW driver to both match GNU ld better and simplifies the code for merging two (potentially mismatching) arguments into one. Differential Revision: https://reviews.llvm.org/D88802	2020-10-05 23:08:01 +03:00
Reid Kleckner	5519e4da83	Re-land "[PDB] Merge types in parallel when using ghashing" Stored Error objects have to be checked, even if they are success values. This reverts commit `8d250ac3cd`. Relands commit 49b3459930655d879b2dc190ff8fe11c38a8be5f.. Original commit message: ----------------------------------------- This makes type merging much faster (-24% on chrome.dll) when multiple threads are available, but it slightly increases the time to link (+10%) when /threads:1 is passed. With only one more thread, the new type merging is faster (-11%). The output PDB should be identical to what it was before this change. To give an idea, here is the /time output placed side by side: BEFORE \| AFTER Input File Reading: 956 ms \| 968 ms Code Layout: 258 ms \| 190 ms Commit Output File: 6 ms \| 7 ms PDB Emission (Cumulative): 6691 ms \| 4253 ms Add Objects: 4341 ms \| 2927 ms Type Merging: 2814 ms \| 1269 ms -55%! Symbol Merging: 1509 ms \| 1645 ms Publics Stream Layout: 111 ms \| 112 ms TPI Stream Layout: 764 ms \| 26 ms trivial Commit to Disk: 1322 ms \| 1036 ms -300ms ----------------------------------------- -------- Total Link Time: 8416 ms 5882 ms -30% overall The main source of the additional overhead in the single-threaded case is the need to iterate all .debug$T sections up front to check which type records should go in the IPI stream. See fillIsItemIndexFromDebugT. With changes to the .debug$H section, we could pre-calculate this info and eliminate the need to do this walk up front. That should restore single-threaded performance back to what it was before this change. This change will cause LLD to be much more parallel than it used to, and for users who do multiple links in parallel, it could regress performance. However, when the user is only doing one link, it's a huge improvement. In the future, we can use NT worker threads to avoid oversaturating the machine with work, but for now, this is such an improvement for the single-link use case that I think we should land this as is. Algorithm ---------- Before this change, we essentially used a DenseMap<GloballyHashedType, TypeIndex> to check if a type has already been seen, and if it hasn't been seen, insert it now and use the next available type index for it in the destination type stream. DenseMap does not support concurrent insertion, and even if it did, the linker must be deterministic: it cannot produce different PDBs by using different numbers of threads. The output type stream must be in the same order regardless of the order of hash table insertions. In order to create a hash table that supports concurrent insertion, the table cells must be small enough that they can be updated atomically. The algorithm I used for updating the table using linear probing is described in this paper, "Concurrent Hash Tables: Fast and General(?)!": https://dl.acm.org/doi/10.1145/3309206 The GHashCell in this change is essentially a pair of 32-bit integer indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the TpiSource object, and it represents an input type stream. The typeIndex is the index of the type in the stream. Together, we have something like a ragged 2D array of ghashes, which can be looked up as: tpiSources[tpiSrcIndex]->ghashes[typeIndex] By using these side tables, we can omit the key data from the hash table, and keep the table cell small. There is a cost to this: resolving hash table collisions requires many more loads than simply looking at the key in the same cache line as the insertion position. However, most supported platforms should have a 64-bit CAS operation to update the cell atomically. To make the result of concurrent insertion deterministic, the cell payloads must have a priority function. Defining one is pretty straightforward: compare the two 32-bit numbers as a combined 64-bit number. This means that types coming from inputs earlier on the command line have a higher priority and are more likely to appear earlier in the final PDB type stream than types from an input appearing later on the link line. After table insertion, the non-empty cells in the table can be copied out of the main table and sorted by priority to determine the ordering of the final type index stream. At this point, item and type records must be separated, either by sorting or by splitting into two arrays, and I chose sorting. This is why the GHashCell must contain the isItem bit. Once the final PDB TPI stream ordering is known, we need to compute a mapping from source type index to PDB type index. To avoid starting over from scratch and looking up every type again by its ghash, we save the insertion position of every hash table insertion during the first insertion phase. Because the table does not support rehashing, the insertion position is stable. Using the array of insertion positions indexed by source type index, we can replace the source type indices in the ghash table cells with the PDB type indices. Once the table cells have been updated to contain PDB type indices, the mapping for each type source can be computed in parallel. Simply iterate the list of cell positions and replace them with the PDB type index, since the insertion positions are no longer needed. Once we have a source to destination type index mapping for every type source, there are no more data dependencies. We know which type records are "unique" (not duplicates), and what their final type indices will be. We can do the remapping in parallel, and accumulate type sizes and type hashes in parallel by type source. Lastly, TPI stream layout must be done serially. Accumulate all the type records, sizes, and hashes, and add them to the PDB. Differential Revision: https://reviews.llvm.org/D87805	2020-09-30 15:44:38 -07:00
Reid Kleckner	8d250ac3cd	Revert "[PDB] Merge types in parallel when using ghashing" This reverts commit `49b3459930`.	2020-09-30 14:55:32 -07:00
Reid Kleckner	49b3459930	[PDB] Merge types in parallel when using ghashing This makes type merging much faster (-24% on chrome.dll) when multiple threads are available, but it slightly increases the time to link (+10%) when /threads:1 is passed. With only one more thread, the new type merging is faster (-11%). The output PDB should be identical to what it was before this change. To give an idea, here is the /time output placed side by side: BEFORE \| AFTER Input File Reading: 956 ms \| 968 ms Code Layout: 258 ms \| 190 ms Commit Output File: 6 ms \| 7 ms PDB Emission (Cumulative): 6691 ms \| 4253 ms Add Objects: 4341 ms \| 2927 ms Type Merging: 2814 ms \| 1269 ms -55%! Symbol Merging: 1509 ms \| 1645 ms Publics Stream Layout: 111 ms \| 112 ms TPI Stream Layout: 764 ms \| 26 ms trivial Commit to Disk: 1322 ms \| 1036 ms -300ms ----------------------------------------- -------- Total Link Time: 8416 ms 5882 ms -30% overall The main source of the additional overhead in the single-threaded case is the need to iterate all .debug$T sections up front to check which type records should go in the IPI stream. See fillIsItemIndexFromDebugT. With changes to the .debug$H section, we could pre-calculate this info and eliminate the need to do this walk up front. That should restore single-threaded performance back to what it was before this change. This change will cause LLD to be much more parallel than it used to, and for users who do multiple links in parallel, it could regress performance. However, when the user is only doing one link, it's a huge improvement. In the future, we can use NT worker threads to avoid oversaturating the machine with work, but for now, this is such an improvement for the single-link use case that I think we should land this as is. Algorithm ---------- Before this change, we essentially used a DenseMap<GloballyHashedType, TypeIndex> to check if a type has already been seen, and if it hasn't been seen, insert it now and use the next available type index for it in the destination type stream. DenseMap does not support concurrent insertion, and even if it did, the linker must be deterministic: it cannot produce different PDBs by using different numbers of threads. The output type stream must be in the same order regardless of the order of hash table insertions. In order to create a hash table that supports concurrent insertion, the table cells must be small enough that they can be updated atomically. The algorithm I used for updating the table using linear probing is described in this paper, "Concurrent Hash Tables: Fast and General(?)!": https://dl.acm.org/doi/10.1145/3309206 The GHashCell in this change is essentially a pair of 32-bit integer indices: <sourceIndex, typeIndex>. The sourceIndex is the index of the TpiSource object, and it represents an input type stream. The typeIndex is the index of the type in the stream. Together, we have something like a ragged 2D array of ghashes, which can be looked up as: tpiSources[tpiSrcIndex]->ghashes[typeIndex] By using these side tables, we can omit the key data from the hash table, and keep the table cell small. There is a cost to this: resolving hash table collisions requires many more loads than simply looking at the key in the same cache line as the insertion position. However, most supported platforms should have a 64-bit CAS operation to update the cell atomically. To make the result of concurrent insertion deterministic, the cell payloads must have a priority function. Defining one is pretty straightforward: compare the two 32-bit numbers as a combined 64-bit number. This means that types coming from inputs earlier on the command line have a higher priority and are more likely to appear earlier in the final PDB type stream than types from an input appearing later on the link line. After table insertion, the non-empty cells in the table can be copied out of the main table and sorted by priority to determine the ordering of the final type index stream. At this point, item and type records must be separated, either by sorting or by splitting into two arrays, and I chose sorting. This is why the GHashCell must contain the isItem bit. Once the final PDB TPI stream ordering is known, we need to compute a mapping from source type index to PDB type index. To avoid starting over from scratch and looking up every type again by its ghash, we save the insertion position of every hash table insertion during the first insertion phase. Because the table does not support rehashing, the insertion position is stable. Using the array of insertion positions indexed by source type index, we can replace the source type indices in the ghash table cells with the PDB type indices. Once the table cells have been updated to contain PDB type indices, the mapping for each type source can be computed in parallel. Simply iterate the list of cell positions and replace them with the PDB type index, since the insertion positions are no longer needed. Once we have a source to destination type index mapping for every type source, there are no more data dependencies. We know which type records are "unique" (not duplicates), and what their final type indices will be. We can do the remapping in parallel, and accumulate type sizes and type hashes in parallel by type source. Lastly, TPI stream layout must be done serially. Accumulate all the type records, sizes, and hashes, and add them to the PDB. Differential Revision: https://reviews.llvm.org/D87805	2020-09-30 14:22:48 -07:00
Fangrui Song	1ca6bd261e	[lld] Clean up in lld::{coff,elf}::link after D70378 Library users should not need to call errorHandler().reset() explicitly. google/iree calls lld:🧝:link and without the patch some global variables are not cleaned up in the next invocation.	2020-09-24 18:02:45 -07:00
Alexandre Ganea	f2efb5742c	[LLD][COFF] Cover usage of LLD-as-a-library in tests In lit tests, we run each LLD invocation twice (LLD_IN_TEST=2), without shutting down the process in-between. This ensures a full cleanup is properly done between runs. Only active for the COFF driver for now. Other drivers still use LLD_IN_TEST=1 which executes just one iteration with full cleanup, like before. When the environment variable LLD_IN_TEST is unset, a shortcut is taken, only one iteration is executed, no cleanup for faster exit, like before. A public API, lld::safeLldMain(), is also available when using LLD as a library. Differential Revision: https://reviews.llvm.org/D70378	2020-09-24 15:07:50 -04:00
Zequan Wu	763671f387	[COFF] Port CallGraphSort to COFF from ELF	2020-07-30 15:21:44 -07:00
Martin Storsjö	745eb02496	[LLD] [MinGW] Implement the --no-seh flag Previously this flag was just ignored. If set, set the IMAGE_DLL_CHARACTERISTICS_NO_SEH bit, regardless of the normal safeSEH machinery. In mingw configurations, the safeSEH bit might not be set in e.g. object files built from handwritten assembly, making it impossible to use the normal safeseh flag. As mingw setups don't generally use SEH on 32 bit x86 at all, it should be fine to set that flag bit though - hook up the existing GNU ld flag for controlling that. Differential Revision: https://reviews.llvm.org/D84701	2020-07-28 21:08:37 +03:00
Reid Kleckner	3508c1d8fb	[LLD] Make scoped timers thread safe Summary: This is a pre-requisite to parallelizing PDB symbol and type merging. Currently this timer usage would not be thread safe. Reviewers: aganea, MaskRay Subscribers: jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80298	2020-05-20 16:16:08 -07:00
Reid Kleckner	54a335a2f6	[COFF] Move type merging to TpiSource::mergeDebugT virtual method This paves the way to doing more things in parallel, and allows us to order type sources in dependency order. PDBs and PCH objects have to be loaded before object files which use them. This is a rebase of the unapplied remaining changes in https://reviews.llvm.org/D59226. I found it very challenging to rebase this across the LLD variable name style change. I recall there was a tool for that, but I didn't take the time to use it. Reviewers: aganea, akhuang Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79672	2020-05-14 09:47:00 -07:00
Martin Storsjö	7f0e6c31c2	[LLD] [COFF] Add options for disabling auto import and runtime pseudo relocs Allow disabling either the full auto import feature, or just forbidding the cases that require runtime fixups. As long as all auto imported variables are referenced from separate .refptr$<name> sections, we can alias them on top of the IAT entries and don't actually need any runtime fixups via pseudo relocations. LLVM generates references to variables in .refptr stubs, if it isn't known that the variable for sure is defined in the same object module. Runtime pseudo relocs are needed if the addresses of auto imported variables are used in constant initializers though. Fixing up runtime pseudo relocations requires the use of VirtualProtect (which is disallowed in WinStore/UWP apps) or VirtualProtectFromApp. To allow any risk of ambiguity, allow rejecting cases that would require this at the linker stage. This adds support for the --disable-runtime-pseudo-reloc and --disable-auto-import options in the MinGW driver (matching GNU ld.bfd) with corresponding lld private options in the COFF driver. Differential Revision: https://reviews.llvm.org/D78923	2020-05-14 13:05:14 +03:00
Martin Storsjö	ed0a57f753	[LLD] [COFF] Fix def file exporting of symbols containing periods This fixes an accidental breakage of exporting symbols using def files, when the symbol name contains a period, since commit `0ca06f7950`, mixing up a symbol name containing a period with the case of exporting a symbol as a forward to another dll. Differential Revision: https://reviews.llvm.org/D79619	2020-05-10 23:30:14 +03:00
Reid Kleckner	932f0276ea	[Support] Move LLD's parallel algorithm wrappers to support Essentially takes the lld/Common/Threads.h wrappers and moves them to the llvm/Support/Paralle.h algorithm header. The changes are: - Remove policy parameter, since all clients use `par`. - Rename the methods to `parallelSort` etc to match LLVM style, since they are no longer C++17 pstl compatible. - Move algorithms from llvm::parallel:: to llvm::, since they have "parallel" in the name and are no longer overloads of the regular algorithms. - Add range overloads - Use the sequential algorithm directly when 1 thread is requested (skips task grouping) - Fix the index type of parallelForEachN to size_t. Nobody in LLVM was using any other parameter, and it made overload resolution hard for for_each_n(par, 0, foo.size(), ...) because 0 is int, not size_t. Remove Threads.h and update LLD for that. This is a prerequisite for parallel public symbol processing in the PDB library, which is in LLVM. Reviewed By: MaskRay, aganea Differential Revision: https://reviews.llvm.org/D79390	2020-05-05 15:21:05 -07:00
Reid Kleckner	3542384ae9	[COFF] Use a global option table to avoid reconstructing it Otherwise an ArgumentParser is constructed for every directive section, and that involves copying the entire table of options into a vector. There is no need for this, just have one option table.	2020-05-02 15:04:19 -07:00
Reid Kleckner	01b5f52140	[COFF] Add a fastpath for /INCLUDE: in .drective sections This speeds up linking chrome.dll with PGO instrumentation by 13% (154271ms -> 134033ms). LLVM's Option library is very slow. In particular, it allocates at least one large-ish heap object (Arg) for every argument. When PGO instrumentation is enabled, all the __profd_* symbols are added to the @llvm.used list, which compiles down to these /INCLUDE: directives. This means we have O(#symbols) directives to parse in the section, so we end up allocating an Arg for every function symbol in the object file. This is unnecessary. To address the issue and speed up the link, extend the fast path that we already have for /EXPORT:, which has similar scaling issues. I promise that I took a hard look at optimizing the Option library, but its data structures are very general and would need a lot of cleanup. We have accumulated lots of optional features (option groups, aliases, multiple values) over the years, and these are now properties of every parsed argument, when the vast majority of arguments do not use these features. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D78845	2020-04-28 10:35:57 -07:00
Eric Astor	a39b14f0b4	[ms] Add new /PDBSTREAM option to lld-link allowing injection of streams into PDB files. Summary: /PDBSTREAM:<name>=<file> adds the contents of <file> to stream <name> in the resulting PDB. This allows native uses with workflows that (for example) add srcsrv streams to PDB files to provide a location for the build's source files. Results should be equivalent to linking with lld-link, then running Microsoft's pdbstr tool with the command line: pdbstr.exe -w -p:<PDB LOCATION> -s:<name> -i:<file> except in cases where the named stream overlaps with a default named stream, such as "/names". In those cases, the added stream will be overridden, making the /pdbstream option a no-op. Reviewers: thakis, rnk Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D77310	2020-04-07 16:19:38 -04:00
Fangrui Song	eb4663d8c6	[lld][COFF][ELF][WebAssembly] Replace --[no-]threads /threads[:no] with --threads={1,2,...} /threads:{1,2,...} --no-threads is a name copied from gold. gold has --no-thread, --thread-count and several other --thread-count-*. There are needs to customize the number of threads (running several lld processes concurrently or customizing the number of LTO threads). Having a single --threads=N is a straightforward replacement of gold's --no-threads + --thread-count. --no-threads is used rarely. So just delete --no-threads instead of keeping it for compatibility for a while. If --threads= is specified (ELF,wasm; COFF /threads: is similar), --thinlto-jobs= defaults to --threads=, otherwise all available hardware threads are used. There is currently no way to override a --threads={1,2,...}. It is still a debate whether we should use --threads=all. Reviewed By: rnk, aganea Differential Revision: https://reviews.llvm.org/D76885	2020-03-31 08:46:12 -07:00
Alexandre Ganea	09158252f7	[ThinLTO] Allow usage of all hardware threads in the system Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled. One can now say in LLD: /opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified. /opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency(). /opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask. When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows). When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows). Differential Revision: https://reviews.llvm.org/D75153	2020-03-27 10:20:58 -04:00
Sylvain Audi	b91905a263	[lld-link] Support /map option, matching link.exe 's /map output format Added support for /map and /map:[filepath]. The output was derived from Microsoft's Link.exe output when using that same option. Note that /MAPINFO support was not added. The previous implementation of MapFile.cpp/.h was meant for /lldmap, and was renamed to LLDMapFile.cpp/.h MapFile.cpp/.h is now for /MAP However, a small fix was added to lldmap, replacing a std::sort with std::stable_sort to enforce reproducibility. Differential Revision: https://reviews.llvm.org/D70557	2020-03-24 09:48:00 -04:00
Rui Ueyama	a2923b2a1e	Implement CET Shadow Stack (Intel Controlflow Enforcement Technology) support on Windows Patch by Petr Penzin. Windows support for CET is limited to shadow stack, which is enabled by setting a PE bit in the linker. Docs: MSVC linker flag: https://docs.microsoft.com/en-us/cpp/build/reference/cetcompat?view=vs-2019 IMAGE_DLLCHARACTERISTICS_EX_CET_COMPAT PE bit: https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#extended-dll-characteristics Differential Revision: https://reviews.llvm.org/D70606	2020-03-16 17:51:32 +09:00
Jonas Devlieghere	3e24242a7d	[lld] Replace SmallStr.str().str() with std::string conversion operator. Use the std::string conversion operator introduced in `d7049213d0`.	2020-01-29 21:30:21 -08:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Martin Storsjö	e6b0ce70bd	[LLD] [COFF] Silence a GCC warning about an unused variable. NFC.	2020-01-23 13:23:56 +02:00
Reid Kleckner	8045a8a7f1	[COFF] Warn that LLD does not support /PDBSTRIPPED: Doesn't really fix PR44491, but it avoids treating it as an input.	2020-01-15 15:11:19 -08:00
James Y Knight	d3fec7fb45	LLD: Don't use the stderrOS stream in link before it's reassigned. Remove the lld::enableColors function, as it just obscures which stream it's affecting, and replace with explicit calls to the stream's enable_colors. Also, assign the stderrOS and stdoutOS globals first in link function, just to ensure nothing might use them. (Either change individually fixes the issue of using the old stream, but both together seems best.) Follow-up to `b11386f9be`. Differential Revision: https://reviews.llvm.org/D70492	2019-11-21 10:55:03 -05:00
Rui Ueyama	b11386f9be	Make it possible to redirect not only errs() but also outs() This change is for those who use lld as a library. Context: https://reviews.llvm.org/D70287 This patch adds a new parmeter to lld::::link() so that we can pass an raw_ostream object representing stdout. Previously, lld::::link() took only an stderr object. Justification for making stdoutOS and stderrOS mandatory: I wanted to make link() functions to take stdout and stderr in that order. However, if we change the function signature from bool link(ArrayRef<const char > args, bool canExitEarly, raw_ostream &stderrOS = llvm::errs()); to bool link(ArrayRef<const char > args, bool canExitEarly, raw_ostream &stdoutOS = llvm::outs(), raw_ostream &stderrOS = llvm::errs()); , then the meaning of existing code that passes stderrOS silently changes (stderrOS would be interpreted as stdoutOS). So, I chose to make existing code not to compile, so that developers can fix their code. Differential Revision: https://reviews.llvm.org/D70292	2019-11-18 11:18:06 +09:00
Reid Kleckner	ce0f3ee5e4	[COFF] Don't error if the only inputs are from /wholearchive: Fixes PR43744 Differential Revision: https://reviews.llvm.org/D69968	2019-11-15 16:09:07 -08:00
Reid Kleckner	4c1a1d3cf9	Add missing includes needed to prune LLVMContext.h include, NFC These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280	2019-11-14 15:23:15 -08:00

1 2 3 4 5 ...

480 Commits