llvm-project

Commit Graph

Author	SHA1	Message	Date
Rui Ueyama	43155d0d48	[LLD] Fix Clang-tidy modernize-use-nullptr warnings; other minor cleanups. Patch from Eugene Zelenko! llvm-svn: 249111	2015-10-02 00:36:00 +00:00
Rui Ueyama	548d22c073	COFF: ICF should not merge sectinos if their alignments are not the same. There's actually a room to improve this patch. Instead of not merging sections that have different alignements, we can choose the section that has the largest alignment requirement among all sections that are otherwise considered the same. Then all section alignments are satisfied, so we can merge them. I don't know if that improvement could make any difference for real-world input, so I'll leave it alone. Would be interesting to revisit later. llvm-svn: 248581	2015-09-25 16:50:12 +00:00
Rui Ueyama	c9e746b9e6	COFF: Fix local varaible type. This is intended to be 64-bit integer, but size_t is not guranteed to be the same or larger type than uint64_t. llvm-svn: 248580	2015-09-25 16:38:13 +00:00
Rui Ueyama	de88072a00	COFF: Rename Ptr -> Repl. This pointer points to a replacement for this chunk. Ptr was not a good name. llvm-svn: 248579	2015-09-25 16:20:24 +00:00
Rui Ueyama	c28a08b8d2	COFF: Remove duplicate parameter from hash value calculation. llvm-svn: 248526	2015-09-24 19:00:42 +00:00
Rui Ueyama	9640173e16	COFF: Add /nosymtab command line option. This is an LLD extension to MSVC link.exe command line. MSVC linker does not write symbol tables for executables. We do unless no /debug option is given. There's a situation that we want to enable debug info but don't want to emit the symbol table. One example is when we are comparing output file size. With this patch, you can tell the linker to not create a symbol table by just specifying /nosymtab. llvm-svn: 248225	2015-09-21 23:43:31 +00:00
Rui Ueyama	97d92736f5	COFF: Improve section hash value. std::distance(C->Relocs.end(), C->Relocs.begin()) is the same as NumRelocs which is already added to the hash value. What we are missing here is the section size. llvm-svn: 248202	2015-09-21 19:41:38 +00:00
Rui Ueyama	3cb1f5c860	COFF: Rename A.replaceWith(B) -> B.replace(A). NFC. llvm-svn: 248197	2015-09-21 19:36:51 +00:00
Rui Ueyama	98a98cffb6	COFF: Do not call std::async with std::launch::async if multithreading is disabled. llvm-svn: 248193	2015-09-21 19:12:36 +00:00
Rui Ueyama	5f38915624	COFF: Fix ICF regression. This patch fixes a regression introduced by r247964. Relocations that are referring the same symbol should be considered equal, but they were not if they were pointing to non-section chunks. llvm-svn: 248132	2015-09-20 20:19:12 +00:00
Rui Ueyama	997b357ac1	COFF: Run InputFile::parse() in background using std::async(). Previously, InputFile::parse() was run in batch. We construct a list of all input files and call parse() on each file using parallel_for_each. That means we cannot start parsing files until we get a complete list of input files, although InputFile::parse() is safe to call from anywhere. This patch makes it asynchronous. As soon as we add a file to the symbol table, we now start parsing the file using std::async(). This change shortens self-hosting time (650 ms) by 28 ms. It's about 4% improvement. llvm-svn: 248109	2015-09-20 03:11:16 +00:00
Rui Ueyama	f49712a853	COFF: Fix race condition. NextID is updated inside parallel_for_each, so it needs mutual exclusion. llvm-svn: 248106	2015-09-20 01:44:44 +00:00
Rui Ueyama	3cfd2bff1e	Remove dead code. llvm-svn: 248105	2015-09-20 01:19:36 +00:00
Rui Ueyama	1cce300843	COFF: Change Symbol::Body type from atomic pointer to regular pointer. I made the field an atomic pointer in hope that we would be able to parallelize the symbol resolver soon, but that's not going to happen soon. This patch reverts that change for the sake of readability. llvm-svn: 248104	2015-09-20 00:00:05 +00:00
Rui Ueyama	63bbe84b27	COFF: Make Chunk::writeTo() const. NFC. This should improve code readability especially because this function is called inside parallel_for_each. llvm-svn: 248103	2015-09-19 23:28:57 +00:00
Rui Ueyama	ebb0ebff4b	COFF: Fix thread-safety bug. LTOModule doesn't seem to be thread-safe, so guard that with mutex. llvm-svn: 248102	2015-09-19 23:14:51 +00:00
Rui Ueyama	a5f0f758d3	COFF: Move markLive() from Writer.cpp to its own file. Conceptually, garbage collection is not part of Writer, so move the function out of the file. llvm-svn: 248099	2015-09-19 21:36:28 +00:00
Rui Ueyama	0652c59506	COFF: Actually parallelize InputFile::parse(). This is a follow-up patch to r248078. llvm-svn: 248098	2015-09-19 21:33:26 +00:00
Rui Ueyama	27e9e6540c	Remove unused #includes. llvm-svn: 248081	2015-09-19 02:28:32 +00:00
Rui Ueyama	f4d05d7a80	COFF: Parallelize InputFile::parse(). InputFile::parse() can be called in parallel with other calls of the same function. By doing that, time to self-link improves from 741 ms to 654 ms or 12% faster. This is probably the last low hanging fruit in terms of parallelism. Input file parsing and symbol table insertion takes 450 ms in total. If we want to optimize further, we probably have to parallelize symbol table insertion using concurrent hashmap or something. That's doable, but that's not easy, especially if you want to keep the exact same semantics and linking order. I'm not going to do that at least soon. Anyway, compared to r248019 (the change before the first attempt for parallelism), we achieved 36% performance improvement from 1022 ms to 654 ms. MSVC linker takes 3.3 seconds to link the same program. MSVC's ICF feature is very slow for some reason, but even if we disable the feature, it still takes about 1.2 seconds. Our number is probably good enough. llvm-svn: 248078	2015-09-19 01:48:26 +00:00
Rui Ueyama	8197a4e0bf	COFF: Use parallel_sort in Writer::sortExceptionTable(). This patch saves 4 ms out of 5 ms. Very small improvement, but maybe better than nothing. llvm-svn: 248063	2015-09-18 23:17:34 +00:00
Rui Ueyama	49e72e69e5	Fix build error that std::atomic is not copy-constructible. llvm-svn: 248061	2015-09-18 22:58:12 +00:00
Rui Ueyama	e629a45531	COFF: Address review comments. - Fix race condition of `Redo` - Avoid std::distance llvm-svn: 248058	2015-09-18 22:31:15 +00:00
Rui Ueyama	e0e0796d83	COFF: Parallelize Writer::writeSections(). Self-hosting took 801 ms on my machine. Of which this function took 69 ms. Now it takes 37 ms. That is about 4% overall performance improvement. llvm-svn: 248052	2015-09-18 22:07:10 +00:00
Rui Ueyama	e8d1c59756	Style fix to make it look consistent. NFC. llvm-svn: 248044	2015-09-18 21:17:44 +00:00
Rui Ueyama	aa95e5a4cc	COFF: Parallelize ICF. The LLD's ICF algorithm is highly parallelizable. This patch does that using parallel_for_each. ICF accounted for about one third of total execution time. Previously, it took 324 ms when self-hosting. Now it takes only 62 ms. Of course your mileage may vary. My machine is a beefy 24-core Xeon machine, so you may not see this much speedup. But this optimization should be effective even for 2-core machine, since I saw speedup (324 ms -> 189 ms) when setting parallelism parameter to 2. llvm-svn: 248038	2015-09-18 21:06:34 +00:00
Rui Ueyama	603d51104b	COFF: Reorder comparisons. This change makes equalsConstant a bit faster (193ms -> 163ms). llvm-svn: 247965	2015-09-18 02:40:54 +00:00
Rui Ueyama	8c73dfb6bf	COFF: Remove useless micro-optimization. This patch simplifies code by removing micro-optimization that doesn't contribute to speed. llvm-svn: 247964	2015-09-18 02:15:34 +00:00
Rui Ueyama	c9a6e827bd	COFF: Optimize ICF by not creating temporary vectors. Previously, ICF created a vector for each SectionChunk. The vector contained pointers to successors, which are namely associative sections and COMDAT relocation targets. The reason I created vectors is because I thought that that would make section comparison faster. It did make the comparison faster. When self-linking, for example, it saved about 10 ms on each iteration. The time we spent on constructing the vectors was 124 ms. If we iterate more than 12 times, return from the investment exceeds the initial cost. In reality, it usually needs 5 iterations. So we shouldn't construct the vectors. llvm-svn: 247963	2015-09-18 01:51:37 +00:00
Rui Ueyama	7d8263bf1d	COFF: Optimize ICF by comparing relocations before section contents. equalsConstants() is the heaviest function in ICF, and that consumes more than half of total ICF execution time. Of which, section content comparison accounts for roughly one third. Previously, we compared section contents at the beginning of the function after comparing their checksums. The comparison is very likely to succeed because when the control reaches that comparison, their checksums are always equal. And because checksums are 64-bit CRC, they are unlikely to collide. We compared relocations and associative sections after that. If they are different, the time we spent on byte-by-byte comparison of section contents were wasted. This patch moves the comparison at the end of function. If the comparison fails, the time we spent on relocation comparison are wasted, but as I wrote it's very unlikely to happen. LLD took 1198 ms to link itself to produce a 27.11 MB executable. Of which, ICF accounted for 536 ms. This patch cuts it by 90 ms, which is 17% speedup of ICF and 7.5% speedup overall. All numbers are median of ten runs. llvm-svn: 247961	2015-09-18 01:30:56 +00:00
Rui Ueyama	4151972c22	Enable extra LTO verification only when build type is debug. llvm-svn: 247956	2015-09-17 22:54:08 +00:00
Rui Ueyama	63dd8766ab	COFF: Remove DefinedSymbol::isLive() and markLive(). NFC. Basically the concept of "liveness" is for sections (or chunks in LLD terminology) and not for symbols. Symbols are always available or live, or otherwise it indicates a link failure. Previously, we had isLive() and markLive() methods for DefinedSymbol. They are confusing methods. What they actually did is to act as a proxy to backing section chunks. We can simplify eliminate these methods and call section chunk's methods directly. llvm-svn: 247869	2015-09-16 23:55:52 +00:00
Rui Ueyama	66c06ceaca	COFF: ICF: Print out the number of iterations. NFC. llvm-svn: 247868	2015-09-16 23:55:39 +00:00
Rui Ueyama	4dbff20c91	COFF: Fix bug that not all symbols were written to symtab if /opt:noref. Only live symbols are written to the symbol table. Because isLive() returned false if dead-stripping was disabled entirely, only non-COMDAT sections were written to the symbol table. This patch fixes the issue. llvm-svn: 247856	2015-09-16 21:40:47 +00:00
Rui Ueyama	3b153e6541	COFF: Fix bug that /opt:noicf was ignored. llvm-svn: 247854	2015-09-16 21:30:55 +00:00
Rui Ueyama	4bce7bcc88	COFF: Output messages for /verbose to stdout instead of stderr. This patch also makes the message less verbose. llvm-svn: 247853	2015-09-16 21:30:40 +00:00
Rui Ueyama	b02a320f5e	COFF: Enable ICF by default. MSVC linker enables ICF as long as /opt:ref is eanbled, so do we. llvm-svn: 247817	2015-09-16 16:41:38 +00:00
Rui Ueyama	c04d5dbf20	COFF: Rename /opt:lldicf -> /opt:icf. Now that ICF is complete, we can rename this option so that the driver accepts the MSVC-compatible command line option. llvm-svn: 247816	2015-09-16 16:33:57 +00:00
Rui Ueyama	92298d5418	COFF: Create ICF class to move code from SectionChunk to ICF. NFC. This patch defines ICF class and defines ICF-related functions as members of the class. By doing this we can move code that are related only to ICF from SectionChunk to the newly-defined class. This also eliminates a global variable "NextID". llvm-svn: 247802	2015-09-16 14:19:10 +00:00
Rui Ueyama	9cb2870ce0	ICF: Improve ICF to reduce more sections than before. This is a patch to make LLD to be on par with MSVC in terms of ICF effectiveness. MSVC produces a 27.14MB executable when linking LLD. LLD previously produced a 27.61MB when self-linking. Now the size is reduced to 27.11MB. Note that without ICF the size is 29.63MB. In r247387, I implemented an algorithm that handles section graphs as cyclic graphs and merge them using SCC. The algorithm did not always work as intended as I demonstrated in r247721. The new algortihm implemented in this patch is different from the previous one. If you are interested the details, you want to read the file comment of ICF.cpp. llvm-svn: 247770	2015-09-16 03:26:31 +00:00
Duncan P. N. Exon Smith	a11f81973a	LTO: Adjust to LLVM r247735 Perhaps lld wants to disable the verifier sometimes during COFF LTO, but for now just match behaviour from before r247735. llvm-svn: 247736	2015-09-15 23:06:16 +00:00
Rui Ueyama	c48f78ca5a	Fix typo. llvm-svn: 247645	2015-09-15 00:35:41 +00:00
Rui Ueyama	13563d8645	Fix style. llvm-svn: 247644	2015-09-15 00:33:11 +00:00
Rui Ueyama	8e186df2f5	COFF: Corrected error message if a section failed to load. There is no sense to use Name in these lines as it is not initialized yet. Patch from Igor Kudrin! llvm-svn: 247531	2015-09-13 20:22:22 +00:00
Rui Ueyama	5b93aa51de	COFF: Teach ICF to merge cyclic graphs. Previously, LLD's ICF couldn't merge cyclic graphs. That was unfortunate because, in COFF, cyclic graphs are not exceptional at all. That is pretty common. In this patch, sections are grouped by Tarjan's strongly connected component algorithm to get acyclic graphs. And then we try to merge SCCs whose outdegree is zero, and remove them from the graph. This makes other SCCs to have outdegree zero, so we can repeat the process until all SCCs are removed. When comparing two SCCs, we handle cycles properly. This algorithm works better than previous one. Previously, self-linking produced a 29.0MB executable. It now produces a 27.7MB. There's still some gap compared to MSVC linker which produces a 27.1MB executable for the same input. So the gap is narrowed, but still LLD is not on par with MSVC. I'll investigate that later. llvm-svn: 247387	2015-09-11 04:29:03 +00:00
Rui Ueyama	3c28ba38de	COFF: Split doICF(). No functionality change. llvm-svn: 246934	2015-09-05 23:06:32 +00:00
Rui Ueyama	ef907ec82d	COFF: Implement a better algorithm for ICF. Identical COMDAT Folding is a feature to merge COMDAT sections by contents. Two sections are considered the same if their contents, relocations, attributes, etc, are all the same. An interesting fact is that MSVC linker takes "iterations" parameter for ICF because the algorithm they are using is iterative. Merging two sections could make more sections to be mergeable because different relocations could now point to the same section. ICF is repeated until we get a convergence (until no section can be merged). This algorithm is not fast. Usually it needs three iterations until a convergence is obtained. In the new algorithm implemented in this patch, we consider sections and relocations as a directed acyclic graph, and we try to merge sections whose outdegree is zero. Sections with outdegree zero are then removed from the graph, which makes other sections to have outdegree zero. We repeat that until all sections are processed. In this algorithm, we don't iterate over the same sections many times. There's an apparent issue in the algorithm -- the section graph is not guaranteed to be acyclic. It's actually pretty often cyclic. So this algorithm cannot eliminate all possible duplicates. That's OK for now because the previous algorithm was not able to eliminate cycles too. I'll address the issue in a follow-up patch. llvm-svn: 246878	2015-09-04 21:35:54 +00:00
Rui Ueyama	434de7a33f	Remove unused variable. llvm-svn: 246874	2015-09-04 21:05:30 +00:00
Rui Ueyama	2dcc23580e	COFF: Use section content checksum for ICF. Previously, we calculated our own hash values for section contents. Of coruse that's slow because we had to access all bytes in sections. Fortunately, COFF objects usually contain hash values for COMDAT sections. We can use that to speed up Identical COMDAT Folding. llvm-svn: 246869	2015-09-04 20:45:50 +00:00
Rui Ueyama	31e66e32b4	COFF: Ignore /GUARDSYM option. The option is added in MSVC 2015, and there's no documentation about what the option is. This patch is to ignore the option for now, so that at least LLD is usable with MSVC 2015. llvm-svn: 246780	2015-09-03 16:20:47 +00:00

1 2 3 4 5 ...

399 Commits