llvm-project

Commit Graph

Author	SHA1	Message	Date
Rui Ueyama	29d8eef440	Rename so that the function name is consistent between ELF and COFF. llvm-svn: 261914	2016-02-25 18:49:11 +00:00
Rui Ueyama	43e12900d9	COFF: Non-external COMDAT sections sholud not be merged by ICF. If a section symbol is not external, that COMDAT section should never be merge with other sections in other compilation unit. Previously, we didn't take visibility into account. Note that COMDAT sections with non-external visibility makes sense because they can be removed by dead-stripping. Fixes https://llvm.org/bugs/show_bug.cgi?id=25686 llvm-svn: 254578	2015-12-03 02:23:33 +00:00
Rui Ueyama	df985afa14	COFF: De-parallelize ICF for now. There was a threading issue in the ICF code for COFF. That seems like a venign bug in the sense that it doesn't produce an incorrect output, but it oftentimes misses reducible sections. As a result, mergeable sections could remain in outputs, which makes the output nondeterministic. Basically the algorithm we are using for ICF is this: We group sections so that identical sections will eventually be in the same group. Initially, all sections are in one group. We split the group by relocation targets until we get a convergence (if relocation targets are in different gruops, the sections are different). Once a group is split, they will never be merged. Each section has a group ID. That variable itself is atomic, so there's no threading issue at the level that we can use thread sanitizer. The point is, when we split a group, we re-assign new group IDs to group of sections. That are multiple separate writes to atomic varaibles. Thus, splitting a group is not an atomic operation, and there's a small chance that the other thread observes inconsistent group IDs. Over-splitting is always "safe", so it will never create incorrect output. I suspect that the nondeterminism stems from that point. However, I cannot prove or fix that at this moment, so I'm going to avoid using threads here. llvm-svn: 251300	2015-10-26 16:20:00 +00:00
Rui Ueyama	548d22c073	COFF: ICF should not merge sectinos if their alignments are not the same. There's actually a room to improve this patch. Instead of not merging sections that have different alignements, we can choose the section that has the largest alignment requirement among all sections that are otherwise considered the same. Then all section alignments are satisfied, so we can merge them. I don't know if that improvement could make any difference for real-world input, so I'll leave it alone. Would be interesting to revisit later. llvm-svn: 248581	2015-09-25 16:50:12 +00:00
Rui Ueyama	c9e746b9e6	COFF: Fix local varaible type. This is intended to be 64-bit integer, but size_t is not guranteed to be the same or larger type than uint64_t. llvm-svn: 248580	2015-09-25 16:38:13 +00:00
Rui Ueyama	c28a08b8d2	COFF: Remove duplicate parameter from hash value calculation. llvm-svn: 248526	2015-09-24 19:00:42 +00:00
Rui Ueyama	97d92736f5	COFF: Improve section hash value. std::distance(C->Relocs.end(), C->Relocs.begin()) is the same as NumRelocs which is already added to the hash value. What we are missing here is the section size. llvm-svn: 248202	2015-09-21 19:41:38 +00:00
Rui Ueyama	3cb1f5c860	COFF: Rename A.replaceWith(B) -> B.replace(A). NFC. llvm-svn: 248197	2015-09-21 19:36:51 +00:00
Rui Ueyama	5f38915624	COFF: Fix ICF regression. This patch fixes a regression introduced by r247964. Relocations that are referring the same symbol should be considered equal, but they were not if they were pointing to non-section chunks. llvm-svn: 248132	2015-09-20 20:19:12 +00:00
Rui Ueyama	f49712a853	COFF: Fix race condition. NextID is updated inside parallel_for_each, so it needs mutual exclusion. llvm-svn: 248106	2015-09-20 01:44:44 +00:00
Rui Ueyama	49e72e69e5	Fix build error that std::atomic is not copy-constructible. llvm-svn: 248061	2015-09-18 22:58:12 +00:00
Rui Ueyama	e629a45531	COFF: Address review comments. - Fix race condition of `Redo` - Avoid std::distance llvm-svn: 248058	2015-09-18 22:31:15 +00:00
Rui Ueyama	e8d1c59756	Style fix to make it look consistent. NFC. llvm-svn: 248044	2015-09-18 21:17:44 +00:00
Rui Ueyama	aa95e5a4cc	COFF: Parallelize ICF. The LLD's ICF algorithm is highly parallelizable. This patch does that using parallel_for_each. ICF accounted for about one third of total execution time. Previously, it took 324 ms when self-hosting. Now it takes only 62 ms. Of course your mileage may vary. My machine is a beefy 24-core Xeon machine, so you may not see this much speedup. But this optimization should be effective even for 2-core machine, since I saw speedup (324 ms -> 189 ms) when setting parallelism parameter to 2. llvm-svn: 248038	2015-09-18 21:06:34 +00:00
Rui Ueyama	603d51104b	COFF: Reorder comparisons. This change makes equalsConstant a bit faster (193ms -> 163ms). llvm-svn: 247965	2015-09-18 02:40:54 +00:00
Rui Ueyama	8c73dfb6bf	COFF: Remove useless micro-optimization. This patch simplifies code by removing micro-optimization that doesn't contribute to speed. llvm-svn: 247964	2015-09-18 02:15:34 +00:00
Rui Ueyama	c9a6e827bd	COFF: Optimize ICF by not creating temporary vectors. Previously, ICF created a vector for each SectionChunk. The vector contained pointers to successors, which are namely associative sections and COMDAT relocation targets. The reason I created vectors is because I thought that that would make section comparison faster. It did make the comparison faster. When self-linking, for example, it saved about 10 ms on each iteration. The time we spent on constructing the vectors was 124 ms. If we iterate more than 12 times, return from the investment exceeds the initial cost. In reality, it usually needs 5 iterations. So we shouldn't construct the vectors. llvm-svn: 247963	2015-09-18 01:51:37 +00:00
Rui Ueyama	7d8263bf1d	COFF: Optimize ICF by comparing relocations before section contents. equalsConstants() is the heaviest function in ICF, and that consumes more than half of total ICF execution time. Of which, section content comparison accounts for roughly one third. Previously, we compared section contents at the beginning of the function after comparing their checksums. The comparison is very likely to succeed because when the control reaches that comparison, their checksums are always equal. And because checksums are 64-bit CRC, they are unlikely to collide. We compared relocations and associative sections after that. If they are different, the time we spent on byte-by-byte comparison of section contents were wasted. This patch moves the comparison at the end of function. If the comparison fails, the time we spent on relocation comparison are wasted, but as I wrote it's very unlikely to happen. LLD took 1198 ms to link itself to produce a 27.11 MB executable. Of which, ICF accounted for 536 ms. This patch cuts it by 90 ms, which is 17% speedup of ICF and 7.5% speedup overall. All numbers are median of ten runs. llvm-svn: 247961	2015-09-18 01:30:56 +00:00
Rui Ueyama	66c06ceaca	COFF: ICF: Print out the number of iterations. NFC. llvm-svn: 247868	2015-09-16 23:55:39 +00:00
Rui Ueyama	92298d5418	COFF: Create ICF class to move code from SectionChunk to ICF. NFC. This patch defines ICF class and defines ICF-related functions as members of the class. By doing this we can move code that are related only to ICF from SectionChunk to the newly-defined class. This also eliminates a global variable "NextID". llvm-svn: 247802	2015-09-16 14:19:10 +00:00
Rui Ueyama	9cb2870ce0	ICF: Improve ICF to reduce more sections than before. This is a patch to make LLD to be on par with MSVC in terms of ICF effectiveness. MSVC produces a 27.14MB executable when linking LLD. LLD previously produced a 27.61MB when self-linking. Now the size is reduced to 27.11MB. Note that without ICF the size is 29.63MB. In r247387, I implemented an algorithm that handles section graphs as cyclic graphs and merge them using SCC. The algorithm did not always work as intended as I demonstrated in r247721. The new algortihm implemented in this patch is different from the previous one. If you are interested the details, you want to read the file comment of ICF.cpp. llvm-svn: 247770	2015-09-16 03:26:31 +00:00
Rui Ueyama	c48f78ca5a	Fix typo. llvm-svn: 247645	2015-09-15 00:35:41 +00:00
Rui Ueyama	5b93aa51de	COFF: Teach ICF to merge cyclic graphs. Previously, LLD's ICF couldn't merge cyclic graphs. That was unfortunate because, in COFF, cyclic graphs are not exceptional at all. That is pretty common. In this patch, sections are grouped by Tarjan's strongly connected component algorithm to get acyclic graphs. And then we try to merge SCCs whose outdegree is zero, and remove them from the graph. This makes other SCCs to have outdegree zero, so we can repeat the process until all SCCs are removed. When comparing two SCCs, we handle cycles properly. This algorithm works better than previous one. Previously, self-linking produced a 29.0MB executable. It now produces a 27.7MB. There's still some gap compared to MSVC linker which produces a 27.1MB executable for the same input. So the gap is narrowed, but still LLD is not on par with MSVC. I'll investigate that later. llvm-svn: 247387	2015-09-11 04:29:03 +00:00
Rui Ueyama	3c28ba38de	COFF: Split doICF(). No functionality change. llvm-svn: 246934	2015-09-05 23:06:32 +00:00
Rui Ueyama	ef907ec82d	COFF: Implement a better algorithm for ICF. Identical COMDAT Folding is a feature to merge COMDAT sections by contents. Two sections are considered the same if their contents, relocations, attributes, etc, are all the same. An interesting fact is that MSVC linker takes "iterations" parameter for ICF because the algorithm they are using is iterative. Merging two sections could make more sections to be mergeable because different relocations could now point to the same section. ICF is repeated until we get a convergence (until no section can be merged). This algorithm is not fast. Usually it needs three iterations until a convergence is obtained. In the new algorithm implemented in this patch, we consider sections and relocations as a directed acyclic graph, and we try to merge sections whose outdegree is zero. Sections with outdegree zero are then removed from the graph, which makes other sections to have outdegree zero. We repeat that until all sections are processed. In this algorithm, we don't iterate over the same sections many times. There's an apparent issue in the algorithm -- the section graph is not guaranteed to be acyclic. It's actually pretty often cyclic. So this algorithm cannot eliminate all possible duplicates. That's OK for now because the previous algorithm was not able to eliminate cycles too. I'll address the issue in a follow-up patch. llvm-svn: 246878	2015-09-04 21:35:54 +00:00
Rui Ueyama	434de7a33f	Remove unused variable. llvm-svn: 246874	2015-09-04 21:05:30 +00:00
Rui Ueyama	2dcc23580e	COFF: Use section content checksum for ICF. Previously, we calculated our own hash values for section contents. Of coruse that's slow because we had to access all bytes in sections. Fortunately, COFF objects usually contain hash values for COMDAT sections. We can use that to speed up Identical COMDAT Folding. llvm-svn: 246869	2015-09-04 20:45:50 +00:00
Rui Ueyama	7b276e2fb4	COFF: Move code for Identical COMDAT Folding to ICF.cpp. llvm-svn: 243701	2015-07-30 22:57:21 +00:00
Rui Ueyama	50be6edfa6	COFF: Make doICF non-recursive. NFC. llvm-svn: 240898	2015-06-28 01:35:59 +00:00
Rui Ueyama	fc510f4cf8	COFF: Devirtualize mark(), markLive() and isCOMDAT(). Only SectionChunk can be dead-stripped. Previously, all types of chunks implemented these functions, but their functions were blank. Likewise, only DefinedRegular and DefinedCOMDAT symbols can be dead-stripped. markLive() function was implemented for other symbol types, but they were blank. I started thinking that the change I made in r240319 was a mistake. I separated DefinedCOMDAT from DefinedRegular because I thought that would make the code cleaner, but now we want to handle them as the same type here. Maybe we should roll it back. This change should improve readability a bit as this removes some dubious uses of reinterpret_cast. Previously, we assumed that all COMDAT chunks are actually SectionChunks, which was not very obvious. llvm-svn: 240675	2015-06-25 19:10:58 +00:00
Rui Ueyama	49560c7a10	COFF: Move code for ICF from Writer.cpp to ICF.cpp. llvm-svn: 240590	2015-06-24 20:40:03 +00:00

31 Commits