llvm-project

Commit Graph

Author	SHA1	Message	Date
Reid Kleckner	8456411e3b	[COFF] Fix SECTION and SECREL relocation handling for absolute symbols Summary: For SECTION relocations against absolute symbols, MSVC emits the largest output section index plus one. I've implemented that by threading a global variable through DefinedAbsolute that is filled in by the Writer. A more library-oriented approach would be to thread the Writer through Chunk::writeTo and SectionChunk::applyRel*, but Rui seems to prefer doing it this way. MSVC rejects SECREL relocations against absolute symbols, but only when the relocation is in a real output section. When the relocation is in a CodeView debug info section destined for the PDB, it seems that this relocation error is suppressed, and absolute symbols become zeros in the object file. This is easily implemented by checking the input section from which we're applying relocations. This should fix errors about __safe_se_handler_table and __guard_fids_table when linking the CRT and generating a PDB. Reviewers: ruiu Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D34541 llvm-svn: 306071	2017-06-22 23:33:04 +00:00
Rui Ueyama	9aa82f76ac	Garbage collect dllimported symbols. This is a different implementation than r303225 (which was reverted in r303270, re-submitted in r303304 and then re-reverted in r303527). In the previous patch, I tried to add Live bit to each dllimported symbol. It turned out that it didn't work with "oldnames.lib" which contains a lot of weak aliases to dllimported symbols. The way we handle weak aliases is to check if undefined symbols can be resolved using weak aliases, and if so, memcpy the Defined symbols to weak Undefined symbols, so that any references to weak aliases automatically see defined symbols instead of undefined ones. This memcpy happens before MarkLive kicks in. That means we may have multiple copies of dllimported symbols. So turning on one instance's Live bit is not enough. This patch moves the Live bit to dllimport file. Since multiple copies of dllsymbols still point to the same file, we can use it as the central repository to keep track of liveness. Differential Revision: https://reviews.llvm.org/D33520 llvm-svn: 303814	2017-05-24 22:30:06 +00:00
Rui Ueyama	b6632d9cd1	Revert r303304: Re-submit r303225: Garbage collect dllimported symbols. This reverts commit r303304 because it looks like the change introduced a crash bug. At least after that change, LLD with thinlto crashes when linking Chromium. llvm-svn: 303527	2017-05-22 06:01:37 +00:00
Rui Ueyama	cd41bc8dec	Re-submit r303225: Garbage collect dllimported symbols. This reverts re-submits r303225 which was reverted in r303270 because it broke the sanitizer-windows bot. The reason of the failure is that we were writing dead symbols to the symbol table. I fixed the issue. llvm-svn: 303304	2017-05-17 21:36:08 +00:00
Hans Wennborg	e67c5f6b52	Revert r303225 "Garbage collect dllimported symbols." and follow-up r303226 "Fix Windows buildbots." This broke the sanitizer-windows buildbot. > Previously, the garbage collector (enabled by default or by explicitly > passing /opt:ref) did not kill dllimported symbols. As a result, > dllimported symbols could be added to resulting executables' dllimport > list even if no one was actually using them. > > This patch implements dllexported symbol garbage collection. Just like > COMDAT sections, dllimported symbols now have Live bits to manage their > liveness, and MarkLive marks reachable dllimported symbols. > > Fixes https://bugs.llvm.org/show_bug.cgi?id=32950 > > Reviewers: pcc > > Subscribers: llvm-commits > > Differential Revision: https://reviews.llvm.org/D33264 llvm-svn: 303270	2017-05-17 16:22:03 +00:00
Rui Ueyama	02df7a6cf1	Garbage collect dllimported symbols. Summary: Previously, the garbage collector (enabled by default or by explicitly passing /opt:ref) did not kill dllimported symbols. As a result, dllimported symbols could be added to resulting executables' dllimport list even if no one was actually using them. This patch implements dllexported symbol garbage collection. Just like COMDAT sections, dllimported symbols now have Live bits to manage their liveness, and MarkLive marks reachable dllimported symbols. Fixes https://bugs.llvm.org/show_bug.cgi?id=32950 Reviewers: pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33264 llvm-svn: 303225	2017-05-17 00:35:50 +00:00
Bob Haarman	cd7197fec3	fix nullptr dereference in COFF/Symbol.h llvm-svn: 294064	2017-02-03 23:05:17 +00:00
Bob Haarman	cde5e5b600	refactor COFF linker to use new LTO API Summary: The COFF linker previously implemented link-time optimization using an API which has now been marked as legacy. This change refactors the COFF linker to use the new LTO API, which is also used by the ELF linker. Reviewers: pcc, ruiu Reviewed By: pcc Subscribers: mgorny, mehdi_amini Differential Revision: https://reviews.llvm.org/D29059 llvm-svn: 293967	2017-02-02 23:58:14 +00:00
Rui Ueyama	ce039266c1	Merge elf::toString and coff::toString. The two overloaded functions hid each other. This patch merges them. llvm-svn: 291222	2017-01-06 10:04:08 +00:00
Rui Ueyama	9381eb1045	Remove lld/Support/Memory.h. I thought for a while about how to remove it, but it looks like we can just copy the file for now. Of course I'm not happy about that, but it's just less than 50 lines of code, and we already have duplicate code in Error.h and some other places. I want to solve them all at once later. Differential Revision: https://reviews.llvm.org/D27819 llvm-svn: 290062	2016-12-18 14:06:06 +00:00
Peter Collingbourne	6ee0b4e9f5	COFF: Open and map input files asynchronously on Windows. Profiling revealed that the majority of lld's execution time on Windows was spent opening and mapping input files. We can reduce this cost significantly by performing these operations asynchronously. This change introduces a queue for all operations on input file data. When we discover that we need to load a file (for example, when we find a lazy archive for an undefined symbol, or when we read a linker directive to load a file from disk), the file operation is launched using a future and the symbol resolution operation is enqueued. This implies another change to symbol resolution semantics, but it seems to be harmless ("ninja All" in Chromium still succeeds). To measure the perf impact of this change I linked Chromium's chrome_child.dll with both thin and fat archives. Thin archives: Before (median of 5 runs): 19.50s After: 10.93s Fat archives: Before: 12.00s After: 9.90s On Linux I found that doing this asynchronously had a negative effect on performance, probably because the cost of mapping a file is small enough that it becomes outweighed by the cost of managing the futures. So on non-Windows platforms I use the deferred execution strategy. Differential Revision: https://reviews.llvm.org/D27768 llvm-svn: 289760	2016-12-15 04:02:23 +00:00
Peter Collingbourne	e50f4854c2	COFF: Fix memory leaks reported by lsan. llvm-svn: 289451	2016-12-12 18:42:09 +00:00
Peter Collingbourne	99111287fc	COFF: Use a bit in SymbolBody to track which symbols are written to the symbol table. Using a set here caused us to take about 1 second longer to write the symbol table when linking chrome_child.dll. With this I consistently get better performance on Windows with the new symbol table. Before r289280 and with r289183 reverted (median of 5 runs): 17.65s After this change: 17.33s On Linux things look even better: Before: 10.700480444s After: 5.735681610s Differential Revision: https://reviews.llvm.org/D27648 llvm-svn: 289408	2016-12-11 22:15:20 +00:00
Peter Collingbourne	79a5e6b1b7	COFF: New symbol table design. This ports the ELF linker's symbol table design, introduced in r268178, to the COFF linker. Differential Revision: http://reviews.llvm.org/D21166 llvm-svn: 289280	2016-12-09 21:55:24 +00:00
Rui Ueyama	bf4ad1d63d	COFF: Use make() to create a new file object in createFile. llvm-svn: 289097	2016-12-08 20:20:22 +00:00
Rui Ueyama	aa4f4450af	Revert r289084: Start using make() in COFF. This reverts commit r289084 to appease buildbots. llvm-svn: 289086	2016-12-08 18:49:04 +00:00
Rui Ueyama	843233b920	Start using make() in COFF. We don't want ELF and COFF to diverge too much. llvm-svn: 289085	2016-12-08 18:31:18 +00:00
Rui Ueyama	a45d45e231	COFF: Define overloaded toString functions. Previously, we had different way to stringize SymbolBody and InputFile to construct error messages. This patch defines overloaded function toString() so that we don't need to memorize all these different function names. With that change, it is now easy to include demangled names in error messages. Now, if there is a symbol name conflict, we'll print out both mangled and demangled names. llvm-svn: 288992	2016-12-07 23:17:02 +00:00
David Majnemer	1706719972	[COFF] Remove an unused function, getFileOff The function was not used and was not functional: all paths would lead to report_fatal_error or endless stack recursion. llvm-svn: 263542	2016-03-15 09:48:18 +00:00
Rui Ueyama	de88072a00	COFF: Rename Ptr -> Repl. This pointer points to a replacement for this chunk. Ptr was not a good name. llvm-svn: 248579	2015-09-25 16:20:24 +00:00
Rui Ueyama	3cfd2bff1e	Remove dead code. llvm-svn: 248105	2015-09-20 01:19:36 +00:00
Rui Ueyama	1cce300843	COFF: Change Symbol::Body type from atomic pointer to regular pointer. I made the field an atomic pointer in hope that we would be able to parallelize the symbol resolver soon, but that's not going to happen soon. This patch reverts that change for the sake of readability. llvm-svn: 248104	2015-09-20 00:00:05 +00:00
Rui Ueyama	63dd8766ab	COFF: Remove DefinedSymbol::isLive() and markLive(). NFC. Basically the concept of "liveness" is for sections (or chunks in LLD terminology) and not for symbols. Symbols are always available or live, or otherwise it indicates a link failure. Previously, we had isLive() and markLive() methods for DefinedSymbol. They are confusing methods. What they actually did is to act as a proxy to backing section chunks. We can simplify eliminate these methods and call section chunk's methods directly. llvm-svn: 247869	2015-09-16 23:55:52 +00:00
Rafael Espindola	beee25e484	Make these headers as being c++. llvm-svn: 245050	2015-08-14 14:12:54 +00:00
Rafael Espindola	5c546a1437	COFF: In chunks, store the offset from the start of the output section. NFC. This is more convenient than the offset from the start of the file as we don't have to worry about it changing when we move the output section. This is a port of r245008 from ELF. llvm-svn: 245018	2015-08-14 03:30:59 +00:00
Rui Ueyama	234afc4a0e	Remove unused `using`. llvm-svn: 244422	2015-08-09 20:38:58 +00:00
Rafael Espindola	b835ae8e4a	Port the error functions from ELF to COFF. This has a few advantages * Less C++ code (about 300 lines less). * Less machine code (about 14 KB of text on a linux x86_64 build). * It is more debugger friendly. Just set a breakpoint on the exit function and you get the complete lld stack trace of when the error was found. * It is a more robust API. The errors are handled early and we don't get a std::error_code hot potato being passed around. * In most cases the error function in a better position to print diagnostics (it has more context). llvm-svn: 244215	2015-08-06 14:58:50 +00:00
Rui Ueyama	8bc43a142b	COFF: ARM: Fix relocations to thumb code. Windows ARM is the thumb ARM environment, and pointers to thumb code needs to have its LSB set. When we apply relocations, we need to adjust the LSB if it points to an executable section. llvm-svn: 243560	2015-07-29 19:25:00 +00:00
Rui Ueyama	eb26e1d03c	COFF: Fix SECREL and SECTION relocations. SECREL should sets the 32-bit offset of the target from the beginning of target's output section. Previously, the offset from the beginning of source's output section was used instead. SECTION means the target section's index, and not the source section's index. This patch fixes that issue too. llvm-svn: 243535	2015-07-29 16:30:45 +00:00
Rui Ueyama	5e706b3ee3	COFF: Use short identifiers. NFC. llvm-svn: 243229	2015-07-25 21:54:50 +00:00
Rui Ueyama	28df04211c	COFF: Split ImportThunkChunk into x86 and x64. NFC. This change should make it easy to port this code to ARM. llvm-svn: 243195	2015-07-25 01:16:06 +00:00
Rui Ueyama	cd3f99b6c5	COFF: Implement Safe SEH support for x86. An object file compatible with Safe SEH contains a .sxdata section. The section contains a list of symbol table indices, each of which is an exception handler function. A safe SEH-enabled executable contains a list of exception handler RVAs. So, what the linker has to do to support Safe SEH is basically to read the .sxdata section, interpret the contents as a list of symbol indices, unique-fy and sort their RVAs, and then emit that list to .rdata. This patch implements that feature. llvm-svn: 243182	2015-07-24 23:51:14 +00:00
Rui Ueyama	3cb895c930	COFF: Fix __ImageBase symbol relocation. __ImageBase is a special symbol whose value is the image base address. Previously, we handled __ImageBase symbol as an absolute symbol. Absolute symbols point to specific locations in memory and the locations never change even if an image is base-relocated. That means that we don't have base relocation entries for absolute symbols. This is not a case for __ImageBase. If an image is base-relocated, its base address changes, and __ImageBase needs to be shifted as well. So we have to have base relocations for __ImageBase. That means that __ImageBase is not really an absolute symbol but a different kind of symbol. In this patch, I introduced a new type of symbol -- DefinedRelative. DefinedRelative is similar to DefinedAbsolute, but it has not a VA but RVA and is a subject of base relocation. Currently only __ImageBase is of the new symbol type. llvm-svn: 243176	2015-07-24 22:58:44 +00:00
Rui Ueyama	cb71c72ccc	COFF: Inline Defined::getRVA because it's very hot. llvm-svn: 242075	2015-07-13 22:01:27 +00:00
David Majnemer	3a62d3d456	COFF: Fill in the type and storage class in the symbol table We can use the type and storage class from the symbol's original object file to fill in the linked executable's symbol table. llvm-svn: 241828	2015-07-09 17:43:50 +00:00
Rui Ueyama	183f53fd22	COFF: Support isa<> for Symbol::Body, whose type is std::atomic<SymbolBody *>. llvm-svn: 241477	2015-07-06 17:45:22 +00:00
Rui Ueyama	c80c03da6c	COFF: Use atomic pointers in preparation for parallelizing. In the new design, mutation of Symbol pointers is the name resolution operation. This patch makes them atomic pointers so that they can be mutated by multiple threads safely. I'm going to use atomic compare-exchange on these pointers. dyn_cast<> doesn't recognize atomic pointers as pointers, so we need to call load(). This is unfortunate, but in other places automatic type conversion works fine. llvm-svn: 241416	2015-07-05 21:54:42 +00:00
Peter Collingbourne	2612a32ce5	COFF: Numerous fixes for interaction between LTO and weak externals. We were previously hitting assertion failures in the writer in cases where a regular object file defined a weak external symbol that was defined by a bitcode file. Because /export and /entry name mangling were implemented using weak externals, the same problem affected mangled symbol names in bitcode files. The underlying cause of the problem was that weak external symbols were being resolved before doing LTO, so the symbol table may have contained stale references to bitcode symbols. The fix here is to defer weak external symbol resolution until after LTO. Also implement support for weak external symbols in bitcode files by modelling them as replaceable DefinedBitcode symbols. Differential Revision: http://reviews.llvm.org/D10940 llvm-svn: 241391	2015-07-04 05:28:41 +00:00
Peter Collingbourne	da2f094bbb	COFF: Fix the case where an object defines a weak external and its alias. This worked before, but only by accident, and only with assertions disabled. We ended up storing a DefinedRegular symbol in the WeakAlias field, and never using it as an Undefined. Differential Revision: http://reviews.llvm.org/D10934 llvm-svn: 241376	2015-07-03 22:03:36 +00:00
Rui Ueyama	65813edfe2	COFF: Make symbols satisfy weak ordering. Previously, SymbolBody::compare(A, B) didn't satisfy weak ordering. There was a case that A < B and B < A could have been true. This is because we just pick LHS if A and B are consisdered equivalent. This patch is to make symbols being weakly ordered. If A and B are not tie, one of A < B && B > A or A > B && B < A is true. This is not an improtant property for a single-threaded environment because everything is deterministic anyways. However, in a multi- threaded environment, this property becomes important. If a symbol is defined or lazy, ties are resolved by its file index. For simple types that we don't really care about their identities, symbols are compared by their addresses. llvm-svn: 241294	2015-07-02 20:33:48 +00:00
Rui Ueyama	3d4c69c04d	COFF: Resolve AlternateNames using weak aliases. Previously, we use SymbolTable::rename to resolve AlternateName symbols. This patch is to merge that mechanism with weak aliases, so that we remove that function. llvm-svn: 241230	2015-07-02 02:38:59 +00:00
Rui Ueyama	0744e87fad	COFF: Rename getReplacement -> repl. The previous name was too long to my taste. llvm-svn: 241215	2015-07-02 00:21:11 +00:00
Rui Ueyama	4897596728	COFF: Chagne weak alias' type from SymbolBody** to SymbolBody*. NFC. llvm-svn: 241198	2015-07-01 22:32:23 +00:00
Rui Ueyama	8d3010a1a6	COFF: Change the order of adding symbols to the symbol table. Previously, the order of adding symbols to the symbol table was simple. We have a list of all input files. We read each file from beginning of the list and add all symbols in it to the symbol table. This patch changes that order. Now all archive files are added to the symbol table first, and then all the other object files are added. This shouldn't change the behavior in single-threading, and make room to parallelize in multi-threading. In the first step, only lazy symbols are added to the symbol table because archives contain only Lazy symbols. Member object files found to be necessary are queued. In the second step, defined and undefined symbols are added from object files. Adding an undefined symbol to the symbol table may cause more member files to be added to the queue. We simply continue reading all object files until the queue is empty. Finally, new archive or object files may be added to the queues by object files' directive sections (which contain new command line options). The above process is repeated until we get no new files. Symbols defined both in object files and in archives can make results undeterministic. If an archive is read before an object, a new member file gets linked, while in the other way, no new file would be added. That is the most popular cause of an undeterministic result or linking failure as I observed. Separating phases of adding lazy symbols and undefined symbols makes that deterministic. Adding symbols in each phase should be parallelizable. llvm-svn: 241107	2015-06-30 19:35:21 +00:00
Peter Collingbourne	f7b27d15f2	COFF: Implement SymbolBody::getDebugName() for DefinedBitcode symbols. Differential Revision: http://reviews.llvm.org/D10827 llvm-svn: 241029	2015-06-30 00:47:52 +00:00
Rui Ueyama	c15139bb6d	COFF: Make DefinedCOFF one pointer smaller. The size of this class actually matters because this is the most popular class among all classes. We create a Defined symbol for each defined symbol in a symbol table. That can be millions for a large program. For example, linking LLD instantiates this class millions times. llvm-svn: 241025	2015-06-30 00:10:54 +00:00
Chandler Carruth	64c17c7d67	[opt] Devirtualize the SymbolBody type hierarchy and start compacting its members into the base class. First, to help motivate this kind of change, understand that in a self-link, LLD creates 5.5 million defined regular symbol bodies (and 6 million symbol bodies total). A significant portion of its time is spent allocating the memory for these symbols, and befor ethis patch the defined regular symbol body objects alone consumed some 420mb of memory during the self link. As a consequence, I think it is worth expending considerable effort to make these objects as memory efficient as possible. This is the first of several components of that. This change starts with the goal of removing the virtual functins from SymbolBody so that it can avoid having a vptr embedded in it when it already contains a "kind" member, and that member can be much more compact than a vptr. The primary way of doing this is to sink as much of the logic that we would have to dispatch for into data in the base class. As part of this, I made the various flags bits that will pack into a bitfield with the kind tag. I also sank the Name down to eliminate the dispatch for that, and used LLVM's RTTI-style dispatch for everything else (most of which is cold and so doesn't matter terribly if we get minutely worse lowering than a vtable dispatch). As I was doing this, I wanted to make the RTTI-dispatch (which would become much hotter than before) as efficient as possible, so I've re-organized the tags somewhat. Notably, the common case (regular defined symbols) is now zero which we can test for faster. I also needed to rewrite the comparison routine used during resolving symbols. This proved to be quite complex as the semantics of the existing one were very subtle due to the back-and-forth virtual dispatch caused by re-dispatching with reversed operands. I've consolidated it to a single function and tried to comment it quite a bit more to help explain what is going on. However, this may need more comments or other explanations. It at least passes all the regression tests. I'm not working on Windows, so I can't fully test it. With all of these changes, the size of a DefinedRegular symbol on a 64-bit build goes from 80 bytes to 64 bytes, and we save approximately 84mb or 20% of the memory consumed by these symbol bodies during the link. The link time appears marginally faster as well, and the profile hotness of the memory allocation subsystem got a bit better, but there is still a lot of allocation traffic. Differential Revision: http://reviews.llvm.org/D10792 llvm-svn: 241001	2015-06-29 21:35:48 +00:00
Chandler Carruth	59013c387e	[opt] Replace the recursive walk for GC with a worklist algorithm. This flattens the entire liveness walk from a recursive mark approach to a worklist approach. It also sinks the worklist management completely out of the SectionChunk and into the Writer by exposing the ability to iterato over children of a chunk and over the symbol bodies of relocated symbols. I'm not 100% happy with the API names, so suggestions welcome there. This allows us to use a single worklist for the entire recursive walk and would also be a natural place to take advantage of parallelism at some future point. With this, we completely inline away the GC walk into the Writer::markLive function and it makes it very easy to profile what is slow. Currently, time is being wasted checking whether a Chunk isa SectionChunk (it essentially always is), finding (or skipping) a replacement for a symbol, and chasing pointers between symbols and their chunks. There are a bunch of things we can do to fix this, and its easier to do them after this change IMO. This change alone saves 1-2% of the time for my self-link of lld.exe (which I'm running and benchmarking on Linux ironically). Perhaps more notably, we'll no longer blow out the stack for large links. =] Just as an FYI, at this point, I/O is starting to really dominate the profile. Well over 10% of the time appears to be inside the kernel doing page table silliness. I think a decent chunk of this can be nuked as well, but it's a little odd as cross-linking in this way isn't really the primary goal here. Differential Revision: http://reviews.llvm.org/D10790 llvm-svn: 240995	2015-06-29 21:12:49 +00:00
Rui Ueyama	871847e32d	COFF: Fix ICF correctness bug. When comparing two COMDAT sections, we need to take section values and associative sections into account. This patch fixes that bug. It fixes a crash bug of llvm-tblgen when linked with /opt:lldicf. One thing I don't understand yet is that this logic seems to be too strict. MSVC linker is able to create more compact executables (which of course work correctly). With this ICF algorithm, LLD is able to make executable smaller, but the outputs are larger than MSVC's. There must be something I'm missing here. llvm-svn: 240897	2015-06-28 01:30:54 +00:00
Peter Collingbourne	be54955bba	COFF: Implement /lldmap flag. This flag can be used to produce a map file, which is essentially a list of objects linked into the final output file together with the RVAs of their symbols. Because our format differs from MSVC's we expose it as a separate flag. Differential Revision: http://reviews.llvm.org/D10773 llvm-svn: 240812	2015-06-26 18:58:24 +00:00
Rui Ueyama	ccde19d77e	COFF: Fix local absolute symbols. Absolute symbols were always handled as external symbols, so if two or more object files define the same absolute symbol, they would conflict even if the symbol is private to each file. This patch fixes that bug. llvm-svn: 240756	2015-06-26 03:09:23 +00:00
Rui Ueyama	68633f1719	COFF: Better error message for duplicate symbols. Now the symbol table prints out not only symbol names but also file names for duplicate symbols. llvm-svn: 240719	2015-06-25 23:22:00 +00:00
Rui Ueyama	9b921e5dc9	COFF: Merge DefinedRegular and DefinedCOMDAT. I split them in r240319 because I thought they are different enough that we should treat them as different types. It turned out that that was not a good idea. They are so similar that we ended up having many duplicate code. llvm-svn: 240706	2015-06-25 22:00:42 +00:00
Rui Ueyama	fc510f4cf8	COFF: Devirtualize mark(), markLive() and isCOMDAT(). Only SectionChunk can be dead-stripped. Previously, all types of chunks implemented these functions, but their functions were blank. Likewise, only DefinedRegular and DefinedCOMDAT symbols can be dead-stripped. markLive() function was implemented for other symbol types, but they were blank. I started thinking that the change I made in r240319 was a mistake. I separated DefinedCOMDAT from DefinedRegular because I thought that would make the code cleaner, but now we want to handle them as the same type here. Maybe we should roll it back. This change should improve readability a bit as this removes some dubious uses of reinterpret_cast. Previously, we assumed that all COMDAT chunks are actually SectionChunks, which was not very obvious. llvm-svn: 240675	2015-06-25 19:10:58 +00:00
Rui Ueyama	88e0f9206b	COFF: Fix a bug of __imp_ symbol. The change I made in r240620 was not correct. If a symbol foo is defined, and if you use __imp_foo, __imp_foo symbol is automatically defined as a pointer (not just an alias) to foo. Now that we need to create a chunk for automatically-created symbols. I defined LocalImportChunk class for them. llvm-svn: 240622	2015-06-25 03:31:47 +00:00
Rui Ueyama	ddf71fc370	COFF: Initial implementation of Identical COMDAT Folding. Identical COMDAT Folding (ICF) is an optimization to reduce binary size by merging COMDAT sections that contain the same metadata, actual data and relocations. MSVC link.exe and many other linkers have this feature. LLD achieves on per with MSVC in terms produced binary size with this patch. This technique is pretty effective. For example, LLD's size is reduced from 64MB to 54MB by enaling this optimization. The algorithm implemented in this patch is extremely inefficient. It puts all COMDAT sections into a set to identify duplicates. Time to self-link with/without ICF are 3.3 and 320 seconds, respectively. So this option roughly makes LLD 100x slower. But it's okay as I wanted to achieve correctness first. LLD is still able to link itself with this optimization. I'm going to make it more efficient in followup patches. Note that this optimization is not entirely safe. C/C++ require different functions have different addresses. If your program relies on that property, your program wouldn't work with ICF. However, it's not going to be an issue on Windows because MSVC link.exe turns ICF on by default. As long as your program works with default settings (or not passing /opt:noicf), your program would work with LLD too. llvm-svn: 240519	2015-06-24 04:36:52 +00:00
Rui Ueyama	617f5ccb5c	COFF: Separate DefinedCOMDAT from DefinedRegular symbol type. NFC. Before this change, you got to cast a symbol to DefinedRegular and then call isCOMDAT() to determine if a given symbol is a COMDAT symbol. Now you can just use isa<DefinedCOMDAT>(). As to the class definition of DefinedCOMDAT, I could remove duplicate code from DefinedRegular and DefinedCOMDAT by introducing another base class for them, but I chose to not do that to keep the class hierarchy shallow. This amount of code duplication doesn't worth to define a new class. llvm-svn: 240319	2015-06-22 19:56:01 +00:00
Rui Ueyama	efb7e1aa29	COFF: Fix a common symbol bug. This is a case that one mistake caused a very mysterious bug. I made a mistake to calculate addresses of common symbols, so each common symbol pointed not to the beginning of its location but to the end of its location. (Ouch!) Common symbols are aligned on 16 byte boundaries. If a common symbol is small enough to fit between the end of its real location and whatever comes next, this bug didn't cause any harm. However, if a common symbol is larger than that, its memory naturally overlapped with other symbols. That means some uninitialized variables accidentally shared memory. Because totally unrelated memory writes mutated other varaibles, it was hard to debug. It's surprising that LLD was able to link itself and all LLD tests except gunit tests passed with this nasty bug. With this fix, the new COFF linker is able to pass all tests for LLVM, Clang and LLD if I use MSVC cl.exe as a compiler. Only three tests are failing when used with clang-cl. llvm-svn: 240216	2015-06-20 07:21:57 +00:00
Rui Ueyama	e25147626c	COFF: Simplify SymbolBody::compare(SymbolBody *Other). We are currently handling all combinations of SymbolBody types directly. This patch is to flip this and Other if Other->kind() < this->kind() to reduce number of combinations. No functionality change intended. llvm-svn: 239745	2015-06-15 19:06:53 +00:00
Peter Collingbourne	1b6fd1f5fd	COFF: Symbol resolution for common and comdat symbols defined in bitcode. In the case where either a bitcode file and a regular file or two bitcode files export a common or comdat symbol with the same name, the linker needs to pick one of them following COFF semantics. This patch implements a design for resolving such symbols that pushes most of the work onto either LLD's regular mechanism for resolving common or comdat symbols or the IR linker's mechanism for doing the same. We modify SymbolBody::compare to always prefer non-bitcode symbols, so that during the initial phase of symbol resolution, the symbol table always contains a regular symbol in any case where we need to choose between a regular and a bitcode symbol. In SymbolTable::addCombinedLTOObject, we force export any bitcode symbols that were initially pre-empted by a regular symbol, and later use SymbolBody::compare to choose between the regular symbol in the symbol table and the regular symbol from the combined LTO object file. This design seems to be sound, so long as the resolution mechanism is defined to be commutative and associative modulo arbitrary choices between symbols (which seems to be the case for COFF). Differential Revision: http://reviews.llvm.org/D10329 llvm-svn: 239563	2015-06-11 21:49:54 +00:00
Rui Ueyama	57fe78d339	COFF: Read symbol names lazily. This change seems to make the linker about 10% faster. Reading symbol name is not very cheap because it needs strlen() on the string table. We were wasting time on reading non-external symbol names that would never be used by the linker. llvm-svn: 239332	2015-06-08 19:43:59 +00:00
Rui Ueyama	2d7627198f	Fix typo. llvm-svn: 238937	2015-06-03 16:50:41 +00:00
Rui Ueyama	fd99e01b91	COFF: Support import-by-ordinal DLL imports. Symbols exported by DLLs can be imported not by name but by small number or ordinal. Usually, symbols have both ordinals and names, and in that case ordinals are called "hints" and used by the loader as hints. However, symbols can have only ordinals. They are called import-by-ordinal symbols. You need to manage ordinals by hand so that they will never change if you choose to use the feature. But it's supposed to make dynamic linking faster because it needs no string comparison. Not sure if that claim still stands in year 2015, though. Anyways, the feature exists, and this patch implements that. llvm-svn: 238780	2015-06-01 21:05:27 +00:00
Peter Collingbourne	60c1616613	COFF: Initial implementation of link-time optimization. This implementation is known to work in very simple cases (see new test case). Differential Revision: http://reviews.llvm.org/D10115 llvm-svn: 238777	2015-06-01 20:10:10 +00:00
Rui Ueyama	68216c680d	Fix comments. llvm-svn: 238718	2015-06-01 03:55:02 +00:00
Rui Ueyama	7c4fcdd559	COFF: Move Windows-specific function under Windows-specific marker. llvm-svn: 238563	2015-05-29 15:49:09 +00:00
Rui Ueyama	c9bfe32010	COFF: Fill imort table HintName field. Currently we set the field to zero, but as per the spec, we should set numbers we read from import library files. The loader uses the values as starting offsets for binary search when looking up imported symbols from DLL. llvm-svn: 238562	2015-05-29 15:45:35 +00:00
Rui Ueyama	411c636081	COFF: Add a new PE/COFF port. This is an initial patch for a section-based COFF linker. The patch has 2300 lines of code including comments and blank lines. Before diving into details, you want to start from reading README because it should give you an overview of the design. All important things are written in the README file, so I write summary here. - The linker is already able to self-link on Windows. - It's significantly faster than the existing implementation. The existing one takes 5 seconds to link LLD on my machine, while the new one only takes 1.2 seconds, even though the new one is not multi-threaded yet. (And a proof-of-concept multi- threaded version was able to link it in 0.5 seconds.) - It uses much less memory (250MB vs. 2GB virtual memory space to self-host). - IMHO the new code is much simpler and easier to read than the existing PE/COFF port. http://reviews.llvm.org/D10036 llvm-svn: 238458	2015-05-28 19:09:30 +00:00

1 2 3

118 Commits