llvm-project

Commit Graph

Author	SHA1	Message	Date
Rui Ueyama	cd3f99b6c5	COFF: Implement Safe SEH support for x86. An object file compatible with Safe SEH contains a .sxdata section. The section contains a list of symbol table indices, each of which is an exception handler function. A safe SEH-enabled executable contains a list of exception handler RVAs. So, what the linker has to do to support Safe SEH is basically to read the .sxdata section, interpret the contents as a list of symbol indices, unique-fy and sort their RVAs, and then emit that list to .rdata. This patch implements that feature. llvm-svn: 243182	2015-07-24 23:51:14 +00:00
Rui Ueyama	3cb895c930	COFF: Fix __ImageBase symbol relocation. __ImageBase is a special symbol whose value is the image base address. Previously, we handled __ImageBase symbol as an absolute symbol. Absolute symbols point to specific locations in memory and the locations never change even if an image is base-relocated. That means that we don't have base relocation entries for absolute symbols. This is not a case for __ImageBase. If an image is base-relocated, its base address changes, and __ImageBase needs to be shifted as well. So we have to have base relocations for __ImageBase. That means that __ImageBase is not really an absolute symbol but a different kind of symbol. In this patch, I introduced a new type of symbol -- DefinedRelative. DefinedRelative is similar to DefinedAbsolute, but it has not a VA but RVA and is a subject of base relocation. Currently only __ImageBase is of the new symbol type. llvm-svn: 243176	2015-07-24 22:58:44 +00:00
Rui Ueyama	afad42f9ea	COFF: Set Load Configuration entry in Data Directory. Load Configuration field points to a structure containing information for SEH. That data strucutre is not created by the linker but provided by an external file. What we have to do is just to set __load_config_used address to the header. llvm-svn: 242427	2015-07-16 18:30:35 +00:00
Rui Ueyama	6d24908fe7	COFF: Fix x86 delay-load helper function name. If /delayload option is given, we have to resolve __delayLoadHelper2 since the function is the dynamic loader to delay-load DLLs. The function name is mangled in x86 as ___delayLoadHelper2@8. llvm-svn: 242078	2015-07-13 22:31:45 +00:00
Rui Ueyama	e59a530a6c	COFF: Split createSymbolAndSymbolTable to small functions. NFC. llvm-svn: 242066	2015-07-13 20:56:31 +00:00
Rui Ueyama	ea533cde30	COFF: Infer machine type earlier than before. Previously, we infer machine type at the very end of linking after all symbols are resolved. That's actually too late because machine type affects how we mangle symbols (whether or not we need to add "_"). For example, /entry:foo adds "_foo" to the symbol table if x86 but "foo" if x64. This patch moves the code to infer machine type, so that machine type is inferred based on input files given via the command line (but not based on .directives files). llvm-svn: 241843	2015-07-09 19:54:13 +00:00
David Majnemer	3a62d3d456	COFF: Fill in the type and storage class in the symbol table We can use the type and storage class from the symbol's original object file to fill in the linked executable's symbol table. llvm-svn: 241828	2015-07-09 17:43:50 +00:00
Rui Ueyama	1b53ec796a	COFF: Remove Writer::Is64 and use Config::is64 instead. NFC. llvm-svn: 241819	2015-07-09 16:40:39 +00:00
Rui Ueyama	7c3e23fffd	COFF: Fix import thunks and name mangling for x86. With this patch, LLD is now able to correctly link a "hello world" program written in assembly for 32-bit x86. llvm-svn: 241771	2015-07-09 01:25:49 +00:00
Rui Ueyama	dcb46d6a74	COFF: Remove dead code. r241647 made Driver to infer machine type, so this code is not actually in use. llvm-svn: 241720	2015-07-08 20:35:29 +00:00
David Majnemer	2c345a337c	COFF: Emit a symbol table if /debug is specified Providing a symbol table in the executable is quite useful when debugging a fully-linked executable without having to reconstruct one from DWARF. Differential Revision: http://reviews.llvm.org/D11023 llvm-svn: 241689	2015-07-08 16:37:50 +00:00
Rui Ueyama	11863b4ae1	COFF: Support x86 file header and relocations. llvm-svn: 241657	2015-07-08 01:45:29 +00:00
Rui Ueyama	183f53fd22	COFF: Support isa<> for Symbol::Body, whose type is std::atomic<SymbolBody *>. llvm-svn: 241477	2015-07-06 17:45:22 +00:00
Rui Ueyama	92a8c82076	COFF: Set TLS table header field. TLS table header field is supposed to have address and size of TLS table. The linker doesn't have to understand what TLS table is. TLS table's name is always "_tls_used", so if there's that symbol, the linker simply sets that symbol's RVA to the header. The size of the TLS table is always 40 bytes. llvm-svn: 241426	2015-07-06 01:48:01 +00:00
Rui Ueyama	c80c03da6c	COFF: Use atomic pointers in preparation for parallelizing. In the new design, mutation of Symbol pointers is the name resolution operation. This patch makes them atomic pointers so that they can be mutated by multiple threads safely. I'm going to use atomic compare-exchange on these pointers. dyn_cast<> doesn't recognize atomic pointers as pointers, so we need to call load(). This is unfortunate, but in other places automatic type conversion works fine. llvm-svn: 241416	2015-07-05 21:54:42 +00:00
Rui Ueyama	6600eb18cd	COFF: Implement /merge option. /merge:.foo=.bar makes the linker to merge section .foo with section .bar. llvm-svn: 241396	2015-07-04 23:37:32 +00:00
Rui Ueyama	b0398827c2	COFF: Fix bug in garbage collector. GC root may have non-regular defined symbols, such as DefinedImportThunk, so this cast<> was a wrong assumption. llvm-svn: 241382	2015-07-04 01:10:32 +00:00
Rui Ueyama	7a247ee242	COFF: Fix a bug that /delayload was case-sensitive. llvm-svn: 241316	2015-07-03 01:40:14 +00:00
Rui Ueyama	458d74421b	COFF: Merge SymbolTable::find{,Symbol}. NFC llvm-svn: 241238	2015-07-02 03:59:04 +00:00
Rui Ueyama	0744e87fad	COFF: Rename getReplacement -> repl. The previous name was too long to my taste. llvm-svn: 241215	2015-07-02 00:21:11 +00:00
Rui Ueyama	18f8d2c5c0	COFF: Change GCRoot member type from StringRef to Undefined. NFC. I think Undefined symbols are a bit more convenient than StringRefs since SymbolBodies are handles for symbols. You can get resolved symbols for undefined symbols just by calling getReplacmenet without looking up the symbol table. llvm-svn: 241214	2015-07-02 00:21:08 +00:00
Rui Ueyama	6bf638e688	COFF: Simplify and rename findMangle. NFC. Occasionally we have to resolve an undefined symbol to its mangled symbol. Previously, we did that on calling side of findMangle by explicitly updating SymbolBody. In this patch, mangled symbols are handled as weak aliases for undefined symbols. llvm-svn: 241213	2015-07-02 00:04:14 +00:00
Chandler Carruth	59013c387e	[opt] Replace the recursive walk for GC with a worklist algorithm. This flattens the entire liveness walk from a recursive mark approach to a worklist approach. It also sinks the worklist management completely out of the SectionChunk and into the Writer by exposing the ability to iterato over children of a chunk and over the symbol bodies of relocated symbols. I'm not 100% happy with the API names, so suggestions welcome there. This allows us to use a single worklist for the entire recursive walk and would also be a natural place to take advantage of parallelism at some future point. With this, we completely inline away the GC walk into the Writer::markLive function and it makes it very easy to profile what is slow. Currently, time is being wasted checking whether a Chunk isa SectionChunk (it essentially always is), finding (or skipping) a replacement for a symbol, and chasing pointers between symbols and their chunks. There are a bunch of things we can do to fix this, and its easier to do them after this change IMO. This change alone saves 1-2% of the time for my self-link of lld.exe (which I'm running and benchmarking on Linux ironically). Perhaps more notably, we'll no longer blow out the stack for large links. =] Just as an FYI, at this point, I/O is starting to really dominate the profile. Well over 10% of the time appears to be inside the kernel doing page table silliness. I think a decent chunk of this can be nuked as well, but it's a little odd as cross-linking in this way isn't really the primary goal here. Differential Revision: http://reviews.llvm.org/D10790 llvm-svn: 240995	2015-06-29 21:12:49 +00:00
Rui Ueyama	a8b60458ea	COFF: Add /noentry flag. This option is sometimes used to create a resource-only DLL that doesn't need any initialization. llvm-svn: 240915	2015-06-28 19:56:30 +00:00
Rui Ueyama	382dc96e29	COFF: Fix delay-import tables. There were a few issues with the previous delay-import tables. - "Attribute" field should have been 1 instead of 0. (I don't know the meaning of this field, though.) - LEA and CALL operands had wrong addresses. - Address tables are in .didat (which is read-only). They should have been in .data. llvm-svn: 240837	2015-06-26 21:40:15 +00:00
Rui Ueyama	9b921e5dc9	COFF: Merge DefinedRegular and DefinedCOMDAT. I split them in r240319 because I thought they are different enough that we should treat them as different types. It turned out that that was not a good idea. They are so similar that we ended up having many duplicate code. llvm-svn: 240706	2015-06-25 22:00:42 +00:00
Rui Ueyama	b0c001c055	COFF: Remove dead code. llvm-svn: 240682	2015-06-25 20:12:15 +00:00
Rui Ueyama	fc510f4cf8	COFF: Devirtualize mark(), markLive() and isCOMDAT(). Only SectionChunk can be dead-stripped. Previously, all types of chunks implemented these functions, but their functions were blank. Likewise, only DefinedRegular and DefinedCOMDAT symbols can be dead-stripped. markLive() function was implemented for other symbol types, but they were blank. I started thinking that the change I made in r240319 was a mistake. I separated DefinedCOMDAT from DefinedRegular because I thought that would make the code cleaner, but now we want to handle them as the same type here. Maybe we should roll it back. This change should improve readability a bit as this removes some dubious uses of reinterpret_cast. Previously, we assumed that all COMDAT chunks are actually SectionChunks, which was not very obvious. llvm-svn: 240675	2015-06-25 19:10:58 +00:00
Rui Ueyama	88e0f9206b	COFF: Fix a bug of __imp_ symbol. The change I made in r240620 was not correct. If a symbol foo is defined, and if you use __imp_foo, __imp_foo symbol is automatically defined as a pointer (not just an alias) to foo. Now that we need to create a chunk for automatically-created symbols. I defined LocalImportChunk class for them. llvm-svn: 240622	2015-06-25 03:31:47 +00:00
Rui Ueyama	49560c7a10	COFF: Move code for ICF from Writer.cpp to ICF.cpp. llvm-svn: 240590	2015-06-24 20:40:03 +00:00
Rui Ueyama	ddf71fc370	COFF: Initial implementation of Identical COMDAT Folding. Identical COMDAT Folding (ICF) is an optimization to reduce binary size by merging COMDAT sections that contain the same metadata, actual data and relocations. MSVC link.exe and many other linkers have this feature. LLD achieves on per with MSVC in terms produced binary size with this patch. This technique is pretty effective. For example, LLD's size is reduced from 64MB to 54MB by enaling this optimization. The algorithm implemented in this patch is extremely inefficient. It puts all COMDAT sections into a set to identify duplicates. Time to self-link with/without ICF are 3.3 and 320 seconds, respectively. So this option roughly makes LLD 100x slower. But it's okay as I wanted to achieve correctness first. LLD is still able to link itself with this optimization. I'm going to make it more efficient in followup patches. Note that this optimization is not entirely safe. C/C++ require different functions have different addresses. If your program relies on that property, your program wouldn't work with ICF. However, it's not going to be an issue on Windows because MSVC link.exe turns ICF on by default. As long as your program works with default settings (or not passing /opt:noicf), your program would work with LLD too. llvm-svn: 240519	2015-06-24 04:36:52 +00:00
Rui Ueyama	0d2e999050	COFF: Make link order compatible with MSVC link.exe. Previously, we added files in directive sections to the symbol table as we read the sections, so the link order was depth-first. That's not compatible with MSVC link.exe nor the old LLD. This patch is to queue files so that new files are added to the end of the queue and processed last. Now addFile() doesn't parse files nor resolve symbols. You need to call run() to process queued files. llvm-svn: 240483	2015-06-23 23:56:39 +00:00
Rui Ueyama	a77336bd5d	COFF: Support delay-load import tables. DLLs are usually resolved at process startup, but you can delay-load them by passing /delayload option to the linker. If a /delayload is specified, the linker has to create data which is similar to regular import table. One notable difference is that the pointers in a delay-load import table are originally pointing to thunks that resolves themselves. Each thunk loads a DLL, resolve its name, and then overwrites the pointer with the result so that subsequent function calls directly call a desired function. The linker has to emit thunks. llvm-svn: 240250	2015-06-21 22:31:52 +00:00
Rui Ueyama	1a109285c2	COFF: Use short varaible name. NFC. llvm-svn: 240232	2015-06-21 04:10:54 +00:00
Rui Ueyama	4d769c3a57	COFF: Support exception table. .pdata section contains a list of triplets of function start address, function end address and its unwind information. Linkers have to sort section contents by function start address and set the section address to the file header (so that runtime is able to find it and do binary search.) This change seems to resolve all but one remaining test failures in check{,-clang,-lld} when building the entire stuff with clang-cl and lld-link. llvm-svn: 240231	2015-06-21 04:00:54 +00:00
Rui Ueyama	97dff9ee3a	COFF: Support creating DLLs. DLL files are in the same format as executables but they have export tables. The format of the export table is described in PE/COFF spec section 5.3. A new class, EdataContents, takes care of creating chunks for export tables. What we need to do is to parse command line flags for dllexports, and then instantiate the class to create chunks. For the writer, export table chunks are opaque data -- it just add chunks to .edata section. llvm-svn: 239869	2015-06-17 00:16:33 +00:00
Rui Ueyama	6592ff8c93	COFF: Add miscellaneous boolean flags. llvm-svn: 239864	2015-06-16 23:13:00 +00:00
Rui Ueyama	f3770d3edb	COFF: Use ulittle32_t::operator\|=. NFC. llvm-svn: 239717	2015-06-15 03:03:23 +00:00
Rui Ueyama	59e9578f20	COFF: Fix resource table size. The size field shouldn't include trailing padding. llvm-svn: 239712	2015-06-15 01:35:56 +00:00
Rui Ueyama	588e832d0a	COFF: Support base relocations. PE/COFF executables/DLLs usually contain data which is called base relocations. Base relocations are a list of addresses that need to be fixed by the loader if load-time relocation is needed. Base relocations are in .reloc section. We emit one base relocation entry for each IMAGE_REL_AMD64_ADDR64 relocation. In order to save disk space, base relocations are grouped by page. Each group is called a block. A block starts with a 32-bit page address followed by 16-bit offsets in the page. That is more efficient representation of addresses than just an array of 32-bit addresses. llvm-svn: 239710	2015-06-15 01:23:58 +00:00
Rui Ueyama	9a03362a08	COFF: Change const name. NFC. llvm-svn: 239707	2015-06-14 22:21:29 +00:00
Rui Ueyama	669236fef3	COFF: Set Chunk to OutputSection backreference in addChunk(). When we add a chunk to an OutputSection, we always want to create a backreference from an OutputSection to a Chunk. To make sure we always do, do that in addChunk(). NFC. llvm-svn: 239706	2015-06-14 22:16:47 +00:00
Rui Ueyama	2bf6a12238	COFF: Support Windows resource files. Resource files are data files containing i18n messages, icon images, etc. MSVC has a tool to convert a resource file to a regular COFF file so that you can just link that file to embed resources to an executable. However, you can directly pass resource files to the linker. If you do that, the linker invokes the tool automatically. This patch implements that feature. llvm-svn: 239704	2015-06-14 21:50:50 +00:00
Rui Ueyama	f533d3e09d	COFF: Avoid callign stable_sort. MSVC profiler reported that this stable_sort takes 7% time when self-linking. As a result, createSection was taking 10% time. Now createSection takes 3%. This small change actually makes the linker a bit but perceptibly faster. llvm-svn: 239292	2015-06-08 08:26:28 +00:00
Rui Ueyama	e2cbfeae5c	COFF: Add /opt:noref option. This option disables dead-stripping. llvm-svn: 239243	2015-06-07 03:17:42 +00:00
Rui Ueyama	4a9fbbca9f	COFF: Add comments and move main function to the top. NFC. llvm-svn: 239237	2015-06-06 23:32:08 +00:00
Rui Ueyama	cc608e4f35	COFF: Rename writeHeader -> writeHeaderTo. Chunk has writeTo function which takes uint8_t Buf. writeHeaderTo feels more consistent with that because this member function also takes uint8_t Buf. llvm-svn: 239236	2015-06-06 23:19:38 +00:00
Rui Ueyama	929d8c52b1	COFF: Inline a constant that is used only once. llvm-svn: 239235	2015-06-06 23:19:36 +00:00
Rui Ueyama	55168c9f70	COFF: Add .didat section. llvm-svn: 239233	2015-06-06 23:07:01 +00:00
Rui Ueyama	458df98869	COFF: Update comments. llvm-svn: 239232	2015-06-06 22:56:55 +00:00
Rui Ueyama	c6ea057d7f	COFF: Move .idata constructor from Writer to Chunk. Previously, half of the constructor for .idata contents was in Chunks.cpp and the rest was in Writer.cpp. This patch moves the latter to Chunks.cpp. Now IdataContents class manages everything for .idata section. llvm-svn: 239230	2015-06-06 22:46:15 +00:00
Rui Ueyama	743afa0736	COFF: Merge Chunk::applyRelocations with Chunk::writeTo. In this design, Chunk is the only thing that knows how to write its contents to output file as well as how to apply relocations there. The writer shouldn't know about the details. llvm-svn: 239216	2015-06-06 04:07:39 +00:00
Rui Ueyama	eb262ce4b6	COFF: /include'd symbols must be preserved. Not only entry point symbol but also symbols specified by /include option must be preserved, as they will never be dead-stripped. http://reviews.llvm.org/D10220 llvm-svn: 239005	2015-06-04 02:12:16 +00:00
Rui Ueyama	bda72a4af4	COFF: Change OutputSections' type from vector<unique_ptr<T>> to vector<T*>. This is mainly for readability. OutputSection objects are still owned by the writer using SpecificBumpPtrAllocator. llvm-svn: 238936	2015-06-03 16:44:00 +00:00
Rui Ueyama	c2abdd9152	COFF: Use Chunk instead of its derived classes. I'm adding ordinal-only (nameless) imports to the import table. The chunk for that type is going to be different from LookupChunk. Without this change, we cannot add objects of the new type to the vectors. llvm-svn: 238779	2015-06-01 21:05:24 +00:00
Rui Ueyama	68216c680d	Fix comments. llvm-svn: 238718	2015-06-01 03:55:02 +00:00
Rui Ueyama	8fd9fb9857	COFF: Define an error category for the linker. Instead of returning non-categorized errors, return categorized errors. All uses of make_dynamic_error_code are removed. Because we don't have error reporting mechanism, I just chose to print out error messages to stderr, and then return an error object. Not sure if that's the right thing to do, but at least it seems practical. http://reviews.llvm.org/D10129 llvm-svn: 238714	2015-06-01 02:58:15 +00:00
Rui Ueyama	e00d651071	Use initializer instead of memset to zero out. llvm-svn: 238662	2015-05-30 19:28:58 +00:00
Rui Ueyama	bfb4aa1791	COFF: Support long section name. Section names were truncated to 8 bytes because the section table's name field is 8 byte long. This patch creates the string table to store long names. llvm-svn: 238661	2015-05-30 19:09:50 +00:00
Peter Collingbourne	246ccc5f51	COFF: Move machine type auto-detection to SymbolTable. The new mechanism is less code, and fixes the case where all inputs are archives. Differential Revision: http://reviews.llvm.org/D10136 llvm-svn: 238618	2015-05-29 21:47:36 +00:00
Rui Ueyama	15cc47ee81	COFF: Add /subsystem option. llvm-svn: 238571	2015-05-29 16:34:31 +00:00
Rui Ueyama	b9dcdb5fc9	COFF: Add /version option. llvm-svn: 238570	2015-05-29 16:28:29 +00:00
Rui Ueyama	c377e9aefe	COFF: Add /heap option. llvm-svn: 238569	2015-05-29 16:23:40 +00:00
Rui Ueyama	b41b7e5a69	Add /stack option. llvm-svn: 238568	2015-05-29 16:21:11 +00:00
Rui Ueyama	3d3e6fba6e	COFF: Add /machine option. llvm-svn: 238564	2015-05-29 16:06:00 +00:00
Rui Ueyama	322b2c413d	COFF: Return an error_code directly. llvm-svn: 238486	2015-05-28 20:39:29 +00:00
Rui Ueyama	d6fefba447	COFF: Teach Chunk to write to a mmap'ed output file. Previously Writer directly handles writes to a file. Chunks needed to give Writer a continuous chunk of memory. That was inefficent if you construct data in chunks because it would require two memory copies (one to construct a chunk and the other is to write that to a file). This patch teaches chunk to write directly to a file. From readability point of view, this is also good because you no longer have to call hasData() before calling getData(). llvm-svn: 238464	2015-05-28 19:45:43 +00:00
Rui Ueyama	411c636081	COFF: Add a new PE/COFF port. This is an initial patch for a section-based COFF linker. The patch has 2300 lines of code including comments and blank lines. Before diving into details, you want to start from reading README because it should give you an overview of the design. All important things are written in the README file, so I write summary here. - The linker is already able to self-link on Windows. - It's significantly faster than the existing implementation. The existing one takes 5 seconds to link LLD on my machine, while the new one only takes 1.2 seconds, even though the new one is not multi-threaded yet. (And a proof-of-concept multi- threaded version was able to link it in 0.5 seconds.) - It uses much less memory (250MB vs. 2GB virtual memory space to self-host). - IMHO the new code is much simpler and easier to read than the existing PE/COFF port. http://reviews.llvm.org/D10036 llvm-svn: 238458	2015-05-28 19:09:30 +00:00

1 2 3 4

168 Commits