llvm-project

Commit Graph

Author	SHA1	Message	Date
Reid Kleckner	11c141eb68	[COFF] Remove finalizeContents virtual method from Chunk, NFC This only needs to be done for MergeChunks, so just do that in a separate pass in the Writer. This is one small step towards eliminating the vtable in Chunk. llvm-svn: 361573	2019-05-24 00:02:00 +00:00
Reid Kleckner	ee4e0a2942	Re-land r361206 "[COFF] Store alignment in log2 form, NFC" The previous patch lost the call to PowerOf2Ceil, which causes LLD to crash when handling common symbols with a non-power-of-2 size. I tweaked the existing common.test to make the bsspad16 common symbol be 15 bytes to add coverage for this case. llvm-svn: 361426	2019-05-22 20:21:52 +00:00
Nico Weber	67510fac36	Revert r361206 "[COFF] Store alignment in log2 form, NFC" Makes the linker crash when linking nasm.exe. llvm-svn: 361212	2019-05-21 02:06:59 +00:00
Reid Kleckner	1a5cc629de	[COFF] Store alignment in log2 form, NFC Summary: Valid section or chunk alignments are powers of 2 in the range [1, 8192]. These can be stored more canonically in log2 form to free up some bits in Chunk. Combined with D61696, SectionChunk gets 8 bytes smaller. Reviewers: ruiu, aganea Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61698 llvm-svn: 361206	2019-05-20 22:57:52 +00:00
Reid Kleckner	4c64256b51	[COFF] Simplify Chunk::writeTo and remove OutputSectionOff, NFC Summary: Prior to this change, every implementation of writeTo would add OutputSectionOff to the output section buffer start before writing data. Instead, do this math in the caller, so that it can be written once instead of many times. The output section offset is always equivalent to the difference between the chunk RVA and the output section RVA, so we can replace the one remaining usage of OutputSectionOff with that subtraction. This doesn't change the size of SectionChunk because of alignment requirements, but I will rearrange the fields in a follow-up change to accomplish that. Reviewers: ruiu, aganea Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61696 llvm-svn: 360376	2019-05-09 21:21:22 +00:00
Reid Kleckner	0a1b1d6e62	Shrink SectionChunk by combining Relocs and SectionName sizes SectionChunk is one of the most frequently allocated data structures in LLD, since there are about four per function when optimizations and debug info are enabled (.text, .pdata, .xdata, .debug$S). A PE COFF file cannot be larger than 2GB, so there is an inherent limit on the length of the section name and the number of relocations. Decompose the ArrayRef and StringRef into pointer and size, and put them back together in the accessors for section name and relocation list. I plan to gather complete performance numbers later by padding SectionChunk with dead data and measuring performance after all the size optimizations are done. llvm-svn: 359923	2019-05-03 20:17:14 +00:00
Nico Weber	c0838af754	lld-link: Implement /swaprun: flag r191276 added this to old LLD, but it never made it to new LLD -- except that the flag was in Options.td, so it was silently ignored. I figured it should be easy to implement, so I did that instead of removing the flags from Options.td. I then discovered that link.exe also supports comma-separated lists of 'cd' and 'net', which made the parsing code a bit annoying. The Alias technique in Options.td is to get nice help output. Differential Revision: https://reviews.llvm.org/D61067 llvm-svn: 359192	2019-04-25 14:02:26 +00:00
Fangrui Song	32c0ebe615	Use llvm::stable_sort Make some small adjustment while touching the code: make parameters const, use less_first(), etc. Differential Revision: https://reviews.llvm.org/D60989 llvm-svn: 358943	2019-04-23 02:42:06 +00:00
Alexandre Ganea	09cca5b243	[LLD][COFF] Generate import modules & COFF groups in PDB Generate import modules for each imported DLL, along with its symbol stream. Also create COFF groups in the * Linker * module, one for each PartialSection (input, unmerged sections) Currently COFF groups are disabled for MINGW because it significantly increases PDB sizes. We could enable that later with an option. The overall objective for this change is to support code hot patching tools. Such tools need to know the import libraries used, from the PDB alone. Differential Revision: https://reviews.llvm.org/D54802 llvm-svn: 357308	2019-03-29 20:25:34 +00:00
Reid Kleckner	1600490af1	[COFF] Optimize range extension thunk insertion memory usage Summary: This avoids allocating O(#relocs) of intermediate data for each section when range extension thunks aren't needed for that section. This also removes a std::vector from SectionChunk, which further reduces its size. Instead, this change adds the range extension thunk symbols to the object files that contain sections that need extension thunks. By adding them to the symbol table of the parent object, that means they now have a symbol table index. Then we can then modify the original relocation, after copying it to read-write memory, to use the new symbol table index. This makes linking browser_tests.exe with no PDB 10.46% faster, moving it from 11.364s to 10.288s averaged over five runs. Reviewers: mstorsjo, ruiu Subscribers: aganea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59902 llvm-svn: 357200	2019-03-28 18:30:03 +00:00
Reid Kleckner	7818144ff3	[COFF] Add address-taken import thunks to the fid table Summary: Fixes PR39799 Reviewers: dmajor, hans Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58739 llvm-svn: 355141	2019-02-28 21:05:41 +00:00
Alexandre Ganea	97b2b0636b	[LLD][COFF] Support /threads[:no] like the ELF driver Differential review: https://reviews.llvm.org/D58594 llvm-svn: 355029	2019-02-27 20:53:50 +00:00
Alexandre Ganea	d307c4c47f	[LLD][COFF] Add support for /FUNCTIONPADMIN command-line option Initial patch by Stefan Reinalter. Fixes PR36775 Differential Revision: https://reviews.llvm.org/D49366 llvm-svn: 354716	2019-02-23 01:46:18 +00:00
Martin Storsjo	ccd4e5e016	[COFF] Avoid O(n^2) accesses into PartialSections For MinGW, unique partial sections are much more common, e.g. comdat functions get sections named e.g. text$symbol. A moderate sized example of this contains over 200K Chunks which create 174K unique PartialSections. Prior to SVN r352928 (D57574), linking this took around 1,5 seconds for me, while it afterwards takes around 13 minutes. After this patch, the linking time is back to what it was before. The std::find_if in findPartialSection will do a linear scan of the whole container until a match is found. To use something like binary_search or the std::set container's own methods, we'd need to already have a PartialSection*. Reinstate a proper map instead of having a set with a custom sorting comparator. Differential Revision: https://reviews.llvm.org/D57666 llvm-svn: 353146	2019-02-05 08:16:10 +00:00
Martin Storsjo	c9f4d25f26	[COFF] Create range extension thunks for ARM64 On ARM64, this is normally necessary only after a module exceeds 128 MB in size (while the limit for thumb is 16 MB). For conditional branches, the range limit is only 1 MB though (the same as for thumb), and for the tbz instruction, the range is only 32 KB, which allows for a test much smaller than the full 128 MB. This fixes PR40467. Differential Revision: https://reviews.llvm.org/D57575 llvm-svn: 352929	2019-02-01 22:08:09 +00:00
Martin Storsjo	b2b0cab0c3	[COFF] Fix crashes when writing a PDB after adding thunks. When writing a PDB, the OutputSection of all chunks need to be set. The thunks are added directly to OutputSection after the normal machinery that sets it for all other chunks. This fixes part of PR40467. Differential Revision: https://reviews.llvm.org/D57574 llvm-svn: 352928	2019-02-01 22:08:03 +00:00
Alexandre Ganea	864d2639f1	[LLD][COFF] Partial sections Persist (input) sections that make up an OutputSection. This is a supporting patch for the upcoming D54802. Differential Revision: https://reviews.llvm.org/D55293 llvm-svn: 352336	2019-01-28 01:45:35 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Martin Storsjo	333e0d180f	[COFF] Remove empty sections before calculating the size of section headers The number of sections is used in assignAddresses (in finalizeAddresses) and the space for all sections is permanent from that point on, even if we later decide we won't write some of them. The VirtualSize field also gets calculated in assignAddresses, so we need to manually check whether the section is empty here instead. Differential Revision: https://reviews.llvm.org/D54495 llvm-svn: 347704	2018-11-27 20:48:09 +00:00
Martin Storsjo	3c046af5a9	[COFF] Generate a codeview build id signature for MinGW even when not creating a PDB GNU ld, which doesn't generate PDBs, can optionally generate a build id by passing the --build-id option. LLD's MinGW frontend knows about this option but ignores it, as I had falsely assumed that LLD already generated build IDs even in those cases. If debug info is requested and no PDB path is set, generate a build id signature as a hash of the binary itself. This allows associating a binary to a minidump, even if debug info isn't written in PDB form by the linker. Differential Revision: https://reviews.llvm.org/D54828 llvm-svn: 347645	2018-11-27 09:20:55 +00:00
Reid Kleckner	a37d672da9	[COFF] Add exported functions to gfids table for /guard:cf Summary: MSVC does this, and we should to. The .gfids table is a table of RVAs, so it's impossible for a DLL to indicate that an imported symbol is address taken. Therefore, exports appear to be listed as address taken by the DLL that exports them. This fixes an issue that Firefox ran into here: https://bugzilla.mozilla.org/show_bug.cgi?id=1485016#c12 In Firefox, the export directive came from a .def file, but we need to do this for any kind of export. Reviewers: dmajor, hans, amccarth, alex Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54723 llvm-svn: 347623	2018-11-27 01:50:17 +00:00
Martin Storsjo	49037d2b3c	[COFF] Fix a longstanding typo in a variable name. NFC. llvm-svn: 346846	2018-11-14 10:26:47 +00:00
Alexandre Ganea	149de8de19	[LLD][COFF] Fix ordering of CRT global initializers in COMDAT sections (patch by Benoit Rousseau) This patch fixes a bug where the global variable initializers were sometimes not invoked in the correct order when it involved a C++ template instantiation. Differential Revision: https://reviews.llvm.org/D52749 llvm-svn: 343847	2018-10-05 12:56:46 +00:00
Martin Storsjo	57ddec0dd1	[COFF] Add support for creating range extension thunks for ARM This is a feature that MS link.exe lacks; it currently errors out on such relocations, just like lld did before. This allows linking clang.exe for ARM - practically, any image over 16 MB will likely run into the issue. Differential Revision: https://reviews.llvm.org/D52156 llvm-svn: 342962	2018-09-25 10:59:29 +00:00
Martin Storsjo	5f6d527f09	[COFF] Support linking to import libraries from GNU binutils GNU binutils import libraries aren't the same kind of short import libraries as link.exe and LLD produce, but are a plain static library containing .idata section chunks. MSVC link.exe can successfully link to them. In order for imports from GNU import libraries to mix properly with the normal import chunks, the chunks from the existing mechanism needs to be added into named sections like .idata$2. These GNU import libraries consist of one header object, a number of object files, one for each imported function/variable, and one trailer. Within the import libraries, the object files are ordered alphabetically in this order. The chunks stemming from these libraries have to be grouped by what library they originate from and sorted, to make sure the section chunks for headers and trailers for the lists are ordered as intended. This is done on all sections named .idata$*, before adding the synthesized chunks to them. Differential Revision: https://reviews.llvm.org/D38513 llvm-svn: 342777	2018-09-21 22:01:06 +00:00
Nico Weber	0bd2d304e6	lld-link: Set PDB GUID to hash of PDB contents instead of to a random byte sequence. Previously, lld-link would use a random byte sequence as the PDB GUID. Instead, use a hash of the PDB file contents. To not disturb llvm-pdbutil pdb2yaml, the hash generation is an opt-in feature on InfoStreamBuilder and ldb/COFF/PDB.cpp always sets it. Since writing the PDB computes this ID which also goes in the exe, the PDB writing code now must be called before writeBuildId(). writeBuildId() for that reason is no longer included in the "Code Layout" timer. Since the PDB GUID is now a function of the PDB contents, the PDB Age is always set to 1. There was a long comment above loadExistingBuildId (now gone) about how not changing the GUID and only incrementing the age was important, but according to the discussion in PR35914 that comment was incorrect. Differential Revision: https://reviews.llvm.org/D51956 llvm-svn: 342334	2018-09-15 18:37:22 +00:00
Martin Storsjo	7a41693898	[COFF] Provide __CTOR_LIST__ and __DTOR_LIST__ symbols for MinGW MinGW uses these kind of list terminator symbols for traversing the constructor/destructor lists. These list terminators are actual pointers entries in the lists, with the values 0 and (uintptr_t)-1 (instead of just symbols pointing to the start/end of the list). (This mechanism exists in both the mingw-w64 crt startup code and in libgcc; normally the mingw-w64 one is used, but a DLL build of libgcc uses the libgcc one. Therefore it's not trivial to change the mechanism without lots of cross-project synchronization and potentially invalidating some combinations of old/new versions of them.) When mingw-w64 has been used with lld so far, the CRT startup object files have so far provided these symbols, ending up with different, incompatible builds of the CRT startup object files depending on whether binutils or lld are going to be used. In order to avoid the need of different configuration of the CRT startup object files depending on what linker to be used, provide these symbols in lld instead. (Mingw-w64 checks at build time whether the linker provides these symbols or not.) This unifies this particular detail between the two linkers. This does disallow the use of the very latest lld with older versions of mingw-w64 (the configure check for the list was added recently; earlier it simply checked whether the CRT was built with gcc or clang), and requires rebuilding the mingw-w64 CRT. But the number of users of lld+mingw still is low enough that such a change should be tolerable, and unifies this aspect of the toolchains, easing interoperability between the toolchains for the future. The actual test for this feature is added in ctors_dtors_priority.s, but a number of other tests that checked absolute output addresses are updated. Differential Revision: https://reviews.llvm.org/D52053 llvm-svn: 342294	2018-09-14 22:26:59 +00:00
Martin Storsjo	4c201a8ba5	[COFF] Avoid copying of chunk vectors. NFC. When declaring the pair variable as "auto Pair : Map", it is effectively declared as std::pair<std::pair<StringRef, uint32_t>, std::vector<Chunk *>>. This effectively does a full, shallow copy of the Chunk vector, just to be thrown away after each iteration. Differential Revision: https://reviews.llvm.org/D52051 llvm-svn: 342205	2018-09-14 06:08:51 +00:00
Nico Weber	cc08366035	Remove an effectively unused local variable. llvm-svn: 341823	2018-09-10 13:20:16 +00:00
Nico Weber	13b55bbc2f	lld-link: Write an empty "repro" debug directory entry if /Brepro is passed If the coff timestamp is set to a hash, like lld-link does if /Brepro is passed, the coff spec suggests that a IMAGE_DEBUG_TYPE_REPRO entry is in the debug directory. This lets lld-link write such a section. Fixes PR38429, see bug for details. Differential Revision: https://reviews.llvm.org/D51652 llvm-svn: 341486	2018-09-05 18:02:43 +00:00
Martin Storsjo	802fcb4167	[COFF] When doing automatic dll imports, replace whole .refptr.<var> chunks with __imp_<var> After fixing up the runtime pseudo relocation, the .refptr.<var> will be a plain pointer with the same value as the IAT entry itself. To save a little binary size and reduce the number of runtime pseudo relocations, redirect references to the IAT entry (via the __imp_<var> symbol) itself and discard the .refptr.<var> chunk (as long as the same section chunk doesn't contain anything else than the single pointer). As there are now cases for both setting the Live variable to true and false externally, remove the accessors and setters and just make the variable public instead. Differential Revision: https://reviews.llvm.org/D51456 llvm-svn: 341175	2018-08-31 07:45:20 +00:00
Martin Storsjo	eac1b05f1d	[COFF] Support MinGW automatic dllimport of data Normally, in order to reference exported data symbols from a different DLL, the declarations need to have the dllimport attribute, in order to use the __imp_<var> symbol (which contains an address to the actual variable) instead of the variable itself directly. This isn't an issue in the same way for functions, since any reference to the function without the dllimport attribute will end up as a reference to a thunk which loads the actual target function from the import address table (IAT). GNU ld, in MinGW environments, supports automatically importing data symbols from DLLs, even if the references didn't have the appropriate dllimport attribute. Since the PE/COFF format doesn't support the kind of relocations that this would require, the MinGW's CRT startup code has an custom framework of their own for manually fixing the missing relocations once module is loaded and the target addresses in the IAT are known. For this to work, the linker (originall in GNU ld) creates a list of remaining references needing fixup, which the runtime processes on startup before handing over control to user code. While this feature is rather controversial, it's one of the main features allowing unix style libraries to be used on windows without any extra porting effort. Some sort of automatic fixing of data imports is also necessary for the itanium C++ ABI on windows (as clang implements it right now) for importing vtable pointers in certain cases, see D43184 for some discussion on that. The runtime pseudo relocation handler supports 8/16/32/64 bit addresses, either PC relative references (like IMAGE_REL__REL32) or absolute references (IMAGE_REL_AMD64_ADDR32, IMAGE_REL_AMD64_ADDR32, IMAGE_REL_I386_DIR32). On linking, the relocation is handled as a relocation against the corresponding IAT slot. For the absolute references, a normal base relocation is created, to update the embedded address in case the image is loaded at a different address. The list of runtime pseudo relocations contains the RVA of the imported symbol (the IAT slot), the RVA of the location the relocation should be applied to, and a size of the memory location. When the relocations are fixed at runtime, the difference between the actual IAT slot value and the IAT slot address is added to the reference, doing the right thing for both absolute and relative references. With this patch alone, things work fine for i386 binaries, and mostly for x86_64 binaries, with feature parity with GNU ld. Despite this, there are a few gotchas: - References to data from within code works fine on both x86 architectures, since their relocations consist of plain 32 or 64 bit absolute/relative references. On ARM and AArch64, references to data doesn't consist of a plain 32 or 64 bit embedded address or offset in the code. On ARMNT, it's usually a MOVW+MOVT instruction pair represented by a IMAGE_REL_ARM_MOV32T relocation, each instruction containing 16 bit of the target address), on AArch64, it's usually an ADRP+ADD/LDR/STR instruction pair with an even more complex encoding, storing a PC relative address (with a range of +/- 4 GB). This could theoretically be remedied by extending the runtime pseudo relocation handler with new relocation types, to support these instruction encodings. This isn't an issue for GCC/GNU ld since they don't support windows on ARMNT/AArch64. - For x86_64, if references in code are encoded as 32 bit PC relative offsets, the runtime relocation will fail if the target turns out to be out of range for a 32 bit offset. - Fixing up the relocations at runtime requires making sections writable if necessary, with the VirtualProtect function. In Windows Store/UWP apps, this function is forbidden. These limitations are addressed by a few later patches in lld and llvm. Differential Revision: https://reviews.llvm.org/D50917 llvm-svn: 340726	2018-08-27 08:43:31 +00:00
Hans Wennborg	bdd8493f2b	[COFF] Make the relocation scanning for CFG more discriminating link.exe ignores REL32 relocations on 32-bit x86, as well as relocations against non-function symbols such as labels. This makes lld do the same. Differential Revision: https://reviews.llvm.org/D50430 llvm-svn: 339345	2018-08-09 13:43:22 +00:00
Martin Storsjo	98ff9f845d	[COFF] Sort .reloc before all other discardable sections If a binary is stripped, which can remove discardable sections (except for the .reloc section, which also is marked as discardable as it isn't loaded at runtime, only read by the loader), the .reloc section should be first of them, in order not to create gaps in the image. Previously, binaries with relocations were broken if they were stripped by GNU binutils strip. Trying to execute such binaries produces an error about "xx is not a valid win32 application". This fixes GNU binutils bug 23348. Prior to SVN r329370 (which didn't intend to have functional changes), the code for moving discardable sections to the end didn't clearly express how other discardable sections should be ordered compared to .reloc, but the change retained the exact same end result as before. After SVN r329370, the code (and comments) more clearly indicate that it tries to make the .reloc section the absolutely last one; this patch changes that. This matches how GNU binutils ld sorts .reloc compared to dwarf debug info sections. Differential Revision: https://reviews.llvm.org/D49351 Signed-off-by: Martin Storsjö <martin@martin.st> llvm-svn: 337598	2018-07-20 18:43:35 +00:00
Martin Storsjo	a55fc71614	[COFF] Write the debug directory and build id to a separate section for MinGW For dwarf debug info, an executable normally either contains the debug info, or it is stripped out. To reduce the storage needed (slightly) for the debug info kept separately from the released, stripped binaries, one can choose to only copy the debug data from the original executable (essentially the reverse of the strip operation), producing a file with only debug info. When copying the debug data from an executable with GNU objcopy, the build id and debug directory need to reside in a separate section, as this will be kept while the rest of the .rdata section is removed. Differential Revision: https://reviews.llvm.org/D49352 llvm-svn: 337526	2018-07-20 05:44:34 +00:00
Martin Storsjo	c35e4bf7eb	[COFF] Don't produce base relocs for discardable sections Dwarf debug info contains some data that contains absolute addresses. Since these sections are discardable and aren't loaded at runtime, there's no point in adding base relocations for them. This makes sure that after stripping out dwarf debug info, there are no base relocations that point to nonexistent sections. Differential Revision: https://reviews.llvm.org/D49350 llvm-svn: 337438	2018-07-19 04:25:22 +00:00
Zachary Turner	e2ce2a5c86	[coff] remove_dots from /PDBPATH but not /PDBALTPATH. This more closely matches the behavior of link.exe, and also simplifies the code slightly. llvm-svn: 336882	2018-07-12 03:22:39 +00:00
Zachary Turner	bf9abccacd	[coff] Remove dots in path pointing to PDB file. Some Microsoft tools (e.g. new versions of WPA) fail when the COFF Debug Directory contains a path to the PDB that contains dots, such as D:\foo\./bar.pdb. Remove dots before writing this path. This fixes pr38126. llvm-svn: 336873	2018-07-12 00:44:15 +00:00
Martin Storsjo	474be005db	[COFF] Store import symbol pointers as pointers to the base class Future symbol insertions can potentially change the type of these symbols - keep pointers to the base class to reflect this, and use dynamic casts to inspect them before using as the subclass type. This fixes crashes that were possible before, by touching these symbols that now are populated as e.g. a DefinedRegular, via the old pointers with DefinedImportThunk type. Differential Revision: https://reviews.llvm.org/D48953 llvm-svn: 336652	2018-07-10 10:40:11 +00:00
Martin Storsjo	3a7905b2aa	[COFF] Add an LLD specific option -debug:symbtab With this set, we retain the symbol table, but skip the actual debug information. This is meant to be used by the MinGW frontend. Differential Revision: https://reviews.llvm.org/D48745 llvm-svn: 335946	2018-06-29 06:08:25 +00:00
Bob Haarman	c103156c60	lld-link: align sections to 16 bytes if referenced from the gfids table Summary: Control flow guard works best when targets it checks are 16-byte aligned. Microsoft's link.exe helps ensure this by aligning code from sections that are referenced from the gfids table to 16 bytes when linking with -guard:cf, even if the original section specifies a smaller alignment. This change implements that behavior in lld-link. See https://crbug.com/857012 for more details. Reviewers: ruiu, hans, thakis, zturner Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48690 llvm-svn: 335864	2018-06-28 15:22:40 +00:00
Shoaib Meenai	02c4344262	[COFF] Fix crash when emitting symbol tables with GC When running with linker GC (`-opt:ref`), defined imported symbols that are referenced but then dropped by GC end up with their `Location` member being nullptr, which means `getChunk()` returns nullptr for them and attempting to call `getChunk()->getOutputSection()` causes a crash from the nullptr dereference. Check for `getChunk()` being nullptr and bail out early to avoid the crash. Differential Revision: https://reviews.llvm.org/D48092 llvm-svn: 334548	2018-06-12 21:19:33 +00:00
Nico Weber	d657c25649	lld-link: Implement /INTEGRITYCHECK flag /INTEGRITYCHECK has the effect of setting IMAGE_DLLCHARACTERISTICS_FORCE_INTEGRITY. Fixes PR31066. https://reviews.llvm.org/D47472 llvm-svn: 333652	2018-05-31 13:43:02 +00:00
Shoaib Meenai	663518d61a	[COFF] Unify output section code. NFC Peter Collingbourne suggested moving the switch to the top of the function, so that all the code that cares about the output section for a symbol is in the same place. Differential Revision: https://reviews.llvm.org/D47497 llvm-svn: 333472	2018-05-29 22:49:56 +00:00
Shoaib Meenai	4e51833611	[COFF] Simplify symbol table output section computation Rather than using a loop to compare symbol RVAs to the starting RVAs of sections to determine which section a symbol belongs to, just get the output section of a symbol directly via its chunk, and bail if the symbol doesn't have an output section, which avoids having to hardcode logic for handling dead symbols, CodeView symbols, etc. This was suggested by Reid Kleckner; thank you. This also fixes writing out symbol tables in the presence of RVA table input sections (e.g. .sxdata and .gfids). Such sections aren't written to the output file directly, so their RVA is 0, and the loop would thus fail to find an output section for them, resulting in a segfault. Extend some existing tests to cover this case. Fixes PR37584. Differential Revision: https://reviews.llvm.org/D47391 llvm-svn: 333450	2018-05-29 19:07:47 +00:00
Zachary Turner	c8dd6ccc8a	[COFF] Add /Brepro and /TIMESTAMP options. Previously we would always write a hash of the binary into the PE file, for reproducible builds. This breaks AppCompat, which is a feature of Windows that relies on the timestamp in the PE header being set to a real value (or at the very least, a value that satisfies certain properties). To address this, we put the old behavior of writing the hash behind the /Brepro flag, which mimics MSVC linker behavior. We also match MSVC default behavior, which is to write an actual timestamp to the PE header. Finally, we add the /TIMESTAMP option (an lld extension) so that the user can specify the exact value to be used in case he/she manually constructs a value which is both reproducible and satisfies AppCompat. Differential Revision: https://reviews.llvm.org/D46966 llvm-svn: 332613	2018-05-17 15:11:01 +00:00
Peter Collingbourne	e28faed768	COFF: Don't create unnecessary thunks. A thunk is only needed if a relocation points to the undecorated import name. Differential Revision: https://reviews.llvm.org/D46673 llvm-svn: 332019	2018-05-10 19:01:28 +00:00
Peter Collingbourne	71c7de5b77	COFF: Preserve section type when processing /section flag. It turns out that we were dropping this before. Differential Revision: https://reviews.llvm.org/D45802 llvm-svn: 330481	2018-04-20 21:23:16 +00:00
Peter Collingbourne	381b3d8aa3	COFF: Use (name, output characteristics) as a key when grouping input sections into output sections. This is what link.exe does and lets us avoid needing to worry about merging output characteristics while adding input sections to output sections. With this change we can't process /merge in the same way as before because sections with different output characteristics can still be merged into one another. So this change moves the processing of /merge to just before we assign addresses. In the case where there are multiple output sections with the same name, link.exe only merges the first section with the source name into the first section with the target name, and we do the same. At the same time I also implemented transitive merging (which means that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a). This isn't quite enough though because link.exe has a special case for .CRT in 32-bit mode: it processes sections whose output characteristics are DATA \| R \| W as though the output characteristics were DATA \| R (so that they get merged into things like constructor lists in the expected way). Chromium has a few such sections, and it turns out that those sections were causing the problem that resulted in r318699 (merge .xdata into .rdata) being reverted: because of the previous permission merging semantics, the .CRT sections were causing the entire .rdata section to become writable, which caused the SEH runtime to crash because it apparently requires .xdata to be read-only. This change also implements the same special case. This should unblock being able to merge .xdata into .rdata by default, as well as .bss into .data, both of which will be done in followups. Differential Revision: https://reviews.llvm.org/D45801 llvm-svn: 330479	2018-04-20 21:10:33 +00:00
Peter Collingbourne	be084eca5b	COFF: Remove OutputSection::getPermissions() and getCharacteristics(). All callers can just access the header directly. Differential Revision: https://reviews.llvm.org/D45800 llvm-svn: 330367	2018-04-19 21:48:37 +00:00
Peter Collingbourne	fa322abee9	COFF: Rename Chunk::getPermissions to getOutputCharacteristics. In an upcoming change I will need to make a distinction between section type (code, data, bss) and permissions. The term that I use for both of these things is "output characteristics". Differential Revision: https://reviews.llvm.org/D45799 llvm-svn: 330361	2018-04-19 20:03:24 +00:00
Reid Kleckner	8f1a28f190	[COFF] Mark images with no exception handlers for /safeseh Summary: DLLs and executables with no exception handlers need to be marked with IMAGE_DLL_CHARACTERISTICS_NO_SEH, even if they have a load config. Discovered here when building Chromium with LLD on Windows: https://crbug.com/833951 Reviewers: ruiu, mstorsjo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45778 llvm-svn: 330300	2018-04-18 22:37:10 +00:00
Peter Collingbourne	94aa62e48a	COFF: Implement /pdbaltpath flag. I needed to revert r330223 because we were embedding an absolute PDB path in the .rdata section, which ended up being laid out before the .idata section and affecting its RVAs. This flag will let us control the embedded path. Differential Revision: https://reviews.llvm.org/D45747 llvm-svn: 330232	2018-04-17 23:28:38 +00:00
Peter Collingbourne	4902508934	COFF: Process /merge flag as we create output sections. With this we can merge builtin sections. Differential Revision: https://reviews.llvm.org/D45350 llvm-svn: 329471	2018-04-07 00:46:55 +00:00
Peter Collingbourne	f2c0f39b91	COFF: Create output sections early. NFCI. With this, all output sections are created in one place. This will make it simpler to implement merging of builtin sections. Differential Revision: https://reviews.llvm.org/D45349 llvm-svn: 329370	2018-04-06 03:25:49 +00:00
Peter Collingbourne	05f0bae318	COFF: Sort non-discardable sections at the same time as other sections. NFC. This makes the sort order a little clearer. Differential Revision: https://reviews.llvm.org/D45282 llvm-svn: 329227	2018-04-04 20:30:37 +00:00
Hans Wennborg	9a9fc78744	COFF: Layout sections in the same order as link.exe One place where this seems to matter is to make sure the .rsrc section comes after .text. The Win32 UpdateResource() function can change the contents of .rsrc. It will move the sections that come after, but if .text gets moved, the entry point header will not get updated and the executable breaks. This was found by a test in Chromium. Differential Revision: https://reviews.llvm.org/D45260 llvm-svn: 329221	2018-04-04 19:15:55 +00:00
Shoaib Meenai	290f26fefd	[COFF] Clarify comment. NFC Reid pointed out the string table for supporting long section names is a BFD extension and the comments should reflect that. Explicitly spell out link.exe's and binutil's behavior around section names and the rationale for LLD's behavior. Differential Revision: https://reviews.llvm.org/D42659 llvm-svn: 327736	2018-03-16 20:20:01 +00:00
Peter Collingbourne	f1a11f87a0	COFF: Implement string tail merging. In COFF, duplicate string literals are merged by placing them in a comdat whose leader symbol name contains a specific prefix followed by the hash and partial contents of the string literal. This gives us an easy way to identify sections containing string literals in the linker: check for leader symbol names with the given prefix. Any sections that are identified in this way as containing string literals may be tail merged. We do so using the StringTableBuilder class, which is also used to tail merge string literals in the ELF linker. Tail merging is enabled only if ICF is enabled, as this provides a signal as to whether the user cares about binary size. Differential Revision: https://reviews.llvm.org/D44504 llvm-svn: 327668	2018-03-15 21:14:02 +00:00
Peter Collingbourne	435b099115	COFF: Move assignment of section RVAs to assignAddresses(). NFCI. This makes the design a little more similar to the ELF linker and should allow for features such as ARM range extension thunks to be implemented more easily. Differential Revision: https://reviews.llvm.org/D44501 llvm-svn: 327667	2018-03-15 21:13:46 +00:00
Zachary Turner	b575f46b6d	Resubmit "Write a hash of the executable into the PE timestamp fields." This fixes the broken tests that were causing failures. The tests before were verifying that the time stamp was 0, but now that we are actually writing a timestamp, I just removed the match against the timestamp value. llvm-svn: 327049	2018-03-08 19:33:47 +00:00
Hans Wennborg	aee5881a85	[COFF] Make the DOS stub a real DOS program It only adds a few bytes and is nice for backward compatibility. Differential Revision: https://reviews.llvm.org/D44018 llvm-svn: 327001	2018-03-08 14:27:28 +00:00
Zachary Turner	0b4af0434b	Revert "Write a hash of the executable into the PE timestamp fields." This is breaking a couple of tests, so I'm reverting temporarily until I can get everything resolved properly. llvm-svn: 326943	2018-03-07 21:22:10 +00:00
Zachary Turner	69f3347b56	Write a hash of the executable into the PE timestamp fields. Windows tools treats the timestamp fields as sort of a build id, using it to archive executables on a symbol server, as well as for matching executables to PDBs. We were writing 0 for these fields, which would cause symbol servers to break as they are indexed in the symbol server based on this value. Although the field is called timestamp, it can really be any value that is unique per build, so to support reproducible builds we use a hash of the executable here. Differential Revision: https://reviews.llvm.org/D43978 llvm-svn: 326920	2018-03-07 18:13:41 +00:00
Rui Ueyama	b3107476a4	Remove an unused accessor and simplify the logic a bit. NFC. llvm-svn: 325445	2018-02-17 20:41:38 +00:00
Reid Kleckner	fd52096259	[LLD] Implement /guard:[no]longjmp Summary: This protects calls to longjmp from transferring control to arbitrary program points. Instead, longjmp calls are limited to the set of registered setjmp return addresses. This also implements /guard:nolongjmp to allow users to link in object files that call setjmp that weren't compiled with /guard:cf. In this case, the linker will approximate the set of address taken functions, but it will leave longjmp unprotected. I used the following program to test, compiling it with different -guard flags: $ cl -c t.c -guard:cf $ lld-link t.obj -guard:cf #include <setjmp.h> #include <stdio.h> jmp_buf buf; void g() { printf("before longjmp\n"); fflush(stdout); longjmp(buf, 1); } void f() { if (setjmp(buf)) { printf("setjmp returned non-zero\n"); return; } g(); } int main() { f(); printf("hello world\n"); } In particular, the program aborts when the code is compiled without -guard:cf and linked with -guard:cf. That indicates that longjmps are protected. Reviewers: ruiu, inglorion, amccarth Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43217 llvm-svn: 325047	2018-02-13 20:32:53 +00:00
Reid Kleckner	af2f7da74c	[COFF] Add minimal support for /guard:cf Summary: This patch adds some initial support for Windows control flow guard. At the end of the day, the linker needs to synthesize a table of RVAs very similar to the structured exception handler table (/safeseh). Both /safeseh and /guard:cf take sections of symbol table indices (.sxdata and .gfids$y) and turn them into RVA tables referenced by the load config struct in the CRT through special symbols. Reviewers: ruiu, amccarth Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42592 llvm-svn: 324306	2018-02-06 01:58:26 +00:00
Shoaib Meenai	34a1101b06	[COFF] Update comment to reflect link.exe behavior. NFC In my experimentation with link.exe from both VS 2015 and 2017, it always produces images with truncated section names. Update the comment accordingly. Differential Revision: https://reviews.llvm.org/D42603 llvm-svn: 323598	2018-01-27 18:17:08 +00:00
Rui Ueyama	57175aa1e9	Add the /order option. With the /order option, you can give an order file. An order file contains symbol names, one per line, and the linker places comdat sections in that given order. The option is used often to optimize an output binary for (in particular, startup) speed by improving locality. Differential Revision: https://reviews.llvm.org/D42598 llvm-svn: 323579	2018-01-27 00:34:46 +00:00
Zachary Turner	727f153b6f	[coff] Print detailed timing information with /TIME. The classes used to print and update time information are in common, so other linkers could use this as well if desired. Differential Revision: https://reviews.llvm.org/D41915 llvm-svn: 322736	2018-01-17 19:16:26 +00:00
Rui Ueyama	2c95e798a0	[LLD][COFF] Report error when file will exceed Windows maximum image size (4GB) Patch by Colden Cullen. Currently, when a large PE (>4 GiB) is to be produced, a crash occurs because: 1. Calling setOffset with a number greater than UINT32_MAX causes the PointerToRawData to overflow 2. When adding the symbol table to the end of the file, the last section's offset was used to calculate file size. Because this had overflowed, this number was too low, and the file created would not be large enough. This lead to the actual crash I saw, which was a buffer overrun. This change: 1. Adds comment to setOffset, clarifying that overflow can occur, but it's somewhat safe because the error will be handled elsewhere 2. Adds file size check after all output data has been created This matches the MS link.exe error, which looks prints as: "LINK : fatal error LNK1248: image size (10000EFC9) exceeds maximum allowable size (FFFFFFFF)" 3. Changes calculate of the symbol table offset to just use the existing FileSize. This should match the previous calculations, but doesn't rely on the use of a u32 that can overflow. 4. Removes trivial usage of a magic number that bugged me while I was debugging the issue I'm not sure how to add a test for this outside of adding 4GB of object files to the repo. If there's an easier way, let me know and I'll be happy to add a test. Differential Revision: https://reviews.llvm.org/D42010 llvm-svn: 322605	2018-01-17 01:08:02 +00:00
Martin Storsjo	a1e9b6e3d2	[COFF] Set the IMAGE_DLL_CHARACTERISTICS_NO_SEH flag automatically This seems to match how link.exe sets it. Differential Revision: https://reviews.llvm.org/D41252 llvm-svn: 320860	2017-12-15 20:53:03 +00:00
Martin Storsjo	9603b8e3f5	[COFF] Sort .pdata for arm64 This works for linking the output from the MSVC compiler. The pdata entries for arm64 seem to be 8 bytes in the same (or at least similar) form to arm. Differential Revision: https://reviews.llvm.org/D41160 llvm-svn: 320676	2017-12-14 08:56:29 +00:00
Rui Ueyama	bdc5150984	Always evaluate the second argument for CHECK() lazily. This patch is to rename check CHECK and make it a C macro, so that we can evaluate the second argument lazily. Differential Revision: https://reviews.llvm.org/D40915 llvm-svn: 319974	2017-12-06 22:08:17 +00:00
Peter Collingbourne	24ca79c776	COFF: Simplify construction of safe SEH table. NFCI. Instead of building intermediate sets of exception handlers for each object file, just create one for the final output file. Differential Revision: https://reviews.llvm.org/D40581 llvm-svn: 319244	2017-11-28 22:50:53 +00:00
Rui Ueyama	2017d52b54	Move Memory.{h,cpp} to Common. Differential Revision: https://reviews.llvm.org/D40571 llvm-svn: 319221	2017-11-28 20:39:17 +00:00
Martin Storsjo	f2508f46ca	[COFF] Interpret a period as a separator for section suffix just like '$' This allows grouping all sections like ".ctors.12345" into ".ctors". For MinGW, the numerical values for such ctors are all zero-padded, so a lexical sort is good enough. Differential Revision: https://reviews.llvm.org/D40408 llvm-svn: 319151	2017-11-28 08:08:37 +00:00
Peter Collingbourne	f874bd67d8	COFF: Emit a COFF symbol table if /debug:dwarf is specified. This effectively reverts r318548 and r318635 while keeping the functionality behind the flag and preserving the bug fix from r318548. Differential Revision: https://reviews.llvm.org/D40264 llvm-svn: 318721	2017-11-21 01:14:14 +00:00
Peter Collingbourne	5e80bdebd2	COFF: Stop emitting a non-standard COFF symbol table into PEs. Now that our support for PDB emission is reasonably good, there is no longer a need to emit a COFF symbol table. Also fix a bug where we would fail to emit a string table for long section names if /debug was not specified. Differential Revision: https://reviews.llvm.org/D40189 llvm-svn: 318548	2017-11-17 19:51:20 +00:00
Martin Storsjo	46304e03ec	[COFF] Don't write long section names for sections that will be mapped at runtime Sections that will be mapped at runtime will only have the short section name available, since the string table it points into isn't mapped. Therefore prefer truncating those names over writing a long name that is unavailable at runtime. This allows libunwind to find the .eh_frame section at runtime even if the module was built with debug info enabled. Differential Revision: https://reviews.llvm.org/D40025 llvm-svn: 318391	2017-11-16 12:06:42 +00:00
Bob Haarman	fe059c782f	[coff] correctly emit safeseh entries for handlers defined in dlls Summary: We previously assumed that all SafeSEH handlers are DefinedRegular symbols. This is not the case for handlers defined in DLLs. As a result, we were failing to emit entries in the SafeSEH table for those handlers. This change fixes that. Fixes PR35324. Reviewers: rnk, ruiu Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40102 llvm-svn: 318364	2017-11-16 01:22:01 +00:00
Martin Storsjo	61716878ae	[COFF] Always include the size of the string table size field Even if we don't actually write any string table contents, the 4 byte size for the string table will always be written. Make sure we accommodate for this in the file size. Since this size is aligned up, this would seldom be an issue in practice. Differential Revision: https://reviews.llvm.org/D39891 llvm-svn: 318284	2017-11-15 08:18:25 +00:00
Rafael Espindola	0a7d0230fc	Try harder to delete the temporary file. This changes COFF to use the output buffer that is reset by the error handler. llvm-svn: 318062	2017-11-13 18:15:22 +00:00
Rafael Espindola	5f903f3848	Update for llvm change. llvm-svn: 317657	2017-11-08 01:50:34 +00:00
Bob Haarman	6c301b6eb1	[coff] use relative instead of absolute __safe_se_handler_base when present Summary: __safe_se_handler_base should be either absolute 0 (when no SafeSEH table is present), or relative to the image base (when the table is present). An earlier change inadvertedly made the symbol absolute in both cases, leading to the SafeSEH table not being locatble at run time. This change fixes that and updates the safeseh test to check for the presence of the relocation. Reviewers: rnk, ruiu Reviewed By: ruiu Subscribers: ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D39765 llvm-svn: 317635	2017-11-07 23:24:10 +00:00
Rui Ueyama	f483da0038	Rename replaceBody -> replaceSymbol. llvm-svn: 317383	2017-11-03 22:48:47 +00:00
Rui Ueyama	f52496e1e0	Rename SymbolBody -> Symbol Now that we have only SymbolBody as the symbol class. So, "SymbolBody" is a bit strange name now. This is a mechanical change generated by perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF) nd clang-format-diff. Differential Revision: https://reviews.llvm.org/D39459 llvm-svn: 317370	2017-11-03 21:21:47 +00:00
Rui Ueyama	616cd99194	[COFF] Merge Symbol and SymbolBody. llvm-svn: 317007	2017-10-31 16:10:24 +00:00
Rui Ueyama	5ace35cba5	Fix SizeOfImage in the PE header. IIUC, SizeOfImage is the distance from the end of the last section to the image base, rounded up to the page size. So the previous code is wrong. Should fix https://bugs.llvm.org/show_bug.cgi?id=34949 (It is nice to know that lld is already being used to create Putty distribution binaries.) llvm-svn: 316626	2017-10-25 23:00:40 +00:00
Bob Haarman	b8a59c8aa5	[lld] unified COFF and ELF error handling on new Common/ErrorHandler Summary: The COFF linker and the ELF linker have long had similar but separate Error.h and Error.cpp files to implement error handling. This change introduces new error handling code in Common/ErrorHandler.h, changes the COFF and ELF linkers to use it, and removes the old, separate implementations. Reviewers: ruiu Reviewed By: ruiu Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits Differential Revision: https://reviews.llvm.org/D39259 llvm-svn: 316624	2017-10-25 22:28:38 +00:00
Shoaib Meenai	4aa7f8a30f	[COFF] Check for sections larger than 4 GiB Sections are limited to 4 GiB. Error out early if a section exceeds this size, rather than overflowing the section size and getting confusing assertion failures/segfaults later. Differential Revision: https://reviews.llvm.org/D38005 llvm-svn: 313699	2017-09-19 23:58:05 +00:00
Rui Ueyama	eef6b2a5c9	Revert r303378: Set IMAGE_DLL_CHARACTERISTICS_NO_BIND. r303378 was submitted because r303374 (Merge IAT and ILT) made lld's output incompatible with the Binding feature. Now that r303374 was reverted, we do not need to keep this change. Pointed out by pcc. llvm-svn: 313414	2017-09-15 22:49:13 +00:00
Rui Ueyama	cfc2f80df6	Remove {get,set}Align accessor functions and use Alignment member variable instead. llvm-svn: 313204	2017-09-13 21:54:55 +00:00
Rui Ueyama	cbf969eb20	Remove Symtab aliases. Various classes have `Symtab` member variables even though we have lld::coff::Symtab variable because previous attempts to make COFF lld's internal structure resemble to ELF's was incomplete. This patch finishes that job by removing member variables. llvm-svn: 311938	2017-08-28 21:51:07 +00:00
Sam Clegg	7dbd1fd73b	Update comments: parallel_for_each -> parallelForEach Also remove unused include of raw_ostream.h Differential Revision: https://reviews.llvm.org/D37048 llvm-svn: 311587	2017-08-23 19:03:20 +00:00
Zachary Turner	1bc6cb64b1	Fix warning about unused variable. I'm explicitly ignoring the warning by casting to void instead of deleting the local assignment, because it's confusing to see a function that fails when its return value evaluates to true. But when you see that it's a std::error_code, it makes more sense. llvm-svn: 310965	2017-08-15 21:46:51 +00:00
Zachary Turner	024323cb12	[LLD COFF/PDB] Incrementally update the build id. Previously, our algorithm to compute a build id involved hashing the executable and storing that as the GUID in the CV Debug Record chunk, and setting the age to 1. This breaks down in one very obvious case: a user adds some newlines to a file, rebuilds, but changes nothing else. This causes new line information and new file checksums to get written to the PDB, meaning that the debug info is different, but the generated code would be the same, so we would write the same build over again with an age of 1. Anyone using a symbol cache would have a problem now, because the debugger would open the executable, look at the age and guid, find a matching PDB in the symbol cache and then load it. It would never copy the new PDB to the symbol cache. This patch implements the canonical Windows algorithm for updating a build id, which is to check the existing executable first, and re-use an existing GUID while bumping the age if it already exists. Differential Revision: https://reviews.llvm.org/D36758 llvm-svn: 310961	2017-08-15 21:31:41 +00:00
Zachary Turner	4f588a93bf	Fix build breakage. llvm-svn: 310112	2017-08-04 20:07:08 +00:00
Zachary Turner	f1ca78c253	[lld] Write the absolute PDB path to the debug directory. This matches the behavior of MSVC's linker. Differential Revision: https://reviews.llvm.org/D36334 llvm-svn: 310108	2017-08-04 20:02:55 +00:00
Reid Kleckner	175af4bcc7	[PDB] Fix section contributions Summary: PDB section contributions are supposed to use output section indices and offsets, not input section indices and offsets. This allows the debugger to look up the index of the module that it should look up in the modules stream for symbol information. With this change, windbg can now find line tables, but it still cannot print local variables. Fixes PR34048 Reviewers: zturner Subscribers: hiraditya, ruiu, llvm-commits Differential Revision: https://reviews.llvm.org/D36285 llvm-svn: 309987	2017-08-03 21:15:09 +00:00

1 2 3 4 5 ...

315 Commits