llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	6b0eb5a672	[ELF] Improve --gc-sections compatibility with GNU ld regarding section groups Based on D70020 by serge-sans-paille. The ELF spec says: > Furthermore, there may be internal references among these sections that would not make sense if one of the sections were removed or replaced by a duplicate from another object. Therefore, such groups must be included or omitted from the linked object as a unit. A section cannot be a member of more than one group. GNU ld has 2 behaviors that we don't have: - Group members (nextInSectionGroup != nullptr) are subject to garbage collection. This includes non-SHF_ALLOC SHT_NOTE sections. In particular, discarding non-SHF_ALLOC SHT_NOTE sections is an expected behavior by the Annobin project. See https://developers.redhat.com/blog/2018/02/20/annobin-storing-information-binaries/ for more information. - Groups members are retained or discarded as a unit. Members may have internal references that are not expressed as SHF_LINK_ORDER, relocations, etc. It seems that we should be more conservative here: if a section is marked live, mark all the other member within the group. Both behaviors are reasonable. This patch implements them. A new field InputSectionBase::nextInSectionGroup tracks the next member within a group. on ELF64, this increases sizeof(InputSectionBase) froms 144 to 152. InputSectionBase::dependentSections tracks section dependencies, which is used by both --gc-sections and /DISCARD/. We can't overload it for the "next member" semantic, because we should allow /DISCARD/ to discard sections independent of --gc-sections (GNU ld behavior). This behavior may be reasonably used by `/DISCARD/ : { (.ARM.exidx) }` or `/DISCARD/ : { (.note) }` (new test `linkerscript/discard-group.s`). Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D70146	2019-11-19 08:54:06 -08:00
Fangrui Song	e447d5afd3	[ELF] Delete SectionBase::assigned D67504 removed uses of `assigned` from OutputSection::addSection, which makes `assigned` purely used in processSectionCommands() and its callees. By replacing its references with `parent`, we can remove `assigned`. Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D67531 llvm-svn: 372735	2019-09-24 11:48:46 +00:00
Fangrui Song	47cfe8f321	[ELF] Fix variable names in comments after VariableName -> variableName change Also fix some typos. llvm-svn: 366181	2019-07-16 05:50:45 +00:00
Rui Ueyama	3837f4273f	[Coding style change] Rename variables so that they start with a lowercase letter This patch is mechanically generated by clang-llvm-rename tool that I wrote using Clang Refactoring Engine just for creating this patch. You can see the source code of the tool at https://reviews.llvm.org/D64123. There's no manual post-processing; you can generate the same patch by re-running the tool against lld's code base. Here is the main discussion thread to change the LLVM coding style: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html In the discussion thread, I proposed we use lld as a testbed for variable naming scheme change, and this patch does that. I chose to rename variables so that they are in camelCase, just because that is a minimal change to make variables to start with a lowercase letter. Note to downstream patch maintainers: if you are maintaining a downstream lld repo, just rebasing ahead of this commit would cause massive merge conflicts because this patch essentially changes every line in the lld subdirectory. But there's a remedy. clang-llvm-rename tool is a batch tool, so you can rename variables in your downstream repo with the tool. Given that, here is how to rebase your repo to a commit after the mass renaming: 1. rebase to the commit just before the mass variable renaming, 2. apply the tool to your downstream repo to mass-rename variables locally, and 3. rebase again to the head. Most changes made by the tool should be identical for a downstream repo and for the head, so at the step 3, almost all changes should be merged and disappear. I'd expect that there would be some lines that you need to merge by hand, but that shouldn't be too many. Differential Revision: https://reviews.llvm.org/D64121 llvm-svn: 365595	2019-07-10 05:00:37 +00:00
Peter Collingbourne	0282898586	ELF: Create synthetic sections for loadable partitions. We create several types of synthetic sections for loadable partitions, including: - The dynamic symbol table. This allows code outside of the loadable partitions to find entry points with dlsym. - Creating a dynamic symbol table also requires the creation of several other synthetic sections for the partition, such as the dynamic table and hash table sections. - The partition's ELF header is represented as a synthetic section in the combined output file, and will be used by llvm-objcopy to extract partitions. Differential Revision: https://reviews.llvm.org/D62350 llvm-svn: 362819	2019-06-07 17:57:58 +00:00
Peter Collingbourne	ba2816be82	ELF: Add basic partition data structures and behaviours. This change causes us to read partition specifications from partition specification sections and split output sections into partitions according to their reachability from partition entry points. This is only the first step towards a full implementation of partitions. Later changes will add additional synthetic sections to each partition so that they can be loaded independently. Differential Revision: https://reviews.llvm.org/D60353 llvm-svn: 361925	2019-05-29 03:55:20 +00:00
Bob Haarman	5ff1eb6418	Revert r358069 "Discard debuginfo for object files empty after GC" The change broke some scenarios where debug information is still needed, although MarkLive cannot see it, including the Chromium/Android build. Reverting to unbreak that build. llvm-svn: 360955	2019-05-16 23:33:06 +00:00
Fangrui Song	957c356ffe	[ELF] Place SectionPiece::{Live,Hash} bit fields together Summary: We access Live and OutputOff (which may share the same memory location) concurrently in 2 parallelForEachN loops. Separating them avoids subtle data races like D41884/PR35788. This patch places Live and Hash together. 2 reasons this is appealing: 1) Hash is immutable. Live is almost read-only - only written once in MarkLive.cpp where Hash is not accessed 2) we already discard low bits of Hash to decide ShardID. It doesn't matter much if we make 32-bit Hash to 31-bit. For a huge internal clang -O3 executable (1.6GiB), `Strings` in StringTableBuilder::finalizeStringTable contains at most 310253 elements. The expected number of pair-wise collisions 2^(-31) * C(310253,2) ~= 22.41 is too small to have a negative impact on performance. Actually, my benchmark shows there is actually a minor performance improvement. Differential Revision: https://reviews.llvm.org/D60765 llvm-svn: 358645	2019-04-18 07:46:09 +00:00
Rui Ueyama	3a8bb7cd2c	Discard debuginfo for object files empty after GC Patch by Robert O'Callahan. Rust projects tend to link in all object files from all dependent libraries and rely on --gc-sections to strip unused code and data. Unfortunately --gc-sections doesn't currently strip any debuginfo associated with GC'ed sections, so lld links in the full debuginfo from all dependencies even if almost all that code has been discarded. See https://github.com/rust-lang/rust/issues/56068 for some details. Properly stripping debuginfo for discarded sections would be difficult, but a simple approach that helps significantly is to mark debuginfo sections as live only if their associated object file has at least one live code/data section. This patch does that. In a (contrived but not totally artificial) Rust testcase linked above, it reduces the final binary size from 46MB to 5.1MB. Differential Revision: https://reviews.llvm.org/D54747 llvm-svn: 358069	2019-04-10 10:37:10 +00:00
Peter Collingbourne	e2b8c40a77	ELF: Use bump pointer allocator for uncompressed section buffers. NFCI. This shaves another word off SectionBase and makes it possible to clone a section using the implicit copy constructor. This basically reverts r311056, which removed the mutex in order to make the code easier to understand. On balance I think it's probably more straightforward to have a mutex here than to have an unusual copy constructor in SectionBase. Differential Revision: https://reviews.llvm.org/D59269 llvm-svn: 355966	2019-03-12 20:32:30 +00:00
Peter Collingbourne	dfbb9a793e	ELF: Reduce the size of InputSectionBase by two words. NFCI. - The Assigned bit was previously taking a word on its own. Move it into the bit fields in SectionBase. - NumRelocations and AreRelocsRela were previously also taking up a word despite only using half of it. Move them into the alignment gap after SectionBase's fields. Differential Revision: https://reviews.llvm.org/D59044 llvm-svn: 355622	2019-03-07 18:48:12 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Rui Ueyama	c9c34bdc1a	Do not use a hash table to uniquify mergeable strings. Previously, we have a hash table containing strings and their offsets to manage mergeable strings. Technically we can live without that, because we can do binary search on a vector of mergeable strings to find a mergeable strings. We did have both the hash table and the binary search because we thought that that is faster. We recently observed that lld tend to consume more memory than gold when building an output with debug info. A few percent of memory is consumed by the hash table. So, we needed to reevaluate whether or not having the extra hash table is a good CPU/memory tradeoff. I run a few benchmarks with and without the hash table. I got a mixed result for the benchmark. We observed a regression for some programs by removing the hash table (that's what we expected), but we also observed that performance imrpovements for some programs. This is perhaps due to reduced memory usage. Differential Revision: https://reviews.llvm.org/D55234 llvm-svn: 348401	2018-12-05 19:13:31 +00:00
George Rimar	0fc5dcd1c8	[LLD][ELF] - Simplify. NFCI. This makes getRISCVPCRelHi20 to be static local helper, and rotates the 'if' condition. llvm-svn: 347497	2018-11-23 15:13:26 +00:00
Rui Ueyama	e28c146423	Avoid unnecessary buffer allocation and memcpy for compressed sections. Previously, we uncompress all compressed sections before doing anything. That works, and that is conceptually simple, but that could results in a waste of CPU time and memory if uncompressed sections are then discarded or just copied to the output buffer. In particular, if .debug_gnu_pub{names,types} are compressed and if no -gdb-index option is given, we wasted CPU and memory because we uncompress them into newly allocated bufers and then memcpy the buffers to the output buffer. That temporary buffer was redundant. This patch changes how to uncompress sections. Now, compressed sections are uncompressed lazily. To do that, `Data` member of `InputSectionBase` is now hidden from outside, and `data()` accessor automatically expands an compressed buffer if necessary. If no one calls `data()`, then `writeTo()` directly uncompresses compressed data into the output buffer. That eliminates the redundant memory allocation and redundant memcpy. This patch significantly reduces memory consumption (20 GiB max RSS to 15 Gib) for an executable whose .debug_gnu_pub{names,types} are in total 5 GiB in an uncompressed form. Differential Revision: https://reviews.llvm.org/D52917 llvm-svn: 343979	2018-10-08 16:58:59 +00:00
Rui Ueyama	e7688e6663	Revert r342297: Discard uncompressed buffer after creating .gdb_index contents. Looks like it broke some local builds that use -gdb-index. llvm-svn: 342298	2018-09-14 23:28:13 +00:00
Rui Ueyama	751dfbe39b	Discard uncompressed buffer after creating .gdb_index contents. Once we create .gdb_index contents, .zdebug_gnu_pub{names,types} are useless, so there's no need to keep their uncompressed data in memory. I observed that for a test case in which lld creates a 3GB .gdb_index section, the maximum resident set size reduced from 43GB to 29GB after this patch. Differential Revision: https://reviews.llvm.org/D52126 llvm-svn: 342297	2018-09-14 22:57:39 +00:00
Rui Ueyama	5cd9c6bcd8	Support RISC-V Patch by PkmX. This patch makes lld recognize RISC-V target and implements basic relocation for RV32/RV64 (and RVC). This should be necessary for static linking ELF applications. The ABI documentation for RISC-V can be found at: https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md. Note that the documentation is far from complete so we had to figure out some details from bfd. The patch should be pretty straightforward. Some highlights: - A new relocation Expr R_RISCV_PC_INDIRECT is added. This is needed as the low part of a PC-relative relocation is linked to the corresponding high part (auipc), see: https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses - LLVM's MC support for RISC-V is very incomplete (we are working on this), so tests are given in objectyaml format with the original assembly included in the comments. Once we have complete support for RISC-V in MC, we can switch to llvm-as/llvm-objdump. - We don't support linker relaxation for now as it requires greater changes to lld that is beyond the scope of this patch. Once this is accepted we can start to work on adding relaxation to lld. Differential Revision: https://reviews.llvm.org/D39322 llvm-svn: 339364	2018-08-09 17:59:56 +00:00
Sterling Augustine	4fd84c18df	Implement framework for linking split-stack object files, and x86_64 support. llvm-svn: 337332	2018-07-17 23:16:02 +00:00
Peter Collingbourne	11dc7fcae2	ELF: Do not ICF two sections with different output sections. Note that this doesn't do the right thing in the case where there is a linker script. We probably need to move output section assignment before ICF to get the correct behaviour here. Differential Revision: https://reviews.llvm.org/D47241 llvm-svn: 333052	2018-05-23 01:58:43 +00:00
Peter Smith	dbef8cc67c	[ELF] Implement --keep-unique option The --keep-unique <symbol> option is taken from gold. The intention is that <symbol> will be prevented from being folded by ICF. Although not specifically mentioned in the documentation <symbol> only matches global symbols, with a warning if the symbol is not found. The implementation finds the Section defining <symbol> and removes it from the set of sections considered for ICF. Differential Revision: https://reviews.llvm.org/D46755 llvm-svn: 332332	2018-05-15 08:57:21 +00:00
Rafael Espindola	9bf1006278	Split merge sections early. Now that getSectionPiece is fast (uses a hash) it is probably OK to split merge sections early. The reason I want to do this is to split eh_frame sections in the same place. This does mean that we have to decompress early. Given that the only compressed sections are debug info, I don't think we are missing much. It is a small improvement: 0.5% on the geometric mean. llvm-svn: 331058	2018-04-27 16:29:57 +00:00
Rafael Espindola	4809e2c11d	Define InputSection::getOffset inline. This is much simpler than the other section types and there are many places where the section type is statically know. llvm-svn: 330350	2018-04-19 18:00:46 +00:00
Rafael Espindola	6275a7aa39	Rename MergeInputSection::getOffset. Unlike the getOffset in the base class, this one computes the offset in the parent synthetic section, not the final output section. llvm-svn: 330339	2018-04-19 16:05:07 +00:00
Rafael Espindola	ea2c78369c	Reduce code duplication. getVA was already implemented in the base class. llvm-svn: 330036	2018-04-13 16:07:27 +00:00
Rafael Espindola	5f8e77afb5	Initialize OutputOff to zero. We have a dedicated Live bit, so we don't need a special value and we were not accounting for in at least one place. llvm-svn: 329307	2018-04-05 15:56:04 +00:00
Rafael Espindola	6cd7af51e1	Inline initOffsetMap. In the lld perf builder r328686 had a negative impact in stalled-cycles-frontend. Somehow that stat is not showing on my machine, but the attached patch shows an improvement on cache-misses, which is probably a reasonable proxy. My working theory is that given a large input the pieces vector is out of cache by the time initOffsetMap runs. Both finalizeContents implementation have a convenient location for initializing the OffsetMap, so this seems the best solution. llvm-svn: 329117	2018-04-03 21:38:18 +00:00
Rafael Espindola	816127ea17	Initialize OffsetMap in a known location. This is a small optimization and avoids the need to use call_once. llvm-svn: 328686	2018-03-28 03:20:18 +00:00
Rafael Espindola	92eba0e14a	Define a trivial method inline. llvm-svn: 328685	2018-03-28 03:14:11 +00:00
Rafael Espindola	5a7ca96e2d	Store live offsets as uint32_t. We don't support input merge sections larger than 4gb, so these can be uint32_t. llvm-svn: 328684	2018-03-28 02:32:31 +00:00
Rafael Espindola	4f058a2c6b	Add a SectionBase::getVA helper. NFC. There were a few too many places duplicating this. llvm-svn: 328402	2018-03-24 00:35:11 +00:00
Rui Ueyama	ac114d27ae	s/uncompress/decompress/g. In lld, we use both "uncompress" and "decompress" which is confusing. Since LLVM uses "decompress", we should use the same term. llvm-svn: 324944	2018-02-12 21:56:14 +00:00
Rafael Espindola	c7945c827d	Move function to the file where it is used. llvm-svn: 323780	2018-01-30 16:24:04 +00:00
Rafael Espindola	9a84f6b954	Detemplate reportDuplicate. We normally avoid "switch (Config->EKind)", but in this case I think it is worth it. It is only executed when there is an error and it allows detemplating a lot of code. llvm-svn: 321404	2017-12-23 17:21:39 +00:00
Rafael Espindola	ce3b52c186	Pass an InputFile to the InputSection constructor. This simplifies toRegularSection and reduces the noise in a followup patch. llvm-svn: 321240	2017-12-21 02:11:51 +00:00
Rafael Espindola	604032729c	Convert a few more InputFiles to references. We use null files in sections to represent linker created sections, so ObjFile<ELFT> is never null. llvm-svn: 321238	2017-12-21 02:03:39 +00:00
Rafael Espindola	5c73c49c9f	Detemplate createCommentSection. It was only templated so it could create a dummy section header that was immediately parsed back. llvm-svn: 321235	2017-12-21 01:21:59 +00:00
Rafael Espindola	f4fb5fd752	Move Repl to SectionBase. It is currently in InputSectionBase. Only InputSections are used in ICF, so Repl should be move to InputSection to clear the class hierarchy or, like this patch does, to SectionBase for convenience. The convenience of having it on the base class is that we can just access the replacement without having to first check if it is an InputSection. It is a bit less code and a bit faster as some of this code is very hot. I got up to 1.77% improvement in clang-gdb-index and no regressions according to lnt. llvm-svn: 320654	2017-12-13 22:59:23 +00:00
Rafael Espindola	b01cd86458	Fix the type of the Discared section. It is constructed with a kind of Regular and will dyn_cast to InputSection, but is declared to be an InputSectionBase. llvm-svn: 320539	2017-12-13 01:39:35 +00:00
Rafael Espindola	10bcc1cf90	Fix line endings. NFC. llvm-svn: 320502	2017-12-12 17:37:01 +00:00
James Henderson	8d0efdd5db	[ELF] Reset OutputSection size prior to processing linker script commands The size of an OutputSection is calculated early, to aid handling of compressed debug sections. However, subsequent to this point, unused synthetic sections are removed. In the event that an OutputSection, from which such an InputSection is removed, is still required (e.g. because it has a symbol assignment), and no longer has any InputSections, dot assignments, or BYTE()-family directives, the size member is never updated when processing the commands. If the removed InputSection had a non-zero size (such as a .got.plt section), the section ends up with the wrong size in the output. The fix is to reset the OutputSection size prior to processing the linker script commands relating to that OutputSection. This ensures that the size is correct even in the above situation. Additionally, to reduce the risk of developers misusing OutputSection Size and InputSection OutSecOff, they are set to simply the number of InputSections in an OutputSection, and the corresponding index respectively. We cannot completely stop using them, due to SHF_LINK_ORDER sections requiring them. Compressed debug sections also require the full size. This is now calculated in maybeCompress for these kinds of sections. Reviewers: ruiu, rafael Differential Revision: https://reviews.llvm.org/D38361 llvm-svn: 320472	2017-12-12 11:51:13 +00:00
Rafael Espindola	bdcfb178b5	Delete dead code. llvm-svn: 319403	2017-11-30 05:52:42 +00:00
Peter Collingbourne	e9a9e0a1e7	ELF: Merge DefinedRegular and Defined. Now that DefinedRegular is the only remaining derived class of Defined, we can merge the two classes. Differential Revision: https://reviews.llvm.org/D39667 llvm-svn: 317448	2017-11-06 04:35:31 +00:00
Peter Collingbourne	6c55a70838	ELF: Remove DefinedCommon. Common symbols are now represented with a DefinedRegular that points to a BssSection, even during symbol resolution. Differential Revision: https://reviews.llvm.org/D39666 llvm-svn: 317447	2017-11-06 04:33:58 +00:00
Rui Ueyama	f52496e1e0	Rename SymbolBody -> Symbol Now that we have only SymbolBody as the symbol class. So, "SymbolBody" is a bit strange name now. This is a mechanical change generated by perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF) nd clang-format-diff. Differential Revision: https://reviews.llvm.org/D39459 llvm-svn: 317370	2017-11-03 21:21:47 +00:00
George Rimar	82f0c42dad	[ELF] - Teach LLD to report line numbers for data symbols. This is PR34826. Currently LLD is unable to report line number when reporting duplicate declaration of some variable. That happens because for extracting line information we always use .debug_line section content which describes mapping from machine instructions to source file locations, what does not help for variables as does not describe them. In this patch I am taking the approproate information about variables locations from the .debug_info section. Differential revision: https://reviews.llvm.org/D38721 llvm-svn: 317080	2017-11-01 07:42:38 +00:00
Rui Ueyama	95c142e208	Revert r316305: Remove a fast lookup table from MergeInputSection. This reverts commit r316305 because performance regression was observed. llvm-svn: 317026	2017-10-31 19:14:06 +00:00
Rui Ueyama	ea1398eeb3	Move "Assigned" bit from SectionBase to InputSectionBase. This bit is to manage whether an input section has already been assigned to some output section by linker scripts or not. So it logically belongs to InputSectionBase. SectionBase is a common base class for input and output sections, so that wasn't the right place to define the bit. llvm-svn: 316879	2017-10-29 23:32:23 +00:00
Rui Ueyama	da4d26dfa3	Initialize members not by assignment but by the member initializer list. llvm-svn: 316876	2017-10-29 22:26:52 +00:00
Rui Ueyama	d6b7a390d8	De-template elf::getObjMsg. NFC. llvm-svn: 316732	2017-10-27 03:13:54 +00:00

1 2 3 4 5 ...

274 Commits