llvm-project/lld/COFF/Writer.h

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

86 lines
2.6 KiB
C
Raw Normal View History

//===- Writer.h -------------------------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLD_COFF_WRITER_H
#define LLD_COFF_WRITER_H
#include "Chunks.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Object/COFF.h"
#include <chrono>
#include <cstdint>
#include <vector>
namespace lld {
namespace coff {
static const int pageSize = 4096;
void writeResult();
class PartialSection {
public:
PartialSection(StringRef n, uint32_t chars)
: name(n), characteristics(chars) {}
StringRef name;
unsigned characteristics;
std::vector<Chunk *> chunks;
};
// OutputSection represents a section in an output file. It's a
// container of chunks. OutputSection and Chunk are 1:N relationship.
// Chunks cannot belong to more than one OutputSections. The writer
// creates multiple OutputSections and assign them unique,
// non-overlapping file offsets and RVAs.
class OutputSection {
public:
COFF: Use (name, output characteristics) as a key when grouping input sections into output sections. This is what link.exe does and lets us avoid needing to worry about merging output characteristics while adding input sections to output sections. With this change we can't process /merge in the same way as before because sections with different output characteristics can still be merged into one another. So this change moves the processing of /merge to just before we assign addresses. In the case where there are multiple output sections with the same name, link.exe only merges the first section with the source name into the first section with the target name, and we do the same. At the same time I also implemented transitive merging (which means that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a). This isn't quite enough though because link.exe has a special case for .CRT in 32-bit mode: it processes sections whose output characteristics are DATA | R | W as though the output characteristics were DATA | R (so that they get merged into things like constructor lists in the expected way). Chromium has a few such sections, and it turns out that those sections were causing the problem that resulted in r318699 (merge .xdata into .rdata) being reverted: because of the previous permission merging semantics, the .CRT sections were causing the entire .rdata section to become writable, which caused the SEH runtime to crash because it apparently requires .xdata to be read-only. This change also implements the same special case. This should unblock being able to merge .xdata into .rdata by default, as well as .bss into .data, both of which will be done in followups. Differential Revision: https://reviews.llvm.org/D45801 llvm-svn: 330479
2018-04-21 05:10:33 +08:00
OutputSection(llvm::StringRef n, uint32_t chars) : name(n) {
header.Characteristics = chars;
}
void addChunk(Chunk *c);
[COFF] Provide __CTOR_LIST__ and __DTOR_LIST__ symbols for MinGW MinGW uses these kind of list terminator symbols for traversing the constructor/destructor lists. These list terminators are actual pointers entries in the lists, with the values 0 and (uintptr_t)-1 (instead of just symbols pointing to the start/end of the list). (This mechanism exists in both the mingw-w64 crt startup code and in libgcc; normally the mingw-w64 one is used, but a DLL build of libgcc uses the libgcc one. Therefore it's not trivial to change the mechanism without lots of cross-project synchronization and potentially invalidating some combinations of old/new versions of them.) When mingw-w64 has been used with lld so far, the CRT startup object files have so far provided these symbols, ending up with different, incompatible builds of the CRT startup object files depending on whether binutils or lld are going to be used. In order to avoid the need of different configuration of the CRT startup object files depending on what linker to be used, provide these symbols in lld instead. (Mingw-w64 checks at build time whether the linker provides these symbols or not.) This unifies this particular detail between the two linkers. This does disallow the use of the very latest lld with older versions of mingw-w64 (the configure check for the list was added recently; earlier it simply checked whether the CRT was built with gcc or clang), and requires rebuilding the mingw-w64 CRT. But the number of users of lld+mingw still is low enough that such a change should be tolerable, and unifies this aspect of the toolchains, easing interoperability between the toolchains for the future. The actual test for this feature is added in ctors_dtors_priority.s, but a number of other tests that checked absolute output addresses are updated. Differential Revision: https://reviews.llvm.org/D52053 llvm-svn: 342294
2018-09-15 06:26:59 +08:00
void insertChunkAtStart(Chunk *c);
COFF: Use (name, output characteristics) as a key when grouping input sections into output sections. This is what link.exe does and lets us avoid needing to worry about merging output characteristics while adding input sections to output sections. With this change we can't process /merge in the same way as before because sections with different output characteristics can still be merged into one another. So this change moves the processing of /merge to just before we assign addresses. In the case where there are multiple output sections with the same name, link.exe only merges the first section with the source name into the first section with the target name, and we do the same. At the same time I also implemented transitive merging (which means that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a). This isn't quite enough though because link.exe has a special case for .CRT in 32-bit mode: it processes sections whose output characteristics are DATA | R | W as though the output characteristics were DATA | R (so that they get merged into things like constructor lists in the expected way). Chromium has a few such sections, and it turns out that those sections were causing the problem that resulted in r318699 (merge .xdata into .rdata) being reverted: because of the previous permission merging semantics, the .CRT sections were causing the entire .rdata section to become writable, which caused the SEH runtime to crash because it apparently requires .xdata to be read-only. This change also implements the same special case. This should unblock being able to merge .xdata into .rdata by default, as well as .bss into .data, both of which will be done in followups. Differential Revision: https://reviews.llvm.org/D45801 llvm-svn: 330479
2018-04-21 05:10:33 +08:00
void merge(OutputSection *other);
void setPermissions(uint32_t c);
uint64_t getRVA() { return header.VirtualAddress; }
uint64_t getFileOff() { return header.PointerToRawData; }
void writeHeaderTo(uint8_t *buf);
void addContributingPartialSection(PartialSection *sec);
// Returns the size of this section in an executable memory image.
// This may be smaller than the raw size (the raw size is multiple
// of disk sector size, so there may be padding at end), or may be
// larger (if that's the case, the loader reserves spaces after end
// of raw data).
uint64_t getVirtualSize() { return header.VirtualSize; }
// Returns the size of the section in the output file.
uint64_t getRawSize() { return header.SizeOfRawData; }
// Set offset into the string table storing this section name.
// Used only when the name is longer than 8 bytes.
void setStringTableOff(uint32_t v) { stringTableOff = v; }
// N.B. The section index is one based.
uint32_t sectionIndex = 0;
llvm::StringRef name;
COFF: Use (name, output characteristics) as a key when grouping input sections into output sections. This is what link.exe does and lets us avoid needing to worry about merging output characteristics while adding input sections to output sections. With this change we can't process /merge in the same way as before because sections with different output characteristics can still be merged into one another. So this change moves the processing of /merge to just before we assign addresses. In the case where there are multiple output sections with the same name, link.exe only merges the first section with the source name into the first section with the target name, and we do the same. At the same time I also implemented transitive merging (which means that /merge:.c=.b /merge:.b=.a merges both .c and .b into .a). This isn't quite enough though because link.exe has a special case for .CRT in 32-bit mode: it processes sections whose output characteristics are DATA | R | W as though the output characteristics were DATA | R (so that they get merged into things like constructor lists in the expected way). Chromium has a few such sections, and it turns out that those sections were causing the problem that resulted in r318699 (merge .xdata into .rdata) being reverted: because of the previous permission merging semantics, the .CRT sections were causing the entire .rdata section to become writable, which caused the SEH runtime to crash because it apparently requires .xdata to be read-only. This change also implements the same special case. This should unblock being able to merge .xdata into .rdata by default, as well as .bss into .data, both of which will be done in followups. Differential Revision: https://reviews.llvm.org/D45801 llvm-svn: 330479
2018-04-21 05:10:33 +08:00
llvm::object::coff_section header = {};
std::vector<Chunk *> chunks;
std::vector<Chunk *> origChunks;
std::vector<PartialSection *> contribSections;
private:
uint32_t stringTableOff = 0;
};
} // namespace coff
} // namespace lld
#endif