It looks like the contents of the table need to be sorted according to its
value, so that the runtime can find the entry by binary search. I'm not 100%
sure if we really have to do that, but at least I can say it's safe to do
because the contents of .sxdata is just a list of exception handlers' RVAs.
llvm-svn: 202550
If all input files are compatible with Structured Exception Handling, linker
is supposed to create an exectuable with a table for SEH handlers. The table
consists of exception handlers entry point addresses.
The basic idea of SEH in x86 Microsoft ABI is to list all valid entry points
of exception handlers in an read-only memory, so that an attacker cannot
override the addresses in it. In x86 ABI, data for exception handling is mostly
on stack, so it's volnerable to stack overflow attack. In order to protect
against it, Windows runtime uses the table to check a return address, to
ensure that the address is really an valid entry point for an exception handler.
Compiler emits a list of exception handler functions to .sxdata section. It
also emits a marker symbol "@feat.00" to indicate that the object is compatible
with SEH. SEH is a relatively new feature for COFF, and mixing SEH-compatible
and SEH-incompatible objects will result in an invalid executable, so is the
marker.
If all input files are compatible with SEH, LLD emits a SEH table. SEH table
needs to be pointed by Load Configuration strucutre, so when emitting a SEH
table LLD emits it too. The address of a Load Configuration will be stored to
the file header.
llvm-svn: 202248
This restores the debug output to how it was before r197727 broke it. This
went undetected because the corresponding test was never run due to broken
feature detection.
llvm-svn: 202079
The sections .rela/.rel.(*) have a alignment of 2 in the final image created by
the linker. This needs to be properly set to the right alignment depending on
the architecture(32/64bits).
llvm-svn: 201740
The target machine type affects the meaning of other options, in particular
how to mangle symbols. So we want to handle the option first and then parse
all the other options.
llvm-svn: 200589
Because the object files are now readable to humans, I don't think we need the
source assembly file any more, so I removed them too in this commit.
llvm-svn: 200276
Add new virtual virtual function `isRelaOutputFormat` to the
`ELFLinkingContext` class. Call this function everywhere we need to
select a relocation table format.
Patch reviewed by Shankar Easwaran and Rui Ueyama.
llvm-svn: 199973
The main goal of this patch is to allow "mach-o encoded as yaml" and "native
encoded as yaml" documents to be intermixed. They are distinguished via
yaml tags at the start of the document. This will enable all mach-o test cases
to be written using yaml instead of checking in object files.
The Registry was extend to allow yaml tag handlers to be registered. The
mach-o Reader adds a yaml tag handler for the tag "!mach-o".
Additionally, this patch fixes some buffer ownership issues. When parsing
mach-o binaries, the mach-o atoms can have pointers back into the memory
mapped .o file. But with yaml encoded mach-o, name and content are ephemeral,
so a copyRefs parameter was added to cause the mach-o atoms to make their
own copy.
llvm-svn: 198986
Module-definition (.def) files are the file containing linker directives,
such as export symbols. Because link.exe supports the same features as command
line options, just as some Linker Script commands overlaps with command line
options, use of module-definition file is not really necessary. It provides
an alternative way to specify some linker options.
This patch implements EXPORTS directive. Other directives will be implemented
in the future.
llvm-svn: 198925
Currently LLD always print a warning message if the same symbol is specified
more than once for /export option. It's a bit annoying because specifying the
same symbol with compatible options is actually safe and considered as a
normal use case. This patch makes LLD to warn only when incompatible export
options are given.
llvm-svn: 198104
Currently .drectve section contents are parsed after other sections are parsed.
That order may result in wrong results if other sections depend on command line
options in the directive section.
For example, if a weak symbol is defined using /alternatename option in the
directive section, we have to read it first and then read the text section
contents. Otherwise the weak symbol won't be defined.
This patch changes the order to fix the issue.
llvm-svn: 198071
There was a bug that the linker does not report an error if symbols specified
by -u (or /include on Windows) are not resolved. This patch fixes it by adding
such symbols to the dead strip root.
llvm-svn: 198041
Subsystem field in the PE/COFF file header has no meanining for the DLL.
It looks like MSVC link.exe sets the default subsystem (Windows GUI) to
the field if no /subsystem option is specified.
llvm-svn: 198015
The main changes are in:
include/lld/Core/Reference.h
include/lld/ReaderWriter/Reader.h
Everything else is details to support the main change.
1) Registration based Readers
Previously, lld had a tangled interdependency with all the Readers. It would
have been impossible to make a streamlined linker (say for a JIT) which
just supported one file format and one architecture (no yaml, no archives, etc).
The old model also required a LinkingContext to read an object file, which
would have made .o inspection tools awkward.
The new model is that there is a global Registry object. You programmatically
register the Readers you want with the registry object. Whenever you need to
read/parse a file, you ask the registry to do it, and the registry tries each
registered reader.
For ease of use with the existing lld code base, there is one Registry
object inside the LinkingContext object.
2) Changing kind value to be a tuple
Beside Readers, the registry also keeps track of the mapping for Reference
Kind values to and from strings. Along with that, this patch also fixes
an ambiguity with the previous Reference::Kind values. The problem was that
we wanted to reuse existing relocation type values as Reference::Kind values.
But then how can the YAML write know how to convert a value to a string? The
fix is to change the 32-bit Reference::Kind into a tuple with an 8-bit namespace
(e.g. ELF, COFFF, etc), an 8-bit architecture (e.g. x86_64, PowerPC, etc), and
a 16-bit value. This tuple system allows conversion to and from strings with
no ambiguities.
llvm-svn: 197727
Executable files do not use a string table, so section names longer than 8
characters are not permitted. Long section names should just be truncated.
llvm-svn: 197470
If NONAME option is given for an export, that symbol will be exported only by
its ordinal. LLD will not emit the symbol name to the export table.
llvm-svn: 197371
OrdinalBase is an addend to the ordinals. We used to always set 1 to the field.
Although it produced a valid a DLL export table, it'd be a waste if the first
ordinal does not start with 1 -- we had to have NULL fields at the beginning of
the export address table. By setting the ordinal base, we can eliminate the
NULL fields.
llvm-svn: 197367
You can specify exported function's ordinal by /export:func,@<number> command
line option, but LLD ignored the option until now. This patch implements the
feature.
Ordinal is basically the index into the exported function address table. So,
for example, if /export:foo,@42 is specified, the linker writes foo's address
to 42th entry in the address table. Windows supports import-by-ordinal; you
can not only import a function by name, but by its ordinal. If you want to
allow your DLL users to import your functions by their ordinals, you need to
make sure that your functions are always exported with the same ordinals.
This is the feature for that situation.
llvm-svn: 197364
The following are the most significant peculiarities of MIPS target:
- MIPS ABI requires some special tags in the dynamic table.
- GOT consists of two parts local and global. The local part contains
entries refer locally visible symbols. The global part contains entries
refer global symbols.
- Entries in the .dynsym section which have corresponded entries in the
GOT should be:
* Emitted at the end of .dynsym section
* Sorted accordingly to theirs GOT counterparts
- There are "paired" relocations. One or more R_MIPS_HI16 and R_MIPS_GOT16
relocations should be followed by R_MIPS_LO16 relocation. To calculate
result of R_MIPS_HI16 and R_MIPS_GOT16 relocations we need to combine
addends from these relocations and paired R_MIPS_LO16 relocation.
The patch reviewed by Michael Spencer, Shankar Easwaran, Rui Ueyama.
http://llvm-reviews.chandlerc.com/D2156
llvm-svn: 197342
The only data in .edata whose length varies is the string. This patch moves
all the strings to the end of the section, so that 16-bit or 32-bit integers
are aligned on correct boundaries.
llvm-svn: 197213
This is the first patch to emit data for the DLL export table. The DLL export
table is the data used by the Windows loader to find the address of exported
function from DLL. With this patch, LLD is able to emit a valid DLL export
table which the Windows loader can interpret and load.
The data structure of the DLL export table is described in the Microsoft
PE/COFF Specification, section 5.3.
DLL support is not complete yet; the linker needs to emit an import library
for a DLL, otherwise the linker cannot link against the DLL. We also do not
support export-only-by-ordinal yet.
llvm-svn: 197212
If section size is not multiple of 512, the writer added NULL bytes at the end
of it to make it so. That is not required by the PE/COFF spec, and the MSVC's
linker does not do that too. So we don't need to do that, too.
llvm-svn: 197002
GroupedSectionsPass was a complicated pass. That pass's job was to reorder
atoms by section name, so that the atoms with the same section prefix will be
emitted consecutively to the executable. The pass added layout edges to atoms,
and let the layout pass to actually reorder them.
This patch simplifies the design by making GroupedSectionPass to directly
reorder atoms, rather than adding layout edges. This resembles ELF's
ArrayOrderPass.
This patch improves the performance of LLD; it used to take 7.1 seconds to
link LLD with LLD on my Macbook Pro, but it now takes 6.1 seconds.
llvm-svn: 196628
Emitting idata atoms to their own section would make debugging easier.
The Windows loader do not really care about whether the DLL import table is
in .rdata or its own .idata section, so there is no change in functionality.
llvm-svn: 196458
This is a patch to let the PECOFF writer to use the information passed
by the parser for /section option. The implementation of /section should
now be complete.
llvm-svn: 195893
Atom ordinals are the indeces in a file. Currently the PECOFF reader assigns
ordinals for each section, so it's (incorrectly) assigning duplicate ordinals.
llvm-svn: 195852
Instead of having multiple SectionChunks for each section (.text, .data,
.rdata and .bss), we could have one chunk writer that can emit any sections.
This patch does that -- removing all section-sepcific chunk writers and
replace them with one "generic" writer.
This change should simplify the code because it eliminates similar-but-
slightly-different classes.
It also fixes an issue in the previous design. Before this patch, we could
emit only limited set of sections (i.e. .text, .data, .rdata and .bss). With
this patch, we can emit any sections.
llvm-svn: 195797
Looks like -L paths are not positional. They need to be added to a list of
search paths and those needs to be searched when lld looks for a library.
llvm-svn: 195594
If /subsystem option is not specified, the linker needs to infer it from the
entry point function. If "main" or "wmain" is defined, it's a console
application. If "WinMain" or "wWinMain" is defined, it's a GUI application.
llvm-svn: 195592
This adds functionality to limit shared library undefined atoms to be added
only once by the Resolver.
Dynamic libraries may be processed more than once if they exist within a
Group.
Also adds a test to verify the change.
llvm-svn: 195307
The fallback atom was used only when it's searching for a symbol in a library;
if an undefined symbol was not found in a library, the LLD looked for its
fallback symbol in the library.
Although it worked in most cases, because symbols with fallbacks usually occur
only in OLDNAMES.LIB (a standard library), that behavior was incompatible with
link.exe. This patch fixes the issue so that the semantics is the same as
MSVC's link.exe
The new (and correct, I believe) behavior is this:
- If there's no definition for an undefined atom, replace the undefined atom
with its fallback and then proceed (e.g. look in the next file or stop
linking as usual.)
Weak External symbols are underspecified in the Microsoft PE/COFF spec. However,
as long as I observed the behavior of link.exe, this seems to be what we want
for compatibility.
Differential Revision: http://llvm-reviews.chandlerc.com/D2162
llvm-svn: 195269
We can add multiple undefined atoms having the same name to the symbol table.
If such atoms are added, the symbol table compares their canBeNull attributes,
and select one having a stronger constraint. If their canBeNulls are the same,
the choice is arbitrary. Currently it choose the existing one.
This patch changes the preference, so that the symbol table choose the new one
if the new atom has a greater canBeNull or a fallback atom. This shouldn't
change the behavior except the case described below.
A new undefined atom may have a new fallback atom attribute. By choosing the new
atom, we can update the fallback atom during Core Linking. PE/COFF actually need
that. For example, _lseek is an alias for __lseek on Windows. One of an object
file in OLDNAMES.LIB has an undefined atom for _lseek with the fallback to
__lseek. When the linker tries to resolve _read, it supposed to read the file
from OLDNAMES.LIB and use the new fallback from the file. Currently LLD cannot
handle such case because duplicate undefined atoms with the same attributes are
ignored.
Differential Revision: http://llvm-reviews.chandlerc.com/D2161
llvm-svn: 194777
This patch adds support for converting normalized mach-o to and from binary
mach-o. It also changes WriterMachO (which previously directly wrote a
mach-o binary given a set of Atoms) to instead do it in two steps. The first
step uses normalizedFromAtoms() to convert Atoms to normalized mach-o, and the
second step uses writeBinary() which to generate the mach-o binary file.
llvm-svn: 194167
This patch should fix the test when it runs on Windows, by allowing drive
letter separator (colon) in the path. Now all LLD ELF tests passed on MSVC
2012 32-bit. Hooray!
llvm-svn: 193978
On Windows, neither "(" nor ")" are shell special characters, so -\( is passed
as-is to LLD. Because of that this test was failing on Windows.
llvm-svn: 193905
This patch adds "-target x86_64" to the command line. Without this option,
a 32 bit object file would be created on 32 bit machine, resulting in test
failure.
llvm-svn: 193904
Enable this for the following flavors
a) core
b) gnu
c) darwin
Its disabled for the flavor PECOFF. Convenient markers are added with FIXME
comments in the Driver that would be removed and code removed from each flavor.
llvm-svn: 193585
__ImageBase is an absolute symbol whose address is the same as the image base
address. What we did before this patch was to create __ImageBase symbol as a
symbol whose *contents* (not location) is the image base address, which is
clearly wrong.
llvm-svn: 193565
Disable tests to be run with REQUIRES: disable. Note disable is not added to the
config by the test runner Mkaefiles, so essentially disables the test.
Code changes would be required to fix these tests :-
test/darwin/hello-world.objtxt
test/elf/check.test
test/elf/phdr.test
test/elf/ppc.test
test/elf/undef-from-main-dso.test
test/elf/X86_64/note-sections-ro_plus_rw.test
test/pecoff/alignment.test
test/pecoff/base-reloc.test
test/pecoff/bss-section.test
test/pecoff/drectve.test
test/pecoff/dynamic.test
test/pecoff/dynamicbase.test
test/pecoff/entry.test
test/pecoff/hello.test
test/pecoff/imagebase.test
test/pecoff/importlib.test
test/pecoff/lib.test
test/pecoff/multi.test
test/pecoff/reloc.test
test/pecoff/weak-external.test
llvm-svn: 193300
This patch fixes a bug in r190608. The results of a comparison function
passed to std::sort must be transitive, which is, if a < b and b < c, and if
a != b, a < c must be also true. CompareAtoms::compare did not actually
guarantee the transitivity. As a result the sort results were sometimes just
wrong.
Consider there are three atoms, X, Y, and Z, whose file ordinals are 1, 2, 3,
respectively. Z has a property "layout-after X". In this case, all the
following conditionals become true:
X < Y because X's ordinal is less than Y's
Y < Z because Y's ordinal is less than Z's
Z < X because of the layout-after relationship
This is not of course transitive. The reason why this happened is because
we used follow-on relationships for comparison if two atoms falls in the same
follow-on chain, but we used each atom's properties if they did not. This patch
fixes the issue by using follow-on root atoms for comparison to get consistent
results.
Differential Revision: http://llvm-reviews.chandlerc.com/D1980
llvm-svn: 193029
We should dead-strip atoms only if they are created for COMDAT symbols. If we
remove non-COMDAT atoms from a binary, it will no longer be guaranteed that
the binary will work correctly.
In COFF, you can manipulate the order of section contents in the resulting
binary by section name. For example, if you have four sections
.data$unique_prefix_{a,b,c,d}, it's guaranteed that the contents of A, B, C,
and D will be consecutive in the resulting .data section in that order.
Thus, you can access B's and C's contents by incrementing a pointer pointing
to A until it reached to D. That's why we cannot dead-strip B or C even if
no one is directly referencing to them.
Some object files in the standard library actually use that technique.
llvm-svn: 193017
Dead-strip root symbols can be undefined atoms, but should not really be
nonexistent, because dead-strip root symbols should be added to initial
undefined atoms at startup. Whenever you look up its name in the symbol
table, some type of atom will always exist.
llvm-svn: 192831
There are aliases for --start-group/--end-group options represented
by -( and -) respectively in the command line.
This change adds and improves the test for the alias options to be
tested.
Looks like users use this option widely than explicitly using
--start-group/--end-group.
llvm-svn: 192470
-- so that command line options to specify new input files, such as
/defaultlib:foo, is handled properly. Such options were ignored before
this patch.
llvm-svn: 192342
This associates resolveState to FileNodes. The control node derive
their resolution state from the inputElements that are contained in
it.
This makes --start-group/--end-group to work with ELF linking.
llvm-svn: 192269
Changes :-
a) Functionality in InputGraph to insert Input elements at any position
b) Functionality in the Resolver to use nextFile
c) Move the functionality of assigning file ordinals to InputGraph
d) Changes all inputs to MemoryBuffers
e) Remove LinkerInput, InputFiles, ReaderArchive
llvm-svn: 192081
Found this with asan. Code assumes that find doesn't return end, thus if
both atoms didn't have followon roots it would still compare their positions.
llvm-svn: 191865
This will eventually need to be refactored to better handle COPY relocations,
as other relocations can also generate them. I'm not yet sure the exact
circumstances in which they are needed yet.
llvm-svn: 191567
We used to support both Windows and Unix style command line options. In Windows
style, an option and its value are separated by ":" (colon). In Unix, separator
is a space. Accepting both styles were convenient, but we can no longer allow
Unix style because I found that can be ambiguous.
For example, /nodefaultlib option takes an optional argument. In Windows style
it's going to be something like "/nodefaultlib:foo". There's no ambiguity what
"foo" means. However, if the option is "/nodefaultlib foo", "foo" can be
interpreted either an optional argument for "/nodefaultlib" or an input file
"foo.obj". We should just stop accepting the non-standard command line style.
llvm-svn: 191247
Summary:
This patch changes WriterPECOFF to actually write down the address instead of ignoring it.
Also, it changes the order of adding the BaseReloc chunk as otherwise the address wasn't set yet.
I think a better way of doing it would be to change DataDirectoryAtom to create a Reference
instead of using a number, and to change IdataPass accordingly, but I'm not sure how to do that.
Reviewers: ruiu
Reviewed By: ruiu
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1743
llvm-svn: 191220
Summary: This patch changes WritePECOFF to calculate the value of the SizeOfHeaders PE header field instead of just using 512.
Reviewers: rui314, ruiu
Reviewed By: ruiu
CC: llvm-commits, ruiu
Differential Revision: http://llvm-reviews.chandlerc.com/D1708
llvm-svn: 191212
This adds an option --output-filetype that can be set to either
YAML/Native(case insensitive). The linker would create the outputs
associated with the type specified by the user.
Changes all the tests to use the new option.
llvm-svn: 191183
This also makes it support debugging executables built with lld.
Initial patch done by Bigcheese. This is only a revised patch to
have the functionality in the Writer.
llvm-svn: 191032
This sets the sectionChoice property for DefinedAtoms. The output section name
is derived by the property of the atom. This also decreases native file size.
Adds a test.
llvm-svn: 190840
This patch changes lld to go through all sections while calculating the size
for SizeOfCode, SizeOfInitializedData and SizeOfUninitializedData fields in the
PE header, instead of using only a small set of hard-coded sections.
This only really changes SizeOfInitializedData which didn't include .reloc
section before this patch.
Patch by Ron Ofir.
llvm-svn: 190799
This patch sets the IMAGE_SCN_MEM_DISCARDABLE characteristic to the base
relocations section in order to match MS PECOFF specification.
Patch by Ron Ofir.
llvm-svn: 190798
There was a bug that if a section has an alignment requirement and there are
multiple symbols at offset 0 in the section, only the last atom at offset 0
would be aligned properly. That bug would move only the last symbol to an
alignment boundary, leaving other symbols unaligned, although they should be at
the same location. That caused a mysterious SEGV error of the resultant
executable.
With this patch, we manage all symbols at the same location properly, rather
than keeping the last one.
llvm-svn: 190724
This handles multiple weak symbols which appear back to back. This fix is needed
which otherwise will lead to symbols getting initialized to arbitrary values.
There was a constructor/destructor test that really triggered this to be fixed
on X86_64.
Adds a test.
llvm-svn: 190658
In COFF, an undefined symbol can have up to one alternative name. If a symbol
is resolved by its regular name, then it's linked normally. If a symbol is not
found in any input files, all references to the regular name are resolved using
the alternative name. If the alternative name is not found, it's a link error.
This mechanism is called "weak externals".
To support this mechanism, I added a new member function fallback() to undefined
atom. If an undefined atom has the second name, fallback() returns a new undefined
atom that should be used instead of the original one to resolve undefines. If it
does not have the second name, the function returns nullptr.
Differential Revision: http://llvm-reviews.chandlerc.com/D1550
llvm-svn: 190625
We need to order atoms that exist in the same chain. This is to make sure that
the command line order is preserved when we emit the atoms to the output file.
Credits: BigCheese for finding the bug.
Adds a test which otherwise would fail.
llvm-svn: 190608
attribute in LinkerInput to isWholeArchive and use that for deciding
whether library archives should be expanded. Implement the -all_load
option of the Darwin linker using this flag and drop the support for it
in GNU mode.
llvm-svn: 190275