Accept /machine:arm as an argument. This is changed to support ARM NT.
Although there is no way to differentiate between ARM (Windows CE) and ARM NT
(Windows on ARM), since LLVM currently only supports Windows on ARM, simply take
/machine:arm to mean Windows on ARM.
llvm-svn: 218105
Rather than saving whether we are targeting 64-bit x86 (x86_64), simply convert
the single use of that information to the actual relocation type. This will
permit the selection of non-x86 relocation types (e.g. for WoA support).
Inline the access of the machine type field as it is relatively cheap (a couple
of pointer dereferences) rather than storing the relocation type as a member
variable.
llvm-svn: 218104
When we encounter an unknown machine type, we print out the machine type magic.
However, we would print out the magic in decimal rather than hex. Perform this
conversion to make it easier to identify what machine is unsupported.
llvm-svn: 218103
This patch fixes a forbidden use of Twine. It should only be used
as an intermediary value, but never stored.
This caused a bug in lld when running on Linux and compiled with
optimizations - it couldn't properly search libs.
Patch from Rafael Auler!
llvm-svn: 218083
I made LLD to report an error if /safeseh:no option is given on x64,
but it turned out MSVC link.exe doesn't report error on it.
Removing the check.
llvm-svn: 218077
The contents from section .CRT$XLA to .CRT$XLZ is an array of function
pointers. They are called by the runtime when a new thread is created
or (gracefully) terminated.
You can make your own initialization function to be called by that
mechanism. All you have to do is:
- Define a pointer to a function in a .CRT$XL* section using pragma
- Make an external reference to "__tls_used" symbol
That technique is used in many projects. This patch is to support that.
What this patch does is to set the relative virtual address of
"__tls_used" to the PECOFF directory table. __tls_used is actually a
struct containing pointers to a symbol in .CRT$XLA and another symbol
in .CRT$XLZ. The runtime looks at the directory table, gets the address
of the struct, and call the function pointers between XLA and XLZ.
llvm-svn: 218007
On darwin, the linker tools records which dylib (DSO) each undefined was found
in, and then at runtime, the loader (dyld) only looks in that one specific
dylib for each undefined symbol. Now that llvm-objdump can display that info
I can write test cases.
llvm-svn: 217898
The provided base must also be a multiple of the system's page size, which is a
reasonable enough demand.
Also check the other diagnostics more thoroughly.
llvm-svn: 217577
The existing system linkers on Darwin and Linux are called "ld". We'd like to
eventually drop in lld as "ld" and have it just work. But lld is a universal
linker that requires the first option to be -flavor to know which command line
mode to emulate (gnu or darwin).
This change tests if argv[0] is "ld" and if so, if the tool was built on MacOSX
then assume the darwin flavor otherwise the gnu flavor. There are two test
cases which copy lld to "ld" and then run it. One for darwin and one for linux.
llvm-svn: 217566
lld shouldn't directly use the COFF header nor should it use raw
coff_symbols. Instead, query the header properties from the
COFFObjectFile and use COFFSymbolRef to abstractly reference COFF
symbols.
This is just enough to get lld compiling with the changes to
llvm::object. Bigobj specific testing will come later.
Differential Revision: http://reviews.llvm.org/D5280
llvm-svn: 217497
Most of the changes are in the new file ArchHandler_arm64.cpp. But a few
things had to be fixed to support 16KB pages (instead of 4KB) which iOS arm64
requires. In addition the StubInfo struct had to be expanded because
arm64 uses two instruction (ADRP/LDR) to load a global which requires two
relocations. The other mach-o arches just needed one relocation.
llvm-svn: 217469
There is a bit (MH_PIE) in the flags field of the mach_header which tells
the kernel is a program was built position independent (for ASLR). The linker
automatically attempts to build programs PIE if they are built for a recent
OS version. But the -pie and -no_pie options override that default behavior.
llvm-svn: 217408
defined in a shared library.
Now LLD does not export a strong defined symbol if it coalesces away a
weak symbol defined in a shared library. This bug affects all ELF
architectures and leads to segfault:
% cat foo.c
extern int __attribute__((weak)) flag;
int foo() { return flag; }
% cat main.c
int flag = 1;
int foo();
int main() { return foo() == 1 ? 0 : -1; }
% clang -c -fPIC foo.c main.c
% lld -flavor gnu -target x86_64 -shared -o libfoo.so ... foo.o
% lld -flavor gnu -target x86_64 -o a.out ... main.o libfoo.so
% ./a.out
Segmentation fault
The problem is caused by the fact that we lose all information about
coalesced symbols after the `Resolver::resolve()` method is finished.
The patch solves the problem by overriding the
`LinkingContext::notifySymbolTableCoalesce()` method and saving names
of coalesced symbols. Later in the `buildDynamicSymbolTable()` routine
we use this information to export these symbols.
llvm-svn: 217363
When a file is not found, produce a proper error message. The previous error
message produced a file format error, which made me wonder for a while why
there is a file format error, but essentially the file was not found.
This fixes the problem by producing a proper error message.
llvm-svn: 217359
By default linker would not create a separate segment to hold read only data.
This option overrides that behavior by creating the a separate read only segment
for read only data.
llvm-svn: 217358
Moved code used only by isDataSymbol from find to isDataSymbol member
function. Also changed the return type of isDataSymbol because
previously "if (isDataSymbol(...))" meant "if it is *not* a data symbol"
which is opposite from what you'd expect.
llvm-svn: 217285
If we are creating a PE+ executable, we need to run cvtres with
/machine:x64 instead of /machine:x86. Otherwise the resulting executable
would be invalid.
llvm-svn: 217214
Mach-O has a "fat" (or "universal") variant where the same contents built for
different architectures are concatenated into one file with a table-of-contents
header at the start. But this leaves a dilemma for the linker - which
architecture to use.
Normally, the linker command line -arch is used to force which slice of any fat
files are used. The clang compiler always passes -arch to the linker when
invoking it. But some Makefiles invoke the linker directly and don’t specify
the -arch option. For those cases, the linker scans all input files in command
line order and finds the first non-fat object file. Whatever architecture it
is becomes the architecture for the link.
llvm-svn: 217189
The use of default: was disabling the warning about unused enumerators. Fix
that, then fix the one enumerator that was not handled. Add coverage for
it in test suite.
llvm-svn: 217078
On Darwin at runtime, dyld will prefer to use the export trie of a dylib instead
of the traditional symbol table (which is large and requires a binary search).
This change enables the linker to generate an export trie and to prefer it if
found in a dylib being linked against. This also simples the yaml for dylibs
because the yaml form of the trie can be reduced to just a sequence of names.
llvm-svn: 217066
I hope this is the last fix for x64 relocations as I've wasted
a few days on this.
This caused a mysterious issue that some C++ programs crash on
startup. It was because a null pointer is passed as argv to main.
__tmainCRTStartup calls main, but before that it calls all
initialization routines between .text$xc_a and .text$xc_z.
pre_cpp_init is one of such routines, and it is the one who
initializes a heap pointer for argv for later use. That routine
was not called for some reason.
It turned out that __tmainCRTStartup was skipping a block of
code because of the relocation bug. A condition in the function
depends on a memory load, and that memory load was referring
a wrong location. As a result a jump instruction took the
wrong branch, skipping pre_cpp_init and so on.
This patch fixes the issue. Also added more tests to fix them
once and for all.
llvm-svn: 216772
When a relocation is applied to a location, the new value needs
to be added to the existing value at the location. Existing
value is in most cases zero, but if not, the current code does
not work.
llvm-svn: 216680
Image Base field in the PE/COFF header is used as hint for the loader.
If the loader can load the executable at the specified address, that's
fine, but if not, it has to load it at a different address.
If that happens, the loader has to fix up the addresses in the
executable by adding the offset. The list of addresses that need to
be fixed is in .reloc section.
This patch is to emit x64 .reloc section contents.
llvm-svn: 216636
IMAGE_REL_AMD64_ADDR64 relocation should set 64-bit *VA* (virtual
address) instead of *RVA* (relative virtual address), so we have
to add the iamge base to the target's RVA.
llvm-svn: 216512
The implementation of AMD64 relocations was imcomplete
and wrong. On AMD64, we of course have to use AMD64
relocations instead of i386 ones. This patch fixes the
issue.
LLD is now able to link hello64.obj (created from
hello64.asm) against user32.lib and kernel32.lib to
create a Win64 binary.
llvm-svn: 216253
Mach-O symbols can have an attribute on them means their content should never be
dead code stripped. This translates to deadStrip() == deadStripNever.
llvm-svn: 216234
Both options control the final scope of atoms.
When -exported_symbols_list <file> is used, the file is parsed into one
symbol per line in the file. Only those symbols will be exported (global)
in the final linked image.
The -keep_private_externs option is only used with -r mode. Normally, -r
mode reduces private extern (scopeLinkageUnit) symbols to non-external. But
add the -keep_private_externs option keeps them private external.
llvm-svn: 216146
This is the one interesting aspect from:
http://reviews.llvm.org/D4965
These hooks are useful for flavor specific processing, such as recording that
a DefinedAtom replaced a weak SharedLibraryAtom.
llvm-svn: 216122
The darwin linker has an option, heavily used by Xcode, in which, instead
of listing all input files on the command line, the input file paths are
written to a text file and the path of that text file is passed to the linker
with the -filelist option (similar to @file).
In order to make test cases for this, I generalized the -test_libresolution
option to become -test_file_usage.
llvm-svn: 215762
Darwin has a packaging mechanism for shared libraries and headers called
frameworks. A directory Foo.framework contains a shared library binary file
"Foo" and a subdirectory "Headers". Most OS frameworks are all in one
directory /System/Library/Frameworks/. As a linking convenience, the linker
option "-framework Foo" means search the framework directories specified
with -F (analogous to -L) looking for a shared library Foo.framework/Foo.
llvm-svn: 215680
In general two-level namespace means each program records exactly which dylib
each undefined (imported) symbol comes from. But, sometimes the implementor
wants to hide the implementation dylib. For instance libSytem.dylib is the base
dylib all Darwin programs must link with. A few years ago it was split up
into two dozen dylibs by all are hidden behind libSystem.dylib which re-exports
each sub-dylib. All clients still think libSystem.dylib is the implementor.
To support this, the linker must load "indirect" dylibs and not just the
"direct" dylibs specified on the command line. This is done in the
createImplicitFiles() method after all command line specified files are
loaded. Since an indirect dylib may have already been loaded as a direct dylib
(or indirectly via a previous direct dylib), the MachOLinkingContext keeps
a list of all loaded dylibs.
With this change hello world can now be linked against the real OS or SDK.
llvm-svn: 215605
Split up the CRuntimeFile into one part for output types that need an entry
point and another part for output types that use stubs.
Add file 'test/mach-o/Inputs/libSystem.yaml' for use by test cases that
use -dylib and therefore may now need the helper symbol in libSystem.dylib.
llvm-svn: 215602
Mach-o uses "two-level namespace" where each undefined symbols is associated
with a specific dylib. This means at runtime the loader (dyld) does need to
search all loaded dylibs for that symbol but rather just the one specified.
Now that llvm-nm -m prints out that info, properly set that info, and test
it in the hello world test cases.
llvm-svn: 215598
This patch adds the initial ELF/AArch64 support to lld. Only a basic "Hello
World" app has been successfully tested for both dynamic and static compiling.
Differential Revision: http://reviews.llvm.org/D4778
Patch by Daniel Stewart <stewartd@codeaurora.org>!
llvm-svn: 215544
The tests assume the path separator is '/', but if you run
them on Windows it is '\'. As a result the tests are failing
on Windows. This should be the minimal change to make these
tests to pass on Windows platform.
Differential Revision: http://reviews.llvm.org/D4710
llvm-svn: 214990
/INCLUDE arguments passed as command line options are handled in the
same way as Unix -u. All option values are converted to an undefined
symbol and added to a dummy input file, so that the specified symbols
are resolved.
One tricky thing on Windows is that the option is also allowed to
appear in the object file's directive section. At the time when
it's being read, all (regular) command line options have already
been processed. We cannot add undefined atoms to the dummy file
anymore.
Previously, we added such /INCLUDE to a set that has already been
processed. As a result the options were ignored.
This patch fixes the issue. Now, /INCLUDE symbols in the directive
section are handled as real undefined symbol in the COFF file.
We create an undefined symbol for each /INCLUDE argument and add
it to the file being parsed.
llvm-svn: 214824
The PE/COFF spec says that SizeOfRawData field in the section
header must be a multiple of FileAlignment from the optional
header. LLD emits 512 as FileAlignment, so it must have been
a multiple of 512.
LLD did not follow that. It emitted the actual section size
without the last padding as the SizeOfRawData. Although it's
not correct as per the spec, the Windows loader doesn't seem
to actually bother to check that. Executables created by LLD
worked fine.
However, tools dealing with executalbe files may expect it
to be the correct value, and one instance of it is mt.exe
tool distributed as a part of Windows SDK.
If CMake is invoked with "-E vs_link_exe" option, it silently
run mt.exe to embed a resource file to the resulting file.
And mt.exe sometimes breaks an input file if it's section
header does not follow the standard. That caused a misterous
error that CMake with Ninja occasionally produces a broken
executable.
This patch fixes the section header to make mt.exe and
other tools happy.
llvm-svn: 214453
In some cases the address of a function will be materialized with a movw/movt
pair. If the function is a thumb function, the low bit needs to be set on
the movw immediate value.
llvm-svn: 214277
The -sectalign option is used to increase the alignment required for a section.
It required some reworking of how the __TEXT segment is laid out because that
segment also contains the mach_header and load commands. And the size of load
commands depend on the number of segments, sections, and dependent dylibs used.
Using this option will simplify some future test cases because the final
address of code can be pinned down, making tests of its content easier.
llvm-svn: 214268
All iOS arm processor support switching between arm and thumb mode at call sites
by using the BLX instruction (instead of BL). But the compiler does not know
the implementation mode for extern functions, so the linker must update BL/BLX
instructions to match what is linked is actually linked together. In addition,
pointers to functions (such as vtables) must have the low bit set if the target
of the pointer is a thumb mode function.
llvm-svn: 214140
The following expression
m[i] = m[j]
where m is a DenseMap and i != j is not safe. m[j] returns a
reference, which would be invalidated when a rehashing occurs.
If rehashing occurs to make room for m[i], m[j] becomes
invalid, and that invalid reference would be used as the RHS
value of the expression.
llvm-svn: 213969
Sometimes compilers emit data into code sections (e.g. constant pools or
jump tables). These runs of data can throw off disassemblers. The solution
in mach-o is that ranges of data-in-code are encoded into a table pointed to
by the LC_DATA_IN_CODE load command.
The way the data-in-code information is encoded into lld's Atom model is that
that start and end of each data run is marked with a Reference whose offset
is the start/end of the data run. For arm, the switch back to code also marks
whether it is thumb or arm code.
llvm-svn: 213901
insertElementAt(x, END) does the identical thing as addInputElement(x),
so the only reasonable use of insertElementAt is to call it with the
other possible argument, BEGIN. That means the second parameter of the
function is just redundant. This patch is to remove the second
parameter and rename the function accordingly.
llvm-svn: 213821
The entry point file needs to be processed after all other
object files and before all .lib files. It was processed
after .lib files. That caused an issue that the entry point
function was not resolved from the standard library files.
llvm-svn: 213804
On Windows there are four "main" functions -- main, wmain, WinMain,
or wWinMain. Their parameter types are diffferent. The standard
library provides four different entry functions (i.e.
{w,}{WinMain,main}CRTStartup) for them. You need to use the right
entry routine for your "main" function.
If you give an /entry option, the specified name is used
unconditionally.
Otherwise, the linker needs to select the right one based on
user-supplied entry point function. This can be done after the
linker reads all the input files.
This patch moves the code to determine the entry point function
from the driver to a virtual input file. It also implements the
correct logic for the entry point function selection.
llvm-svn: 213713
This patch just supports marking ranges that are thumb code (vs arm code).
Future patches will mark data and jump table ranges. The ranges are encoded
as References with offsetInAtom being the start of the range and the target
being the same atom.
llvm-svn: 213712
This is a part of a larger change to move the entry point
processing to a later pass than the driver. On Windows the default
entry point function varies depending on user-provided functions.
That means the driver is not able to correctly know the entry point
function name. Only passes after the core linker can infer it.
llvm-svn: 213697
Over time the symbols and relocations have changed for dwarf unwind info
in the __eh_frame section. Add test cases for older and new style.
llvm-svn: 213585
Add support for adding section relocations in -r mode. Enhance the test
cases which validate the parsing of .o files to also round trip. They now
write out the .o file and then parse that, verifying all relocations survived
the round trip.
llvm-svn: 213333
The code to manage resolvable symbols is now separated from
ExportedSymbolRenameFile so that other class can reuse it.
I'm planning to use it to find the entry function symbol
based on resolvable symbols.
llvm-svn: 213322
All architecture specific handling is now done in the appropriate
ArchHandler subclass.
The StubsPass and GOTPass have been simplified. All architecture specific
variations in stubs are now encoded in a table which is vended by the
current ArchHandler.
llvm-svn: 213187
There are two forms of `-l` prefixed expression:
* -l<libname>
* -l:<filename>
In the first case a linker should construct a full library name
`lib + libname + .[so|a]` and search this library as usual. In the second case
a linker should use the `<filename>` as is and search this file through library
search directories.
The patch reviewed by Shankar Easwaran.
llvm-svn: 213077
Previously we invoked cvtres.exe for each compiled Windows
resource file. The generated files were then concatenated
and embedded to the executable.
That was not the correct way to merge compiled Windows
resource files. If you just concatenate generated files,
only the first file would be recognized and the rest would
be ignored as trailing garbage.
The right way to merge them is to call cvtres.exe with
multiple input files. In this patch we do that in the
Windows driver.
llvm-svn: 212763
These behave slightly idiosyncratically in the best of cases, and have
additional hacks layered on top of that for compatibility with badly behaved
build systems (via ld64).
For -lXYZ:
+ If XYZ is actually XY.o then search all library paths for XY.o
+ Otherwise search all library paths, first for libXYZ.dylib, then libXYZ.a
+ By default the library paths are /usr/lib and /usr/local/lib in that order.
For -syslibroot:
+ -syslibroot options apply to absolute paths in the search order.
+ All -syslibroot prefixes that exist are added to the search path *instead*
of the original.
+ If no -syslibroot prefixed path exists, the original is kept.
+ Hacks^WExceptions:
+ If only 1 -syslibroot is given and doesn't contain /usr/lib or
/usr/local/lib, that path is dropped entirely. (rdar://problem/6438270).
+ If the last -syslibroot is "/", all of them are ignored entirely.
(rdar://problem/5829579).
At least, that's my best interpretation of what ld64 does in buildSearchPaths.
llvm-svn: 212706
Previously the alignment of the .bss section was not
properly set because of a bug in AtomizeDefinedSymbolsInSection.
We set the alignment of the section at the end of the function,
but we use an eraly return for the .bss section, so the code had
been skipped.
llvm-svn: 212571
This converts the very complicated mach-o arm
relocations into the simple Reference Kinds in lld.
The next patch will use the internal Reference kinds
to fix up arm/thumb code.
llvm-svn: 212306
When trying to map atom types to sections, we were iterating through an array
until we hit a sentinel value. There's no need for such dances when range-based
for loops are available.
llvm-svn: 212035
This isn't really the right place to put them in final object files (that would
be __TEXT,__unwind_info), but the format is different between relocatable and
final objects, which means we really need a pass to handle the translation.
For now, re-emitting in __LD,__compact_unwind is harmless (dyld ignores it and
moves straight on to inspecting __TEXT,__eh_frame), and sidesteps an assertion
failure when processing files containing compact-unwind info.
llvm-svn: 212032
Segments must occupy a multiple of the page size in memory (4096 currently). We
check for this when emitting files, but the placement algorithm broke down for
the second non-__TEXT segment encountered: the offset wasn't aligned up to 4096
before starting its layout.
llvm-svn: 212031
Because of how we were calculating fileOffset and fileSize for segments, most
ended up at a single offset in a finalised MachO file. This meant the data
often didn't even get written in the final object, let alone where it would be
useful.
llvm-svn: 212030
This is first step in reworking how mach-o relocations are processed.
The existing KindHandler is going to become a delgate/helper object for
processing architecture specific references. The KindHandler knows how
to convert mach-o relocations into References and back, as well, as fixing
up the content the relocation is on.
One of the messy things about mach-o relocations is that they sometime
come in pairs, but the pairs still convert to one lld::Reference. So, the
conversion has to detect pairs (arch specific) and change the stride.
llvm-svn: 211921
The previous function returned true for "s < s", which could completely mess up
the sorting of symbols within a section.
Unfortunately, I don't think there's a robust way to write a test for this.
Anything I come up with will be making assumptions about the particular
implementation of std::sort.
llvm-svn: 211704
When looking through sections with zero-terminated string-literals (__cstring
or __ustring) we were constantly rechecking the first few bytes of the string
for '\0' rather than advancing along. This obviously failed unless all strings
within the section had the same length as that first one.
llvm-svn: 211682
We were trying to examine the first symbol in a section that we wanted to
atomize by symbols, even when there wasn't one. Instead, we should make the
initial anonymous symbol cover the entire section in that situation.
llvm-svn: 211681
dynamic symbol table populating and DT_NEEDED tag creation.
The `isDynSymEntryRequired` function returns true if the specified shared
library atom requires a dynamic symbol table entry. The `isNeededTagRequired`
function returns true if we need to create DT_NEEDED tag for the shared
library defined specified shared atom.
By default the both functions return true. So there is no functional changes
for all targets except MIPS. Probably we need to spread the same modifications
on other ELF targets but I want to implement and fully tested complete set of
changes for MIPS target first.
For MIPS we create a dynamic symbol table entry for a shared library atom iif
this atom is referenced by a regular defined atom. For example, if library L1
defines symbol T1, library L2 defines symbol T2 and uses symbol T1
and executable file E1 uses symbol T2 but does not use symbol T1 we create
an entry in the E1 dynamic symbol table for symbol T2 and do not create
an entry for T1.
The patch creates DT_NEEDED tags for shared libraries contain shared library
atoms which a) referenced by regular defined atoms; b) have corresponding
copy dynamic relocations (R_MIPS_COPY).
Now the patch does not take in account --as-needed / --no-as-needed command
line options. So it is too restrictive and create DT_NEEDED tags for really
needed shared libraries only. I plan to fix that by subsequent patches.
llvm-svn: 211674
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
I added a new reference type, kindAssociate. If a target atom is
coalesced away, the referring atom is removed by Resolver, so that
they are treated as a group.
Differential Revision: http://reviews.llvm.org/D4028
llvm-svn: 211106
This code was never being used and any use of it would look fairly strange.
For example, it would try to map a NativeReaderError::file_malformed to
std::errc::invalid_argument.
llvm-svn: 210913
The previous commit uncovered a bug in the mach-o writer whereby two __text
sections were created. But the test case did not catch that. So I updated
the test case to run the linker a second time, reading the output of the
first pass.
llvm-svn: 210624
isCoalescedAway(x) is faster than replacement(x) != x as the former
does not follow the replacement atom chain. Also it's easier to use.
llvm-svn: 210242
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
Implementing such feature is easy. We can add a reference from a
target atom to an original atom, so that if the target is linked,
the original atom is also linked. If not linked, both will be
dead-stripped. So they are treated as a group.
I added a new reference type, kindAssociate. It does nothing except
preventing referenced atoms from being dead-stripped.
No change to the Resolver is needed.
Reviewers: Bigcheese, shankarke, atanasyan
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3946
llvm-svn: 210240
This provides support for the autoconfing & make build style.
The format, style and implementation follows that used within the llvm and clang projects.
TODO: implement out-of-source documentation builds.
llvm-svn: 210177
Previously FileArchive ctor comment said that only its subclasses
can be instantiated, but the ctor is actually public and is
instantiated by ArchiveReader.
Remove the wrong comment and reorder the member functions so that
public members appear before private ones.
llvm-svn: 210175
In sections that are broken into atoms at symbols, if the first symbol in the
section is not at the start of the section, then make an anonymous atom for
the section content that is before the first symbol.
llvm-svn: 210142
Previously each section kind had its own code to loop over the section and
parse it into atoms. This refactoring has two tables. The first maps sections
to ContentType. The second maps ContentType to information on how to find
the atom boundaries.
A few bugs in test cases were discovered as part of the refactoring.
No change in functionality intended.
llvm-svn: 210138
FileToMutable is what this class does, but this class (or, to be precise,
an instance of this class) is a wrapper of the other SimpleFile. It's odd
that the class was named like a function.
llvm-svn: 210089
Previously section groups are doubly linked to their children.
That is, an atom representing a group has group-child references
to its group contents, and content atoms also have group-parent
references to the group atom. That relationship was invariant;
if X has a group-child edge to Y, Y must have a group-parent
edge to X.
However we were not using group-parent references at all. The
resolver only needs group-child edges.
This patch simplifies the section group by removing the unused
reverse edge. No functionality change intended.
Differential Revision: http://reviews.llvm.org/D3945
llvm-svn: 210066
Layout-before edges are no longer used for layout, but they are
still there for dead-stripping. If we would just remove them
from code, LLD would wrongly remove live atoms that were
referenced by layout-befores.
This patch fixes the issue. Before dead-stripping, it scans all
atoms to construct a reverse map for layout-after edges. Dead-
stripping pass uses the map to traverse the graph.
Differential Revision: http://reviews.llvm.org/D3986
llvm-svn: 210057
Reference::target() never returns a nullptr, so NULL check
is not needed and is more harmful than doing nothing.
No functionality change.
llvm-svn: 210008
Arrange .ctors/.dtors sections in the following order:
.ctors from crtbegin.o or crtbegin?.o
.ctors from regular object files
.ctors.* (sorted) from regular object files
.ctors from crtend.o or crtend?.o
This order is specific for MIPS traget. For example, on X86
the .ctors.* sections are merged into the .init_array section.
llvm-svn: 209987
The main problem is in the predicate passed to the `std::stable_sort()`.
This predicate always returns false if **both** section's names do not
start with `.init_array` or `.fini_array` prefixes. In short, it does not
define a strict weak orderng. Suppose we have the following sections:
.A .init_array.1 .init_array.2
The predicate states that:
not .init_array.1 < .A
not .A < .init_array.2
but .init_array.1 < .init_array.2 !!!
The second problem is that `.init_array` section without number should
go last in the list. Not it has the lowest priority '0' and goes first.
The patch fixes both of the problems.
llvm-svn: 209875
This is a short-term fix to allow lld Readers to return error messages
with dynamic content.
The long term fix will be to enhance ErrorOr<> to work with errors other
than error_code. Or to change the interface to Readers to pass down a
diagnostics object through which all error messages are written.
llvm-svn: 209681
/alternatename is a command line option to define a weak alias. You
can use it as /alternatename:foo=bar to define "foo" as a weak alias
for "bar".
Because it's a command line option, the weak alias mapping is in the
LinkingContext object, and not in a object file being read.
Previously, we looked up the mapping each time we read a new symbol
from a file, to check if there is a weak alias defined for the symbol.
That's not wrong, but had made function signature's a bit complicated --
we had to pass the mapping object to many functions. Now their
parameter lists are much cleaner.
This also has another (unrealized) benefit. parseFile() now read a
file and then add alias symbols to the file. In the first pass a
LinkingContext object is not used at all. That should make it easy
to read files from archive files speculatively, as the first pass
is free from side effect.
llvm-svn: 209486
Alias symbols are SimpleDefinedAtoms and are platform neutral. They
don't have to belong ELF. This patch is to make it available to all
platforms. No functionality change intended.
Differential Revision: http://reviews.llvm.org/D3862
llvm-svn: 209475
addResolvableSymbols() queues input files, and readAllSymbols() reads
from them. In practice it's currently safe because they are called from
a single thread. But it's not guaranteed.
Also, acquiring the same mutex is needed not to see inconsistent memory
contents that is allowed in the C++ memory model.
llvm-svn: 209254
ExportedSymbolRenameFile is not always used. In most cases we don't
need to read given files at all. So lazy load would help. This doesn't
change the meaining of the program.
llvm-svn: 208818
In r205566, I made a change to Resolver so that Resolver revisit
only archive files in --start-group and --end-group pair. That's
not correct, as it also has to revisit DSO files.
This patch is to fix the issue.
Added a test to demonstrate the fix. I confirmed that it succeeded
before r205566, failed after r205566, and is ok with this patch.
Differential Revision: http://reviews.llvm.org/D3734
llvm-svn: 208797
As written in the comment in this patch, symbol names specified with
/export option is resolved in a special way; for /export:foo, linker
finds a foo@<number> symbol if such symbols exists.
On Windows, a function in stdcall calling convention is mangled with
a leading underscore and following "@" and numbers. This name
mangling is kind of automatic, so you can sometimes omit _ and @number
when specifying a symbol. /export option is that case.
Previously, if a file in an archive file foo.lib provides a symbol
_fn@8, and /export:fn is specified, LLD failed to resolve the symbol.
It only tried to find _fn, and failed to find _fn@8. With this patch,
_fn@8 will be searched on the second iteration.
Differential Revision: http://reviews.llvm.org/D3736
llvm-svn: 208754
Make it possible to add observers to an Input Graph, so that files
returned from an Input Graph can be examined before they are
passed to Resolver.
To implement some PE/COFF features we need to know all the symbols
that *can* be solved, including ones in archive files that are not
yet to be read.
Currently, Resolver only maintains a set of symbols that are
already read. It has no knowledge on symbols in skipped files in
an archive file.
There are many ways to implement that. I chose to apply the
observer pattern here because it seems most non-intrusive. We don't
want to mess up Resolver with architecture specific features.
Even in PE/COFF, the feature that needs this mechanism is minor.
So I chose not to modify Resolver, but add a hook to Input Graph.
Differential Revision: http://reviews.llvm.org/D3735
llvm-svn: 208753
If one or more dynamic relocation might modify a read-only section,
dynamic table should contain DT_TEXTREL tag.
The patch introduces new `RelocationTable::canModifyReadonlySection()`
method. This method checks through the relocations to see if any modifies
a read-only section. The DynamicTable class calls this method and emits
the DT_TEXTREL tag if necessary.
The patch reviewed by Rui Ueyama and Shankar Easwaran.
llvm-svn: 208670
We did not actively try to resolve dllexported symbols specified
by /export or by a module definition file. So if exported symbols
would be resolved for other reasons, like other symbols refer to
them, that was fine, but if (unreferenced) exported symbols were
in an archive file, and no one refers to that file in the archive,
they remained unresolved.
That would obviously cause the issue that dllexported symbols are
not in a resultant DLL.
In this patch, we create an undefined symbol for each dllexported
symbol, to let the core linker to resolve it.
llvm-svn: 208452
Previously the handling of exported symbol was wrong if it's
specified in a module definition file in the form of
<externalname>=<internalname>. Export the correct symbol.
llvm-svn: 208446
Previously only GNU driver calls InputGraph::normalize, but its
functionality is not and should not be limited to GNU ld. Other
driver should be able to use it.
Currently only linker scripts use the feature, so this change
won't change the existing behavior.
llvm-svn: 208266
Previously only the toplevel elements were expanded by expandElements().
Now we recursively call getReplacements() to expand input elements even
if they are in, say, in a group.
llvm-svn: 208144
isAlias always returns false and no one is using it. It was
originally added Atom to query if an atom is an alias for another
atom, assuming that alias atoms are different from normal atoms.
We now support atom aliasing, but the way that's implemented is
in a different way than what isAlias assumed. An alias atom is
just a regular defined atom with no content, and it has a layout-
before edge to alias-to atom so that they are layed out at the
same location in the result. So this is dead code, and it doesn't
make much sense to keep it.
llvm-svn: 207884
Export definitions in a module definition file is as follows:
exportedname[=internalname] [@ordinal [NONAME]] [PRIVATE] [DATA]
Previously we did not support =internalname, so users couldn't export
symbols from a DLL with a different name.
llvm-svn: 207827
In general the linker scripts's GROUP command works like a pair
of command line options --start-group/--end-group. But there is
a difference in the files look up algorithm.
The --start-group/--end-group commands use a trivial approach:
a) If the path has '-l' prefix, add 'lib' prefix and '.a'/'.so'
suffix and search the path through library search directories.
b) Otherwise, use the path 'as-is'.
The GROUP command implements more compicated approach:
a) If the path has '-l' prefix, add 'lib' prefix and '.a'/'.so'
suffix and search the path through library search directories.
b) If the path does not have '-l' prefix, and sysroot is configured,
and the path starts with the / character, and the script being
processed is located inside the sysroot, search the path under
the sysroot. Otherwise, try to open the path in the current
directory. If it is not found, search through library search
directories.
https://www.sourceware.org/binutils/docs-2.24/ld/File-Commands.html
The patch reviewed by Shankar Easwaran, Rui Ueyama.
llvm-svn: 207769
When creating a .lib file, we should strip the leading underscore,
but should not strip stdcall atsign suffix. Otherwise produced .lib
files cannot be linked.
llvm-svn: 207729
Previously the input file for the lib.exe command would be removed
as soon as the command exits, so we couldn't write a test to check
the file contents are correct.
This patch adds /lldmoduledeffile: option to retain a copy of the
temporary file at the given file path, so that you can see the file
if you want.
llvm-svn: 207727
element is a FileNode, request error description. If the element is Group,
print hard coded error message. We need to implement a better diagnostics
here but even current solution is better than a segmentation fault output.
llvm-svn: 207691
Linker should create _imp_ symbols for local use only when such
symbols cannot be resolved in any other way. If it overrides real
imported symbols, such symbols remain virtually unresolved without
error, causing odd issues. I observed that a program linked with
LLD entered an infinite loop before reaching main() because of
this issue.
This patch moves the virtual file creating _imp_ symbols to the
very end of the input file list. Previously, the file is at the end
of the library file group. Linker might revisit the group many times,
so it was not really at the end of the input file list.
llvm-svn: 207605
1. Re-implement PLT entries and dynamic relocations emitting to keep PLT
and relocations table in a consistent state.
2. Initialize st_value and st_other fields for dynamic symbols table
entry if this entry corresponds to an external function which address is
taken in a non-PIC executable. In that case the st_value field holds an
address of the function's PLT entry. Also set STO_MIPS_PLT bit in the
st_other field.
llvm-svn: 207494
Implicit symbol for local use implemented in r207141 was not fully
compatible with MSVC link.exe. In r207141, I implemented the feature
in such way that implicit symbols are defined only when they are
exported with /EXPORT option.
After that I found that implicit symbols are defined not only for
dllexported symbols but for all defined symbols. Actually _imp_
implicit symbols have no relationship with the dllexport feature. You
could add _imp_ to any symbol to get a pointer to the symbol, whether
the symbol is dllexported or not. It looks pretty weird to me but
that's what we want if link.exe behaves that way.
Here is a bit about the implementation: Creating all implicit symbols
beforehand is going to be a huge waste of resource. This feature is
rarely used, and MSVC link.exe even prints out a warning message when
it finds this feature is being used. So we create implicit symbols
on demand. There is an archive file that creates implicit symbols when
they are needed.
llvm-svn: 207476
I'm a bit surprised that I have not implemented this yet. This is
definitely needed to handle real-world module definition files.
This patch contains a unit test for r207294.
llvm-svn: 207297
I'm fixing another bug in the parser, and I wanted to submit this
fix as a separate change as it's logically independent from the other.
I'll add a test for this shortly.
llvm-svn: 207294
This patch is to fix a compatibility issue with MSVC link.exe as to
use of dllexported symbols inside DLL.
A DLL exports two symbols for a function. One is non-decorated one,
and the other is with __imp_ prefix. The former is a function that
you can directly call, and the latter is a pointer to the function.
These dllexported symbols are created by linker for programs that
link against the DLL. So, I naturally believed that __imp_ symbols
become available when you once create a DLL and link against it, but
they don't exist until then. And that's not true.
MSVC link.exe is smart enough to allow users to use __imp_ symbols
locally. That is, if a symbol is specified with /export option, it
implicitly creates a new symbol with __imp_ prefix as a pointer to
the exported symbol. This feature allows the following program to
be linked and run, although _imp__hello is not defined in this code.
#include <stdio.h>
__declspec(dllexport)
void hello(void) { printf("Hello\n"); }
extern void (*_imp__hello)(void);
int main() {
_imp__hello();
return 0;
}
MSVC link.exe prints out the following warning when linking it.
LNK4217: locally defined symbol _hello imported in function _main
Using __imp_ symbols locally is I think not a good coding style. One
should just take an address using "&" operator rather than appending
__imp_ prefix. However, there are programs in the wild that depends
on this link.exe's behavior, so we need this feature.
llvm-svn: 207141
Not all symbols are decorated with an underscore in x86. You can
write undecorated symbols in assembly, for example. Thus this
assertion is too strong.
llvm-svn: 207125
We don't use sections with IMAGE_SYM_DEBUG attribute so we basically
want to the symbols for them when reading symbol table. When we skip
them, we need to skip auxiliary symbols too. Otherwise weird error
would happen because aux symbols would be interpreted as regular ones.
llvm-svn: 206931