lld was regenerating LC_DATA_IN_CODE in .o output files, but not into
final linked images.
Update test case to verify data-in-code info makes it into final linked images.
llvm-svn: 220827
Objective-C switched to a new ABI which uses a different mangling for class
names. But to keep projects building that use export lists that use the old
class name mangling, the linker recognizes the old names and transforms them
to the new mangling.
llvm-svn: 220598
In final linked shared images, the __TEXT segment contains both code and
the mach-o header/load-commands. In the case of a data-only dylib, there is
no code, so we need to force the addition of the __TEXT segment.
llvm-svn: 220597
This is a follow-up patch for r220333. r220333 renames exported symbols.
That raised another issue; if we have both decorated and undecorated names
for the same symbol, we'll end up have two duplicate exported symbol
entries.
This is a fix for that issue by removing duplciate entries.
llvm-svn: 220350
All compiler generated mach-o object files are marked with MH_SUBSECTIONS_VIA_SYMBOLS.
But hand written assembly files need to opt-in if they are written correctly.
The flag means the linker can break up a sections at symbol addresses and
dead strip or re-order functions.
This change recognizes object files without the flag and marks its atoms as
not dead strippable and adds a layout-after chain of references so that the
atoms cannot be re-ordered.
llvm-svn: 220348
There are two ways to specify a symbol to be exported in the module
definition file.
1) EXPORT <external name> = <symbol>
2) EXPORT <symbol>
In (1), you give both external name and internal name. In that case,
the linker tries to find a symbol using the internal name, and write
that address to the export table with the external name. Thus, from
the outer world, the symbol seems to be exported as the external name.
In (2), internal name is basically the same as the external name
with an exception: if you give an undecorated symbol to the EXPORT
directive, and if the linker finds a decorated symbol, the external
name for the symbol will become the decorated symbol.
LLD didn't implement that exception correctly. This patch fixes that.
llvm-svn: 220333
The darwin linker operates differently than the gnu linker with respect to
libraries. The darwin linker first links in all object files from the command
line, then to resolve any remaining undefines, it repeatedly iterates over
libraries on the command line until either all undefines are resolved or no
undefines were resolved in the last pass.
When Shankar made the InputGraph model, the plan for darwin was for the darwin
driver to place all libraries in a group at the end of the InputGraph. Thus
making the darwin model a subset of the gnu model. But it turns out that does
not work because the driver cannot tell if a file is an object or library until
it has been loaded, which happens later.
This solution is to subclass InputGraph for darwin and just iterate the graph
the way darwin linker needs.
llvm-svn: 220330
HAVE_CXXABI_H is not defined on FreeBSD but the system actually
has the header. CMake test fails because the header depends on size_t.
llvm-svn: 220315
The canParse function for all the ELF subtargets check if the input files match
the subtarget.
There were few mismatches in the input files that didnt match the subtarget for
which the link was being invoked, which also acts as a test for this change.
llvm-svn: 220182
For PC relative accesses, negative addends were to be ignored. The linker was
not ignoring it and would fail with an assert. This fixes the issue and is able
to get Helloworld working.
llvm-svn: 220179
This fixes the way archive members are displayed when the linker is used with a
flag to show all the files that it processes.
When an archive file member is read, we need to show the archive filename and
the member.
llvm-svn: 220144
This would permit the ELF reader to check the architecture that is being
selected by the linking process.
This patch also sorts the include files according to LLVM conventions.
llvm-svn: 220129
-all_load tells the darwin linker to immediately load all members of all
archives. The code do that used reinterpret_cast<> instead of dyn_cast<>.
If the file was a dylib, the reinterpret_cast<> turned a pointer to a dylib
into a pointer to an archive...boom.
Added test case to reproduce the crash, simplified the code and used dyn_cast<>.
llvm-svn: 219990
To deal with cycles in shared library dependencies, the darwin linker supports
marking specific link dependencies as "upward". An upward link is when a
lower level library links against a higher level library.
llvm-svn: 219949
This patch creates the import address table and sets its
address to the delay-load import table. This also creates
wrapper functions for __delayLoadHelper2.
x86 only for now.
llvm-svn: 219948
Not all situations are representable in the compressed __unwind_info format,
and when this happens the entry needs to point to the more general __eh_frame
description.
Just x86_64 implementation for now.
rdar://problem/18208653
llvm-svn: 219836
We'll also need references back to the CIE eventually, but for now making sure
we can work out what an FDE is referring to is enough.
The actual kind of reference needs to be different between architectures,
probably because of MachO's chronic shortage of relocation types but I don't
really want to know in case I find out something that distresses me even more.
rdar://problem/18208653
llvm-svn: 219824
Arm code has two instruction encodings "thumb" and "arm". When branching from
one code encoding to another, you need to use an instruction that switches
the instruction mode. Usually the transition only happens at call sites, and
the linker can transform a BL instruction in BLX (or vice versa). But if the
compiler did a tail call optimization and a function ends with a branch (not
branch and link), there is no pc-rel BX instruction.
The ShimPass looks for pc-rel B instructions that will need to switch mode.
For those cases it synthesizes a shim which does the transition, then modifies
the original atom with the B instruction to target to the shim atom.
llvm-svn: 219655
Previously the field was not set. The field should be pointing to
a placeholder where the DLL delay-loader writes the base address
of a DLL.
llvm-svn: 219415
Previously, LLVM object tools didn't know the true size of the sections.
This would result in tools thinking that a section with a VirtualSize
smaller than FileAlignment would end in zeros when actually those zeros
weren't really part of the section contents.
llvm-svn: 219394
This is a partial patch to emit the delay-import table. With this,
LLD is now able to emit the table that llvm-readobj can read and
dump.
The table lacks a few fields, such as the address of HMODULE, the
import address table, etc. They'll be added in subsequent patches.
llvm-svn: 219384
This commit implements in the X86_64 ELF lld backend yet another feature that
was only available in the MIPS backend. However, this patch changes generic ELF
classes to make it trivial for other ELF backends to use this logic too. When
creating a dynamic executable that has dynamic relocations against weak
undefined symbols, these symbols must be exported to the dynamic symbol table
to seek a possible resolution at run time.
A common use case is the __gmon_start__ weak glibc undefined symbol.
Reviewer: shankarke
http://reviews.llvm.org/D5571
llvm-svn: 219349
When creating a dynamic executable and receiving the -E flag, the linker should
export all globally visible symbols in its dynamic symbol table.
This commit also moves the logic that exports symbols in the dynamic symbol
table from OutputELFWriter to the ExecutableWriter class. It is not correct to
leave this at OutputELFWriter because DynamicLibraryWriter, another subclass of
OutputELFWriter, already exports all symbols, meaning we can potentially end up
with duplicated symbols in the dynamic symbol table when creating shared libs.
Reviewers: shankarke
http://reviews.llvm.org/D5585
llvm-svn: 219334
mach-o supports "fat" files which are a header/table-of-contents followed by a
concatenation of mach-o files (or archives of mach-o files) built for
different architectures. Previously, the support for fat files was in the
MachOReader, but that only supported fat .o files and dylibs (not archives).
The fix is to put the fat handing into MachOFileNode. That way any input file
kind (including archives) can be fat. MachOFileNode selects the sub-range
of the fat file that matches the arch being linked and creates a MemoryBuffer
for just that subrange.
llvm-svn: 219268
Previously, we would not check the target machine type and the module (object)
machine type. Add a check to ensure that we do not attempt to use an object
file with a different target architecture.
This change identified a couple of tests which were incorrectly mixing up
architecture types, using x86 input for a x64 target. Adjust the tests
appropriately. The renaming of the input and the architectures covers the
changes to the existing tests.
One significant change to the existing tests is that the newly added test input
for x64 uses the correct user label prefix for X64.
llvm-svn: 219093
This option is added by Xcode when it runs the linker. It produces a binary
file which contains the file the linker used. Xcode uses the info to
dynamically update it dependency tracking.
To check the content of the binary file, the test case uses a python script
to dump the binary file as text which FileCheck can check.
llvm-svn: 219039
When creating the graph edges of the atoms of an ELF file, special care must be
taken with atoms that represent weak symbols. They cannot be the target of any
Reference::kindLayoutAfter edge because they can be merged and point to other
code, screwing up the final layout of the atoms. ELFFile::createAtoms()
correctly handles this corner case. The problem is that createAtoms() assumed
that there can be no zero-sized weak symbols, which is not true. Consider:
my_weak_func1:
my_weak_func2:
my_weak_func3:
code
In this case, we have two zero-sized weak symbols, my_weak_func1 and
my_weak_func2, and one non-zero weak symbol my_weak_func3. createAtoms() would
correctly handle my_weak_func3, but not the first two symbols. This problem
happens in the musl C library when a zero-sized weak symbol is merged and
screws up the file layout. Since this musl code lives at the finalization hooks,
any C program linked with LLD and musl was correctly executing, but segfaulting
at the end.
Reviewers: shankarke
http://reviews.llvm.org/D5606
llvm-svn: 219034
DLL delay importing is a feature to load a DLL lazily, instead of
at program start-up time.
If the feature is turned on with the /delayload flag, the linker
resolves the delay-load helper function. All function pointer table
entries for the DLL are initially pointing to the helper function.
When called, the function loads and resolves the DLL symbols using
dlopen-ish Windows system calls and then write the reuslts to the
function pointer table. The helper function is in "delayimp.lib".
Note that this feature is not completely implemented yet. LLD
also needs to emit the table that's consumed by the delay-load
helper function. That'll be done in another patch.
llvm-svn: 218943
The mergeByContent attribute on DefinedAtoms triggers the symbol table to
coalesce atoms with the exact same content. The problem is that atoms can also
have a required custom section. The coalescing should never change the custom
section of an atom.
The fix is to only consider to atoms to have the same content if their
sectionChoice() and customSectionName() attributes match.
llvm-svn: 218893
The darwin linker has the -demangle option which directs it to demangle C++
(and soon Swift) mangled symbol names. Long term we need some Diagnostics object
for formatting errors and warnings. But for now we have the Core linker just
writing messages to llvm::errs(). So, to enable demangling, I changed the
Resolver to call a LinkingContext method on the symbol name.
To make this more interesting, the demangling code is done via __cxa_demangle()
which is part of the C++ ABI, which is only supported on some platforms, so I
had to conditionalize the code with the config generated HAVE_CXXABI_H.
llvm-svn: 218718
This is yet another edge case of ambiguous name resolution.
When a symbol is specified with /entry:SYM, SYM may be resolved
to the C++ mangled function name (?SYM@@YAXXZ).
llvm-svn: 218706
This is a minimally useful pass to construct the __unwind_info section in a
final object from the various __compact_unwind inputs. Currently it doesn't
produce any compressed pages, only works for x86_64 and will fail if any
function ends up without __compact_unwind.
rdar://problem/18208653
llvm-svn: 218703
MSDN doesn't say about /export:foo=bar style option, but
it turned out MSVC link.exe actually accepts that. So we need that
too.
It also means that the export directive in the module definition
file and /export command line option are functionally equivalent.
llvm-svn: 218695
Summary:
This patch adds support for the general dynamic TLS access model for X86_64 (see www.akkadia.org/drepper/tls.pdf).
To properly support TLS, the patch also changes the __tls_get_addr atom to be a shared library atom instead of a regularly defined atom (the previous lld approach). This closely models the reality of a function that will be resolved at runtime by the dynamic linker and loader itself (ld.so). I was tempted to force LLD to link against ld.so itself to resolve these symbols, but since GNU ld does not need the ld.so library to resolve this symbol, I decided to mimic its behavior and keep hardwired a definition of __tls_get_addr in the lld code.
This patch also moves some important logic that previously was only available to the MIPS lld backend to be used to all ELF backends. This logic, which now lives in the DefaultLayout class, will monitor which external (shared lib) symbols are really imported by the current module and will only populate the dynamic symbol table with used symbols, as opposed to the previous approach of dumping all shared lib symbols in the dynamic symbol table. This is important to this patch to avoid __tls_get_addr from getting injected into all dynamic symbol tables.
By solving the previous problem of always adding __tls_get_addr, now the produced symbol tables are slightly smaller. But this impacted several tests that relied on hardwired/predefined sizes of the symbol table, requiring this patch to update such tests.
Test Plan: Added a LIT test case that exercises a simple use case of TLS variable in a shared library.
Reviewers: ruiu, rafael, Bigcheese, shankarke
Reviewed By: Bigcheese, shankarke
Subscribers: emaste, shankarke, joerg, kledzik, mcrosier, llvm-commits
Projects: #lld
Differential Revision: http://reviews.llvm.org/D5505
llvm-svn: 218633
Previously we emit two or more identical definitions for an
exported symbol if the same /export option is given more than
once. This patch fixes that bug.
llvm-svn: 218433
Currently you can omit the leading underscore from exported
symbol name. LLD will look for mangled name for you. But it won't
look for C++ mangled name.
This patch is to support that.
If "sym" is specified to be exported, the linker looks for not
only "sym", but also "_sym" and "?sym@@<whatever>", so that you
can export a C++ function without decorating it.
llvm-svn: 218355
If two or more /export options are given for the same symbol, we should
always print a warning message and use the first one regardless of other
parameters.
Previously there was a case that the first parameter is not used.
llvm-svn: 218342
A symbol in a module definition file may be annotated with the
PRIVATE keyword like this.
EXPORTS
func PRIVATE
The PRIVATE keyword does not affect the resulting .dll file.
But it prevents the symbol to be listed in the .lib (import
library) file.
llvm-svn: 218273
Patch from Rafael Auler!
When a shared lib has an undefined symbol that is defined in a regular object
(the program), the final executable must export this symbol in the dynamic
symbol table. However, in the current logic, lld only puts the symbol in the
dynamic symbol table if the symbol is weak. This patch fixes lld to put the
symbol in the dynamic symbol table regardless if it is weak or not.
This caused a problem in FreeBSD10, whose programs link against a crt1.o
that defines the symbol __progname, which is, in turn, undefined in libc.so.7
and will only be resolved in runtime.
http://reviews.llvm.org/D5424
llvm-svn: 218259
Atoms are ordered in the output file by ordinal. File has file ordinal,
and atom has atom ordinal which is unique within the file.
No two atoms should have the same combination of ordinals.
However that contract was not satisifed for alias atoms. Alias atom
is defined by /alternatename:sym1=sym2. In this case sym1 is defined
as an alias for sym2. sym1 always got ordinal 0.
As a result LLD failed with an assertion failure.
This patch assigns ordinal to alias atoms.
llvm-svn: 218158
I made LLD to report an error if /safeseh:no option is given on x64,
but it turned out MSVC link.exe doesn't report error on it.
Removing the check.
llvm-svn: 218077
The contents from section .CRT$XLA to .CRT$XLZ is an array of function
pointers. They are called by the runtime when a new thread is created
or (gracefully) terminated.
You can make your own initialization function to be called by that
mechanism. All you have to do is:
- Define a pointer to a function in a .CRT$XL* section using pragma
- Make an external reference to "__tls_used" symbol
That technique is used in many projects. This patch is to support that.
What this patch does is to set the relative virtual address of
"__tls_used" to the PECOFF directory table. __tls_used is actually a
struct containing pointers to a symbol in .CRT$XLA and another symbol
in .CRT$XLZ. The runtime looks at the directory table, gets the address
of the struct, and call the function pointers between XLA and XLZ.
llvm-svn: 218007
On darwin, the linker tools records which dylib (DSO) each undefined was found
in, and then at runtime, the loader (dyld) only looks in that one specific
dylib for each undefined symbol. Now that llvm-objdump can display that info
I can write test cases.
llvm-svn: 217898
The provided base must also be a multiple of the system's page size, which is a
reasonable enough demand.
Also check the other diagnostics more thoroughly.
llvm-svn: 217577
The existing system linkers on Darwin and Linux are called "ld". We'd like to
eventually drop in lld as "ld" and have it just work. But lld is a universal
linker that requires the first option to be -flavor to know which command line
mode to emulate (gnu or darwin).
This change tests if argv[0] is "ld" and if so, if the tool was built on MacOSX
then assume the darwin flavor otherwise the gnu flavor. There are two test
cases which copy lld to "ld" and then run it. One for darwin and one for linux.
llvm-svn: 217566
Most of the changes are in the new file ArchHandler_arm64.cpp. But a few
things had to be fixed to support 16KB pages (instead of 4KB) which iOS arm64
requires. In addition the StubInfo struct had to be expanded because
arm64 uses two instruction (ADRP/LDR) to load a global which requires two
relocations. The other mach-o arches just needed one relocation.
llvm-svn: 217469
There is a bit (MH_PIE) in the flags field of the mach_header which tells
the kernel is a program was built position independent (for ASLR). The linker
automatically attempts to build programs PIE if they are built for a recent
OS version. But the -pie and -no_pie options override that default behavior.
llvm-svn: 217408
defined in a shared library.
Now LLD does not export a strong defined symbol if it coalesces away a
weak symbol defined in a shared library. This bug affects all ELF
architectures and leads to segfault:
% cat foo.c
extern int __attribute__((weak)) flag;
int foo() { return flag; }
% cat main.c
int flag = 1;
int foo();
int main() { return foo() == 1 ? 0 : -1; }
% clang -c -fPIC foo.c main.c
% lld -flavor gnu -target x86_64 -shared -o libfoo.so ... foo.o
% lld -flavor gnu -target x86_64 -o a.out ... main.o libfoo.so
% ./a.out
Segmentation fault
The problem is caused by the fact that we lose all information about
coalesced symbols after the `Resolver::resolve()` method is finished.
The patch solves the problem by overriding the
`LinkingContext::notifySymbolTableCoalesce()` method and saving names
of coalesced symbols. Later in the `buildDynamicSymbolTable()` routine
we use this information to export these symbols.
llvm-svn: 217363
When a file is not found, produce a proper error message. The previous error
message produced a file format error, which made me wonder for a while why
there is a file format error, but essentially the file was not found.
This fixes the problem by producing a proper error message.
llvm-svn: 217359
By default linker would not create a separate segment to hold read only data.
This option overrides that behavior by creating the a separate read only segment
for read only data.
llvm-svn: 217358
Mach-O has a "fat" (or "universal") variant where the same contents built for
different architectures are concatenated into one file with a table-of-contents
header at the start. But this leaves a dilemma for the linker - which
architecture to use.
Normally, the linker command line -arch is used to force which slice of any fat
files are used. The clang compiler always passes -arch to the linker when
invoking it. But some Makefiles invoke the linker directly and don’t specify
the -arch option. For those cases, the linker scans all input files in command
line order and finds the first non-fat object file. Whatever architecture it
is becomes the architecture for the link.
llvm-svn: 217189
The use of default: was disabling the warning about unused enumerators. Fix
that, then fix the one enumerator that was not handled. Add coverage for
it in test suite.
llvm-svn: 217078
On Darwin at runtime, dyld will prefer to use the export trie of a dylib instead
of the traditional symbol table (which is large and requires a binary search).
This change enables the linker to generate an export trie and to prefer it if
found in a dylib being linked against. This also simples the yaml for dylibs
because the yaml form of the trie can be reduced to just a sequence of names.
llvm-svn: 217066
I hope this is the last fix for x64 relocations as I've wasted
a few days on this.
This caused a mysterious issue that some C++ programs crash on
startup. It was because a null pointer is passed as argv to main.
__tmainCRTStartup calls main, but before that it calls all
initialization routines between .text$xc_a and .text$xc_z.
pre_cpp_init is one of such routines, and it is the one who
initializes a heap pointer for argv for later use. That routine
was not called for some reason.
It turned out that __tmainCRTStartup was skipping a block of
code because of the relocation bug. A condition in the function
depends on a memory load, and that memory load was referring
a wrong location. As a result a jump instruction took the
wrong branch, skipping pre_cpp_init and so on.
This patch fixes the issue. Also added more tests to fix them
once and for all.
llvm-svn: 216772
When a relocation is applied to a location, the new value needs
to be added to the existing value at the location. Existing
value is in most cases zero, but if not, the current code does
not work.
llvm-svn: 216680
Image Base field in the PE/COFF header is used as hint for the loader.
If the loader can load the executable at the specified address, that's
fine, but if not, it has to load it at a different address.
If that happens, the loader has to fix up the addresses in the
executable by adding the offset. The list of addresses that need to
be fixed is in .reloc section.
This patch is to emit x64 .reloc section contents.
llvm-svn: 216636
IMAGE_REL_AMD64_ADDR64 relocation should set 64-bit *VA* (virtual
address) instead of *RVA* (relative virtual address), so we have
to add the iamge base to the target's RVA.
llvm-svn: 216512
The implementation of AMD64 relocations was imcomplete
and wrong. On AMD64, we of course have to use AMD64
relocations instead of i386 ones. This patch fixes the
issue.
LLD is now able to link hello64.obj (created from
hello64.asm) against user32.lib and kernel32.lib to
create a Win64 binary.
llvm-svn: 216253
Mach-O symbols can have an attribute on them means their content should never be
dead code stripped. This translates to deadStrip() == deadStripNever.
llvm-svn: 216234
Both options control the final scope of atoms.
When -exported_symbols_list <file> is used, the file is parsed into one
symbol per line in the file. Only those symbols will be exported (global)
in the final linked image.
The -keep_private_externs option is only used with -r mode. Normally, -r
mode reduces private extern (scopeLinkageUnit) symbols to non-external. But
add the -keep_private_externs option keeps them private external.
llvm-svn: 216146
The darwin linker has an option, heavily used by Xcode, in which, instead
of listing all input files on the command line, the input file paths are
written to a text file and the path of that text file is passed to the linker
with the -filelist option (similar to @file).
In order to make test cases for this, I generalized the -test_libresolution
option to become -test_file_usage.
llvm-svn: 215762
Darwin has a packaging mechanism for shared libraries and headers called
frameworks. A directory Foo.framework contains a shared library binary file
"Foo" and a subdirectory "Headers". Most OS frameworks are all in one
directory /System/Library/Frameworks/. As a linking convenience, the linker
option "-framework Foo" means search the framework directories specified
with -F (analogous to -L) looking for a shared library Foo.framework/Foo.
llvm-svn: 215680
In general two-level namespace means each program records exactly which dylib
each undefined (imported) symbol comes from. But, sometimes the implementor
wants to hide the implementation dylib. For instance libSytem.dylib is the base
dylib all Darwin programs must link with. A few years ago it was split up
into two dozen dylibs by all are hidden behind libSystem.dylib which re-exports
each sub-dylib. All clients still think libSystem.dylib is the implementor.
To support this, the linker must load "indirect" dylibs and not just the
"direct" dylibs specified on the command line. This is done in the
createImplicitFiles() method after all command line specified files are
loaded. Since an indirect dylib may have already been loaded as a direct dylib
(or indirectly via a previous direct dylib), the MachOLinkingContext keeps
a list of all loaded dylibs.
With this change hello world can now be linked against the real OS or SDK.
llvm-svn: 215605
Split up the CRuntimeFile into one part for output types that need an entry
point and another part for output types that use stubs.
Add file 'test/mach-o/Inputs/libSystem.yaml' for use by test cases that
use -dylib and therefore may now need the helper symbol in libSystem.dylib.
llvm-svn: 215602
Mach-o uses "two-level namespace" where each undefined symbols is associated
with a specific dylib. This means at runtime the loader (dyld) does need to
search all loaded dylibs for that symbol but rather just the one specified.
Now that llvm-nm -m prints out that info, properly set that info, and test
it in the hello world test cases.
llvm-svn: 215598
This patch adds the initial ELF/AArch64 support to lld. Only a basic "Hello
World" app has been successfully tested for both dynamic and static compiling.
Differential Revision: http://reviews.llvm.org/D4778
Patch by Daniel Stewart <stewartd@codeaurora.org>!
llvm-svn: 215544
The commit of the .ent/.end implementation will change the result of the
relocation evaluations (because of a new section and additional relocations)
which will lead to a failure if the .ent/.end directives are present in this
test.
We don't really need the .ent/.end directives in this test so let's just remove
them to preserve the current output.
llvm-svn: 215534
/INCLUDE arguments passed as command line options are handled in the
same way as Unix -u. All option values are converted to an undefined
symbol and added to a dummy input file, so that the specified symbols
are resolved.
One tricky thing on Windows is that the option is also allowed to
appear in the object file's directive section. At the time when
it's being read, all (regular) command line options have already
been processed. We cannot add undefined atoms to the dummy file
anymore.
Previously, we added such /INCLUDE to a set that has already been
processed. As a result the options were ignored.
This patch fixes the issue. Now, /INCLUDE symbols in the directive
section are handled as real undefined symbol in the COFF file.
We create an undefined symbol for each /INCLUDE argument and add
it to the file being parsed.
llvm-svn: 214824
The PE/COFF spec says that SizeOfRawData field in the section
header must be a multiple of FileAlignment from the optional
header. LLD emits 512 as FileAlignment, so it must have been
a multiple of 512.
LLD did not follow that. It emitted the actual section size
without the last padding as the SizeOfRawData. Although it's
not correct as per the spec, the Windows loader doesn't seem
to actually bother to check that. Executables created by LLD
worked fine.
However, tools dealing with executalbe files may expect it
to be the correct value, and one instance of it is mt.exe
tool distributed as a part of Windows SDK.
If CMake is invoked with "-E vs_link_exe" option, it silently
run mt.exe to embed a resource file to the resulting file.
And mt.exe sometimes breaks an input file if it's section
header does not follow the standard. That caused a misterous
error that CMake with Ninja occasionally produces a broken
executable.
This patch fixes the section header to make mt.exe and
other tools happy.
llvm-svn: 214453
In some cases the address of a function will be materialized with a movw/movt
pair. If the function is a thumb function, the low bit needs to be set on
the movw immediate value.
llvm-svn: 214277
The -sectalign option is used to increase the alignment required for a section.
It required some reworking of how the __TEXT segment is laid out because that
segment also contains the mach_header and load commands. And the size of load
commands depend on the number of segments, sections, and dependent dylibs used.
Using this option will simplify some future test cases because the final
address of code can be pinned down, making tests of its content easier.
llvm-svn: 214268
All iOS arm processor support switching between arm and thumb mode at call sites
by using the BLX instruction (instead of BL). But the compiler does not know
the implementation mode for extern functions, so the linker must update BL/BLX
instructions to match what is linked is actually linked together. In addition,
pointers to functions (such as vtables) must have the low bit set if the target
of the pointer is a thumb mode function.
llvm-svn: 214140
Sometimes compilers emit data into code sections (e.g. constant pools or
jump tables). These runs of data can throw off disassemblers. The solution
in mach-o is that ranges of data-in-code are encoded into a table pointed to
by the LC_DATA_IN_CODE load command.
The way the data-in-code information is encoded into lld's Atom model is that
that start and end of each data run is marked with a Reference whose offset
is the start/end of the data run. For arm, the switch back to code also marks
whether it is thumb or arm code.
llvm-svn: 213901
On Windows there are four "main" functions -- main, wmain, WinMain,
or wWinMain. Their parameter types are diffferent. The standard
library provides four different entry functions (i.e.
{w,}{WinMain,main}CRTStartup) for them. You need to use the right
entry routine for your "main" function.
If you give an /entry option, the specified name is used
unconditionally.
Otherwise, the linker needs to select the right one based on
user-supplied entry point function. This can be done after the
linker reads all the input files.
This patch moves the code to determine the entry point function
from the driver to a virtual input file. It also implements the
correct logic for the entry point function selection.
llvm-svn: 213713
This patch just supports marking ranges that are thumb code (vs arm code).
Future patches will mark data and jump table ranges. The ranges are encoded
as References with offsetInAtom being the start of the range and the target
being the same atom.
llvm-svn: 213712
Over time the symbols and relocations have changed for dwarf unwind info
in the __eh_frame section. Add test cases for older and new style.
llvm-svn: 213585
Add support for adding section relocations in -r mode. Enhance the test
cases which validate the parsing of .o files to also round trip. They now
write out the .o file and then parse that, verifying all relocations survived
the round trip.
llvm-svn: 213333
All architecture specific handling is now done in the appropriate
ArchHandler subclass.
The StubsPass and GOTPass have been simplified. All architecture specific
variations in stubs are now encoded in a table which is vended by the
current ArchHandler.
llvm-svn: 213187
There are two forms of `-l` prefixed expression:
* -l<libname>
* -l:<filename>
In the first case a linker should construct a full library name
`lib + libname + .[so|a]` and search this library as usual. In the second case
a linker should use the `<filename>` as is and search this file through library
search directories.
The patch reviewed by Shankar Easwaran.
llvm-svn: 213077
Previously we invoked cvtres.exe for each compiled Windows
resource file. The generated files were then concatenated
and embedded to the executable.
That was not the correct way to merge compiled Windows
resource files. If you just concatenate generated files,
only the first file would be recognized and the rest would
be ignored as trailing garbage.
The right way to merge them is to call cvtres.exe with
multiple input files. In this patch we do that in the
Windows driver.
llvm-svn: 212763
These behave slightly idiosyncratically in the best of cases, and have
additional hacks layered on top of that for compatibility with badly behaved
build systems (via ld64).
For -lXYZ:
+ If XYZ is actually XY.o then search all library paths for XY.o
+ Otherwise search all library paths, first for libXYZ.dylib, then libXYZ.a
+ By default the library paths are /usr/lib and /usr/local/lib in that order.
For -syslibroot:
+ -syslibroot options apply to absolute paths in the search order.
+ All -syslibroot prefixes that exist are added to the search path *instead*
of the original.
+ If no -syslibroot prefixed path exists, the original is kept.
+ Hacks^WExceptions:
+ If only 1 -syslibroot is given and doesn't contain /usr/lib or
/usr/local/lib, that path is dropped entirely. (rdar://problem/6438270).
+ If the last -syslibroot is "/", all of them are ignored entirely.
(rdar://problem/5829579).
At least, that's my best interpretation of what ld64 does in buildSearchPaths.
llvm-svn: 212706
Previously the alignment of the .bss section was not
properly set because of a bug in AtomizeDefinedSymbolsInSection.
We set the alignment of the section at the end of the function,
but we use an eraly return for the .bss section, so the code had
been skipped.
llvm-svn: 212571
This converts the very complicated mach-o arm
relocations into the simple Reference Kinds in lld.
The next patch will use the internal Reference kinds
to fix up arm/thumb code.
llvm-svn: 212306
Unfortunately, the creation of (the default) output file, a.out races with all
the other tests in this directory. When the wrong one is read by macho-dump,
the test fails.
llvm-svn: 212269
When trying to map atom types to sections, we were iterating through an array
until we hit a sentinel value. There's no need for such dances when range-based
for loops are available.
llvm-svn: 212035
This isn't really the right place to put them in final object files (that would
be __TEXT,__unwind_info), but the format is different between relocatable and
final objects, which means we really need a pass to handle the translation.
For now, re-emitting in __LD,__compact_unwind is harmless (dyld ignores it and
moves straight on to inspecting __TEXT,__eh_frame), and sidesteps an assertion
failure when processing files containing compact-unwind info.
llvm-svn: 212032
Segments must occupy a multiple of the page size in memory (4096 currently). We
check for this when emitting files, but the placement algorithm broke down for
the second non-__TEXT segment encountered: the offset wasn't aligned up to 4096
before starting its layout.
llvm-svn: 212031
Because of how we were calculating fileOffset and fileSize for segments, most
ended up at a single offset in a finalised MachO file. This meant the data
often didn't even get written in the final object, let alone where it would be
useful.
llvm-svn: 212030