In general two-level namespace means each program records exactly which dylib
each undefined (imported) symbol comes from. But, sometimes the implementor
wants to hide the implementation dylib. For instance libSytem.dylib is the base
dylib all Darwin programs must link with. A few years ago it was split up
into two dozen dylibs by all are hidden behind libSystem.dylib which re-exports
each sub-dylib. All clients still think libSystem.dylib is the implementor.
To support this, the linker must load "indirect" dylibs and not just the
"direct" dylibs specified on the command line. This is done in the
createImplicitFiles() method after all command line specified files are
loaded. Since an indirect dylib may have already been loaded as a direct dylib
(or indirectly via a previous direct dylib), the MachOLinkingContext keeps
a list of all loaded dylibs.
With this change hello world can now be linked against the real OS or SDK.
llvm-svn: 215605
Split up the CRuntimeFile into one part for output types that need an entry
point and another part for output types that use stubs.
Add file 'test/mach-o/Inputs/libSystem.yaml' for use by test cases that
use -dylib and therefore may now need the helper symbol in libSystem.dylib.
llvm-svn: 215602
Mach-o uses "two-level namespace" where each undefined symbols is associated
with a specific dylib. This means at runtime the loader (dyld) does need to
search all loaded dylibs for that symbol but rather just the one specified.
Now that llvm-nm -m prints out that info, properly set that info, and test
it in the hello world test cases.
llvm-svn: 215598
This patch adds the initial ELF/AArch64 support to lld. Only a basic "Hello
World" app has been successfully tested for both dynamic and static compiling.
Differential Revision: http://reviews.llvm.org/D4778
Patch by Daniel Stewart <stewartd@codeaurora.org>!
llvm-svn: 215544
The commit of the .ent/.end implementation will change the result of the
relocation evaluations (because of a new section and additional relocations)
which will lead to a failure if the .ent/.end directives are present in this
test.
We don't really need the .ent/.end directives in this test so let's just remove
them to preserve the current output.
llvm-svn: 215534
The tests assume the path separator is '/', but if you run
them on Windows it is '\'. As a result the tests are failing
on Windows. This should be the minimal change to make these
tests to pass on Windows platform.
Differential Revision: http://reviews.llvm.org/D4710
llvm-svn: 214990
``html_favicon`` seem to conflict with [what it in the sphinx
docs](http://sphinx-doc.org/config.html#confval-html_favicon).
So I've copied the comments from there to conf.py and changed its
value appropriately to remove the missing favicon.ico warning.
llvm-svn: 214971
/INCLUDE arguments passed as command line options are handled in the
same way as Unix -u. All option values are converted to an undefined
symbol and added to a dummy input file, so that the specified symbols
are resolved.
One tricky thing on Windows is that the option is also allowed to
appear in the object file's directive section. At the time when
it's being read, all (regular) command line options have already
been processed. We cannot add undefined atoms to the dummy file
anymore.
Previously, we added such /INCLUDE to a set that has already been
processed. As a result the options were ignored.
This patch fixes the issue. Now, /INCLUDE symbols in the directive
section are handled as real undefined symbol in the COFF file.
We create an undefined symbol for each /INCLUDE argument and add
it to the file being parsed.
llvm-svn: 214824
The PE/COFF spec says that SizeOfRawData field in the section
header must be a multiple of FileAlignment from the optional
header. LLD emits 512 as FileAlignment, so it must have been
a multiple of 512.
LLD did not follow that. It emitted the actual section size
without the last padding as the SizeOfRawData. Although it's
not correct as per the spec, the Windows loader doesn't seem
to actually bother to check that. Executables created by LLD
worked fine.
However, tools dealing with executalbe files may expect it
to be the correct value, and one instance of it is mt.exe
tool distributed as a part of Windows SDK.
If CMake is invoked with "-E vs_link_exe" option, it silently
run mt.exe to embed a resource file to the resulting file.
And mt.exe sometimes breaks an input file if it's section
header does not follow the standard. That caused a misterous
error that CMake with Ninja occasionally produces a broken
executable.
This patch fixes the section header to make mt.exe and
other tools happy.
llvm-svn: 214453
In some cases the address of a function will be materialized with a movw/movt
pair. If the function is a thumb function, the low bit needs to be set on
the movw immediate value.
llvm-svn: 214277
The -sectalign option is used to increase the alignment required for a section.
It required some reworking of how the __TEXT segment is laid out because that
segment also contains the mach_header and load commands. And the size of load
commands depend on the number of segments, sections, and dependent dylibs used.
Using this option will simplify some future test cases because the final
address of code can be pinned down, making tests of its content easier.
llvm-svn: 214268
All iOS arm processor support switching between arm and thumb mode at call sites
by using the BLX instruction (instead of BL). But the compiler does not know
the implementation mode for extern functions, so the linker must update BL/BLX
instructions to match what is linked is actually linked together. In addition,
pointers to functions (such as vtables) must have the low bit set if the target
of the pointer is a thumb mode function.
llvm-svn: 214140
The following expression
m[i] = m[j]
where m is a DenseMap and i != j is not safe. m[j] returns a
reference, which would be invalidated when a rehashing occurs.
If rehashing occurs to make room for m[i], m[j] becomes
invalid, and that invalid reference would be used as the RHS
value of the expression.
llvm-svn: 213969
Sometimes compilers emit data into code sections (e.g. constant pools or
jump tables). These runs of data can throw off disassemblers. The solution
in mach-o is that ranges of data-in-code are encoded into a table pointed to
by the LC_DATA_IN_CODE load command.
The way the data-in-code information is encoded into lld's Atom model is that
that start and end of each data run is marked with a Reference whose offset
is the start/end of the data run. For arm, the switch back to code also marks
whether it is thumb or arm code.
llvm-svn: 213901
insertElementAt(x, END) does the identical thing as addInputElement(x),
so the only reasonable use of insertElementAt is to call it with the
other possible argument, BEGIN. That means the second parameter of the
function is just redundant. This patch is to remove the second
parameter and rename the function accordingly.
llvm-svn: 213821
The entry point file needs to be processed after all other
object files and before all .lib files. It was processed
after .lib files. That caused an issue that the entry point
function was not resolved from the standard library files.
llvm-svn: 213804
On Windows there are four "main" functions -- main, wmain, WinMain,
or wWinMain. Their parameter types are diffferent. The standard
library provides four different entry functions (i.e.
{w,}{WinMain,main}CRTStartup) for them. You need to use the right
entry routine for your "main" function.
If you give an /entry option, the specified name is used
unconditionally.
Otherwise, the linker needs to select the right one based on
user-supplied entry point function. This can be done after the
linker reads all the input files.
This patch moves the code to determine the entry point function
from the driver to a virtual input file. It also implements the
correct logic for the entry point function selection.
llvm-svn: 213713
This patch just supports marking ranges that are thumb code (vs arm code).
Future patches will mark data and jump table ranges. The ranges are encoded
as References with offsetInAtom being the start of the range and the target
being the same atom.
llvm-svn: 213712
This is a part of a larger change to move the entry point
processing to a later pass than the driver. On Windows the default
entry point function varies depending on user-provided functions.
That means the driver is not able to correctly know the entry point
function name. Only passes after the core linker can infer it.
llvm-svn: 213697
Over time the symbols and relocations have changed for dwarf unwind info
in the __eh_frame section. Add test cases for older and new style.
llvm-svn: 213585
Add support for adding section relocations in -r mode. Enhance the test
cases which validate the parsing of .o files to also round trip. They now
write out the .o file and then parse that, verifying all relocations survived
the round trip.
llvm-svn: 213333
The code to manage resolvable symbols is now separated from
ExportedSymbolRenameFile so that other class can reuse it.
I'm planning to use it to find the entry function symbol
based on resolvable symbols.
llvm-svn: 213322
All architecture specific handling is now done in the appropriate
ArchHandler subclass.
The StubsPass and GOTPass have been simplified. All architecture specific
variations in stubs are now encoded in a table which is vended by the
current ArchHandler.
llvm-svn: 213187
There are two forms of `-l` prefixed expression:
* -l<libname>
* -l:<filename>
In the first case a linker should construct a full library name
`lib + libname + .[so|a]` and search this library as usual. In the second case
a linker should use the `<filename>` as is and search this file through library
search directories.
The patch reviewed by Shankar Easwaran.
llvm-svn: 213077
Previously we invoked cvtres.exe for each compiled Windows
resource file. The generated files were then concatenated
and embedded to the executable.
That was not the correct way to merge compiled Windows
resource files. If you just concatenate generated files,
only the first file would be recognized and the rest would
be ignored as trailing garbage.
The right way to merge them is to call cvtres.exe with
multiple input files. In this patch we do that in the
Windows driver.
llvm-svn: 212763
These behave slightly idiosyncratically in the best of cases, and have
additional hacks layered on top of that for compatibility with badly behaved
build systems (via ld64).
For -lXYZ:
+ If XYZ is actually XY.o then search all library paths for XY.o
+ Otherwise search all library paths, first for libXYZ.dylib, then libXYZ.a
+ By default the library paths are /usr/lib and /usr/local/lib in that order.
For -syslibroot:
+ -syslibroot options apply to absolute paths in the search order.
+ All -syslibroot prefixes that exist are added to the search path *instead*
of the original.
+ If no -syslibroot prefixed path exists, the original is kept.
+ Hacks^WExceptions:
+ If only 1 -syslibroot is given and doesn't contain /usr/lib or
/usr/local/lib, that path is dropped entirely. (rdar://problem/6438270).
+ If the last -syslibroot is "/", all of them are ignored entirely.
(rdar://problem/5829579).
At least, that's my best interpretation of what ld64 does in buildSearchPaths.
llvm-svn: 212706
Previously the alignment of the .bss section was not
properly set because of a bug in AtomizeDefinedSymbolsInSection.
We set the alignment of the section at the end of the function,
but we use an eraly return for the .bss section, so the code had
been skipped.
llvm-svn: 212571
This converts the very complicated mach-o arm
relocations into the simple Reference Kinds in lld.
The next patch will use the internal Reference kinds
to fix up arm/thumb code.
llvm-svn: 212306
Unfortunately, the creation of (the default) output file, a.out races with all
the other tests in this directory. When the wrong one is read by macho-dump,
the test fails.
llvm-svn: 212269
When trying to map atom types to sections, we were iterating through an array
until we hit a sentinel value. There's no need for such dances when range-based
for loops are available.
llvm-svn: 212035
This isn't really the right place to put them in final object files (that would
be __TEXT,__unwind_info), but the format is different between relocatable and
final objects, which means we really need a pass to handle the translation.
For now, re-emitting in __LD,__compact_unwind is harmless (dyld ignores it and
moves straight on to inspecting __TEXT,__eh_frame), and sidesteps an assertion
failure when processing files containing compact-unwind info.
llvm-svn: 212032
Segments must occupy a multiple of the page size in memory (4096 currently). We
check for this when emitting files, but the placement algorithm broke down for
the second non-__TEXT segment encountered: the offset wasn't aligned up to 4096
before starting its layout.
llvm-svn: 212031
Because of how we were calculating fileOffset and fileSize for segments, most
ended up at a single offset in a finalised MachO file. This meant the data
often didn't even get written in the final object, let alone where it would be
useful.
llvm-svn: 212030
This is first step in reworking how mach-o relocations are processed.
The existing KindHandler is going to become a delgate/helper object for
processing architecture specific references. The KindHandler knows how
to convert mach-o relocations into References and back, as well, as fixing
up the content the relocation is on.
One of the messy things about mach-o relocations is that they sometime
come in pairs, but the pairs still convert to one lld::Reference. So, the
conversion has to detect pairs (arch specific) and change the stride.
llvm-svn: 211921
The previous function returned true for "s < s", which could completely mess up
the sorting of symbols within a section.
Unfortunately, I don't think there's a robust way to write a test for this.
Anything I come up with will be making assumptions about the particular
implementation of std::sort.
llvm-svn: 211704
When looking through sections with zero-terminated string-literals (__cstring
or __ustring) we were constantly rechecking the first few bytes of the string
for '\0' rather than advancing along. This obviously failed unless all strings
within the section had the same length as that first one.
llvm-svn: 211682
We were trying to examine the first symbol in a section that we wanted to
atomize by symbols, even when there wasn't one. Instead, we should make the
initial anonymous symbol cover the entire section in that situation.
llvm-svn: 211681
dynamic symbol table populating and DT_NEEDED tag creation.
The `isDynSymEntryRequired` function returns true if the specified shared
library atom requires a dynamic symbol table entry. The `isNeededTagRequired`
function returns true if we need to create DT_NEEDED tag for the shared
library defined specified shared atom.
By default the both functions return true. So there is no functional changes
for all targets except MIPS. Probably we need to spread the same modifications
on other ELF targets but I want to implement and fully tested complete set of
changes for MIPS target first.
For MIPS we create a dynamic symbol table entry for a shared library atom iif
this atom is referenced by a regular defined atom. For example, if library L1
defines symbol T1, library L2 defines symbol T2 and uses symbol T1
and executable file E1 uses symbol T2 but does not use symbol T1 we create
an entry in the E1 dynamic symbol table for symbol T2 and do not create
an entry for T1.
The patch creates DT_NEEDED tags for shared libraries contain shared library
atoms which a) referenced by regular defined atoms; b) have corresponding
copy dynamic relocations (R_MIPS_COPY).
Now the patch does not take in account --as-needed / --no-as-needed command
line options. So it is too restrictive and create DT_NEEDED tags for really
needed shared libraries only. I plan to fix that by subsequent patches.
llvm-svn: 211674
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
I added a new reference type, kindAssociate. If a target atom is
coalesced away, the referring atom is removed by Resolver, so that
they are treated as a group.
Differential Revision: http://reviews.llvm.org/D4028
llvm-svn: 211106
This code was never being used and any use of it would look fairly strange.
For example, it would try to map a NativeReaderError::file_malformed to
std::errc::invalid_argument.
llvm-svn: 210913
The previous commit uncovered a bug in the mach-o writer whereby two __text
sections were created. But the test case did not catch that. So I updated
the test case to run the linker a second time, reading the output of the
first pass.
llvm-svn: 210624
isCoalescedAway(x) is faster than replacement(x) != x as the former
does not follow the replacement atom chain. Also it's easier to use.
llvm-svn: 210242
COFF supports a feature similar to ELF's section groups. This
patch implements it.
In ELF, section groups are identified by their names, and they are
treated somewhat differently from regular symbols. In COFF, the
feature is realized in a more straightforward way. A section can
have an annotation saying "if Nth section is linked, link this
section too."
Implementing such feature is easy. We can add a reference from a
target atom to an original atom, so that if the target is linked,
the original atom is also linked. If not linked, both will be
dead-stripped. So they are treated as a group.
I added a new reference type, kindAssociate. It does nothing except
preventing referenced atoms from being dead-stripped.
No change to the Resolver is needed.
Reviewers: Bigcheese, shankarke, atanasyan
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3946
llvm-svn: 210240
This provides support for the autoconfing & make build style.
The format, style and implementation follows that used within the llvm and clang projects.
TODO: implement out-of-source documentation builds.
llvm-svn: 210177
Previously FileArchive ctor comment said that only its subclasses
can be instantiated, but the ctor is actually public and is
instantiated by ArchiveReader.
Remove the wrong comment and reorder the member functions so that
public members appear before private ones.
llvm-svn: 210175
In sections that are broken into atoms at symbols, if the first symbol in the
section is not at the start of the section, then make an anonymous atom for
the section content that is before the first symbol.
llvm-svn: 210142
Previously each section kind had its own code to loop over the section and
parse it into atoms. This refactoring has two tables. The first maps sections
to ContentType. The second maps ContentType to information on how to find
the atom boundaries.
A few bugs in test cases were discovered as part of the refactoring.
No change in functionality intended.
llvm-svn: 210138
FileToMutable is what this class does, but this class (or, to be precise,
an instance of this class) is a wrapper of the other SimpleFile. It's odd
that the class was named like a function.
llvm-svn: 210089
Previously section groups are doubly linked to their children.
That is, an atom representing a group has group-child references
to its group contents, and content atoms also have group-parent
references to the group atom. That relationship was invariant;
if X has a group-child edge to Y, Y must have a group-parent
edge to X.
However we were not using group-parent references at all. The
resolver only needs group-child edges.
This patch simplifies the section group by removing the unused
reverse edge. No functionality change intended.
Differential Revision: http://reviews.llvm.org/D3945
llvm-svn: 210066
Layout-before edges are no longer used for layout, but they are
still there for dead-stripping. If we would just remove them
from code, LLD would wrongly remove live atoms that were
referenced by layout-befores.
This patch fixes the issue. Before dead-stripping, it scans all
atoms to construct a reverse map for layout-after edges. Dead-
stripping pass uses the map to traverse the graph.
Differential Revision: http://reviews.llvm.org/D3986
llvm-svn: 210057
Reference::target() never returns a nullptr, so NULL check
is not needed and is more harmful than doing nothing.
No functionality change.
llvm-svn: 210008
Arrange .ctors/.dtors sections in the following order:
.ctors from crtbegin.o or crtbegin?.o
.ctors from regular object files
.ctors.* (sorted) from regular object files
.ctors from crtend.o or crtend?.o
This order is specific for MIPS traget. For example, on X86
the .ctors.* sections are merged into the .init_array section.
llvm-svn: 209987
The main problem is in the predicate passed to the `std::stable_sort()`.
This predicate always returns false if **both** section's names do not
start with `.init_array` or `.fini_array` prefixes. In short, it does not
define a strict weak orderng. Suppose we have the following sections:
.A .init_array.1 .init_array.2
The predicate states that:
not .init_array.1 < .A
not .A < .init_array.2
but .init_array.1 < .init_array.2 !!!
The second problem is that `.init_array` section without number should
go last in the list. Not it has the lowest priority '0' and goes first.
The patch fixes both of the problems.
llvm-svn: 209875