With this, a simple hello world links against libSystem.tbd and the
old ld64.lld linker kind of works again with newer SDKs.
The motivation here is to have an arm64 cross linker that's good
enough to be able to run simple configure link checks on non-mac
systems for generating config.h files. Once -flavor darwinnew can
link arm64, we'll switch to that.
One instance looks like a false positive:
lld/ELF/Relocations.cpp:1622:14: note: use reference type 'const std::pair<ThunkSection *, uint32_t> &' (aka 'cons
t pair<lld:🧝:ThunkSection *, unsigned int> &') to prevent copying
for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections)
It is not changed in this commit.
Patch by Nicholas Allegra.
The Mach-O writer calculates the size of load commands multiple times.
First, Util::assignAddressesToSections() (in MachONormalizedFileFromAtoms.cpp)
calculates the size using headerAndLoadCommandsSize() (in
MachONormalizedFileBinaryWriter.cpp), which creates a temporary
MachOFileLayout for the NormalizedFile, only to retrieve its
headerAndLoadCommandsSize. Later, writeBinary() (in
MachONormalizedFileBinaryWriter.cpp) creates a new layout and uses the offsets
from that layout to actually write out everything in the NormalizedFile.
But the NormalizedFile changes between the first computation and the second.
When Util::assignAddressesToSections is called, file.functionStarts is always
empty because Util::addFunctionStarts has not yet been called. Yet
MachOFileLayout decides whether to include a LC_FUNCTION_STARTS command based
on whether file.functionStarts is nonempty. Therefore, the initial computation
always omits it.
Because padding for the __TEXT segment (to make its size a multiple of the
page size) is added between the load commands and the first section, LLD still
generates a valid binary as long as the amount of padding happens to be large
enough to fit LC_FUNCTION_STARTS command, which it usually is.
However, it's easy to reproduce the issue by adding a section of a precise
size. Given foo.c:
__attribute__((section("__TEXT,__foo")))
char foo[0xd78] = {0};
Run:
clang -dynamiclib -o foo.dylib foo.c -fuse-ld=lld -install_name
/usr/lib/foo.dylib
otool -lvv foo.dylib
This should produce:
truncated or malformed object (offset field of section 1 in LC_SEGMENT_64
command 0 not past the headers of the file)
This commit:
- Changes MachOFileLayout to always assume LC_FUNCTION_STARTS is present for
the initial computation, as long as generating LC_FUNCTION_STARTS is
enabled. It would be slightly better to check whether there are actually
any functions, since no LC_FUNCTION_STARTS will be generated if not, but it
doesn't cause a problem if the initial computation is too high.
- Adds a test.
- Adds an assert in MachOFileLayout::writeSectionContent() that we are not
writing section content into the load commands region (which would happen
if the offset was calculated too low due to the initial load commands size
calculation being too low). Adds an assert in
MachOFileLayout::writeLoadCommands to validate a similar situation where
two size-of-load-commands computations are expected to be equivalent.
llvm-svn: 358545
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
New lld's files are spread under lib subdirectory, and it isn't easy
to find which files are actually maintained. This patch moves maintained
files to Common subdirectory.
Differential Revision: https://reviews.llvm.org/D37645
llvm-svn: 314719
Patch by Patricio Villalobos.
I discovered that lld for darwin is generating the wrong code for lazy
bindings in the __stub_helper section (at least for osx 10.12). This is
the way i can reproduce this problem, using this program:
#include <stdio.h>
int main(int argc, char **argv) {
printf("C: printf!\n");
puts("C: puts!\n");
return 0;
}
Then I link it using i have tested it in 3.9, 4.0 and 4.1 versions:
$ clang -c hello.c
$ lld -flavor darwin hello.o -o h1 -lc
When i execute the binary h1 the system gives me the following error:
C: printf!
dyld: lazy symbol binding failed:
BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB
has segment 4 which is too large (0..3)
dyld: BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB has segment 4 which is too
large (0..3)
Trace/BPT trap: 5
Investigating the code, it seems that the problem is that the asm code
generated in the file StubPass.cpp, specifically in the line 323,when it
adds, what it seems an arbitrary number (12) to the offset into the lazy
bind opcodes section, but it should be calculated depending on the
MachONormalizedFileBinaryWrite::lazyBindingInfo result.
I confirmed this bug by patching the code manually in the binary and
writing the right offset in the asm code (__stub_helper).
This patch fixes the content of the atom that contains the assembly code
when the offset is known.
Differential Revision: https://reviews.llvm.org/D35387
llvm-svn: 311734
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
It only makes sense to set on N_NO_DEAD_STRIP on a relocatable object file. Otherwise the bits aren't useful for anything. Matches the ld64 behaviour.
llvm-svn: 278419
Currently we do this when an atom is used, but we need to do it when a
dylib is referenced on the cmdline as this matches ld64.
This fixes much confusion over which maps are indexed with installName
vs path. There is likely other confusion so i'll be seeing if i can remove
path() completely in a future commit as path() shouldn't really be needed by anyone.
llvm-svn: 278396
The MachO debug support code (committed in r276935) occasionally needs to
allocate string copies, and was doing so by creating std::strings on a
BumpPtrAllocator. The strings were untracked, so the destructors weren't being
run and we were leaking the memory when the allocator was thrown away. Since
it's easier than tracking the strings, this patch switches the copies to char
buffers allocated directly in the bump-ptr allocator.
llvm-svn: 277208
This patch causes LLD to build stabs debugging symbols for files containing
DWARF debug info, and to propagate existing stabs symbols for object files
built using '-r' mode. This enables debugging of binaries generated by LLD
from MachO objects.
llvm-svn: 276921
This reverts commit r264945.
The commit only removed an unreachable in a method with a covered switch, but
GCC is likely to warn on this, and the coding standards recommend just leaving
in the unreachable.
llvm-svn: 264983
Its possible for file to have no entry atom which means that there
is no atom to check for being a thumb function. Instead just skip
the thumb check and set the entry address to 0, which matches the
current behaviour of getting a default initialised int from a map.
llvm-svn: 264233
The size of a section can be zero, even when it contains atoms, so
long as all of the atoms are also size 0. In this case we were
allocating space for a 0 sized buffer.
Changed this to only allocate when we need the space, but also cleaned
up all the code to use MutableArrayRef instead of uint8_t* so its much much
safer as we get bounds checking on all of our section creation logic.
llvm-svn: 264204
Turns out that checking only x86 for empty atoms to fix UBSan then
requires the same code in the other targets too. Better to just
check this in the main run loop instead of in each target.
Should be NFC, other than fixing UBSan failures.
llvm-svn: 264116
The non lazy atoms generated in the stubs pass use an image cache to
hold all of the pointers. On arm archs, this is the __got section,
but on x86 archs it should be __nl_symbol_ptr.
rdar://problem/24572729
llvm-svn: 260271
Also added the defaults for whether to generate this load command, which
the cmdline options are able to override.
There was also a difference to ld64 which is fixed here in that ld64 will
generate an empty data in code command if requested.
rdar://problem/24472630
llvm-svn: 260191
This load command generates data in the LINKEDIT section which
is a list of ULEB128 delta's to all of the functions in the __text section.
It is then 0 terminated and pointer aligned to pad.
ld64 exposes the -function-starts and no-function-starts cmdline options
to override behaviour from the defaults based on file types.
rdar://problem/24472630
llvm-svn: 260188
The atom content type enum is used as a tie breaker to sort atoms.
In that case, we want MachHeader to be before typeCode as it really will
be before the code in the final executable.
Test case to follow in the next commit or two.
llvm-svn: 260184
The initial segment protection was also being used to set the maximum
segment protection level. Instead, the maximum should be set according
to the architecture we are linking. For example on Mac OS it should be
RWX on most pages, but on iOS is often on R_X.
rdar://problem/24515136
llvm-svn: 259966
We currently tag on a "__LINKEDIT" when we are emitting the segments.
However, an upcoming patch aims to set the initprot and maxprot segment members
to their correct values, and in order to share code, its better to create this
segment for real and handle it in buildFileOffsets the same way ld64 does.
The commit for segment protections will add a test for this all being correct so
no test here until that code is committed.
llvm-svn: 259960
This is of the form A.B.C.D.E and to match ld64's behaviour, is
always output to files, even when the version is 0.
rdar://problem/24472630
llvm-svn: 259746
ld64 sets both S_ATTR_PURE_INSTRUCTIONS and S_ATTR_SOME_INSTRUCTIONS
on __TEXT, __text. We only had the S_ATTR_PURE_INSTRUCTIONS attribute.
rdar://problem/24495801
llvm-svn: 259744
In the case where we are emitting to an object file, the platform is
possibly unknown, and the source object files contained load commands
for version min, we can take the maximum of those min versions and
emit in in the output object file.
This test also tests r259739.
llvm-svn: 259742
This option is emitted in the min_version load commands.
Note, there's currently a difference in behaviour compared to ld64 in
that we emit a warning if we generate a min_version load command and
didn't give an sdk_version. We need to decide what the correct behaviour
is here as its possible we want to emit an error and force clients to
provide the option.
llvm-svn: 259729
If the command line contains something like -macosx_version_min and we
don't explicitly disable generation with -no_version_load_command then
we generate the LC_VERSION_MIN command in the output file.
There's a couple of FIXME's in here. These will be handled soon with
more tests but I didn't want to grow this patch any more than it already was.
rdar://problem/24472630
llvm-svn: 259718
In r259574 I fixed some of the issues with the mach header symbols
and DSO handles.
This is the next issue whereby the __mh_execute_header has to not
be dead stripped, and (to match ld64) should be dynamically referenced.
The test here should also have been added in r259574 to make sure that
we emit this symbol. But checking that it is not only emitted but also
has the correct reference type is fine.
llvm-svn: 259589
The magic file which contained these symbols inherited from archive
which meant that the resolver didn't add the required atoms as archive
members only get added when referenced. Instead we now inherit from
SimpleFile which always links in the atoms needed.
The second issue was in the handling of these symbols when we emit
the MachO. The mach header symbol needs to be in the atom list as
it gets an offset (0), and being in the atom list makes sure it is
emitted to the symbol table. DSO handles are not emitted to the
symbol table.
rdar://problem/24450654
llvm-svn: 259574
Now that MachoFile has classof(), we can use dyn_cast instead which
is actually the only safe way to handle this.
Turns out this actually manifests as a bug as we were incorrectly
casting instances which weren't MachoFile in to a MachoFile.
Unfortunately, there's no reliable way of checking for this as it
requires that the file we are looking for has a 0 at exactly the byte
we need for the load of subsectionsViaSymbols.
llvm-svn: 259413
When generating a relocatable file, its only valid to set this flag if
all of the inputs also had the flag. Otherwise we may atomize incorrectly
when we link the relocatable file again.
Reviewed by Lang Hames.
Differential Revision: http://reviews.llvm.org/D16018
llvm-svn: 257976
The __eh_frame section contains relocations which can always be implicitly generated.
This patch tracks whether sections have only implicitly relocations and skips emitting them to the object file if that is the case.
The test case here ensures that this is the case for __eh_frame sections.
Reviewed by Lang Hames.
http://reviews.llvm.org/D15594
llvm-svn: 257099
The final section order in relocatable files was just a side effect
of the atom sorter. This meant that sections like __data were before
__text because __data has RW permissions and __text RX and RW was less
than RX in our enum.
Final linked images had an actual section/segment sorter. There was no
reason for the difference, so simplify a bunch of code and just use the
same sorted for everything.
Reviewed by Lang Hames.
http://reviews.llvm.org/D15868
llvm-svn: 256786
This is a basic initial implementation of the -flat_namespace and
-undefined options for LLD-darwin. It ignores several subtlties,
but the result is close enough that we can now link LLVM (but not
clang) on Darwin and pass all regression tests.
llvm-svn: 248732