When we are in a error state, script parser will not parse the -defsym
expression and hence will not tokenize it. Then ScriptLexer::Pos will be 0
and LLD will assert and crash here:
MemoryBufferRef ScriptLexer::getCurrentMB() {
assert(!MBs.empty() && Pos > 0); // Bang !
Solution - stop parsing the defsym in a error state. That is consistent
with the regular case (when we parse the linker script).
llvm-svn: 347549
The uint32_t type does not clearly convey that these fields are interpreted
in the target endianness. Converting them to byte arrays should make this
more obvious and less error-prone.
Patch by James Clarke
Differential Revision: http://reviews.llvm.org/D54207
llvm-svn: 346893
We need this to support 32-bit ARM. Add test cases for emulation
handling for this architecture as well.
Differential Revision: https://reviews.llvm.org/D53539
llvm-svn: 344976
This patch adds a support for OUTPUT_FORMAT linker script directive.
Since I'm not 100% confident with BFD names you can use in the directive
for all architectures, I added only a few in this patch. We can add
other names for other archtiectures later.
We still do not support triple-style OUTPUT_FORMAT directive, namely,
OUTPUT_FORMAT(bfdname, big, little). If you pass -EL (little endian)
or -EB (big endian) to the linker, GNU linkers pick up big or little
as a BFD name, correspondingly, so that you can use a single linker
script for bi-endian processor. I'm not sure if we really need to
support that, so I'll leave it alone for now.
Note that -m takes precedence over OUTPUT_FORAMT, but we always parse
a BFD name given to OUTPUT_FORMAT for error checking. You cannot write
an invalid name in the OUTPUT_FORMAT directive.
Differential Revision: https://reviews.llvm.org/D53495
llvm-svn: 344952
GNU ld's manual says that TARGET(foo) is basically an alias for
`--format foo` where foo is a BFD target name such as elf64-x86-64.
Unlike GNU linkers, lld doesn't allow arbitrary BFD target name for
--format. We accept only "default", "elf" or "binary". This makes
situation a bit tricky because we can't simply make TARGET an alias for
--target.
A quick code search revealed that the usage number of TARGET is very
small, and the only meaningful usage is to switch to the binary mode.
Thus, in this patch, we handle only TARGET(elf.*) and TARGET(binary).
Differential Revision: https://reviews.llvm.org/D48153
llvm-svn: 339060
This is PR36768.
Linker script OVERLAYs are described in 4.6.9. Overlay Description of the spec:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/sections.html
They are used to allow output sections which have different LMAs but the same VAs
and used for embedded programming.
Currently, LLD restricts overlapping of sections and that seems to be the most desired
behaviour for defaults. My thoughts about possible approaches for PR36768 are on the bug page,
this patch implements OVERLAY keyword and allows VAs overlapping for sections that within the overlay.
Differential revision: https://reviews.llvm.org/D44780
llvm-svn: 335714
Currently, LLD supports ASSERT as a separate command.
We support two forms now.
Assign expression-form: . = ASSERT(0x100)
(old GNU ld required it and some scripts in the wild are still using
something like . = ASSERT((_end - _text <= (512 * 1024 * 1024)), "kernel image bigger than KERNEL_IMAGE_SIZE");
Nowadays above is not a mandatory form and command-like form is commonly used:
ASSERT(<expr>, "text);
The return value of the ASSERT is Dot. That was implemented in D30171.
It looks like (2) is just a short version of (1) then.
GNU ld does *not* list ASSERT as a SECTIONS command:
https://sourceware.org/binutils/docs/ld/SECTIONS.html#SECTIONS
Given above we probably can change ASSERT to be an assignment to Dot.
That makes the rest of the code much simpler. Patch do that.
Differential revision: https://reviews.llvm.org/D45434
llvm-svn: 330814
I'm proposing a new command line flag, --warn-backrefs in this patch.
The flag and the feature proposed below don't exist in GNU linkers
nor the current lld.
--warn-backrefs is an option to detect reverse or cyclic dependencies
between static archives, and it can be used to keep your program
compatible with GNU linkers after you switch to lld. I'll explain the
feature and why you may find it useful below.
lld's symbol resolution semantics is more relaxed than traditional
Unix linkers. Therefore,
ld.lld foo.a bar.o
succeeds even if bar.o contains an undefined symbol that have to be
resolved by some object file in foo.a. Traditional Unix linkers
don't allow this kind of backward reference, as they visit each
file only once from left to right in the command line while
resolving all undefined symbol at the moment of visiting.
In the above case, since there's no undefined symbol when a linker
visits foo.a, no files are pulled out from foo.a, and because the
linker forgets about foo.a after visiting, it can't resolve
undefined symbols that could have been resolved otherwise.
That lld accepts more relaxed form means (besides it makes more
sense) that you can accidentally write a command line or a build
file that works only with lld, even if you have a plan to
distribute it to wider users who may be using GNU linkers. With
--check-library-dependency, you can detect a library order that
doesn't work with other Unix linkers.
The option is also useful to detect cyclic dependencies between
static archives. Again, lld accepts
ld.lld foo.a bar.a
even if foo.a and bar.a depend on each other. With --warn-backrefs
it is handled as an error.
Here is how the option works. We assign a group ID to each file. A
file with a smaller group ID can pull out object files from an
archive file with an equal or greater group ID. Otherwise, it is a
reverse dependency and an error.
A file outside --{start,end}-group gets a fresh ID when
instantiated. All files within the same --{start,end}-group get the
same group ID. E.g.
ld.lld A B --start-group C D --end-group E
A and B form group 0, C, D and their member object files form group
1, and E forms group 2. I think that you can see how this group
assignment rule simulates the traditional linker's semantics.
Differential Revision: https://reviews.llvm.org/D45195
llvm-svn: 329636
Currently, LLD print symbol assignment commands to the map file,
but it does not do that for assignments that are outside of the section
descriptions. Such assignments can affect the layout though.
The patch implements the following:
* Teaches LLD to print symbol assignments outside of section declaration.
* Teaches LLD to print PROVIDE/HIDDEN/PROVIDE hidden commands.
In case when symbol is not provided, nothing will be printed.
Differential revision: https://reviews.llvm.org/D44894
llvm-svn: 329272
Patch teaches LLD to print BYTE/SHORT/LONG/QUAD and
location move commands to the map file.
Differential revision: https://reviews.llvm.org/D44004
llvm-svn: 327612
This finishes PR35877.
INSERT BEFORE used similar to INSERT AFTER,
it inserts sections before the given target section.
Differential revision: https://reviews.llvm.org/D44380
llvm-svn: 327378
AFTER keyword is mandatory and consume() was
used by mistake here. We accepted broken script before
this patch, testcase shows the issue.
llvm-svn: 327260
This implements INSERT AFTER in a following way:
During reading scripts it collects all insert statements.
After we done and read all files it inserts statements into script commands list.
With that:
* Rest of code does know nothing about INSERT.
* Approach is straightforward and have no visible limitations.
* It is also easy to support INSERT BEFORE (was seen in clang code once).
* Should work for PR35877 and similar cases.
Cons:
* It assumes we have "main" scripts that describes sections.
Differential revision: https://reviews.llvm.org/D43468
llvm-svn: 327003
"division by zero" or "modulo by zero" are not
very informative errors and even probably confusing
as does not let to know that error is coming from linker script.
Patch adds location reporting.
Differential revision: https://reviews.llvm.org/D43934
llvm-svn: 326686
LLD crashes with broken scripts shown in testcase,
because fails to read memory regon name and accesses
MemoryRegions's element which is nullptr.
Patch fixes it.
Differential revision: https://reviews.llvm.org/D43866
llvm-svn: 326431
This is PR36515.
Currenly if we have a script like .debug_info 0 : { *(.debug_info) },
we would not remove this section and keep it in the output.
That does not work, because it is common case for
debug sections to have a zero address expression.
Patch changes behavior so that we remove only sections
that do not use symbols in its expressions.
Differential revision: https://reviews.llvm.org/D43863
llvm-svn: 326430
In GNU linkers, the last semicolon is optional. We can't link libstdc++
with lld because of that difference.
Differential Revision: https://reviews.llvm.org/D42820
llvm-svn: 324036
The problem we had with it is that anything inside an AT is an
expression, so we failed to parse the section name because of the - in
it.
llvm-svn: 322801
parseInt assumed that it could take a negative number literal (e.g.
"-123"). However, such number is in reality already handled as a
unary operator '-' followed by a number literal, so the number
literal is always non-negative. Thus, this code is dead.
llvm-svn: 322453
AT> lma_region expression allows to specify the memory region
for section load address.
Should fix PR35684.
Differential revision: https://reviews.llvm.org/D41397
llvm-svn: 322359
When two linker script symbols are subtracted, the result should be absolute.
This is the behavior of binutils' ld.
Patch by Erick Reyes!
llvm-svn: 321390
Summary:
This matches the behaviour of ld.bfd:
https://sourceware.org/binutils/docs/ld/Options.html#Options
If scriptfile does not exist in the current directory, ld looks for it in
the directories specified by any preceding '-L' options. Multiple '-T'
options accumulate.
Reviewers: ruiu, grimar
Reviewed By: ruiu, grimar
Subscribers: emaste, llvm-commits
Differential Revision: https://reviews.llvm.org/D40129
llvm-svn: 318655
When findMemoryRegion do search to find a region for output section it
iterates over MemoryRegions which is DenseMap and so does not
guarantee iteration in insertion order. As a result selected region depends
on its name and not on its definition position
Testcase shows the issue, patch fixes it. Behavior after applying the patch
seems consistent with bfd.
Differential revision: https://reviews.llvm.org/D39544
llvm-svn: 317307
Summary:
The COFF linker and the ELF linker have long had similar but separate
Error.h and Error.cpp files to implement error handling. This change
introduces new error handling code in Common/ErrorHandler.h, changes the
COFF and ELF linkers to use it, and removes the old, separate
implementations.
Reviewers: ruiu
Reviewed By: ruiu
Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D39259
llvm-svn: 316624
This is PR34886.
SUBALIGN command currently triggers failture if result expression
is zero. Patch fixes the issue, treating zero as 1, what is consistent with
other places and ELF spec it seems.
Patch also adds "is power of 2" check for this and other expressions
returning alignment.
Differential revision: https://reviews.llvm.org/D38846
llvm-svn: 316580
This patch doesn't change the behavior of the program because it
would eventually return None at end of the function. But it is
better to return None early if we know it will eventually happen.
llvm-svn: 315495
Usually, a function that does symbol lookup takes symbol name as
its first argument. Also, if a function takes a source location hint,
it is usually the last parameter. So the previous parameter order
was counter-intuitive.
llvm-svn: 315433
"Commands" was ambiguous because in the linker script, everything is
a command. We used to handle only SECTIONS commands, and at the time,
it might make sense to call them the commands, but it is no longer
the case. We handle not only SECTIONS but also MEMORY, PHDRS, VERSION,
etc., and they are all commands.
llvm-svn: 315409
HasSections is true if there is at least one SECTIONS linker
script command, and it is not directly related to whether we have
section objects or not. So I think the new name is better.
llvm-svn: 315405
ScriptConfiguration was a class to contain parsed results of
linker scripts. LinkerScript is a class to interpret it.
That ditinction was needed because we haven't instantiated
LinkerScript early (because, IIRC, LinkerScript class was a
ELFT template function). So, when we parse linker scripts,
we couldn't directly store the result to a LinkerScript instance.
Now, that limitation is gone. We instantiate LinkerScript
at the very beginning of our main function. We can directly
store parse results to a LinkerScript instance.
llvm-svn: 315403
This patch goes back to considering ForceAbsolute in moveAbsRight, but
only if the second argument is not already absolute.
With this we can handle "foo + ABSOLUTE(foo)" and "ABSOLUTE(foo) + foo".
llvm-svn: 313800
The idea of this function is to simplify the implementation of binary
operators like add.
A value might be absolute because of an ABSOLUTE expression, but it
still depends on the value of a section and we might not be able to
evaluate it early. We should keep such values on the LHS, so that we
can delay the evaluation.
We can now handle both "1 + ABSOLUTE(foo)" and "ABSOLUTE(foo) + 1".
llvm-svn: 313794
This allows combining --dynamic-list and version scripts too. The
version script controls which symbols are visible, and
--dynamic-list controls which of those are preemptible.
Unlike previous versions, undefined symbols are still considered
preemptible, which was the issue breaking the cfi tests.
This fixes pr34053.
llvm-svn: 312806
REGION_ALIAS(alias, region)
Alias names can be added to existing memory regions created with
the MEMORY command. Each name corresponds to at most one
memory region.
Differential revision: https://reviews.llvm.org/D37477
llvm-svn: 312777
If --dynamic-list is given, only those symbols are preemptible.
This allows combining --dynamic-list and version scripts too. The
version script controls which symbols are visible, and --dynamic-list
controls which of those are preemptible.
This fixes pr34053.
llvm-svn: 312757
Patch by Rafael Espíndola.
This is PR34053.
The implementation is a bit of a hack, given the precise location where
IsPreemtible is set, it cannot be used from
SymbolTable::handleAnonymousVersion.
I could add another method to SymbolTable if you think that would be
better.
Differential Revision: https://reviews.llvm.org/D36499
llvm-svn: 311468
Summary: This small patch adds the support for ! operator in linker scripts.
Reviewers: ruiu, rafael
Reviewed By: ruiu
Subscribers: meadori, grimar, emaste, llvm-commits
Differential Revision: https://reviews.llvm.org/D36451
llvm-svn: 310607
D35945 introduces change when there is useless to check Error flag
in few places, but ErrorCount must be checked instead.
But then we probably can just check ErrorCount always. That should simplify
things. Patch do that.
Differential revision: https://reviews.llvm.org/D36266
llvm-svn: 310046
Following possible scripts triggered accessing to Target when it was not yet
initialized (was nullptr).
MEMORY { name : ORIGIN = DATA_SEGMENT_RELRO_END; }
MEMORY { name : ORIGIN = CONSTANT(COMMONPAGESIZE); }
Patch errors out instead.
Differential revision: https://reviews.llvm.org/D36140
llvm-svn: 309953
Previously we would crash when tried to ALIGN(0).
Patch uses value 1 instead in this case, that
looks to be consistent with GNU linkers
and reasonable and simple behavior itself.
Differential revision: https://reviews.llvm.org/D35942
llvm-svn: 309372
This is a bit of a hack, but it is *so* convenient.
Now that we create synthetic linker scripts when none is provided, we
always have to handle paired OutputSection and OutputsectionCommand and
keep a mapping from one to the other.
This patch simplifies things by merging them and creating what used to
be OutputSectionCommands really early.
llvm-svn: 309311
This patch fixes a small issue with respect to how memory region names
are parsed on output section descriptions. For example, consider:
.text : { *(.text) } > rom
That can also be written like:
.text : { *(.text) } >rom
The latter form is accepted by GNU LD and is fairly common.
Differential Revision: https://reviews.llvm.org/D35920
llvm-svn: 309191
Also add the test cases for the addition and subtraction both for
the relative and absolute case.
Differential Revision: https://reviews.llvm.org/D35346
llvm-svn: 308692
Summary:
If the linker is invoked with `--chroot /foo` and `/bar/baz.o`, it
tries to read the file from `/foo/bar/baz.o`. This feature is useful
when you are dealing with files created by the --reproduce option.
Reviewers: grimar
Subscribers: llvm-commits, emaste
Differential Revision: https://reviews.llvm.org/D35517
llvm-svn: 308646
Functions declared in Strings.h should provide generic string operations
for the linker, but some of them are too specific to some features. This
patch moves them to the location where they are used.
llvm-svn: 307949
The assignAddresses() function accumulates state in the LinkerScript that
prevents it from being called multiple times. This change moves the state
into a separate structure AddressState that is created at the start of the
function and disposed of at the end.
CurAddressState is used rather than passing a reference to the state as a
parameter to the functions used by assignAddresses(). This is because the
getSymbolValue function needs to be executed in the context of AddressState
but it is stored in ScriptParser when AddressState is not available.
The AddressState is also used in a limited context by processCommands()
Differential Revision: https://reviews.llvm.org/D34345
llvm-svn: 307367
Previously, it couldn't parse
SECTIONS .text (0x1000) : { *(.text) }
because "(" was interpreted as the begining of the "(NOLOAD)" directive.
llvm-svn: 305006
When linking linux kernel LLD currently reports next errors:
ld: error: unable to evaluate expression: input section .head.text has no output section assigned
ld: error: At least one side of the expression must be absolute
ld: error: At least one side of the expression must be absolute
That does not provide file/line information and overall looks unclear.
Patch adds location information to ExprValue and that allows
to provide more clear error messages.
Differential revision: https://reviews.llvm.org/D33943
llvm-svn: 304881
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
We were looking up sections by name during expression evaluation. By
keeping track of forward declarations we can do the lookup during
script parsing.
Doing the lookup earlier will be more efficient when assignAddresses
is run twice and removes two uses of OutputSections.
llvm-svn: 304381
While the following expression is handled fine:
PROVIDE_HIDDEN(newsym = oldsym + address);
The following expression triggers an error because the expression
is evaluated as absolute:
PROVIDE_HIDDEN(newsym = ALIGN(oldsym, CONSTANT(MAXPAGESIZE)) + address);
To avoid this error, we use late evaluation for ALIGN by making the
alignment an attribute of the expression itself.
Differential Revision: https://reviews.llvm.org/D33629
llvm-svn: 304185
Switch to llvm::to_integer() everywhere in LLD instead of
StringRef::getAsInteger() because API of latter is confusing.
It returns true on error and false otherwise what makes reading
the code incomfortable.
Differential revision: https://reviews.llvm.org/D33187
llvm-svn: 303149
Adds support for the ORIGIN and LENGTH linker script built in functions.
ORIGIN(memory) Return the origin of the memory region
LENGTH(memory) Return the length of the memory region
Redo of D29775 for refactored linker script parsing.
Patch by Robert Clarke
Differential Revision: https://reviews.llvm.org/D32934
llvm-svn: 302564
"read" is used as a prefix for functions that read tokens from input
streams. This function doesn't really read anything, but just parses
a given string as an integer, so rename.
llvm-svn: 300281
Previously, we allowed only integers in this context. Now you can
write expressions there. LLD is now able to handle the following
linker, for example.
MEMORY { rom (rx) : ORIGIN = (1024 * 1024) }
llvm-svn: 300131
Fixes PR32572.
When
(a) a library has no soname
and (b) library is given on the command line with path (and not through -L/-l flags)
DT_NEEDED entry for such library keeps the path as given.
This behavior is consistent with gold and bfd, and is used in compiler-rt test suite.
This is a second attempt after r300007 got reverted. This time relro-omagic test is
changed in a way to avoid hardcoding the path to the test directory in the objdump'd
binary.
llvm-svn: 300011
Fixes PR32572.
When
(a) a library has no soname
and (b) library is given on the command line with path (and not through -L/-l flags)
DT_NEEDED entry for such library keeps the path as given.
This behavior is consistent with gold and bfd, and is used in compiler-rt test suite.
llvm-svn: 300007
Symbols referenced by linker scripts are not necessarily be undefined,
so the previous name didn't convey the meaining of the variable.
llvm-svn: 299573