LinkerScript.cpp contains both the linker script processor and the
linker script parser. I put both into a single file, but the file grown
too large, so it's time to put them into two different files.
llvm-svn: 299515
This requires collectign all symbols referenced in the linker script
and adding them to symbol table as undefined symbol.
Differential Revision: https://reviews.llvm.org/D31147
llvm-svn: 298577
LinkerScript used to be a template class, so we couldn't instantiate
that class in elf::link. We instantiated ScriptConfig class earlier
instead so that the linker script parser can store configurations to
the object.
Now that LinkerScript is not a template, it doesn't make sense to
separate ScriptConfig from LinkerScript. This patch merges them.
llvm-svn: 298457
This fixes pr32031 by representing the expressions results as a
SectionBase and offset. This allows us to use an input section
directly instead of getting lost trying to compute an offset in an
outputsection when not all the information is available yet.
This also creates a struct to represent the *value* of and expression,
allowing the expression itself to be a simple typedef. I think this is
easier to read and will make it easier to extend the expression
computation to handle more complicated cases.
llvm-svn: 298079
This also requires postponing the assignment the assignment of
symbols defined in input linker scripts since those can refer to
output sections and in case we don't have a SECTIONS command, we
need to wait until all output sections have been created and
assigned addresses.
Differential Revision: https://reviews.llvm.org/D30851
llvm-svn: 297802
That moves all members that s possible to move for now (all which
does not depend on ELFT templating).
After that change LinkerScript contains only 8 methods in total,
and I believe it is possible to move them all after tweaking other
parts of linker. And we will be able to have single class for
linkerscript at the end.
llvm-svn: 297735
We can move all not templated functionality to LinkerScriptBase.
Patch do that for hasPhdrsCommands() and shows how it helps to detemplate
things in other places.
Probably we should be able to merge these 2 classes into single one after such steps.
Even if not, it still looks as reasonable cleanup for me.
Differential revision: https://reviews.llvm.org/D30895
llvm-svn: 297714
With this we have a single section hierarchy. It is a bit less code,
but the main advantage will be in a future patch being able to handle
foo = symbol_in_obj;
in a linker script. Currently that fails since we try to find the
output section of symbol_in_obj. With this we should be able to just
return an InputSection from the expression.
llvm-svn: 297313
With the current design an InputSection is basically anything that
goes directly in a OutputSection. That includes plain input section
but also synthetic sections, so this should probably not be a
template.
llvm-svn: 295993
Previously we evaluated the values of LMA incorrectly for next cases:
.text : AT(ADDR(.text) - 0xffffffff80000000) { ... }
.data : AT(ADDR(.data) - 0xffffffff80000000) { ... }
.init.begin : AT(ADDR(.init.begin) - 0xffffffff80000000) { ... }
Reason was that we evaluated offset when VA was not assigned. For case above
we ended up with 3 loads that has similar LMA and it was incorrect.
That is critical for linux kernel.
Patch updates the offset after VA calculation. That fixes the issue.
Differential revision: https://reviews.llvm.org/D30163
llvm-svn: 295722
Previously LLD would error out just "ld.lld: error: unable to move location counter backward"
What does not really reveal the place of issue,
Patch adds location to the output.
Differential revision: https://reviews.llvm.org/D30187
llvm-svn: 295720
This case should be possible to handle, but it is hard:
* In order to create program headers correctly, we have to scan the
sections in the order they are in the file.
* To find that order, we have to "execute" the linker script.
* The linker script can contain SIZEOF_HEADERS.
So to support this we have to start with a guess of how many headers
we need (3), run the linker script and try to create the program
headers. If it turns out we need more headers, we run the script again
with a larger SIZEOF_HEADERS.
Also, running the linker script depends on knowing the size of the
sections, so we have to finalize them. But creating the program
headers can change the value stored in some sections, so we have to
split size finalization and content finalization.
Looks like the last part is also needed for range extension thunks, so
we might support this at some point. For now just report an error
instead of producing broken files.
llvm-svn: 295458
As specified here:
* https://sourceware.org/binutils/docs/ld/MEMORY.html#MEMORY
There are two deviations from what is specified for GNU ld:
1. Only integer constants and *not* constant expressions
are allowed in `LENGTH` and `ORIGIN` initializations.
2. The `I` and `L` attributes are *not* implemented.
With (1) there is currently no easy way to evaluate integer
only constant expressions. This can be enhanced in the
future.
With (2) it isn't clear how these flags map to the `SHF_*`
flags or if they even make sense for an ELF linker.
Differential Revision: https://reviews.llvm.org/D28911
llvm-svn: 292875
The feature is documented as
-----------------------------
The format of the dynamic list is the same as the version node
without scope and node name. See *note VERSION:: for more
information.
--------------------------------
And indeed qt uses a dynamic list with an 'extern "C++"' in it. With
this patch we support that
The change to gc-sections-shared makes us match bfd. Just because we
kept bar doesn't mean it has to be in the dynamic symbol table.
The changes to invalid-dynamic-list.test and reproduce.s are because
of the new parser.
The changes to version-script.s are the only case where we change
behavior with regards to bfd, but I would like to see a mix of
--version-script and --dynamic-list used in the wild before
complicating the code.
llvm-svn: 289082
Linker script doesn't create a section if it has no content. So the following
script doesn't create .norelocs section if it doesn't have any .rel* sections.
.norelocs : { *(.rel*) }
Later, if you assert that the size of .norelocs is 0, LLD printed out
an error message, because it didn't allow calling SIZEOF() on nonexistent
sections.
This patch allows SIZEOF() on nonexistent sections, so that you can do
something like this.
ASSERT(SIZEOF(.norelocs), "shouldn't contain .rel sections!")
Note that this behavior is compatible with GNU.
Differential Revision: https://reviews.llvm.org/D26810
llvm-svn: 287257
Propagate program headers by walking the commands, not the
sections. This allows us to propagate program headers even from
sections that don't end up in the output.
Fixes pr30997.
llvm-svn: 286837
The disadvantage is that we use uint64_t instad of uint32_t for some
value in 32 bit files. The advantage is a substantially simpler code,
faster builds and less code duplication.
llvm-svn: 286414
Previously, we do this piece of code to iterate over all input sections.
for (elf::ObjectFile<ELFT> *F : Symtab.getObjectFiles())
for (InputSectionBase<ELFT> *S : F->getSections())
It turned out that this mechanisms doesn't work well with synthetic
input sections because synthetic input sections don't belong to any
input file.
This patch defines a vector that contains all input sections including
synthetic ones.
llvm-svn: 286051
Previously, it didn't support the character class, so we couldn't
eliminate the use fo llvm::Regex. Now that it is supported, we
can remove compileGlobPattern, which converts a glob pattern to
a regex.
This patch contains optimization for exact/prefix/suffix matches.
Differential Revision: https://reviews.llvm.org/D26284
llvm-svn: 285949
This can speed up lld up to 5 times when linking applications
with large number of sections and using linker script.
Differential revision: https://reviews.llvm.org/D26241
llvm-svn: 285895
With this patch we keep track of the fact that . is a position in the
file and therefore not absolute. This allow us to compute relative
relocations that involve symbol that are defined in linker scripts
with '.'.
This fixes https://llvm.org/bugs/show_bug.cgi?id=30406
There is still more work to track absoluteness over the various
expressions, but this should unblock linking the EFI bootloader.
llvm-svn: 285641
We parse linker scripts very early, but whether an expression is
absolute or not can depend on a symbol defined in a .o. Given that, we
have to delay the computation of IsAbsolute. We can do that by storing
an AST when parsing or by also making IsAbsolute a function like we do
for the expression value. This patch implements the second option.
llvm-svn: 285628
And as a token of the new feature, make ALIGNOF always absolute.
This is a step in making it possible to have non absolute symbols out
of output sections.
llvm-svn: 285608
Previously, we have a lot of BumpPtrAllocators, but all these
allocators virtually have the same lifetime because they are
not freed until the linker finishes its job. This patch aggregates
them into a single allocator.
Differential revision: https://reviews.llvm.org/D26042
llvm-svn: 285452
If there is not sufficient address space, just give up and don't put
the header in the PT_LOAD.
This matches bfd behaviour and I found at least one script that
depends on having a section at address 0.
llvm-svn: 282750
The BYTE, SHORT, LONG, and QUAD commands store one, two, four, and eight bytes (respectively).
After storing the bytes, the location counter is incremented by the number of bytes
stored.
Previously our scripts handles these commands incorrectly. For example:
SECTIONS {
.foo : {
*(.foo.1)
BYTE(0x11)
...
We accepted the script above treating BYTE as input section description.
These commands are used in the wild though.
Differential revision: https://reviews.llvm.org/D24830
llvm-svn: 282429
DEFINED(symbol)
Return 1 if symbol is in the linker global symbol table and is defined before
the statement using DEFINED in the script, otherwise return 0.
Can be used to define default values for symbols. Found it in the wild.
Differential revision: https://reviews.llvm.org/D24858
llvm-svn: 282245
This is PR30442.
Previously we were failed to parce complex expressions like:
foo : { *(SORT_BY_NAME(bar) zed) }
Main idea of patch that globs and excludes can be wrapped in a SORT.
There is a difference in semanics of ld/gold:
ld likes:
*(SORT(EXCLUDE_FILE (*file1.o) .foo.1))
gold likes:
*(EXCLUDE_FILE (*file1.o) SORT(.foo.1))
Patch implements ld grammar, complex expressions like
next is not a problem anymore:
.abc : { *(SORT(.foo.* EXCLUDE_FILE (*file1.o) .bar.*) .bar.*) }
Differential revision: https://reviews.llvm.org/D24758
llvm-svn: 282078
This reverts commit r282021, bringing back r282015.
The problem was that the comparison function was not a strict weak
ordering anymore, which this patch fixes.
Original message:
Only restrict order if both sections are in the script.
This matches gold and bfd behavior and is required to handle some scripts.
The script has to assume where PT_LOADs start in order to align that
spot. If we don't allow section it doesn't know about to move to the
middle, we can need more PT_LOADs and those will not be aligned.
llvm-svn: 282035
With fix for 2 bots. Details about the fix performed is on a review page.
Initial commit message:
This is PR30387:
From PR description:
We fail to parse
SECTIONS
{
foo :
{
*(sec0 EXCLUDE_FILE (zed1.o) sec1 EXCLUDE_FILE (zed2.o) sec2 )
}
}
The semantics according to bfd are:
Include sec1 from every file but zed1.o
Include sec2 from every file but zed2.o
Include sec0 from every file
Patch implements the support.
Differential revision: https://reviews.llvm.org/D24650
llvm-svn: 281754
This fixes pr30367, but more importantly, it changes how we compute offsets.
Now offset computation in a walk over linker script commands, like the
rest of assignAddresses. IMHO this is simpler to understand and if we
ever have to create multiple outputsections or chunks to change how we
handle test/ELF/linkerscript/alternate-sections.s it should be easier
to do it.
llvm-svn: 281736
This is PR30387:
From PR description:
We fail to parse
SECTIONS
{
foo :
{
*(sec0 EXCLUDE_FILE (zed1.o) sec1 EXCLUDE_FILE (zed2.o) sec2 )
}
}
The semantics according to bfd are:
Include sec1 from every file but zed1.o
Include sec2 from every file but zed2.o
Include sec0 from every file
Patch implements the support.
Differential revision: https://reviews.llvm.org/D24650
llvm-svn: 281721
This is PR30386,
SORT_BY_INIT_PRIORITY is a keyword can be used to sort sections by numerical value of the
GCC init_priority attribute encoded in the section name.
Differential revision: https://reviews.llvm.org/D24611
llvm-svn: 281646
Absence of it caused a clang warning:
warning: 'lld:🧝:LinkerScriptBase' has virtual functions but non-virtual destructor [-Wnon-virtual-dtor]
At fact we don't need it here because do not destroy this object by
base pointer.
llvm-svn: 280916
Previous way of accessing templated methods was a bit bulky,
Patch introduces small interface based solution.
Differential revision: https://reviews.llvm.org/D23872
llvm-svn: 280910
Use std::regex instead of hand written matcher.
Patch based on code and ideas of Rui Ueyama.
Differential revision: https://reviews.llvm.org/D23829
llvm-svn: 280544
Previously we used LayoutInputSection class to correctly assign
symbols defined in linker script. This patch removes it and uses
pointer to preceding input section in SymbolAssignment class instead.
Differential revision: https://reviews.llvm.org/D23661
llvm-svn: 280348
Symbol assignments outside of SECTIONS command need to be created
even when SECTIONS command is not used.
Differential Revision: https://reviews.llvm.org/D23751
llvm-svn: 280252
Patch removes VersionScriptParser class and moves the members to ScriptParser
It opens road for implementation of VERSION linkerscript command.
Differential revision: https://reviews.llvm.org/D23774
llvm-svn: 280212
You can force input section alignment within an output section by using SUBALIGN. The
value specified overrides any alignment given by input sections, whether larger or smaller.
SUBALIGN is used in many projects in the wild.
Differential revision: https://reviews.llvm.org/D23063
llvm-svn: 279256
The linker will normally set the LMA equal to the VMA.
You can change that by using the AT keyword.
The expression lma that follows the AT keyword specifies
the load address of the section.
Patch implements this keyword.
Differential revision: https://reviews.llvm.org/D19272
llvm-svn: 278911
Previously filtering that was used worked incorrectly.
For example for next script it would just remove both sections completely:
SECTIONS {
. = 0x1000;
.aaa : ONLY_IF_RW { *(.aaa.*) }
. = 0x2000;
.aaa : ONLY_IF_RO { *(.aaa.*) }
}
Patch fixes above issues and adds testcase showing the issue. Testcase is a subset of
FreeBSD script which has:
.eh_frame : ONLY_IF_RO { KEEP (*(.eh_frame)) }
...
.eh_frame : ONLY_IF_RW { KEEP (*(.eh_frame)) }
Differential revision: https://reviews.llvm.org/D23326
llvm-svn: 278486
Previously, we created two or more output sections if there are
input sections with the same name but with different attributes.
That is a wrong behavior. This patch fixes the issue.
One thing we need to do is to merge output section attributes.
Currently, we create an output section based on the first input
section's attributes. This may make a wrong output section
attributes. What we need to do is to bitwise-OR attributes.
We'll do it in a follow-up patch.
llvm-svn: 278461
SIZEOF_HEADERS - Return the size in bytes of the output file’s headers.
It is is a feature used in FreeBsd script, for example.
There is a discussion on PR28688 page about it.
Differential revision: https://reviews.llvm.org/D23165
llvm-svn: 278204
Summary:
The comparator function to compare input sections as instructed by
SORT command was a bit too complicated because it needed to handle
four different cases. This patch split it into two function calls.
This patch also simplifies the parser.
Reviewers: grimar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23140
llvm-svn: 277780
ASSERT(exp, message)
Ensure that exp is non-zero. If it is zero, then exit the linker with an error
code, and print message.
ASSERT is useful and was seen in few projects in the wild.
Differential revision: https://reviews.llvm.org/D22912
llvm-svn: 277710
Previously we supported only sorting by name.
When there are nested section sorting commands in linker script, there can be at most 1
level of nesting for section sorting commands.
SORT_BY_NAME (SORT_BY_ALIGNMENT (wildcard section pattern)). It will sort the input
sections by name first, then by alignment if 2 sections have the same name.
SORT_BY_ALIGNMENT (SORT_BY_NAME (wildcard section pattern)). It will sort the input
sections by alignment first, then by name if 2 sections have the same alignment.
SORT_BY_NAME (SORT_BY_NAME (wildcard section pattern)) is treated the same as SORT_
BY_NAME (wildcard section pattern).
SORT_BY_ALIGNMENT (SORT_BY_ALIGNMENT (wildcard section pattern)) is treated the
same as SORT_BY_ALIGNMENT (wildcard section pattern).
All other nested section sorting commands are invalid.
Patch implements that all above.
Differential revision: https://reviews.llvm.org/D23019
llvm-svn: 277583
Previously, Ignore flag is set if we don't want to assign
a value to symbols. It happens if a symbol assingment is in
PROVIDE() and there's already a symbol with the same name.
The previous code had a subtle but that we assume that the
existing symbol is an absolute symbol even if it is not.
This patch fixes the issue by always overwriting an absolute
symbol.
llvm-svn: 277115
All other singleton instances are accessible globally.
CommonInputSection shouldn't be an exception.
Differential Revision: https://reviews.llvm.org/D22935
llvm-svn: 277034
createSections function is getting longer, so it is time to split it
into small functions. The reason why the function is long is because
it has deeply nested for-loops. This patch constructs temporary data
to reduce nesting level.
Differential Revision: https://reviews.llvm.org/D22786
llvm-svn: 276706
Previously, we handled an expression as a vector of tokens. In other
words, an expression was a vector of uncooked raw StringRefs.
When we need a value of an expression, we used ExprParser to run
the expression.
The separation was needed essentially because parse time is too
early to evaluate an expression. In order to evaluate an expression,
we need to finalize section sizes. Because linker script parsing
is done at very early stage of the linking process, we can't
evaluate expressions while parsing.
The above mechanism worked fairly well, but there were a few
drawbacks.
One thing is that we sometimes have to parse the same expression
more than once in order to find the end of the expression.
In some contexts, linker script expressions have no clear end marker.
So, we needed to recognize balanced expressions and ternary operators.
The other is poor error reporting. Since expressions are parsed
basically twice, and some information that is available at the first
stage is lost in the second stage, it was hard to print out
apprpriate error messages.
This patch fixes the issues with a new approach.
Now the expression parsing is integrated into ScriptParser.
ExprParser class is removed. Expressions are represented as lambdas
instead of vectors of tokens. Lambdas captures information they
need to run themselves when they are created.
In this way, ends of expressions are naturally detected, and
errors are handled in the usual way. This patch also reduces
the amount of code.
Differential Revision: https://reviews.llvm.org/D22728
llvm-svn: 276574
LinkerScript<ELFT>::assignAddresses is becoming larger and looks
it can be good time for splitting. I expect to can more SectionsCommand's there,
and dispatching some of them separatelly can help to keep method smaller either.
Differential revision: https://reviews.llvm.org/D22506
llvm-svn: 276300
This adds InputSectionDescription command to represent
the input section declaration.
This leads to next cleanup:
SectionRule removed.
ScriptConfiguration::Sections mamber removed.
LinkerScript<ELFT>::getOutputSection() removed.
Differential revision: https://reviews.llvm.org/D22617
llvm-svn: 276283
Approach uses LLVM-style RTTI for representing the linker script
commands in a form of tree for future simplification of parsing.
Core idea and code sample belongs to Rui Ueyama.
Differential revision: https://reviews.llvm.org/D22604
llvm-svn: 276243
This patch simplifies output section management by making
Factory class have ownership of sections that creates.
Differential Revision: https://reviews.llvm.org/D22575
llvm-svn: 276141