When -verbose is specified, patch outputs names of each input orphan section
assigned to output.
Differential revision: https://reviews.llvm.org/D37517
llvm-svn: 314098
The previous logic was to try to detect if a linker script defined _gp
by checking !ElfSym::MipsGp->Value. That doesn't work in all cases as
the assigned value can be 0.
We now just always defined it Writer.cpp and always overwrite it
afterwards if needed.
llvm-svn: 313788
Normally to find the offset of a value in a section, we have to
compute the value since the alignment is defined on the final address.
If the alignment is trivial, we can skip the value computation. This
allows us to know the offset even in cases where we cannot yet know
the value.
llvm-svn: 313777
We try to evaluate expressions early when possible, but it is not
possible to evaluate them early if they are based on a section.
Before we would get this wrong on ABSOLUTE expressions.
llvm-svn: 313764
This patch removes lot of static Instances arrays from different input file
classes and introduces global arrays for access instead. Similar to arrays we
have for InputSections/OutputSectionCommands.
It allows to iterate over input files in a non-templated code.
Differential revision: https://reviews.llvm.org/D35987
llvm-svn: 313619
Does not seem we need to set SectionIndex here.
It is set in finalizeSections() later.
Differential revision: https://reviews.llvm.org/D37815
llvm-svn: 313522
When given
foobar = ALIGN(., 0x100);
my expectation from what the manual says is that the final address of
foobar will be aligned. It seems that bfd aligns the offset in the
section, which causes some odd results if the section is not 0x100
aligned. Gold aligns the address.
This changes lld to align the final address.
llvm-svn: 312979
to separate commons based on file name patterns. The following linker script
construct does not work because commons are allocated before section placement
is done and the only synthesized BssSection that holds all commons has no file
associated with it:
SECTIONS { .common_0 : { *file0.o(COMMON) }}
This patch changes the allocation of commons to create a section per common
symbol and let the section logic do the layout.
Differential revision: https://reviews.llvm.org/D37489
llvm-svn: 312796
REGION_ALIAS(alias, region)
Alias names can be added to existing memory regions created with
the MEMORY command. Each name corresponds to at most one
memory region.
Differential revision: https://reviews.llvm.org/D37477
llvm-svn: 312777
It is a bit more convinent and helps to simplify logic
of program headers allocation a little.
Differential revision: https://reviews.llvm.org/D34956
llvm-svn: 312711
Previously LLD did not calculate LMAOffset correctly when
AT and MEMORY were used together.
Patch fixes PR34407.
Differential revision: https://reviews.llvm.org/D37469
llvm-svn: 312625
linker script SECTION rules. This patch extends it to use a fully specified
file name as it appears in --trace output to match agains, i.e,
"<path>/<objname>.o" or "<path>/<libname>.a(<objname>.o)".
Differential Revision: https://reviews.llvm.org/D37031
llvm-svn: 311713
Currently, LLD checks whether there's enough space for headers by
checking if headers fit below the address of the first allocated
section. However, that's always thue if the binary doesn't start
at zero which means that LLD always emits a segment for headers,
even if no other sections belong to that segment.
This is a problem in cases when linker script is being used with a
non-zero start address when we don't want to make the headers visible
by not leaving enough space for them. This pattern is common in
embedded programming but doesn't work in LLD.
This patch changes the behavior of LLD in case when linker script
is being to match the behavior of BFD ld and gold, which is to only
place headers into a segment when they're covered by some output
section.
Differential Revision: https://reviews.llvm.org/D36256
llvm-svn: 311586
Previously we would crash on samples from testcase,
because were trying to access zero pointer to output section.
Differential revision: https://reviews.llvm.org/D36145
llvm-svn: 311311
We would previously crash on next script:
MEMORY { name : ORIGIN = .; }
Patch fixes that.
Differential revision: https://reviews.llvm.org/D36138
llvm-svn: 311073
With this Symbol has the same size as before, but DefinedRegular goes
from 72 to 64 bytes.
I also find this a bit easier to read. There are fewer places
initializing File for example.
This has a small but measurable speed improvement on all tests (1%
max).
llvm-svn: 310142
This is PR33889,
Patch adds support of combination of linkerscript and
-symbol-ordering-file option.
If no sorting commands are present in script inside section declaration
and no --sort-section option specified, code uses sorting from ordering
file if any exist.
Differential revision: https://reviews.llvm.org/D35843
llvm-svn: 310045
This is a bit of a hack, but it is *so* convenient.
Now that we create synthetic linker scripts when none is provided, we
always have to handle paired OutputSection and OutputsectionCommand and
keep a mapping from one to the other.
This patch simplifies things by merging them and creating what used to
be OutputSectionCommands really early.
llvm-svn: 309311
That is slightly more convinent as allows to store pointer on
program header entry in a more safe way.
Change was used in 2 patches currently on review.
Differential revision: https://reviews.llvm.org/D35832
llvm-svn: 309253
This is PR33714.
Previously for each input section offset of memory region
was incremented on a size of output section.
That resulted in a wrong error message saying about
overflow. Patch fixes that.
Differential revision: https://reviews.llvm.org/D35803
llvm-svn: 308955
The comment at the top of compareByFilePosition indicates that it relies
on stable_sort to preserve the order of synthetic sections. We were
using sort instead of stable_sort, however, leading to incorrect
synthetic section ordering.
Differential Revision: https://reviews.llvm.org/D35473
llvm-svn: 308207
Functions declared in Strings.h should provide generic string operations
for the linker, but some of them are too specific to some features. This
patch moves them to the location where they are used.
llvm-svn: 307949
Patch removes restriction about moving location counter
backwards outside of output sections declarations.
That may be useful for some apps relying on such scripts,
known example is linux kernel.
Differential revision: https://reviews.llvm.org/D34977
llvm-svn: 307794
r307367 via D34345 split out the temporary address state used within
processCommands() and assignAddresses(). Due to the way that getSymbolValue
is used by the ScriptParser there is no way of giving the current
OutputSection to getSymbolValue() without somehow accessing the created
addressState. The suggestion was that by making a pointer that would go out
of scope we would find out by ASAN/MSAN or a crash if someone had misused
currentAddressState.
Differential Revision: https://reviews.llvm.org/D34345
llvm-svn: 307637
The assignAddresses() function accumulates state in the LinkerScript that
prevents it from being called multiple times. This change moves the state
into a separate structure AddressState that is created at the start of the
function and disposed of at the end.
CurAddressState is used rather than passing a reference to the state as a
parameter to the functions used by assignAddresses(). This is because the
getSymbolValue function needs to be executed in the context of AddressState
but it is stored in ScriptParser when AddressState is not available.
The AddressState is also used in a limited context by processCommands()
Differential Revision: https://reviews.llvm.org/D34345
llvm-svn: 307367
The allocateHeaders() function is called at the end of assignAddresses(), it
decides whether the ELF header and program header table can be allocated to
a PT_LOAD program header. As the function alters state, it prevents
assignAddresses() from being called multiple times.
This change splits out the call to allocateHeaders() from assignAddresses()
this will permit assignAddresses() to be called while processing range
extension thunks without trying to allocateHeaders().
Differential Revision: https://reviews.llvm.org/D34344
llvm-svn: 307131
This patch makes changes to allow sections without the SHF_ALLOC bit to be
assigned to segments in a linker script.
The assignment of output sections to segments is performed in
LinkerScript::createPhdrs. Previously, this function would bail as soon as it
encountered an output section which did not have the SHF_ALLOC bit set, thus
preventing any output section without SHF_ALLOC from being assigned to a
segment.
This restriction has now been removed from LinkerScript::createPhdrs and instead
a check for SHF_ALLOC has been added to LinkerScript::adjustSectionsAfterSorting
to not propagate program headers to sections without SHF_ALLOC which matches the
behaviour of bfd linker scripts.
Differential Revision: https://reviews.llvm.org/D34204
llvm-svn: 307013
I found this while trying to build u-boot. It uses -Ttext in
combination with linker scripts.
My first reaction was to change the linker scripts to have the correct
value, but I found that it is actually quite convenient to have -Ttext
take precedence.
By having just
.text : { *(.text) }
In the script, they can define the text address in a single Makefile
and pass it to ld with -Ttext and for the C code with
-DFoo=value. Doing the same with linker scripts would require them to
be generated during the build.
llvm-svn: 305766
This patch adds support for segment NONE in linker scripts which enables the
specification that a section should not be assigned to any segment.
Note that GNU ld does not disallow the definition of a segment named NONE, which
if defined, effectively overrides the behaviour described above. This feature
has been copied.
Differential Revision: https://reviews.llvm.org/D34203
llvm-svn: 305700
This is probably the main patch left in unifying our intermediary
representation.
It moves the creation of default commands before section sorting. This
has the nice effect that we now have one location where we decide
where an orphan section should be placed.
Before this patch sortSections would decide the relative location of
orphan sections to other sections, but it was up to placeOrphanSection
to decide on the exact location.
We now only sort sections we created since the linker script is
already in the correct order.
llvm-svn: 305512
Currently we do layout as if non alloc sections had an actual address
and then set it to zero. This produces a few odd results where a
symbol has an address that is inconsistent with the section address.
The simplest way to fix it is probably to just set the address earlier.
The behavior of bfd seems to be similar, but it only sets the non
alloc section address is missing from the linker script or if the
script has an explicit " : 0" setting the address of the output
section (which the default script does).
llvm-svn: 305323
Thunks are now generated per InputSectionDescription instead of per
OutputSection. This allows created ThunkSections to be inserted directly
into InputSectionDescription.
Changes in this patch:
- Loop over InputSectionDescriptions to find relocations to Thunks
- Generate a ThunkSection per InputSectionDescription
- Remove synchronize() as we no longer need it
- Move fabricateDefaultCommands() before createThunks
Differential Revision: https://reviews.llvm.org/D33835
llvm-svn: 304887
When linking linux kernel LLD currently reports next errors:
ld: error: unable to evaluate expression: input section .head.text has no output section assigned
ld: error: At least one side of the expression must be absolute
ld: error: At least one side of the expression must be absolute
That does not provide file/line information and overall looks unclear.
Patch adds location information to ExprValue and that allows
to provide more clear error messages.
Differential revision: https://reviews.llvm.org/D33943
llvm-svn: 304881
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
We were looking up sections by name during expression evaluation. By
keeping track of forward declarations we can do the lookup during
script parsing.
Doing the lookup earlier will be more efficient when assignAddresses
is run twice and removes two uses of OutputSections.
llvm-svn: 304381
Before InputSectionBase had an OutputSection pointer, but that was not
always valid. For example, if it was a merge section one actually had
to look at MergeSec->OutSec.
This was brittle and caused bugs like the one fixed by r304260.
We now have a single Parent pointer that points to an OutputSection
for InputSection, but to a SyntheticSection for merge sections and
.eh_frame. This makes it impossible to accidentally access an invalid
OutSec.
llvm-svn: 304338
I found that during visual inspection of code while wrote different patch.
Script in testcase probably have nothing common with real life, but
we segfault currently using it.
If output section is known NOBITS, there is no need to create
writers threads for doing nothing or proccess any filler logic that
is useless here. We can just early return, that is what this patch do.
DIfferential revision: https://reviews.llvm.org/D33646
llvm-svn: 304192
InputSections may contain MergeInputSection members which trigger
a segmentation fault when trying to cast them to InputSection.
Differential Revision: https://reviews.llvm.org/D33628
llvm-svn: 304189
While the following expression is handled fine:
PROVIDE_HIDDEN(newsym = oldsym + address);
The following expression triggers an error because the expression
is evaluated as absolute:
PROVIDE_HIDDEN(newsym = ALIGN(oldsym, CONSTANT(MAXPAGESIZE)) + address);
To avoid this error, we use late evaluation for ALIGN by making the
alignment an attribute of the expression itself.
Differential Revision: https://reviews.llvm.org/D33629
llvm-svn: 304185
This reduces how many times we have to map from OutputSection to
OutputSectionCommand. It is a required step to moving
clearOutputSections earlier.
In order to always use writeTo in OutputSectionCommand we have to call
fabricateDefaultCommands for -r links and move section compression
after it.
llvm-svn: 303784
By the time we get to linker scripts, all special InputSectionBase
should have been combined into synthetic sections, which are a type of
InputSection. The net result is that we can use InputSection in a few
places that were using InputSectionBase.
llvm-svn: 303702
This converts the last (chronologically) user of OutputSections to use
the linker script commands instead.
The idea is to convert all uses after fabricateDefaultCommands, so
that we have a single representation.
llvm-svn: 303384
This is PR32664.
Issue was revealed by linux kernel script which was:
SECTIONS {
. = (0xffffffff80000000 + ALIGN(0x1000000, 0x200000));
phys_startup_64 = ABSOLUTE(startup_64 - 0xffffffff80000000);
.text : AT(ADDR(.text) - 0xffffffff80000000) {
.....
*(.head.text)
Where startup_64 is in .head.text.
At the place of assignment to phys_startup_64 we can not calculate absolute value for startup_64
because .text section has no VA assigned. Two patches were prepared earlier to address this: D32173 and D32174.
And in comments for D32173 was suggested not try to support this case, but error out.
Differential revision: https://reviews.llvm.org/D32793
llvm-svn: 302668
Previously it was impossible to use linkerscript with --compress-debug-sections
because of assert failture:
Assertion failed: isFinalized(), file C:\llvm\lib\MC\StringTableBuilder.cpp, line 64
Patch fixes the issue
llvm-svn: 302413
We can set SectionIndex tentatively as we process the linker script
instead of looking it repeatedly.
In general we should try to have as few name lookups as possible.
llvm-svn: 302299
In the non linker script case we would try very early to find out if
we could allocate the headers. Failing to do that would add extra
alignment to the first ro section, since we would set PageAlign
thinking it was the first section in the PT_LOAD.
In the linker script case the header allocation must be done in the
end, causing some duplication.
We now tentatively add the headers to the first PT_LOAD and if it
turns out they don't fit, remove them. With this we only need to
allocate the headers in one place in the code.
llvm-svn: 302186
We were correctly computing the size contribution of a .tbss input
section (it is none), but we were incorrectly considering the
alignment of the output section: it was advancing Dot instead of
ThreadBssOffset.
As far as I can tell this was always wrong in our linkerscript
implementation, but that became more visible now that the code is
shared with the non linker script case.
llvm-svn: 302107
The --section-start <name>=<address> needs to be translated into equivalent
linker script commands. There are a couple of problems with the existing
implementation:
- The --section-start with the lowest address is assumed to be at the start
of the map. This assumption is incorrect, we have to iterate through the
SectionStartMap to find the lowest address.
- The addresses in --section-start were being over-aligned when the
sections were marked as PageAlign. This is inconsistent with the use of
SectionStartMap in fixHeaders(), and can cause problems when the PageAlign
causes an "unable to move location counter backward" error when the
--section-start with PageAlign is aligned to an address higher than the next
--section-start. The ld.bfd and ld.gold seem to be more consistent with this
approach but this is not a well specified area.
This change fixes the problems above and also corrects a typo in which
fabricateDefaultCommands() is called with the wrong parameter, it should be
called with AllocateHeader not Config->MaxPageSize.
Differential Revision: https://reviews.llvm.org/D32749
llvm-svn: 302007
When using linkerscripts we were trying to sort SHF_LINK_ORDER
sections too early. Instead of always doing two runs of
assignAddresses, record the section order in processCommands.
llvm-svn: 301830
This version uses a set to speed up the synchronize method.
Original message:
Remove LinkerScript::flush.
This patch replaces flush with a last ditch attempt at synchronizing
the section list with the linker script "AST".
The synchronization is a bit of a hack and should in time be avoided
by creating the AST earlier so that modifications can be made directly
to it instead of modifying the section list and synchronizing it back.
This is the main step for fixing
https://bugs.llvm.org/show_bug.cgi?id=32816. With this in place I
think the only missing thing would be to have processCommands assign
section indexes as dummy offsets so that the sort in
OutputSection::finalize works.
With this LinkerScript::assignAddresses becomes much simpler, which
should help with the thunk work.
llvm-svn: 301745
This reverts commit r301678 since that change significantly slowed
down the linker. Before this patch, LLD could link clang in 8 seconds,
but with this patch it took 40 seconds.
llvm-svn: 301709
This patch replaces flush with a last ditch attempt at synchronizing
the section list with the linker script "AST".
The synchronization is a bit of a hack and should in time be avoided
by creating the AST earlier so that modifications can be made directly
to it instead of modifying the section list and synchronizing it back.
This is the main step for fixing
https://bugs.llvm.org/show_bug.cgi?id=32816. With this in place I
think the only missing thing would be to have processCommands assign
section indexes as dummy offsets so that the sort in
OutputSection::finalize works.
With this LinkerScript::assignAddresses becomes much simpler, which
should help with the thunk work.
llvm-svn: 301678
We were already pretty close, the one exception was when a name was
reused in another SECTIONS directive:
SECTIONS {
.text : { *(.text) }
.data : { *(.data) }
}
SECTIONS {
.data : { *(other) }
}
In this case we would create a single .data and magically output
"other" while looking at the first OutputSectionCommand.
We now create two .data sections. This matches what gold does. If we
really want to create a single one, we should change the parser so that
the above is parsed as if the user had written
SECTIONS {
.text : { *(.text) }
.data : { *(.data) *(other)}
}
That is, there should be only one OutputSectionCommand for .data and
it would have two InputSectionDescriptions.
By itself this patch makes the code a bit more complicated, but is an
important step in allowing assignAddresses to operate just on the
linker script.
llvm-svn: 301484
This change fabricates linker script commands for the case where there is
no linker script SECTIONS to control address assignment. This permits us
to have a single Script->assignAddresses() function.
There is a small change in user-visible-behavior with respect to the
handling of .tbss SHT_NOBITS, SHF_TLS as the Script->assignAddresses()
requires setDot() to be called with monotically increasing addresses.
The tls-offset.s test has been updated so that the script and non-script
results match.
This change should make the non-script behavior of lld closer to an
equivalent linker script.
Differential Revision: https://reviews.llvm.org/D31888
llvm-svn: 300687
Imagine next script:
SECTIONS { BYTE(0x11); }
Section content written to disk will be 0x11. Previous LLD behavior was to make this
section SHT_NOBITS. What is not correct because section has content.
ld.bfd makes such sections SHT_PROGBITS, this patch do the same.
This fixes PR32537
Differential revision: https://reviews.llvm.org/D32016
llvm-svn: 300317
This fixes an assertion `Align != 0u && "Align can't be 0."'
in llvm::alignTo() when a linker script references a globally
defined variable in an ALIGN() context.
Patch by Alexander Richardson !
Differential revision: https://reviews.llvm.org/D31984
llvm-svn: 300315
Executable sections should not be padded with zero by default. On some
architectures, 0x00 is the start of a valid instruction sequence, so can confuse
disassembly between InputSections (and indeed the start of the next InputSection
in some situations). Further, in the case of misjumps into padding, padding may
start to be executed silently.
On x86, the "0xcc" byte represents the int3 trap instruction. It is a single
byte long so can serve well as padding. This change switches x86 (and x86_64) to
use this value for padding in executable sections, if no linker script directive
overrides it. It also puts the behaviour into place making it easy to change the
behaviour of other targets when desired. I do not know the relevant instruction
sequences for trap instructions on other targets however, so somebody should add
this separately.
Because the old behaviour simply wrote padding in the whole section before
overwriting most of it, this change also modifies the padding algorithm to write
padding only where needed. This in turn has caused a small behaviour change with
regards to what values are written via Fill commands in linker scripts, bringing
it into line with ld.bfd. The fill value is now written starting from the end of
the previous block, which means that it always starts from the first byte of the
fill, whereas the old behaviour meant that the padding sometimes started mid-way
through the fill value. See the test changes for more details.
Reviewed by: ruiu
Differential Revision: https://reviews.llvm.org/D30886
Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32227
llvm-svn: 299635
LinkerScript.cpp contains both the linker script processor and the
linker script parser. I put both into a single file, but the file grown
too large, so it's time to put them into two different files.
llvm-svn: 299515
This requires collectign all symbols referenced in the linker script
and adding them to symbol table as undefined symbol.
Differential Revision: https://reviews.llvm.org/D31147
llvm-svn: 298577
LinkerScript used to be a template class, so we couldn't instantiate
that class in elf::link. We instantiated ScriptConfig class earlier
instead so that the linker script parser can store configurations to
the object.
Now that LinkerScript is not a template, it doesn't make sense to
separate ScriptConfig from LinkerScript. This patch merges them.
llvm-svn: 298457
This fixes pr32031 by representing the expressions results as a
SectionBase and offset. This allows us to use an input section
directly instead of getting lost trying to compute an offset in an
outputsection when not all the information is available yet.
This also creates a struct to represent the *value* of and expression,
allowing the expression itself to be a simple typedef. I think this is
easier to read and will make it easier to extend the expression
computation to handle more complicated cases.
llvm-svn: 298079
This also requires postponing the assignment the assignment of
symbols defined in input linker scripts since those can refer to
output sections and in case we don't have a SECTIONS command, we
need to wait until all output sections have been created and
assigned addresses.
Differential Revision: https://reviews.llvm.org/D30851
llvm-svn: 297802
That moves all members that s possible to move for now (all which
does not depend on ELFT templating).
After that change LinkerScript contains only 8 methods in total,
and I believe it is possible to move them all after tweaking other
parts of linker. And we will be able to have single class for
linkerscript at the end.
llvm-svn: 297735
We can move all not templated functionality to LinkerScriptBase.
Patch do that for hasPhdrsCommands() and shows how it helps to detemplate
things in other places.
Probably we should be able to merge these 2 classes into single one after such steps.
Even if not, it still looks as reasonable cleanup for me.
Differential revision: https://reviews.llvm.org/D30895
llvm-svn: 297714
With this we have a single section hierarchy. It is a bit less code,
but the main advantage will be in a future patch being able to handle
foo = symbol_in_obj;
in a linker script. Currently that fails since we try to find the
output section of symbol_in_obj. With this we should be able to just
return an InputSection from the expression.
llvm-svn: 297313
That function doesn't use any member of SymbolTableSection, so I
couldn't see a reason to make it a member of that class. The function
takes a SymbolBody, so it is more natural to make it a member of
SymbolBody.
llvm-svn: 296433
The list of all input sections was defined in SymbolTable class for a
historical reason. The list itself is not a template. However, because
SymbolTable class is a template, we needed to pass around ELFT to access
the list. This patch moves the list out of the class so that it doesn't
need ELFT.
llvm-svn: 296309
With the current design an InputSection is basically anything that
goes directly in a OutputSection. That includes plain input section
but also synthetic sections, so this should probably not be a
template.
llvm-svn: 295993
This change exposes the symbol table insert method and uses it to
insert the linkerscript defined symbols directly into the symbol
table to avoid unnecessarily pulling the object out of an archive.
Differential Revision: https://reviews.llvm.org/D30224
llvm-svn: 295780
Patch fixes PR32024.
Sections that were not marked as Live has null output section.
Previously we tried to access that field and segfaulted.
Differential revision: https://reviews.llvm.org/D30188
llvm-svn: 295727
Previously we evaluated the values of LMA incorrectly for next cases:
.text : AT(ADDR(.text) - 0xffffffff80000000) { ... }
.data : AT(ADDR(.data) - 0xffffffff80000000) { ... }
.init.begin : AT(ADDR(.init.begin) - 0xffffffff80000000) { ... }
Reason was that we evaluated offset when VA was not assigned. For case above
we ended up with 3 loads that has similar LMA and it was incorrect.
That is critical for linux kernel.
Patch updates the offset after VA calculation. That fixes the issue.
Differential revision: https://reviews.llvm.org/D30163
llvm-svn: 295722
Previously LLD would error out just "ld.lld: error: unable to move location counter backward"
What does not really reveal the place of issue,
Patch adds location to the output.
Differential revision: https://reviews.llvm.org/D30187
llvm-svn: 295720
Previously ASSERT we implemented returned expression value.
Ex:
. = ASSERT(0x100);
would set Dot value to 0x100
Form of assert when it is assigned to Dot was implemented for
compatibility with very old GNU ld which required it.
Some scripts in the wild, including linux kernel scripts
use such ASSERTs at the end for doing different checks.
Currently we fail with "unable to move location counter backward"
for such scripts. Patch changes ASSERT to return location counter
value to fix that.
Differential revision: https://reviews.llvm.org/D30171
llvm-svn: 295703
I splitted it from D29273.
Since we plan to make relocatable sections as dependent for target ones for
--emit-relocs implementation, this change is required to support .eh_frame case.
EhInputSection inherets from InputSectionBase and not from InputSection.
So for case when it has relocation section, it should be able to access DependentSections
vector.
This case is real for Linux kernel.
Differential revision: https://reviews.llvm.org/D30084
llvm-svn: 295483
This is a small difference I noticed to gold and bfd. When given
--print-gc-sections, we print sections a linkerscript marks
DISCARD. The other linkers don't.
llvm-svn: 295467
This case should be possible to handle, but it is hard:
* In order to create program headers correctly, we have to scan the
sections in the order they are in the file.
* To find that order, we have to "execute" the linker script.
* The linker script can contain SIZEOF_HEADERS.
So to support this we have to start with a guess of how many headers
we need (3), run the linker script and try to create the program
headers. If it turns out we need more headers, we run the script again
with a larger SIZEOF_HEADERS.
Also, running the linker script depends on knowing the size of the
sections, so we have to finalize them. But creating the program
headers can change the value stored in some sections, so we have to
split size finalization and content finalization.
Looks like the last part is also needed for range extension thunks, so
we might support this at some point. For now just report an error
instead of producing broken files.
llvm-svn: 295458
SHF_LINK_ORDER sections adds special ordering requirements.
Such sections references other sections. Previously we would crash
if section that other were referenced to was discarded by script.
Patch fixes that by discarding all dependent sections in that case.
It supports chained dependencies, testcase is provided.
Differential revision: https://reviews.llvm.org/D30033
llvm-svn: 295332
Unfortunately, the common way of writing linker scripts seems to be
to get the output of ld.bfd --verbose and edit it a bit.
Also unfortunately, the bfd default script contains things like
.rela.dyn : { *(... .rela.data ...) }
but bfd actually ignores that for -emit-relocs, so we have to do the
same.
llvm-svn: 295324
The linker script lexer is context-sensitive. In the regular context,
arithmetic operator characters are regular characters, but in the
expression context, they are independent tokens. This afects how the
lexer tokenizes "3*4", for example. (This kind of expression is real;
the Linux kernel uses it.)
This patch defines function `maybeSplitExpr`. This function splits the
current token into multiple expression tokens if the lexer is in the
expression context.
Differential Revision: https://reviews.llvm.org/D29963
llvm-svn: 295225
OUTPUT_ARCH command can contain architecture values separated with ":", like:
OUTPUT_ARCH(i386:x86-64)
We did not support that, because got 3 lexer tokens here after recent changes.
This trivial patch fixes the issue, now whole expression inside
OUTPUT_ARCH is just ignored.
Differential revision: https://reviews.llvm.org/D29640
llvm-svn: 294432
LLD already parses ALIGN expression to specifiy alignment for output
sections in linker scripts but it never applies the alignment to the
output section. This change handles that.
Differential Revision: https://reviews.llvm.org/D29689
llvm-svn: 294374
DefinedSynthetic symbols are attached to sections,
for the case when such symbol was attached to non-allocated section,
we calculated its value incorrectly.
We subtracted Body->Section->Addr, but non-allocatable sections
should have zero VA in output and therefore result value was wrong.
And at the same time we have Body->Section->Addr != 0 for them
internally because use it for calculation of section size.
Patch fixes calculation of such symbols values.
Differential revision: https://reviews.llvm.org/D29653
llvm-svn: 294322
We had assignSymbol and assignSectionSymbol methods which has similar functionality.
Patch removes one of copy and reuses another in code.
Differential revision: https://reviews.llvm.org/D29582
llvm-svn: 294290
We now create a dummy section with index 1 before processing the
linker script.
Thanks to George Rimar for finding the bug and providing the initial
testcase.
llvm-svn: 294252
This is a fix for Bugzilla 31813.
The problem is that the tokenizer does not create a separate token for
":" unless there's white space before it. Changed it to always create
a token for ":" and reworked some logic that relied on ":" being
attached to some tokens like "global:" and "local:".
llvm-svn: 294006
This is alternative to D28857 which was incorrect.
One of linux scripts contains:
vvar_start = . - 2 * (1 << 12);
vvar_page = vvar_start;
vvar_vsyscall_gtod_data = vvar_page + 128;
Previously we did not mark first expression as non-absolute,
though it contains location counter.
And LLD failed with error:
relocation R_X86_64_PC32 cannot refer to absolute symbol
This patch should fix the issue, and opens road for doing the same for other operators
(though not clear if that is needed).
Differential revision: https://reviews.llvm.org/D29332
llvm-svn: 293748
Linux kernel linkerscript contains additional semicolon (last line):
.apicdrivers : AT(ADDR(.apicdrivers) - LOAD_OFFSET) {
__apicdrivers = .;
*(.apicdrivers);
I checked that both gold and bfd are able to parse something like:
.text : { ;;*(.text);;S = 0;; } }
Patch do the same.
Differential revision: https://reviews.llvm.org/D29276
llvm-svn: 293612
[ELF] Fixed formatting. NFC
and
[ELF] Bypass section type check
Differential revision: https://reviews.llvm.org/D28761
They do the opposite of what was asked for in the code review.
llvm-svn: 293320
As specified here:
* https://sourceware.org/binutils/docs/ld/MEMORY.html#MEMORY
There are two deviations from what is specified for GNU ld:
1. Only integer constants and *not* constant expressions
are allowed in `LENGTH` and `ORIGIN` initializations.
2. The `I` and `L` attributes are *not* implemented.
With (1) there is currently no easy way to evaluate integer
only constant expressions. This can be enhanced in the
future.
With (2) it isn't clear how these flags map to the `SHF_*`
flags or if they even make sense for an ELF linker.
Differential Revision: https://reviews.llvm.org/D28911
llvm-svn: 292875
Found that during attempts of linking linux kernel,
previously we partially duplicated code from getOutputSection(),
and it missed commons symbol case.
Differential revision: https://reviews.llvm.org/D28903
llvm-svn: 292594
This actually simplifies the code a bit as now all local symbols are
handled uniformly.
This should fix the build of www/webkit2-gtk3.
llvm-svn: 291569
This patch allows for linker scripts to assign a new value
to a symbol that is already defined (either in an object file
or the linker script itself).
llvm-svn: 291459
Previously, files added using INCLUDE directive weren't added
to reproduce archives. In this patch, I defined a function to
open a file and use that from Driver and LinkerScript.
llvm-svn: 291413
After Mark's patch I was wondering what was the rationale for the ELF
spec requiring us to merge only sections with matching flags and
types. I tried emailing
https://groups.google.com/forum/#!forum/generic-abi, but looks like my
emails are not being posted (the list is probably moderated). I
emailed Cary Coutant instead.
Cary pointed out that the section was a late addition and didn't got
the scrutiny it deserved. Given that and the problems found by
implementing the letter of the standard, I propose changing lld to
merge all sections with the same name and issue errors if the types or
some critical flags are different.
This should allow an unmodified firefox linked with lld to run.
This also merges some code with the linkerscript path.
llvm-svn: 291107
DefinedSynthetic is not created for a real ELF object, so it doesn't
have to be a template function. It has a virtual st_value, which is
either 32 bit or 64 bit, but we can simply use 64 bit.
llvm-svn: 290241
It was revealed by D27831.
If we have linkerscript that includes another one that sets OUTPUT for example:
RUN: echo "INCLUDE \"foo.script\"" > %t.script
RUN: echo "OUTPUT(\"%t.out\")" > %T/foo.script
then we do:
void ScriptParser::readInclude() {
...
std::unique_ptr<MemoryBuffer> &MB = *MBOrErr;
tokenize(MB->getMemBufferRef());
OwningMBs.push_back(std::move(MB));
}
void ScriptParser::readOutput() {
...
Config->OutputFile = unquote(Tok);
...
}
Problem is that OwningMBs are destroyed after script parser do its job.
So all Toks are dead and Config->OutputFile points to destroyed data.
Patch suggests to save all included scripts into using string Saver.
Differential revision: https://reviews.llvm.org/D27987
llvm-svn: 290238
This handles all the corner cases if setting a section address:
- If the address is too low, we cannot allocate the program headers.
- If the load address is lowered, we have to do that before finalize
This also shares some code with the linker script since it was already
hitting similar cases.
This is used by the freebsd boot loader. It is not clear if we need to
support this with a non binary output, but it is not as bad as I was
expecting.
llvm-svn: 290136
I thought for a while about how to remove it, but it looks like we
can just copy the file for now. Of course I'm not happy about that,
but it's just less than 50 lines of code, and we already have
duplicate code in Error.h and some other places. I want to solve
them all at once later.
Differential Revision: https://reviews.llvm.org/D27819
llvm-svn: 290062
PR31335 shows that we do that in next case:
SECTIONS { .text 0x2000 : {. = 0x100 ; *(.text) } }
though documentations says that "If . is used inside a section
description however, it refers to the byte offset from the start
of that section, not an absolute address. " looks does not work
as documented in bfd (as mentioned in comments for PR31335).
Until we find out the expected behavior was suggested at least not
to 'crash', what we do after trying to generate huge file.
Differential revision: https://reviews.llvm.org/D27712
llvm-svn: 289782
The feature is documented as
-----------------------------
The format of the dynamic list is the same as the version node
without scope and node name. See *note VERSION:: for more
information.
--------------------------------
And indeed qt uses a dynamic list with an 'extern "C++"' in it. With
this patch we support that
The change to gc-sections-shared makes us match bfd. Just because we
kept bar doesn't mean it has to be in the dynamic symbol table.
The changes to invalid-dynamic-list.test and reproduce.s are because
of the new parser.
The changes to version-script.s are the only case where we change
behavior with regards to bfd, but I would like to see a mix of
--version-script and --dynamic-list used in the wild before
complicating the code.
llvm-svn: 289082