This allows combining --dynamic-list and version scripts too. The
version script controls which symbols are visible, and
--dynamic-list controls which of those are preemptible.
Unlike previous versions, undefined symbols are still considered
preemptible, which was the issue breaking the cfi tests.
This fixes pr34053.
llvm-svn: 312806
to separate commons based on file name patterns. The following linker script
construct does not work because commons are allocated before section placement
is done and the only synthesized BssSection that holds all commons has no file
associated with it:
SECTIONS { .common_0 : { *file0.o(COMMON) }}
This patch changes the allocation of commons to create a section per common
symbol and let the section logic do the layout.
Differential revision: https://reviews.llvm.org/D37489
llvm-svn: 312796
If --dynamic-list is given, only those symbols are preemptible.
This allows combining --dynamic-list and version scripts too. The
version script controls which symbols are visible, and --dynamic-list
controls which of those are preemptible.
This fixes pr34053.
llvm-svn: 312757
It is a bit more convinent and helps to simplify logic
of program headers allocation a little.
Differential revision: https://reviews.llvm.org/D34956
llvm-svn: 312711
The fixSectionAlignments() function may alter the alignment of some
OutputSections, this is likely to alter the addresses calculated earlier
in assignAddresses(). By moving the call to fixSectionAlignments() we
make sure that assignAddresses() is consistent with the early calculation
used for RangeThunks and the final call just before writing the image.
Differential Revision: https://reviews.llvm.org/D36739
llvm-svn: 312636
std::vector::insert invalidates all iterators, so it was not safe to do
Script->Opt.Commands.insert(++I, Make(ElfSym::End1));
Script->Opt.Commands.insert(++I, Make(ElfSym::End2));
because after the first line, `I` is no longer valid.
This patch rewrites fixes the issue. I belive the new code without
higher-order functions is a bit more readable than before.
llvm-svn: 312570
The problem with symbol assignments in implicit linker scripts is that
they can refer synthetic symbols such as _end, _etext or _edata. The
value of these symbols is currently fixed only after all linker script
commands are processed, so these assignments will be using non-final and
hence invalid value.
Rather than fixing the symbol values after all command processing have
finished, we instead change the logic to generate symbol assignment
commands that set the value of these symbols while processing the
commands, this ensures that the value is going to be correct by the time
any reference to these symbol is processed and is equivalent to defining
these symbols explicitly in linker script as BFD ld does.
Differential Revision: https://reviews.llvm.org/D36986
llvm-svn: 312305
Patch by Rafael Espíndola.
This is PR34053.
The implementation is a bit of a hack, given the precise location where
IsPreemtible is set, it cannot be used from
SymbolTable::handleAnonymousVersion.
I could add another method to SymbolTable if you think that would be
better.
Differential Revision: https://reviews.llvm.org/D36499
llvm-svn: 311468
With fix: explicitly specify ouput format for hexdump tool call.
Original commit message:
[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions.
Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.
Differential revision: https://reviews.llvm.org/D36262
llvm-svn: 311315
Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.
Differential revision: https://reviews.llvm.org/D36262
llvm-svn: 311310
GdbIndexSection doesn't need lazy finalization because when an instance
of the class is created, we already know all debug info sections.
We can initialize the instnace in the ctor.
llvm-svn: 310931
Liveness is usually a notion of input sections, but this patch adds
"liveness" bit to common symbols because they don't belong to any
input section.
This patch is based on https://reviews.llvm.org/D36520
Differential Revision: https://reviews.llvm.org/D36546
llvm-svn: 310617
This is probably a small optimization, but the main motivation is
having a way of fixing pr34053 that doesn't require a hash lookup in
isPreempitible.
llvm-svn: 310602
Emit these symbols as long as we're building in a static configuration,
even if we're emitting a dynamic symbol table. This is consistent with
both bfd and gold.
Ordinarily, the combination of -static and -export-dynamic wouldn't make
much sense. Unfortunately, cmake versions prior to 3.4 forcefully
injected -rdynamic [1], so it seems worthwhile to support.
[1] https://cmake.org/cmake/help/v3.4/policy/CMP0065.html
Differential Revision: https://reviews.llvm.org/D36350
llvm-svn: 310168
With this Symbol has the same size as before, but DefinedRegular goes
from 72 to 64 bytes.
I also find this a bit easier to read. There are fewer places
initializing File for example.
This has a small but measurable speed improvement on all tests (1%
max).
llvm-svn: 310142
This is PR33889,
Patch adds support of combination of linkerscript and
-symbol-ordering-file option.
If no sorting commands are present in script inside section declaration
and no --sort-section option specified, code uses sorting from ordering
file if any exist.
Differential revision: https://reviews.llvm.org/D35843
llvm-svn: 310045
When the data segment is the last segment, it is correct to leave
it unaligned. However, when the code segment is the last segment,
it should be aligned to the page boundary to avoid loading the
non-segment parts of the ELF file at the end of the file.
Differential Revision: https://reviews.llvm.org/D33630
llvm-svn: 309829
This is a bit of a hack, but it is *so* convenient.
Now that we create synthetic linker scripts when none is provided, we
always have to handle paired OutputSection and OutputsectionCommand and
keep a mapping from one to the other.
This patch simplifies things by merging them and creating what used to
be OutputSectionCommands really early.
llvm-svn: 309311
That is slightly more convinent as allows to store pointer on
program header entry in a more safe way.
Change was used in 2 patches currently on review.
Differential revision: https://reviews.llvm.org/D35832
llvm-svn: 309253
I have a patch to let DwarfContext defer to lld for getting section
contents and relocations.
That is a pretty big performance improvement.
This is just a refactoring to make that easier to do.
This change makes the *creation* of gdb index a dedicated step and
makes that templated. That is so that we can uses Elf_Rel in the code.
llvm-svn: 307867
In preparation for the addition of rangeThunks() calculate the addresses
of all the inputSections so that ThunkSections can be inserted at the right
place.
Differential Revision: https://reviews.llvm.org/D34688
llvm-svn: 307373
The allocateHeaders() function is called at the end of assignAddresses(), it
decides whether the ELF header and program header table can be allocated to
a PT_LOAD program header. As the function alters state, it prevents
assignAddresses() from being called multiple times.
This change splits out the call to allocateHeaders() from assignAddresses()
this will permit assignAddresses() to be called while processing range
extension thunks without trying to allocateHeaders().
Differential Revision: https://reviews.llvm.org/D34344
llvm-svn: 307131
This is finally getting to the point where output sections are
constructed.
createOrphanCommands and fabricateDefaultCommands are moved out so
they can be merged with existing code in followup commits.
llvm-svn: 307107
Script commands are processed before unused synthetic sections are
removed. Therefore, if a linker script matches one of these sections
it'll get emitted as an empty output section because the logic for
removing unused synthetic sections ignores script commands which
could have already matched and captured one of these sections. This
patch fixes that by also removing the unused synthetic sections from
the script commands.
Differential Revision: https://reviews.llvm.org/D34800
llvm-svn: 307037
This is PR33596. Previously LLD would crash
because BYTE command synthesized output section,
but it was not assigned to Sec member of OutputSectionCommand.
Behaviour of -script and -r combination is not well defined,
but it seems after this change LLD naturally inherits behavior of
GNU linkers - creates output section requested in script and does not
crash anymore.
Differential revision: https://reviews.llvm.org/D34676
llvm-svn: 306527
When -ffunction-sections and ARM C++ exceptions are used each .text.suffix
section will have at least one .ARM.exidx.suffix section and may have an
additional .ARM.extab.suffix section if the unwinding instructions are too
large to inline into the .ARM.exidx table entry. For a large program without
a linker script this can lead to a large number of section header table
entries that can increase the size of the ELF file.
This change introduces a default rule for .ARM.extab.* to be placed in
a single output section called .ARM.extab . This follows the behavior of
ld.gold and ld.bfd.
fixes pr33407
Differential Revision: https://reviews.llvm.org/D34678
llvm-svn: 306522
Most "reserved" symbols are in ElfSym and it looks like there's no
reason to not do the same thing for _GLOBAL_OFFSET_TABLE_. This should
help https://reviews.llvm.org/D34618 too.
llvm-svn: 306292
On many architectures gcc and clang will recognize _GLOBAL_OFFSET_TABLE_ - .
and produce a relocation that can be processed without needing to know the
value of _GLOBAL_OFFSET_TABLE_. This is not always the case; for example ARM
gcc produces R_ARM_BASE_PREL but clang produces the more general
R_ARM_REL32 to _GLOBAL_OFFSET_TABLE_. To evaluate this relocation
correctly _GLOBAL_OFFSET_TABLE_ must be defined to be the either the base of
the GOT or end of the GOT dependent on architecture..
If/when llvm-mc is changed to recognize _GLOBAL_OFFSET_TABLE_ - . this
change will not be necessary for new objects. However there may still be
old objects and versions of clang.
Differential Revision: https://reviews.llvm.org/D34355
llvm-svn: 306282
I found this while trying to build u-boot. It uses -Ttext in
combination with linker scripts.
My first reaction was to change the linker scripts to have the correct
value, but I found that it is actually quite convenient to have -Ttext
take precedence.
By having just
.text : { *(.text) }
In the script, they can define the text address in a single Makefile
and pass it to ld with -Ttext and for the C code with
-DFoo=value. Doing the same with linker scripts would require them to
be generated during the build.
llvm-svn: 305766
In preparation for supporting range extension thunks we now continually
call createThunks() until no more thunks are added. This requires us to
record the thunks we add on each pass and only merge the new ones into the
OutputSection. We also need to check if a Relocation is targeting a thunk
to prevent us from infinitely creating more thunks.
Differential Revision: https://reviews.llvm.org/D34034
llvm-svn: 305555
This is probably the main patch left in unifying our intermediary
representation.
It moves the creation of default commands before section sorting. This
has the nice effect that we now have one location where we decide
where an orphan section should be placed.
Before this patch sortSections would decide the relative location of
orphan sections to other sections, but it was up to placeOrphanSection
to decide on the exact location.
We now only sort sections we created since the linker script is
already in the correct order.
llvm-svn: 305512
Thunks are now generated per InputSectionDescription instead of per
OutputSection. This allows created ThunkSections to be inserted directly
into InputSectionDescription.
Changes in this patch:
- Loop over InputSectionDescriptions to find relocations to Thunks
- Generate a ThunkSection per InputSectionDescription
- Remove synchronize() as we no longer need it
- Move fabricateDefaultCommands() before createThunks
Differential Revision: https://reviews.llvm.org/D33835
llvm-svn: 304887
Previously we would merge relocation sections by name.
That did not work in some cases, like testcase shows.
Patch implements logic to merge relocation sections if their target
sections were merged into the same output section.
Differential revision: https://reviews.llvm.org/D33824
llvm-svn: 304886
Before this patch in -r we compute the OutputSection sizes early in
the various calls to assignOffsets. With this change we can remove
most of those calls.
llvm-svn: 304860
Traditionally, it has been defined in crtbegin.o, which is typically
provided by libgcc or as part of the C library on some systems. However,
but there's no principled reason for it to be there. We optionaly
define this symbol, which can be used on platforms that don't provide
__dso_handle in crtbegin.o or which don't use crtbegin.o at all.
Differential Revision: https://reviews.llvm.org/D33856
llvm-svn: 304732
This is PR33289.
Previously LLD leaved section naming as is and that lead to wrong result,
because we decompress sections when using -r,
and hence should remove ".z" prefix.
Differential revision: https://reviews.llvm.org/D33885
llvm-svn: 304711
This change alters the sorting for OutputSections with the SHF_LINK_ORDER
flag in OutputSection::finalize() to use the InputSectionDescription
representation and not the OutputSection::Sections representation.
Differential revision: https://reviews.llvm.org/D33772
llvm-svn: 304700
This is probably the correct location for it: next to
fabricateDefaultCommands. If we don't have a linker script, we
fabricate one. If we have one, we patch it.
llvm-svn: 304436
Before InputSectionBase had an OutputSection pointer, but that was not
always valid. For example, if it was a merge section one actually had
to look at MergeSec->OutSec.
This was brittle and caused bugs like the one fixed by r304260.
We now have a single Parent pointer that points to an OutputSection
for InputSection, but to a SyntheticSection for merge sections and
.eh_frame. This makes it impossible to accidentally access an invalid
OutSec.
llvm-svn: 304338
When there is a linker script with .ARM.exidx in the SECTIONS
command we must add the .ARM.exidx sentinel section to the
InputSectionDescriptions as well as to OutputSection::Sections.
Differential Revision: https://reviews.llvm.org/D33496
llvm-svn: 304206
Now that we are trying to use the linker script representation as the
canonycal one, there are a few loops looking for just OutputSectionCommands.
Create a vector with just the OutputSectionCommands once that is
stable to simplify the rest of the code.
llvm-svn: 304181
On SPARC, .plt is both writeable and executable. The current way
sections are sorted means that lld puts it after .data/.bss. but it
really needs to be close to .test to make sure branches into .plt
don't overflow. I'd argue that because .bss is supposed to come last
on all architectures, we should change the default sort order such
that writable and executable sections come before sections that are
just writeable. read-only executable sections should still come after
sections that are just read-only of course. This diff makes this
change.
llvm-svn: 304008
In this way, the content and the flag is always consistent, which I
think better than removing the bit when input sections reaches the Writer.
llvm-svn: 303926
This reduces how many times we have to map from OutputSection to
OutputSectionCommand. It is a required step to moving
clearOutputSections earlier.
In order to always use writeTo in OutputSectionCommand we have to call
fabricateDefaultCommands for -r links and move section compression
after it.
llvm-svn: 303784
Once the dummy linker script is created, we want it to be used for
everything to avoid having two redundant representations that can get
out of sync.
We were already clearing OutputSections. With this patch we clear the
Sections vector of every OutputSection.
llvm-svn: 303703
This converts the last (chronologically) user of OutputSections to use
the linker script commands instead.
The idea is to convert all uses after fabricateDefaultCommands, so
that we have a single representation.
llvm-svn: 303384
GetSection is a template because write calls relocate.
relocate has two parts. The non alloc code really has to be a
template, as it is looking a raw input file data.
The alloc part is only a template because of getSize.
This patch folds the value of getSize early, detemplates
getRelocTargetVA and splits relocate into a templated non alloc case
and a regular function for the alloc case. This has the nice advantage
of making sure we collect all the information we need for relocations
before getting to InputSection::relocateNonAlloc.
Since we know got is alloc, it can just call the function directly and
avoid the template.
llvm-svn: 303355
Nothing special here, just detemplates code that became possible
to detemplate after recent commits in a straghtforward way.
Differential revision: https://reviews.llvm.org/D33234
llvm-svn: 303237
SymbolTableBaseSection was introduced.
Detemplation of SymbolTableSection should allow to detemplate more things.
Differential revision: https://reviews.llvm.org/D33124
llvm-svn: 303150
Switch to llvm::to_integer() everywhere in LLD instead of
StringRef::getAsInteger() because API of latter is confusing.
It returns true on error and false otherwise what makes reading
the code incomfortable.
Differential revision: https://reviews.llvm.org/D33187
llvm-svn: 303149
We used to place orphans by just using compareSectionsNonScript.
Then we noticed that since linker scripts can use another order, we
should first try match the section to a given PT_LOAD. But there is
nothing special about PT_LOAD. The same issue can show up for
PT_GNU_RELRO for example.
In general, we have to search for the most similar section and put the
orphan next to it. Most similar being defined as how long they follow
the same code path in compareSecitonsNonScript.
That is what this patch does. We now compute a rank for each output
section, with a bit for each branch in what was
compareSectionsNonScript.
With this findOrphanPos is now fully general and orphan placement can
be optimized by placing every section with the same rank at once.
The included testcase is a variation of many-sections.s that uses
allocatable sections to avoid the fast path in the existing
code. Without threads it goes form 46 seconds to 0.9 seconds.
llvm-svn: 302903
This behavior differs from the semantics implemented by GNU linkers
which only define this symbol iff ELF headers are in the memory
mapped segment.
Differential Revision: https://reviews.llvm.org/D33019
llvm-svn: 302687
This is a bit easier to read and a lot faster in some cases. A version
of many-sections.s with linker scripts goes from 0m41.232s to
0m19.575s.
llvm-svn: 302528
The code following this one already considers every possible insertion
point for orphan sections, there is no point in sorting them before.
llvm-svn: 302441
We can set SectionIndex tentatively as we process the linker script
instead of looking it repeatedly.
In general we should try to have as few name lookups as possible.
llvm-svn: 302299
In the non linker script case we would try very early to find out if
we could allocate the headers. Failing to do that would add extra
alignment to the first ro section, since we would set PageAlign
thinking it was the first section in the PT_LOAD.
In the linker script case the header allocation must be done in the
end, causing some duplication.
We now tentatively add the headers to the first PT_LOAD and if it
turns out they don't fit, remove them. With this we only need to
allocate the headers in one place in the code.
llvm-svn: 302186
The --section-start <name>=<address> needs to be translated into equivalent
linker script commands. There are a couple of problems with the existing
implementation:
- The --section-start with the lowest address is assumed to be at the start
of the map. This assumption is incorrect, we have to iterate through the
SectionStartMap to find the lowest address.
- The addresses in --section-start were being over-aligned when the
sections were marked as PageAlign. This is inconsistent with the use of
SectionStartMap in fixHeaders(), and can cause problems when the PageAlign
causes an "unable to move location counter backward" error when the
--section-start with PageAlign is aligned to an address higher than the next
--section-start. The ld.bfd and ld.gold seem to be more consistent with this
approach but this is not a well specified area.
This change fixes the problems above and also corrects a typo in which
fabricateDefaultCommands() is called with the wrong parameter, it should be
called with AllocateHeader not Config->MaxPageSize.
Differential Revision: https://reviews.llvm.org/D32749
llvm-svn: 302007
This version uses a set to speed up the synchronize method.
Original message:
Remove LinkerScript::flush.
This patch replaces flush with a last ditch attempt at synchronizing
the section list with the linker script "AST".
The synchronization is a bit of a hack and should in time be avoided
by creating the AST earlier so that modifications can be made directly
to it instead of modifying the section list and synchronizing it back.
This is the main step for fixing
https://bugs.llvm.org/show_bug.cgi?id=32816. With this in place I
think the only missing thing would be to have processCommands assign
section indexes as dummy offsets so that the sort in
OutputSection::finalize works.
With this LinkerScript::assignAddresses becomes much simpler, which
should help with the thunk work.
llvm-svn: 301745
This reverts commit r301678 since that change significantly slowed
down the linker. Before this patch, LLD could link clang in 8 seconds,
but with this patch it took 40 seconds.
llvm-svn: 301709
This patch replaces flush with a last ditch attempt at synchronizing
the section list with the linker script "AST".
The synchronization is a bit of a hack and should in time be avoided
by creating the AST earlier so that modifications can be made directly
to it instead of modifying the section list and synchronizing it back.
This is the main step for fixing
https://bugs.llvm.org/show_bug.cgi?id=32816. With this in place I
think the only missing thing would be to have processCommands assign
section indexes as dummy offsets so that the sort in
OutputSection::finalize works.
With this LinkerScript::assignAddresses becomes much simpler, which
should help with the thunk work.
llvm-svn: 301678
addIgnored defines a given symbol even if there is no existing
symbol with the same name. So, even if libc provides __tls_get_addr,
we should still be able to call addIgnored.
Differential Revision: https://reviews.llvm.org/D32053
llvm-svn: 301290
This change fabricates linker script commands for the case where there is
no linker script SECTIONS to control address assignment. This permits us
to have a single Script->assignAddresses() function.
There is a small change in user-visible-behavior with respect to the
handling of .tbss SHT_NOBITS, SHF_TLS as the Script->assignAddresses()
requires setDot() to be called with monotically increasing addresses.
The tls-offset.s test has been updated so that the script and non-script
results match.
This change should make the non-script behavior of lld closer to an
equivalent linker script.
Differential Revision: https://reviews.llvm.org/D31888
llvm-svn: 300687
Patch implements --compress-debug-sections=zlib.
In compare with D20211 (a year old patch, abandoned), it implementation
uses streaming and fully reimplemented, does not support zlib-gnu for
simplification.
This is PR32308.
Differential revision: https://reviews.llvm.org/D31941
llvm-svn: 300444
RELRO is a feature to make segments read-only after dynamic relocations
are applied. It is different from read-only segments because RELRO is
initially writable. And of course RELRO is different from writable
segments.
RELRO is not a very well known feature. We have a series of checks to
make a decision whether a section should be in a RELRO segment or not,
but we didn't describe why. This patch adds comments to explain how
that decision is made.
llvm-svn: 300176
Executable sections should not be padded with zero by default. On some
architectures, 0x00 is the start of a valid instruction sequence, so can confuse
disassembly between InputSections (and indeed the start of the next InputSection
in some situations). Further, in the case of misjumps into padding, padding may
start to be executed silently.
On x86, the "0xcc" byte represents the int3 trap instruction. It is a single
byte long so can serve well as padding. This change switches x86 (and x86_64) to
use this value for padding in executable sections, if no linker script directive
overrides it. It also puts the behaviour into place making it easy to change the
behaviour of other targets when desired. I do not know the relevant instruction
sequences for trap instructions on other targets however, so somebody should add
this separately.
Because the old behaviour simply wrote padding in the whole section before
overwriting most of it, this change also modifies the padding algorithm to write
padding only where needed. This in turn has caused a small behaviour change with
regards to what values are written via Fill commands in linker scripts, bringing
it into line with ld.bfd. The fill value is now written starting from the end of
the previous block, which means that it always starts from the first byte of the
fill, whereas the old behaviour meant that the padding sometimes started mid-way
through the fill value. See the test changes for more details.
Reviewed by: ruiu
Differential Revision: https://reviews.llvm.org/D30886
Bugzilla: http://bugs.llvm.org/show_bug.cgi?id=32227
llvm-svn: 299635
For range extension thunks we will need to repeatedly call createThunks()
until no more thunks are created. We will need to retain the state of
Thunks that we have created so far to avoid recreating them on later
passes. This change does not change the functionality of createThunks().
Differential Revision: https://reviews.llvm.org/D31654
llvm-svn: 299530
GNU linkers define __bss_start symbol.
Patch teaches LLD to do that. This is PR32051.
Below is part of standart ld.bfd script:
.data1 : { *(.data1) }
_edata = .; PROVIDE (edata = .);
. = .;
__bss_start = .;
.bss :
{
Currently LLD can emit up to 3 .bss* sections as one of testcase shows.
Implementation inserts this symbol before first .bss* output section.
Differential revision: https://reviews.llvm.org/D30419
llvm-svn: 299528
LinkerScript used to be a template class, so we couldn't instantiate
that class in elf::link. We instantiated ScriptConfig class earlier
instead so that the linker script parser can store configurations to
the object.
Now that LinkerScript is not a template, it doesn't make sense to
separate ScriptConfig from LinkerScript. This patch merges them.
llvm-svn: 298457
Patch moves Sections array to
InputFile (root class for input files).
That allows to detemplate GdbIndexSection.
Differential revision: https://reviews.llvm.org/D30976
llvm-svn: 298345
The patch introduces two new relocations expressions R_MIPS_GOT_GP and
R_MIPS_GOT_GP_PC. The first one represents a current value of `_gp`
pointer and used to calculate relocations against the `__gnu_local_gp`
symbol. The second one represents the offset between the beginning of
the function and the `_gp` pointer's value.
There are two motivations for introducing new expressions:
- It's better to keep all non-trivial relocation calculations in the
single place - `getRelocTargetVA` function.
- Relocations against both `_gp_disp` and `__gnu_local_gp` symbols
depend on the `_gp` value. It's a magical value points to the "middle"
of GOT. Now all relocations use a common `_gp` value. But in fact,
under some conditions each input file might require its own `_gp`
value. I'm going to implement it in the future patches. So it's
better to make `MipsGotSection` responsible for calculation of
the `_gp` value.
llvm-svn: 298306
This continues detemplation process.
Detemplating MipsGotSection<ELFT> is helpfull because can
help to detemplate getRelocTargetVA. (one more change is required)
It opens road to detemplation of GotSection<ELFT> and probably
something else after that.
Differential revision: https://reviews.llvm.org/D31090
llvm-svn: 298272
Does not introduce anything new,
just performs detemplate, using methods we
already have.
Differential revision: https://reviews.llvm.org/D30935
llvm-svn: 298269
We had a few Config member functions that returns configuration values.
For example, we had is64() which returns true if the target is 64-bit.
The return values of these functions are constant and never change.
This patch is to compute them only once to make it clear that they'll
never change.
llvm-svn: 298168
With fix of next warning:
Writer.cpp:361:3: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
Original commit message:
Patch reuses BssSection section to simplify creation of
COMMON section.
Differential revision: https://reviews.llvm.org/D30690
llvm-svn: 298086
Was fixed, details on review page.
Original commit message:
That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.
This is splitted from D30541 and polished.
Difference from D30541 that all logic of SharedSymbol
converting to DefinedRegular was removed for now and
probably will be posted as separate patch.
Differential revision: https://reviews.llvm.org/D30892
llvm-svn: 298062
After introducing Config->is64Bit() and
recent changes in LinkerScriptBase, some
sections can be detemplated trivially. This
is one of such cases.
llvm-svn: 297825
StringTableSection was <ELFT> templated previously,
It disallow to de-template code that uses it,
for example LinkerScript<ELFT>::discard uses it as:
if (S == In<ELFT>::ShStrTab)
error("discarding .shstrtab section is not allowed");
It seems we can try to detemplate some of synthetic sections
and somehow make them available for non-templated calls.
(move out of In<ELFT> struct probably).
Differential revision: https://reviews.llvm.org/D30933
llvm-svn: 297815
That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.
This is splitted from D30541 and polished.
Difference from D30541 that all logic of SharedSymbol
converting to DefinedRegular was removed for now and
probably will be posted as separate patch.
Differential revision: https://reviews.llvm.org/D30892
llvm-svn: 297814
Patch introduces Config->is64Bit() and with help of that detemplates
GotPltSection and IgotPltSection sections
Differential revision: https://reviews.llvm.org/D30944
llvm-svn: 297813
This also requires postponing the assignment the assignment of
symbols defined in input linker scripts since those can refer to
output sections and in case we don't have a SECTIONS command, we
need to wait until all output sections have been created and
assigned addresses.
Differential Revision: https://reviews.llvm.org/D30851
llvm-svn: 297802
We can move all not templated functionality to LinkerScriptBase.
Patch do that for hasPhdrsCommands() and shows how it helps to detemplate
things in other places.
Probably we should be able to merge these 2 classes into single one after such steps.
Even if not, it still looks as reasonable cleanup for me.
Differential revision: https://reviews.llvm.org/D30895
llvm-svn: 297714
lld crashes when .eh_frame or .eh_frame_hdr section is discarded
in linker script and there is no PHDRS directive.
Differential revision: https://reviews.llvm.org/D30885
llvm-svn: 297712
Fix a bug introduced in r297313 which caused them to resolve to the end
of the ELF header in PIEs and DSOs.
Differential Revision: https://reviews.llvm.org/D30843
llvm-svn: 297638
Using .eh_frame input section pattern in linker script currently
causes a crash; this is because .eh_frame input sections require
special handling since they're all combined into a synthetic
section rather than regular output section.
Differential Revision: https://reviews.llvm.org/D30627
llvm-svn: 297501
.eh_frame_hdr is a header constructed for .eh_frame sections.
We do not proccess .eh_frame when doing relocatable output,
so should not try to create .eh_frame_hdr too.
Previous behavior without this patch is segfault.
Fixes PR32118.
Differential revision: https://reviews.llvm.org/D30566
llvm-svn: 297365
With this we have a single section hierarchy. It is a bit less code,
but the main advantage will be in a future patch being able to handle
foo = symbol_in_obj;
in a linker script. Currently that fails since we try to find the
output section of symbol_in_obj. With this we should be able to just
return an InputSection from the expression.
llvm-svn: 297313
This change moves the calls to finalizeContent() for each synthetic section
before createThunks(). This will allow us to assign addresses prior to
calling createThunks(). As addition of thunks may add to the static
symbol table and may affect the size of the mips got section we introduce a
couple of additional member functions to update these values.
Differential revision: https://reviews.llvm.org/D29983
llvm-svn: 297277
In compare with D30458, this makes Bss/BssRelRo to be pure
synthetic sections.
That removes CopyRelSection class completely, making
Bss/BssRelRo to be just regular synthetics.
SharedSymbols involved in creating copy relocations are
converted to DefinedRegular, what also simplifies things.
Differential revision: https://reviews.llvm.org/D30541
llvm-svn: 297008
In many places we reset Size to 0 before calling assignOffsets()
manually. Sometimes we don't do that.
It looks we can just always do that inside.
Previous code had:
template <class ELFT> void OutputSection::assignOffsets() {
uint64_t Off = Size;
And tests feels fine with Off = 0.
I think Off = Size make no sence.
Differential revision: https://reviews.llvm.org/D30463
llvm-svn: 296609
Previously, these two functions put their symbols in different queues.
Now that the queues have been merged. So there's no point to keep two
separate functions.
llvm-svn: 296435
The list of all input sections was defined in SymbolTable class for a
historical reason. The list itself is not a template. However, because
SymbolTable class is a template, we needed to pass around ELFT to access
the list. This patch moves the list out of the class so that it doesn't
need ELFT.
llvm-svn: 296309
__ehdr_start should be pointing to ELF file headers, not program
headers.
This is a reland of D30319.
Differential Revision: https://reviews.llvm.org/D30323
llvm-svn: 296085
With this we complete the transition out of special output sections,
and with the previous patches it should be possible to merge
OutputSectionBase and OuputSection.
llvm-svn: 296023
With the current design an InputSection is basically anything that
goes directly in a OutputSection. That includes plain input section
but also synthetic sections, so this should probably not be a
template.
llvm-svn: 295993
This change moves the SymbolBodies with isLocal() == true before the global
symbols then calculating NumLocals rather than assuming all locals are
added before globals and the first NumLocals have isLocal() == true. This
permits Thunks to be moved after the pass that adds global symbols from
synthetics to the symbol table.
Differential revision: https://reviews.llvm.org/D30085
llvm-svn: 295650
This is a small difference I noticed to gold and bfd. When given
--print-gc-sections, we print sections a linkerscript marks
DISCARD. The other linkers don't.
llvm-svn: 295467
Without this we would produce two relocation sections pointing to the
same section, which gnu tools reject.
This fixes pr31986.
The implementation of -r/--emit-reloc is getting fairly
complicated. But lets get the test passing before trying to refactor
it.
llvm-svn: 295385
Previously, space in a BSS section for copy relocations are reserved
in a special way. We directly manipulated size of the BSS section.
r294577 changed the way of doing it. Now, we create an instance of
CopyRelSection (which is a synthetic input section) for each copy
relocation.
This patch removes the remains of the old way and add CopyRelSections
to BSS sections using `addSections` function, which is the usual
way to add an input section to an output section.
llvm-svn: 295278
For CloudABI I'm only interested in generating non-PIC/PIE executables
on armv6 and i686, as PIE introduces larger overhead than on aarch64 and
x86_64. Still, I want to be able to instruct the linker to generate a
dynamic symbol table if requested. One example use for this is that
dynamic symbol tables can be used by programs to print nicely formatted
stacktraces, including symbol names.
Right now there seems to be some logic in LLD that it only wants to emit
dynamic symbol tables when either linking against libraries or when
building PIC. Let's extend this to also take --export-dynamic into
account.
Reviewed by: ruiu
Differential Revision: https://reviews.llvm.org/D29982
llvm-svn: 295240
Unfortunately some consumers of our .o files produced with -r expect
only one section symbol per section. That is true of at least of go's
own linker.
Combining them is a somewhat convoluted process. We have to create a
symbol for every section since we don't know which ones will be
needed. The relocation sections also have to be written first to
handle the Elf_Rel addend.
I did consider a completely different approach:
We could remove the -r special case of relocation sections when
reading. We would instead have a copyRelocs function that is used
instead of scanRelocs. It would create a DynamicReloc for each
relocation and a RelocationSection for each input relocation section.
A complication of such change is that DynamicReloc would have to take
a section index and a input section instead of a symbol since with
-emit-relocs some DynamicReloc would hold relocations referring to the
dynamic symbol table and other to the static symbol table.
That would be a pretty big change, and if we do it it is probably
better to do it as a refactoring.
llvm-svn: 294816
Much of the code in PltSection and IPltSection is similar, we identify
the IPlt by a HeaderSize of 0 and alter our behaviour in the member
functions appropriately:
-Iplt does not have a header
-Iplt always follows after the Plt
Differential Revision: https://reviews.llvm.org/D29664
llvm-svn: 294579
with temporarily file name fix in testcase.
Original commit message:
-q, --emit-relocs - Generate relocations in output
Simplest implementation:
* no GC case,
* no "/DISCARD/" linkerscript command support.
This patch is extracted from D28612 / D29636,
Relative to PR31579.
Differential revision: https://reviews.llvm.org/D29663
llvm-svn: 294469
-q, --emit-relocs - Generate relocations in output
Simplest implementation:
* no GC case,
* no "/DISCARD/" linkerscript command support.
This patch is extracted from D28612 / D29636,
Relative to PR31579.
Differential revision: https://reviews.llvm.org/D29663
llvm-svn: 294464
With a synthetic merge section we can have, for example, a single
.rodata section with stings, fixed sized constants and non merge
constants.
I can be simplified further by not setting Entsize, but that is
probably better done is a followup patch.
This should allow some cleanup in the linker script code now that
every output section command maps to just one output section.
llvm-svn: 294005
Instead of creating multiple PHDRs in a single loop, this patch
runs one for loop for each PHDR type. I think this improves code
readability.
llvm-svn: 293832
There could be multiple discontiguous output .note sections in which
case we need to put these into separate PT_NOTE segments rather then
placing them into a single segment. Where possible, we could reorder
the input sections to make sure that all .note are layed out next to
each other to avoid creation multiple PT_NOTE segments, but even in
that case, it's still possible to construct a discontiguous case e.g.
by using a linker script.
Differential Revision: https://reviews.llvm.org/D29364
llvm-svn: 293811
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
first caller to the Thunk.
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.
This is a recommit of r293283 with a fixed comparison predicate as
std::merge requires a strict weak ordering.
Differential revision: https://reviews.llvm.org/D29327
llvm-svn: 293757
If no bss sections appear after the relro segment, the loader will round
the r/w segment size to the target's page size. Align the relro size in the
same way to ensure that it does not extend past the end of the program's
own memory region.
Differential Revision: https://reviews.llvm.org/D29242
llvm-svn: 293519
The symbols _end, end, _etext, etext, _edata, edata and __ehdr_start
refer to positions in the file and are therefore not absolute. Making
them absolute was on unfortunate cargo cult of what bfd was doing.
Changing the symbols allows for pc relocations to them to be resolved,
which should fix the wine build.
llvm-svn: 293385
[ELF] Fixed formatting. NFC
and
[ELF] Bypass section type check
Differential revision: https://reviews.llvm.org/D28761
They do the opposite of what was asked for in the code review.
llvm-svn: 293320
Thunks are now implemented by redirecting the relocation to the
symbol S, to a symbol TS in a Thunk. The Thunk will transfer control
to S. This has the following implications:
- All the side-effects of Thunks happen within createThunks()
- Thunks are no longer stored in InputSections and Symbols no longer
need to hold a pointer to a Thunk
- The synthetic Thunk sections need to be merged into OutputSections
This implementation is almost a direct conversion of the existing
Thunks with the following exceptions:
- Mips LA25 Thunks are placed before the InputSection that defines
the symbol that needs a Thunk.
- All ARM Thunks are placed at the end of the OutputSection of the
first caller to the Thunk.
Range extension Thunks are not supported yet so it is optimistically
assumed that all Thunks can be reused.
Differential Revision: https://reviews.llvm.org/D29129
llvm-svn: 293283
It now uses the same infrastructure as symbol versions. This fixes us
creating a dynamic relocation without a corresponding dynamic symbol.
This also means that unlike gold and bfd we keep a STB_LOCAL in the
static symbol table. It seems an odd feature to offer precise control
over what is in a symbol table that is not used by the dynamic
linker. We can bring this back if needed, but it would probably be
better to just have --discard option that tells the linker to keep in
the static symbol table only what is in the dynamic one.
Should fix the eog build.
llvm-svn: 293093
Mapping symbols allow a mapping symbol aware disassembler to
correctly disassemble the PLT when the code immediately prior to the
PLT is Thumb.
To implement this we add a function to add symbols with local
binding to be defined in SyntheticSymbols.
Differential Revision: https://reviews.llvm.org/D28956
llvm-svn: 293044
Previously we stored kept locals in a KeptLocalSyms arrays,
belonged to files.
Patch makes SymbolTableSection to store locals in Symbols member,
that already present and was used for globals.
SymbolTableSection already had NumLocals counter member, so change
itself is trivial.
That allows to simplify handling of -r,
Body::DynsymIndex is no more used as "symbol table index" for relocatable
output.
Change was suggested during review of D28773 and opens road for D28612.
Differential revision: https://reviews.llvm.org/D29021
llvm-svn: 292789
A necessary first step towards range extension thunks is to delay
the creation of thunks until the layout of InputSections within
OutputSections has been done.
The change scans the relocations directly from InputSections rather
than looking in the ELF File the InputSection came from. This will
allow a future change to redirect the relocations to symbols defined
by Thunks rather than indirect when resolving relocations.
A side-effect of moving ThunkCreation is that the OutSecOff of
InputSections may change in an OutputSection that contains Thunks.
In well behaved programs thunks are not in OutputSections with
dynamic relocations.
Differential Revision: https://reviews.llvm.org/D28811
llvm-svn: 292359
This patch adds a test for an invalid output path for -Map option,
though that test is not for verifying that we are using error()
instead of fatal() in writeMapFile.
llvm-svn: 292336
Patch changes behavior to not try open the output
file if we already know about error.
That is not just cleaner, but also fixes nasty behavior of linker that
could create huge temporarily files under certain conditions.
Differential revision: https://reviews.llvm.org/D28107
llvm-svn: 292219
On MIPS .got section cannot be included into the PT_GNU_RELRO segment.
Sometimes it might work, but in general it is unsupported. One of the
problem is that all sections marked by SHF_MIPS_GPREL should be grouped
together because data in these sections is addressable with a gp
relative address, but such sections might be writable.
This patch exclude mips .got from PT_GNU_RELRO segment and group
SHF_MIPS_GPREL sections.
llvm-svn: 292161
These were 3 last synthetics that were added in addPredefinedSections() instead
of createSyntheticSections(). Now it is possible to move addition to correct common place.
Also patch fixes testcase which discards .shstrtab, by restricting doing that.
Differential revision: https://reviews.llvm.org/D28561
llvm-svn: 291908
When reserving copy relocation space for a shared symbol, scan the DSO's
program headers to see if the symbol is in a read-only segment. If so,
reserve space for that symbol in a new synthetic section named .bss.rel.ro
which will be covered by the relro program header.
This fixes the security issue disclosed on the binutils mailing list at:
https://sourceware.org/ml/libc-alpha/2016-12/msg00914.html
Differential Revision: https://reviews.llvm.org/D28272
llvm-svn: 291524
This is in preparation for my next change, which will introduce a relro
nobits section. That requires that relro sections appear at the end of the
progbits part of the r/w segment so that the relro nobits section can appear
contiguously.
Because of the amount of churn required in the test suite, I'm making this
change separately.
llvm-svn: 291523
This patch enables something like "-o /dev/null".
Previouly, it failed because such files cannot be renamed.
Differential Revision: https://reviews.llvm.org/D28010
llvm-svn: 291496
The glibc dynamic loader rounds the size down, so without this the loader
will fail to change the memory protection for the last page.
Differential Revision: https://reviews.llvm.org/D28267
llvm-svn: 290986
DefinedSynthetic is not created for a real ELF object, so it doesn't
have to be a template function. It has a virtual st_value, which is
either 32 bit or 64 bit, but we can simply use 64 bit.
llvm-svn: 290241
That was requested by Mark Kettenis in llvm-dev:
"It is the intention that .openbsd.randomdata sections are made
read-only after initialization. The native (ld.bfd based) OpenBSD
toolchain accomplishes this by including .openbsd.randomdata into the
PT_GNU_RELRO segment."
He suggested code change, I added testcase.
Differential revision: https://reviews.llvm.org/D27974
llvm-svn: 290174
That variable was of type DenseMap<StringRef, unsigned>, but the
unsigned numbers needed to be monotonicly increasing numbers because
the implementation that used the variable depended on that fact.
That was an implementation detail and shouldn't have leaked into Config.
This patch simplifies its type to std::vector<StringRef>.
llvm-svn: 290151
This handles all the corner cases if setting a section address:
- If the address is too low, we cannot allocate the program headers.
- If the load address is lowered, we have to do that before finalize
This also shares some code with the linker script since it was already
hitting similar cases.
This is used by the freebsd boot loader. It is not clear if we need to
support this with a non binary output, but it is not as bad as I was
expecting.
llvm-svn: 290136
--retain-symbols-file=filename
Retain only the symbols listed in the file filename, discarding all others.
filename is simply a flat file, with one symbol name per line. This option
is especially useful in environments (such as VxWorks) where a large global
symbol table is accumulated gradually, to conserve run-time memory.
Note: though documentation says "--retain-symbols-file does not discard
undefined symbols, or symbols needed for relocations.", both bfd and gold
do that, and this patch too, like testcase show.
Differential revision: https://reviews.llvm.org/D27716
llvm-svn: 290122
Use of CachedHashStringRef makes sense only when we reuse hash values.
Sprinkling it to all DenseMap has no benefits and just complicates data types.
Basically we shouldn't use CachedHashStringRef unless there is a strong
reason to to do so.
llvm-svn: 290076
I thought for a while about how to remove it, but it looks like we
can just copy the file for now. Of course I'm not happy about that,
but it's just less than 50 lines of code, and we already have
duplicate code in Error.h and some other places. I want to solve
them all at once later.
Differential Revision: https://reviews.llvm.org/D27819
llvm-svn: 290062
This change introduces new synthetic sections IpltSection, IgotPltSection
that represent the ifunc entries that would previously have been put in
the PltSection and the GotPltSection. The separation makes sure that
the R_*_IRELATIVE relocations are placed after the non R_*_IRELATIVE
relocations, which permits ifunc resolvers to know that the .got.plt
slots will be initialized prior to the resolver being called.
A secondary benefit is that for ARM we can move the IgotPltSection and its
dynamic relocations to the .got and .rel.dyn as the ARM glibc expects all
the R_*_IRELATIVE relocations to be in the .rel.dyn
Differential revision: https://reviews.llvm.org/D27406
llvm-svn: 289045
These MIPS specific symbols should be global because in general they can
have an arbitrary value. By default this value is a fixed offset from .got
section.
This patch adds more checks to the mips-gp-local.s test case but marks
it as XFAIL because LLD does not allow redefinition of absolute symbols
value by a linker script. This should be fixed by D27276.
Differential revision: https://reviews.llvm.org/D27524
llvm-svn: 289025
Shared libraries should have entry set following the same rules as for
regular binaries. The only difference is that in case the default entry
point (_start or __start) isn't found (unless it was set explicitly), we
shouldn't give a warning as in case of regular binaries.
Differential Revision: https://reviews.llvm.org/D27497
llvm-svn: 288878
If we do, the freebsd dynamic linker tries to call mmap with a size 0,
which fails.
It is hard to avoid creating them when linker scripts are used, so we
just delete empty PT_LOADs at the end.
llvm-svn: 288808
On Linux (and probably on other Unix-like systems), unlink(2) is
noticeably slow. It takes 250 milliseconds to remove a 1 GB file
on ext4 filesystem on my machine, whether the file is on SSD or
on a spinning disk.
To create a new result file, we remove existing file first. So, if
you repeatedly link a 1 GB program in a regular compile-link-debug
cycle, every cycle wastes 250 milliseconds only to remove a file.
Since LLD can link a 1 GB in about 5 seconds, that waste actually
matters.
This patch defines `unlinkAsync` function. The function spawns a
background thread to call unlink. The calling thread returns
almost immediately.
Differential Revision: https://reviews.llvm.org/D27295
llvm-svn: 288680
Binary output feature is a bit confuzing. bfd and gold output differs a lot sometimes,
though it is important for FreeBSD mbr loaders.
Patch change the way how we compute file offsets for binary output.
This fixes PR31196.
Previously offsets were calculated basing on offsets and addresses of sections
from the same loads:
if (Sec == First)
return alignTo(Off, Target->MaxPageSize, Sec->Addr);
return First->Offset + Sec->Addr - First->Addr;
bfd assigns offsets for each section to VA - MinVA:
https://github.com/redox-os/binutils-gdb/blob/master/bfd/binary.c#L27https://github.com/redox-os/binutils-gdb/blob/master/bfd/binary.c#L255
(LMA == VA usually)
This patch for now just stops creating phdrs for binary output.
An effect from this that no any additional calculation for offset is performed:
uintX_t getFileAlignment(uintX_t Off, OutputSectionBase *Sec) {
OutputSectionBase *First = Sec->FirstInPtLoad;
// If the section is not in a PT_LOAD, we have no other constraint.
if (!First)
return Off; //**First is always null, condition always happens**
That is enough now with combination of another patch to generate output
that is similar to what bfd produce for mbr loader.
Differential revision: https://reviews.llvm.org/D27341
llvm-svn: 288580
This change continues what was started by D27040
Now all allocatable synthetics should be available from script side.
Differential revision: https://reviews.llvm.org/D27131
llvm-svn: 288150
-N (-omagic)
Set the text and data sections to be readable and writable.
Also, do not page-align the data segment.
Differential revision: https://reviews.llvm.org/D26888
llvm-svn: 288123
That unifies handling cases when we have SECTIONS and when
-no-rosegment is given in compareSectionsNonScript()
Now Config->SingleRoRx is used for check, testcase is provided.
llvm-svn: 288022
--no-rosegment: Do not put read-only non-executable sections in their own segment
Differential revision: https://reviews.llvm.org/D26889
llvm-svn: 288020
Unfortunatelly PT_ARM_EXIDX is special. There is no way to create it
from linker scripts, so we have to create it even if PHDRS is used.
This matches bfd and is required for the lld output to survive bfd's strip.
llvm-svn: 288012
This is important for cases like:
.sdata : {
*(.got.plt .got)
...
}
That was not supported before as there was no way to get access to
synthetic sections from script.
More details on review page.
Differential revision: https://reviews.llvm.org/D27040
llvm-svn: 287913
The .ARM.exidx table has an entry for each function with the first entry
giving the start address of the function, the table is sorted in ascending
order of function address. Given a PC value, the unwinder will search the
table for the entry that contains the PC value.
If the table entry happens to be the last, the range of the addresses that
the final unwinding table describes will extend to the end of the address
space. To prevent an incorrect address outside the address range of the
program matching the last entry we follow ld.bfd's example and add a
sentinel EXIDX_CANTUNWIND entry at the end of the table. This gives the
final real table entry an upper bound.
In addition the llvm libunwind unwinder currently depends on the presence
of a sentinel entry (PR31091).
Differential revision: https://reviews.llvm.org/D26977
llvm-svn: 287869
Previously, if a symbol specified by -e or ENTRY() is not found,
we didn't set entry point address. That is incompatible with GNU
because GNU linkers set the first address of .text to entry.
This patch implement that behavior.
llvm-svn: 287836
Offset between beginning of a .got section and _gp symbols used in MIPS
GOT relocations calculations. Usually the expression looks like
VA + Offset - GP, where VA is the .got section address, Offset - offset
of the GOT entry, GP - offset between .got and _gp. Also there two "magic"
symbols _gp_disp and __gnu_local_gp which hold the offset mentioned above.
These symbols might be referenced by MIPS relocations.
Now the linker always defines _gp symbol and uses hardcoded value for
its initialization. So offset between .got and _gp is 0x7ff0. The _gp_disp
and __gnu_local_gp defined if required and initialized by 0x7ff0.
In fact that is not correct because _gp symbol might be defined by a linker
script and holds arbitrary value. In that case we need to use this value
in relocation calculation and initialize _gp_disp and __gnu_local_gp
properly.
The patch fixes the problem and completes fixing the bug #30311.
https://llvm.org/bugs/show_bug.cgi?id=30311
Differential revision: https://reviews.llvm.org/D27036
llvm-svn: 287832
We have different functions to stringize objects to construct
error messages. For InputFile, we have getFilename, and for
InputSection, we have getName. You had to memorize them.
I think this is the case where the function overloading comes in handy.
This patch defines toString() functions that are overloaded for all these
types, so that you just call it in error().
Differential Revision: https://reviews.llvm.org/D27030
llvm-svn: 287787
Previously, we stored offsets in string tables to symbols, so
you needed to pass a string table to get a symbol name. This patch
stores const char pointers instead to eliminate the need to pass
a string table.
llvm-svn: 287737
If the linker script has SECTIONS, the address computation is now
always done in LinkerScript::assignAddresses, like for any other
section.
Before fixHeaders would do a tentative computation that
assignAddresses would sometimes override.
This patch also splits the cases where assignAddresses needs to add
the headers to the first PT_LOAD and the address computation. The net
effect is that we no longer create an empty page for no reason in the
included test case, which matches bfd behavior.
llvm-svn: 287565
MergeOutputSection class was a bit hard to use because it provdes
a series of finalize functions that have to be called in a right way
at a right time. It also intereacted with MergeInputSection, and the
logic was somewhat entangled between the two classes.
This patch simplifies it by providing only one finalize function.
Now, all you have to do is to call MergeOutputSection::finalize
when you have added all sections to the output section. Then, it
internally merges strings and initliazes StringPiece objects.
I think this is much easier to understand.
This patch also adds comments.
llvm-svn: 287314
I hit an internal linker script that was defining _DYNAMIC instead of
letting the linker do it. It turns out that both bfd and gold allow
that.
This is pretty easy to implement, just make the linker defined symbol
weak. This should have no impact in the case where there is no user
defined symbol: The visibility is hidden, which causes the output to
still be local.
llvm-svn: 287260
MIPS GOT handling is very different from other targets so it is better
to keep the code in the separatre section class MipsGotSection. This
patch introduces the new section and moves all MIPS specific code from
GotSection to the new class. I did not rename fields and methods in the
MipsGotSection class to reduce the diff and plan to do that by the
separate commit.
Differential revision: https://reviews.llvm.org/D26733
llvm-svn: 287150
This patch introduces the following changes:
- DynamicSection now inherits InputSection<ELFT> and was moved
to SyntheticSections.h/.cpp.
- Link and Entsize fields of DynamicSection are propagated to
its output section
- In<ELFT>::SyntheticSections was removed.
- Finalization of synthetic sections was removed from
OutputSection<ELFT>::finalize. Now finalizeSyntheticSections is
used instead.
Differential revision: https://reviews.llvm.org/D26603
llvm-svn: 286950
This patch stops creating symbols like __ehdr_start,
_end/_etext_edata,__tls_get_addr when using -r.
This fixes PR30984.
Differential revision: https://reviews.llvm.org/D26600
llvm-svn: 286941
Patch adds a filename to that error message.
I faced next error when debugged one of FreeBSD port:
error: relocation R_X86_64_PLT32 cannot refer to absolute symbol __tls_get_addr
error message was poor and this patch improves it to show the locations
of symbol declaration and using.
Differential revision: https://reviews.llvm.org/D26508
llvm-svn: 286940
Propagate program headers by walking the commands, not the
sections. This allows us to propagate program headers even from
sections that don't end up in the output.
Fixes pr30997.
llvm-svn: 286837
Unlike gold, bfd, gas or MC we were putting exidx sections first since
they are ro.
The spec doesn't explicitly say that they must come after, but it is
definitely more convenient for the consumer, matches other producers
and matches other areas in ELF (like SHT_GROUP) where sections are
ordered in a natural way.
llvm-svn: 286659
We would create a MergeInputSection for the synthetic .comment and
crash trying to add it to a regular output section.
With this we just don't add the synthetic section with -r. That is
consistent with gold that doesn't create .note.gnu.gold-version with
-r.
llvm-svn: 286635
Summary:
This patch adds a ".comment" section to an output. The comment
section contains the linker's version string. You can now
find out whether a binary is created by LLD or not using objdump
command like this.
$ objdump -s -j .comment foo
foo: file format elf64-x86-64
Contents of section .comment:
0000 00474343 3a202855 62756e74 7520342e .GCC: (Ubuntu 4.
0010 382e342d 32756275 6e747531 7e31342e 8.4-2ubuntu1~14.
...
00c0 766d2f74 72756e6b 20323835 38343629 vm/trunk 285846)
00d0 004c696e 6b65723a 204c4c44 20342e30 .Linker: LLD 4.0
00e0 2e302028 7472756e 6b203238 36343036 .0 (trunk 286406
00f0 2900 ).
Compilers emits .comment section as well, so the output contains
both compiler and linker information.
Alternative considered:
I first tried to add a SHT_NOTE section because GNU gold does that.
A NOTE section starts with a header which contains content type.
It turned out that ld.gold sets type NT_GNU_GOLD_VERSION to their
NOTE section. So the NOTE type is only for GNU gold (surprise!)
Next, I tried to create ".linker-version" section. However, it seems
that reusing the existing ".comment" section is better because 1)
other tools already know about .comment section and is able to strip
it and 2) the result contans not only linker info but also compiler
info.
Differential Revision: https://reviews.llvm.org/D26487
llvm-svn: 286496
Relocations are the last thing that we wore storing a raw section
pointer to and parsing on demand.
With this patch we parse it only once and store a pointer to the
actual data.
The patch also changes where we store it. It is now in
InputSectionBase. Not all sections have relocations, but most do and
this simplifies the logic. It also means that we now only support one
relocation section per section. Given that that constraint is
maintained even with -r with gold bfd and lld, I think it is OK.
llvm-svn: 286459
Patch allows to pass a symbols file to linker.
LLD will map symbols to sections and sort sections
in output according to symbol ordering file.
That can help to reduce the startup time and/or
amount of pagefaults during startup.
Also, interesting benchmark result was produced by Rafael Espíndola.
After applying the symbols file for clang he timed compiling
X86MCTargetDesc.ii to an object file.
The page faults went from just
56,988 to 56,946 since most faults are not in the binary.
Running time went from 4.403053515 to 4.178112244.
The speedup seems to be because of better cache
locality.
Differential revision: https://reviews.llvm.org/D26130
llvm-svn: 286440
The disadvantage is that we use uint64_t instad of uint32_t for some
value in 32 bit files. The advantage is a substantially simpler code,
faster builds and less code duplication.
llvm-svn: 286414
Previously, we have both input and output section for .MIPS.abiflags.
Now we have only one class for .MIPS.abiflags, which is MipsAbiFlagsSection.
This class is a synthetic input section.
.MIPS.abiflags sections are handled as regular sections until
the control reaches Writer. Writer then aggregates all sections
whose type is SHT_MIPS_ABIFLAGS to create a single synthesized
input section. The synthesized section is then processed normally
as if it came from an input file.
llvm-svn: 286398
Previously, we have both input and output sections for .reginfo and
.MIPS.options. Now for each such sections we have one synthetic input
sections: MipsReginfoSection and MipsOptionsSection respectively.
Both sections are handled as regular sections until the control reaches
Writer. Writer then aggregates all sections whose type is SHT_MIPS_REGINFO
or SHT_MIPS_OPTIONS to create a single synthesized input section. In that
moment Writer also save GP0 value to the MipsGp0 field of the corresponding
ObjectFile. This value required for R_MIPS_GPREL16 and R_MIPS_GPREL32
relocations calculation.
Differential revision: https://reviews.llvm.org/D26444
llvm-svn: 286397
This is similar to what was done for InputSection.
With this the various fields are stored in host order and only
converted to target order when writing.
llvm-svn: 286327
A CommonInputSection is a section containing all common symbols.
That was an input section but was abstracted in a different way
than the synthetic input sections because it was written before
the synthetic input section was invented.
This patch rewrites CommonInputSection as a synthetic input section
so that it behaves better with other sections.
llvm-svn: 286053
Previously, we do this piece of code to iterate over all input sections.
for (elf::ObjectFile<ELFT> *F : Symtab.getObjectFiles())
for (InputSectionBase<ELFT> *S : F->getSections())
It turned out that this mechanisms doesn't work well with synthetic
input sections because synthetic input sections don't belong to any
input file.
This patch defines a vector that contains all input sections including
synthetic ones.
llvm-svn: 286051
Previously, we added strings from DynamicSection::finalize().
It was a bit tricky because finalize() is supposed to fix the final
size of the section, but adding new strings would change the size of
.dynstr section. So there was a dependency between finalize functions
of .dynamic and .dynstr.
However, I noticed that we can elimiante the dependency by simply
add strings early; we don't have to do that in finalize() but can do
from DynamicSection's ctor.
This patch defines a new function, DynamicSection::addEntries, to
add .dynamic entries that doesn't depend on other sections.
llvm-svn: 285784
We are going to have many more classes for linker-synthesized
input sections, so it's worth to be added to a separate file
than to the file for regular input sections.
llvm-svn: 285740
Previously, we have a lot of BumpPtrAllocators, but all these
allocators virtually have the same lifetime because they are
not freed until the linker finishes its job. This patch aggregates
them into a single allocator.
Differential revision: https://reviews.llvm.org/D26042
llvm-svn: 285452
Instead of having 3 section allocators per file, have 3 for all files.
This is a substantial performance improvement for some cases. Linking
chromium without gc speeds up by 1.065x.
This requires using _exit in fatal since we have to avoid destructing
an InputSection if fatal is called from the constructor.
Thanks to Rui for the suggestion.
llvm-svn: 285290
When static linking in ARM (like Mips) __tls_get_addr is defined by
the library so we should not define it as a synthetic.
We also need to add __exidx_start and __exidx_end for the .ARM.exidx
section as the static libc library startup code is expecting them to
be defined by the default linker script for static linking on ARM.
Differential revision: https://reviews.llvm.org/D25978
llvm-svn: 285279
As the state of lld gets more complicated, shutting down gets more
expensive.
In a normal lld run we can just call _exit immediately after renaming
the temporary output file. We still want the ability to run a full
shutdown since that is useful for detecting memory leaks.
This patch adds a --full-shutdown flag and changes lit to use it.
llvm-svn: 285224
Instead of storing a pointer, store the members we need.
The reason for doing this is that it makes it far easier to create
synthetic sections. It also avoids reading data from files multiple
times., which might help with cross endian linking and host
architectures with slow unaligned access.
There are obvious compacting opportunities, but this already has mixed
results even on native x86_64 linking.
There is also the possibility of better refactoring the code for
handling common symbols, but this already shows that a custom class is
not necessary.
llvm-svn: 285148
We were fairly inconsistent as to what information should be accessed
with getSectionHdr and what information (like alignment) was stored
elsewhere.
Now all section info has a dedicated getter. The code is also a bit
more compact.
llvm-svn: 285079
We were previously using the (static) addSynthetic function to create
*_start/*_end symbols. This function was doing almost the same thing as
addOptionalSynthetic, except that it would also create the symbol in the
case where it is unreferenced. Because the symbol has hidden visibility,
creating it in that case would have no effect other than adding another
entry to the static symbol table. Remove addSynthetic and change callers to
use addOptionalSynthetic instead.
Differential Revision: https://reviews.llvm.org/D25545
llvm-svn: 285021
In this patch partial gdb_index section is created.
For costructing the .gdb_index section 6 steps should be performed (details are in
SplitDebugInfo.cpp file header), this patch do first 3:
Creates proper section header.
Fills list of compilation units.
Types CU list area is not supposed to be supported, so it is ignored and therefore
can be treated as implemented either.
Differential revision: https://reviews.llvm.org/D24706
llvm-svn: 284708
Previously, we were checking the existence of an entry symbol
too early. It was done before the linker script processor creates
symbols defined in scripts. Fixes bug 30743.
llvm-svn: 284676
This is 30646.
PT_OPENBSD_RANDOMIZE
The array element specifies the location and size of a part of the memory image of the program that must be filled with random data before any code in the object is executed. The memory region specified by a segment of this type may overlap the region specified by a PT_GNU_RELRO segment, in which case the intersection will be filled with random data before being marked read-only.
Reference links:
http://man.openbsd.org/OpenBSD-current/man5/elf.5c494713c45
Differential revision: https://reviews.llvm.org/D25469
llvm-svn: 284234
-z wxneeded creates a PHDR PT_OPENBSD_WXNEEDED.
PT_OPENBSD_WXNEEDED
The array element specifies that a process executing this file may need to be able to map or protect memory regions as simultaneously executable and writable. If the system is unable or unwilling to permit that for this executable then it may fail immediately. This segment type is meaningful only for executable files and is ignored in other objects.
http://man.openbsd.org/OpenBSD-current/man5/elf.5
Differential revision: https://reviews.llvm.org/D25472
llvm-svn: 284226
Previously we would fail to synthesise a __start_ or __stop_ symbol if
there existed a definition in a DSO. Instead, we would try to link against
the DSO definition. This became possible after D23552 when linking against
lld-produced DSOs but could in principle also occur when linking against
DSOs produced by other linkers.
Not only does it seem more likely that a user would expect the resolved
definition to be local to the executable, but if a __start_ or __stop_
symbol was synthesised by the linker, it is effectively impossible to link
against correctly from a non-PIC executable in a read-only section. Neither
a PLT nor a copy relocation would give us the right semantics here. The only
way the link could succeed is if the executable provided its own synthetic
definition of the symbol.
The fix is to also synthesise the definition if the only definition comes
from a DSO. Since this is what the addOptionalSynthetic function does,
switch to using that function.
Fixes PR30680.
Differential Revision: https://reviews.llvm.org/D25544
llvm-svn: 284168
Previously, we supported only SHF_COMPRESSED sections because it's
new and it's the ELF standard. But there are object files compressed
in the GNU style out there, so we had to support it.
Sections compressed in the GNU style start with ".zdebug_" and
contain different headers than the ELF standard's one. In this
patch, getRawCompressedData is responsible to handle it.
A tricky thing about GNU-style compressed sections is that we have
to rename them when creating output sections. ".zdebug_" prefix
implies the section is compressed. We need to rename ".zdebug_"
".debug" because our output sections are not compressed.
We do that in this patch.
llvm-svn: 284068
This part was splitted from D25016.
When sh_info value was set in the way that non-local symbol was treated as local, lld
was asserting, patch fixes that.
Differential revision: https://reviews.llvm.org/D25371
llvm-svn: 283859
Absolute local symbols with name staring from ".L" were reason of crash.
The same could happen when using some broken inputs found by AFL.
Patch fixes that.
Differential revision: https://reviews.llvm.org/D25365
llvm-svn: 283731
The .ARM.exidx sections contain a table. Each entry has two fields:
- PREL31 offset to the function the table entry describes
- Action to take, either cantunwind, inline unwind, or PREL31 offset to
.ARM.extab section
The table entries must be sorted in order of the virtual addresses the
first entry of the table describes. Traditionally this is implemented by
the SHF_LINK_ORDER dependency. Instead of implementing this directly we
sort the table entries post relocation.
The .ARM.exidx OutputSection is described by the PT_ARM_EXIDX program
header
Differential revision: https://reviews.llvm.org/D25127
llvm-svn: 283730
Since they end up going on the same PT_LOAD, there is no reason to
sort them. This matches bfd's behaviour and is user visible in the
placement of orphan sections.
llvm-svn: 282799
If there is not sufficient address space, just give up and don't put
the header in the PT_LOAD.
This matches bfd behaviour and I found at least one script that
depends on having a section at address 0.
llvm-svn: 282750
If we two sections reside in the same PT_LOAD segment,
we compute second section using the following formula:
Off2 = Off1 + VA2 - VA1. This allows OS kernel allocating
sections correctly when loading an image.
Differential revision: https://reviews.llvm.org/D25014
llvm-svn: 282705
This matches the behavior of Binutils linkers. We also change the
default MaxPageSize on x86-64 to 0x1000 to preserver the current
behavior, which is the same as the behavior implemented by gold.
https://llvm.org/bugs/show_bug.cgi?id=30541
Differential Revision: https://reviews.llvm.org/D24987
llvm-svn: 282560
If section contains local symbols ldd crashes, because local
symbols are added to symbol table before section is discarded
by linker script processor. This patch calls copyLocalSymbols()
after createSections, so discarded section symbols are not copied
llvm-svn: 282244
This reverts commit r282021, bringing back r282015.
The problem was that the comparison function was not a strict weak
ordering anymore, which this patch fixes.
Original message:
Only restrict order if both sections are in the script.
This matches gold and bfd behavior and is required to handle some scripts.
The script has to assume where PT_LOADs start in order to align that
spot. If we don't allow section it doesn't know about to move to the
middle, we can need more PT_LOADs and those will not be aligned.
llvm-svn: 282035
Linker scripts are responsible for aliging '.'. Since they are
designed for bfd which has no --rosegment, they don't align the RO to
RX transition.
llvm-svn: 281978
InputSection<ELFT>::Discarded has no name and it's not backed by
a file. Trying to report it as discared will cause a nullptr
dereference, therefore a crash. Skip it.
Differential Revision: https://reviews.llvm.org/D24731
llvm-svn: 281946
The InputSection variables in the Writer were named `C`. This
was because when the ELF linker was ported (from COFF)
the name `Chunks` for input sections was retained.
Luckily we switched to a more ELF-compliant jargon, but these
variables weren't reanamed accordingly during the transition.
llvm-svn: 281917
This matches gold and bfd, and is pretty much required by some linker
scripts. They end with commands like
foo 0 : { *(bar) }
if we put any SHF_ALLOC sections after they can have an address that
is too low.
llvm-svn: 281778
--section-start=sectionname=org
Locate a section in the output file at the absolute address given by org.
You may use this option as many times as necessary to locate multiple sections in the command line.
org must be a single hexadecimal integer; for compatibility with other linkers,
you may omit the leading `0x' usually associated with hexadecimal values.
Note: there should be no white space between sectionname, the equals sign (“<=>”), and org.
-Tbss=org
-Tdata=org
-Ttext=org
Same as --section-start, with .bss, .data or .text as the sectionname.
Differential revision: https://reviews.llvm.org/D24294
llvm-svn: 281458
Previously, all input files were owned by the symbol table.
Files were created at various places, such as the Driver, the lazy
symbols, or the bitcode compiler, and the ownership of new files
was transferred to the symbol table using std::unique_ptr.
All input files were then free'd when the symbol table is freed
which is on program exit.
I think we don't have to transfer ownership just to free all
instance at once on exit.
In this patch, all instances are automatically collected to a
vector and freed on exit. In this way, we no longer have to
use std::unique_ptr.
Differential Revision: https://reviews.llvm.org/D24493
llvm-svn: 281425
Without this flag set, an AArch64 Linux kernel won't try to load the executable
(even if a 32 bit arm kernel will run the binary just fine).
Patch by Martin Storsjö!
Differential revision: https://reviews.llvm.org/D24471
llvm-svn: 281394
This simplifies error handling as there is now only one place in the
code that needs to consider the possibility that the name is
corrupted. Before we would do it in every access.
llvm-svn: 280937
On most architectures the linker is required to optimize away any
references to __tls_get_addr in case of static linking. As usual
a special case is MIPS - libc defines __tls_get_addr itself because
there are no TLS optimizations for this architecture.
llvm-svn: 280664
The primary use of build-id is in debugging, hence omitting debug
sections when computing it significantly reduces its usability as
changes in debug section content wouldn't alter the build-id.
Differential Revision: https://reviews.llvm.org/D24120
llvm-svn: 280421
Symbol assignments outside of SECTIONS command need to be created
even when SECTIONS command is not used.
Differential Revision: https://reviews.llvm.org/D23751
llvm-svn: 280252
DiscardPolicy is enum replacing several boolean options.
This approach is not only consistent with what we use for
unresolveds (UnresolvedPolicy), but also should help to solve a problem
of options with opposing meanings, mentioned in PR28843
Differential revision: https://reviews.llvm.org/D23868
llvm-svn: 280209
This approach is not only consistent with UnresolvedPolicy,
but also should help to solve a problem
of options with opposing meanings, mentioned in PR28843
Differential revision: https://reviews.llvm.org/D23869
llvm-svn: 280206
-oformat output-format
`-oformat' option can be used to specify the binary format for the output object file.
Patch implements binary format output type.
Differential revision: https://reviews.llvm.org/D23769
llvm-svn: 279726
The ARM Exception handling ABI requires that all ARM exception index
table sections have a prefix of .ARM.exidx and are combined into a
single contiguous block either in their own output section or as part
of another output section.
In general clang will output a single .ARM.exidx section per object,
but will use .ARM.exidx.<section name> when -ffunction-sections is used.
This change canonicalizes the names of sections with the .ARM.exidx
prefix to just .ARM.exidx, which ensures that there is only a single
output section.
Differential Revision: https://reviews.llvm.org/D23775
llvm-svn: 279617
This patch is opposite to D19024, which made this symbols to be hidden by default.
Unfortunately FreeBSD loader wants to see
start_set_modmetadata_set/stop_set_modmetadata_set in the dynamic symbol table.
They were not placed there because had hidden visibility.
Patch makes them to have default visibility again.
Differential revision: https://reviews.llvm.org/D23552
llvm-svn: 279262
The linker will normally set the LMA equal to the VMA.
You can change that by using the AT keyword.
The expression lma that follows the AT keyword specifies
the load address of the section.
Patch implements this keyword.
Differential revision: https://reviews.llvm.org/D19272
llvm-svn: 278911