Previously we evaluated the values of LMA incorrectly for next cases:
.text : AT(ADDR(.text) - 0xffffffff80000000) { ... }
.data : AT(ADDR(.data) - 0xffffffff80000000) { ... }
.init.begin : AT(ADDR(.init.begin) - 0xffffffff80000000) { ... }
Reason was that we evaluated offset when VA was not assigned. For case above
we ended up with 3 loads that has similar LMA and it was incorrect.
That is critical for linux kernel.
Patch updates the offset after VA calculation. That fixes the issue.
Differential revision: https://reviews.llvm.org/D30163
llvm-svn: 295722
Previously LLD would error out just "ld.lld: error: unable to move location counter backward"
What does not really reveal the place of issue,
Patch adds location to the output.
Differential revision: https://reviews.llvm.org/D30187
llvm-svn: 295720
Previously ASSERT we implemented returned expression value.
Ex:
. = ASSERT(0x100);
would set Dot value to 0x100
Form of assert when it is assigned to Dot was implemented for
compatibility with very old GNU ld which required it.
Some scripts in the wild, including linux kernel scripts
use such ASSERTs at the end for doing different checks.
Currently we fail with "unable to move location counter backward"
for such scripts. Patch changes ASSERT to return location counter
value to fix that.
Differential revision: https://reviews.llvm.org/D30171
llvm-svn: 295703
Previously LLD crashed on on provided testcases because "/DISCARD/" was
not supported. Patch implements that.
After this I think there is no known issues with --emit-relocs implementation
required for linux kernel linking.
Differential revision: https://reviews.llvm.org/D29273
llvm-svn: 295488
This is a small difference I noticed to gold and bfd. When given
--print-gc-sections, we print sections a linkerscript marks
DISCARD. The other linkers don't.
llvm-svn: 295467
This case should be possible to handle, but it is hard:
* In order to create program headers correctly, we have to scan the
sections in the order they are in the file.
* To find that order, we have to "execute" the linker script.
* The linker script can contain SIZEOF_HEADERS.
So to support this we have to start with a guess of how many headers
we need (3), run the linker script and try to create the program
headers. If it turns out we need more headers, we run the script again
with a larger SIZEOF_HEADERS.
Also, running the linker script depends on knowing the size of the
sections, so we have to finalize them. But creating the program
headers can change the value stored in some sections, so we have to
split size finalization and content finalization.
Looks like the last part is also needed for range extension thunks, so
we might support this at some point. For now just report an error
instead of producing broken files.
llvm-svn: 295458
SHF_LINK_ORDER sections adds special ordering requirements.
Such sections references other sections. Previously we would crash
if section that other were referenced to was discarded by script.
Patch fixes that by discarding all dependent sections in that case.
It supports chained dependencies, testcase is provided.
Differential revision: https://reviews.llvm.org/D30033
llvm-svn: 295332
Unfortunately, the common way of writing linker scripts seems to be
to get the output of ld.bfd --verbose and edit it a bit.
Also unfortunately, the bfd default script contains things like
.rela.dyn : { *(... .rela.data ...) }
but bfd actually ignores that for -emit-relocs, so we have to do the
same.
llvm-svn: 295324
The linker script lexer is context-sensitive. In the regular context,
arithmetic operator characters are regular characters, but in the
expression context, they are independent tokens. This afects how the
lexer tokenizes "3*4", for example. (This kind of expression is real;
the Linux kernel uses it.)
This patch defines function `maybeSplitExpr`. This function splits the
current token into multiple expression tokens if the lexer is in the
expression context.
Differential Revision: https://reviews.llvm.org/D29963
llvm-svn: 295225
OUTPUT_ARCH command can contain architecture values separated with ":", like:
OUTPUT_ARCH(i386:x86-64)
We did not support that, because got 3 lexer tokens here after recent changes.
This trivial patch fixes the issue, now whole expression inside
OUTPUT_ARCH is just ignored.
Differential revision: https://reviews.llvm.org/D29640
llvm-svn: 294432
LLD already parses ALIGN expression to specifiy alignment for output
sections in linker scripts but it never applies the alignment to the
output section. This change handles that.
Differential Revision: https://reviews.llvm.org/D29689
llvm-svn: 294374
DefinedSynthetic symbols are attached to sections,
for the case when such symbol was attached to non-allocated section,
we calculated its value incorrectly.
We subtracted Body->Section->Addr, but non-allocatable sections
should have zero VA in output and therefore result value was wrong.
And at the same time we have Body->Section->Addr != 0 for them
internally because use it for calculation of section size.
Patch fixes calculation of such symbols values.
Differential revision: https://reviews.llvm.org/D29653
llvm-svn: 294322
We now create a dummy section with index 1 before processing the
linker script.
Thanks to George Rimar for finding the bug and providing the initial
testcase.
llvm-svn: 294252
This is a fix for Bugzilla 31813.
The problem is that the tokenizer does not create a separate token for
":" unless there's white space before it. Changed it to always create
a token for ":" and reworked some logic that relied on ":" being
attached to some tokens like "global:" and "local:".
llvm-svn: 294006
With a synthetic merge section we can have, for example, a single
.rodata section with stings, fixed sized constants and non merge
constants.
I can be simplified further by not setting Entsize, but that is
probably better done is a followup patch.
This should allow some cleanup in the linker script code now that
every output section command maps to just one output section.
llvm-svn: 294005
This is alternative to D28857 which was incorrect.
One of linux scripts contains:
vvar_start = . - 2 * (1 << 12);
vvar_page = vvar_start;
vvar_vsyscall_gtod_data = vvar_page + 128;
Previously we did not mark first expression as non-absolute,
though it contains location counter.
And LLD failed with error:
relocation R_X86_64_PC32 cannot refer to absolute symbol
This patch should fix the issue, and opens road for doing the same for other operators
(though not clear if that is needed).
Differential revision: https://reviews.llvm.org/D29332
llvm-svn: 293748
Linux kernel linkerscript contains additional semicolon (last line):
.apicdrivers : AT(ADDR(.apicdrivers) - LOAD_OFFSET) {
__apicdrivers = .;
*(.apicdrivers);
I checked that both gold and bfd are able to parse something like:
.text : { ;;*(.text);;S = 0;; } }
Patch do the same.
Differential revision: https://reviews.llvm.org/D29276
llvm-svn: 293612
[ELF] Fixed formatting. NFC
and
[ELF] Bypass section type check
Differential revision: https://reviews.llvm.org/D28761
They do the opposite of what was asked for in the code review.
llvm-svn: 293320
As specified here:
* https://sourceware.org/binutils/docs/ld/MEMORY.html#MEMORY
There are two deviations from what is specified for GNU ld:
1. Only integer constants and *not* constant expressions
are allowed in `LENGTH` and `ORIGIN` initializations.
2. The `I` and `L` attributes are *not* implemented.
With (1) there is currently no easy way to evaluate integer
only constant expressions. This can be enhanced in the
future.
With (2) it isn't clear how these flags map to the `SHF_*`
flags or if they even make sense for an ELF linker.
Differential Revision: https://reviews.llvm.org/D28911
llvm-svn: 292875
Found that during attempts of linking linux kernel,
previously we partially duplicated code from getOutputSection(),
and it missed commons symbol case.
Differential revision: https://reviews.llvm.org/D28903
llvm-svn: 292594
Inputs shown in that testcase previously created
a huge temporarily file under 32 bits.
It was fixed by D28107. During review was suggested to
add a testcase even without CHECKs for documentation purposes.
Patch do that.
llvm-svn: 292220
These were 3 last synthetics that were added in addPredefinedSections() instead
of createSyntheticSections(). Now it is possible to move addition to correct common place.
Also patch fixes testcase which discards .shstrtab, by restricting doing that.
Differential revision: https://reviews.llvm.org/D28561
llvm-svn: 291908
This is in preparation for my next change, which will introduce a relro
nobits section. That requires that relro sections appear at the end of the
progbits part of the r/w segment so that the relro nobits section can appear
contiguously.
Because of the amount of churn required in the test suite, I'm making this
change separately.
llvm-svn: 291523
This patch allows for linker scripts to assign a new value
to a symbol that is already defined (either in an object file
or the linker script itself).
llvm-svn: 291459
After Mark's patch I was wondering what was the rationale for the ELF
spec requiring us to merge only sections with matching flags and
types. I tried emailing
https://groups.google.com/forum/#!forum/generic-abi, but looks like my
emails are not being posted (the list is probably moderated). I
emailed Cary Coutant instead.
Cary pointed out that the section was a late addition and didn't got
the scrutiny it deserved. Given that and the problems found by
implementing the letter of the standard, I propose changing lld to
merge all sections with the same name and issue errors if the types or
some critical flags are different.
This should allow an unmodified firefox linked with lld to run.
This also merges some code with the linkerscript path.
llvm-svn: 291107
PR31335 shows that we do that in next case:
SECTIONS { .text 0x2000 : {. = 0x100 ; *(.text) } }
though documentations says that "If . is used inside a section
description however, it refers to the byte offset from the start
of that section, not an absolute address. " looks does not work
as documented in bfd (as mentioned in comments for PR31335).
Until we find out the expected behavior was suggested at least not
to 'crash', what we do after trying to generate huge file.
Differential revision: https://reviews.llvm.org/D27712
llvm-svn: 289782
linkerscript.s is the first test file for linker script, and at the moment
it contains all tests for linker scripts. Now that test file doesn't make
sense.
linkerscript2.s was just badly named. Renamed searchdir.s.
llvm-svn: 289148
This change continues what was started by D27040
Now all allocatable synthetics should be available from script side.
Differential revision: https://reviews.llvm.org/D27131
llvm-svn: 288150
Unfortunatelly PT_ARM_EXIDX is special. There is no way to create it
from linker scripts, so we have to create it even if PHDRS is used.
This matches bfd and is required for the lld output to survive bfd's strip.
llvm-svn: 288012
Unfortunatelly some scripts look like
kernphys = ...
. = ....
and the expectation in that every orphan section is after the
assignment.
llvm-svn: 287996
This is an horrible special case, but seems to match bfd's behaviour
and is important for avoiding placing an orphan section before the
expected start of the file.
llvm-svn: 287994
This is important for cases like:
.sdata : {
*(.got.plt .got)
...
}
That was not supported before as there was no way to get access to
synthetic sections from script.
More details on review page.
Differential revision: https://reviews.llvm.org/D27040
llvm-svn: 287913
Align to the large page size (known as a superpage or huge page).
FreeBSD automatically promotes large, superpage-aligned allocations.
Differential Revision: https://reviews.llvm.org/D27042
llvm-svn: 287782
GNU LD allows `ASSERT` commands to be in output section descriptions.
Note that LD also mandates that `ASSERT` commands in this context must
end with a semicolon.
llvm-svn: 287677
If the linker script has SECTIONS, the address computation is now
always done in LinkerScript::assignAddresses, like for any other
section.
Before fixHeaders would do a tentative computation that
assignAddresses would sometimes override.
This patch also splits the cases where assignAddresses needs to add
the headers to the first PT_LOAD and the address computation. The net
effect is that we no longer create an empty page for no reason in the
included test case, which matches bfd behavior.
llvm-svn: 287565
LLD's error messages contain line numbers, function names or section names.
Currently they are formatter as follows.
foo.c (32): symbol 'foo' not found
foo.c (function bar): symbol 'foo' not found
foo.c (.text+0x1234): symbol 'foo' not found
This patch changes them so that they are consistent with Clang's output.
foo.c:32: symbol 'foo' not found
foo.c:(function bar): symbol 'foo' not found
foo.c:(.text+0x1234): symbol 'foo' not found
Differential Revision: https://reviews.llvm.org/D26901
llvm-svn: 287537
I hit an internal linker script that was defining _DYNAMIC instead of
letting the linker do it. It turns out that both bfd and gold allow
that.
This is pretty easy to implement, just make the linker defined symbol
weak. This should have no impact in the case where there is no user
defined symbol: The visibility is hidden, which causes the output to
still be local.
llvm-svn: 287260
Linker script doesn't create a section if it has no content. So the following
script doesn't create .norelocs section if it doesn't have any .rel* sections.
.norelocs : { *(.rel*) }
Later, if you assert that the size of .norelocs is 0, LLD printed out
an error message, because it didn't allow calling SIZEOF() on nonexistent
sections.
This patch allows SIZEOF() on nonexistent sections, so that you can do
something like this.
ASSERT(SIZEOF(.norelocs), "shouldn't contain .rel sections!")
Note that this behavior is compatible with GNU.
Differential Revision: https://reviews.llvm.org/D26810
llvm-svn: 287257
Propagate program headers by walking the commands, not the
sections. This allows us to propagate program headers even from
sections that don't end up in the output.
Fixes pr30997.
llvm-svn: 286837
Summary:
This patch adds a ".comment" section to an output. The comment
section contains the linker's version string. You can now
find out whether a binary is created by LLD or not using objdump
command like this.
$ objdump -s -j .comment foo
foo: file format elf64-x86-64
Contents of section .comment:
0000 00474343 3a202855 62756e74 7520342e .GCC: (Ubuntu 4.
0010 382e342d 32756275 6e747531 7e31342e 8.4-2ubuntu1~14.
...
00c0 766d2f74 72756e6b 20323835 38343629 vm/trunk 285846)
00d0 004c696e 6b65723a 204c4c44 20342e30 .Linker: LLD 4.0
00e0 2e302028 7472756e 6b203238 36343036 .0 (trunk 286406
00f0 2900 ).
Compilers emits .comment section as well, so the output contains
both compiler and linker information.
Alternative considered:
I first tried to add a SHT_NOTE section because GNU gold does that.
A NOTE section starts with a header which contains content type.
It turned out that ld.gold sets type NT_GNU_GOLD_VERSION to their
NOTE section. So the NOTE type is only for GNU gold (surprise!)
Next, I tried to create ".linker-version" section. However, it seems
that reusing the existing ".comment" section is better because 1)
other tools already know about .comment section and is able to strip
it and 2) the result contans not only linker info but also compiler
info.
Differential Revision: https://reviews.llvm.org/D26487
llvm-svn: 286496
A CommonInputSection is a section containing all common symbols.
That was an input section but was abstracted in a different way
than the synthetic input sections because it was written before
the synthetic input section was invented.
This patch rewrites CommonInputSection as a synthetic input section
so that it behaves better with other sections.
llvm-svn: 286053
With this patch we keep track of the fact that . is a position in the
file and therefore not absolute. This allow us to compute relative
relocations that involve symbol that are defined in linker scripts
with '.'.
This fixes https://llvm.org/bugs/show_bug.cgi?id=30406
There is still more work to track absoluteness over the various
expressions, but this should unblock linking the EFI bootloader.
llvm-svn: 285641
We parse linker scripts very early, but whether an expression is
absolute or not can depend on a symbol defined in a .o. Given that, we
have to delay the computation of IsAbsolute. We can do that by storing
an AST when parsing or by also making IsAbsolute a function like we do
for the expression value. This patch implements the second option.
llvm-svn: 285628
And as a token of the new feature, make ALIGNOF always absolute.
This is a step in making it possible to have non absolute symbols out
of output sections.
llvm-svn: 285608
This patch make lld show following details for undefined symbol errors:
- file (line)
- file (function name)
- file (section name + offset)
Differential revision: https://reviews.llvm.org/D25826
llvm-svn: 285186
This script below shouldn't include file and program headers
to PT_LOAD segment, because it doesn't have PHDRS and FILEHDR
attributes:
PHDRS { all PT_LOAD; }
SECTIONS { /* list of sections here */ }
Differential revision: https://reviews.llvm.org/D25774
llvm-svn: 284709
Previously, we were checking the existence of an entry symbol
too early. It was done before the linker script processor creates
symbols defined in scripts. Fixes bug 30743.
llvm-svn: 284676
Linker scripts may specify PHDRS, but not specify section to
segment assignments, i.e:
PHDRS { seg PT_LOAD; }
SECTIONS {
.sec1 {} : seg
.sec2 {}
}
In such case linker should still choose some segment for .sec2 section.
This patch will add .sec2 to previously opened segments (seg) or to the
very first PT_LOAD segment, if no section-to-segment assignments has been
made
Differential revision: https://reviews.llvm.org/D24795
llvm-svn: 284600
Both gold and ld accepts integers instead of named constants
for PHDRS.
Patch adds support for that.
Differential revision: https://reviews.llvm.org/D25549
llvm-svn: 284470
This is 30646.
PT_OPENBSD_RANDOMIZE
The array element specifies the location and size of a part of the memory image of the program that must be filled with random data before any code in the object is executed. The memory region specified by a segment of this type may overlap the region specified by a PT_GNU_RELRO segment, in which case the intersection will be filled with random data before being marked read-only.
Reference links:
http://man.openbsd.org/OpenBSD-current/man5/elf.5c494713c45
Differential revision: https://reviews.llvm.org/D25469
llvm-svn: 284234
-z wxneeded creates a PHDR PT_OPENBSD_WXNEEDED.
PT_OPENBSD_WXNEEDED
The array element specifies that a process executing this file may need to be able to map or protect memory regions as simultaneously executable and writable. If the system is unable or unwilling to permit that for this executable then it may fail immediately. This segment type is meaningful only for executable files and is ignored in other objects.
http://man.openbsd.org/OpenBSD-current/man5/elf.5
Differential revision: https://reviews.llvm.org/D25472
llvm-svn: 284226
r283984 introduced a problem of too many warning messages being shown
when -ffunction-sections and -fdata-sections were used in conjunction
with --gc-sections linker flag and debugging information present. This
happens because lot of relocations from .debug_line section may become
invalid in such case. The newer fix doesn't show any warning message but
zeroes OutSec pointer in createInputSectionList() to avoid crash, when
relocations are written
llvm-svn: 284010
Sometimes the very first PT_LOAD segment, created by lld, can be empty.
This happens when (all conditions met):
- Linker script is used
- First section in ELF image is not RO
- Not enough space for program headers.
Differential revision: https://reviews.llvm.org/D25330
llvm-svn: 283760
We were implicitly creating space for the headers. That is not the
behaviour of bfd, which requires the script to use SIZEOF_HEADERS. The
difference is important for scripts that don't use SIZEOF_HEADERS and
expect the first section to be at 0.
llvm-svn: 282818
Currently lld will implicitly reserve space for the headers. This is
not the case is bfd, where it is the script responsibility to use
SIZEOF_HEADERS. This means that a script not using SIZEOF_HEADERS and
expecting the address of the first section to be 0 would fail with lld.
I am fixing that is the next commit. This one just makes the tests
explicitly use SIZEOF_HEADERS to avoid the dependency on the current
behaviour.
llvm-svn: 282814
Since they end up going on the same PT_LOAD, there is no reason to
sort them. This matches bfd's behaviour and is user visible in the
placement of orphan sections.
llvm-svn: 282799
If there is not sufficient address space, just give up and don't put
the header in the PT_LOAD.
This matches bfd behaviour and I found at least one script that
depends on having a section at address 0.
llvm-svn: 282750
If we two sections reside in the same PT_LOAD segment,
we compute second section using the following formula:
Off2 = Off1 + VA2 - VA1. This allows OS kernel allocating
sections correctly when loading an image.
Differential revision: https://reviews.llvm.org/D25014
llvm-svn: 282705
This matches the behavior of Binutils linkers. We also change the
default MaxPageSize on x86-64 to 0x1000 to preserver the current
behavior, which is the same as the behavior implemented by gold.
https://llvm.org/bugs/show_bug.cgi?id=30541
Differential Revision: https://reviews.llvm.org/D24987
llvm-svn: 282560
The BYTE, SHORT, LONG, and QUAD commands store one, two, four, and eight bytes (respectively).
After storing the bytes, the location counter is incremented by the number of bytes
stored.
Previously our scripts handles these commands incorrectly. For example:
SECTIONS {
.foo : {
*(.foo.1)
BYTE(0x11)
...
We accepted the script above treating BYTE as input section description.
These commands are used in the wild though.
Differential revision: https://reviews.llvm.org/D24830
llvm-svn: 282429
We were counting the size of the bss section holding common symbols twice:
Dot += CurOutSec->getSize();
flush();
The new code is also simpler as now flush is the only function that
inserts in AlreadyOutputOS, which makes sense since the set hold fully
output sections.
llvm-svn: 282285
Previously we failed to parse next scripts because disallowed
a space between filler value and '=':
.text : {
...
} :text = 0x9090
Differential revision: https://reviews.llvm.org/D24831
llvm-svn: 282248
DEFINED(symbol)
Return 1 if symbol is in the linker global symbol table and is defined before
the statement using DEFINED in the script, otherwise return 0.
Can be used to define default values for symbols. Found it in the wild.
Differential revision: https://reviews.llvm.org/D24858
llvm-svn: 282245
If section contains local symbols ldd crashes, because local
symbols are added to symbol table before section is discarded
by linker script processor. This patch calls copyLocalSymbols()
after createSections, so discarded section symbols are not copied
llvm-svn: 282244
Found this operators used in the wild scripts, for example:
__got2_entries = (_FIXUP_TABLE_ - _GOT2_TABLE_) >>2;
__fixup_entries = (. - _FIXUP_TABLE_)>>2;
Differential revision: https://reviews.llvm.org/D24860
llvm-svn: 282243
The actual logic is to keep the output section if the output section
would have been ro/rw.
This is both simpler and more practical, as the intention is linker
scripts is to always keep of of a pair of ONLY_IF_RO/ONLY_IF_RW.
llvm-svn: 282099
This is PR30442.
Previously we were failed to parce complex expressions like:
foo : { *(SORT_BY_NAME(bar) zed) }
Main idea of patch that globs and excludes can be wrapped in a SORT.
There is a difference in semanics of ld/gold:
ld likes:
*(SORT(EXCLUDE_FILE (*file1.o) .foo.1))
gold likes:
*(EXCLUDE_FILE (*file1.o) SORT(.foo.1))
Patch implements ld grammar, complex expressions like
next is not a problem anymore:
.abc : { *(SORT(.foo.* EXCLUDE_FILE (*file1.o) .bar.*) .bar.*) }
Differential revision: https://reviews.llvm.org/D24758
llvm-svn: 282078
When final image has several .bss sections, lld fails
because second .bss always has zero VA. This causes
link error "Not enough space for ELF and program headers"
llvm-svn: 282067
It is not only a bit more straightforward now, but also next 2 issues are solved:
* It just crashed on ".foo : { *(EXCLUDE_FILE (*file1.o)) }" before.
* It accepted multiple EXCLUDE_FILEs in a row.
Differential revision: https://reviews.llvm.org/D24726
llvm-svn: 282060
This reverts commit r282021, bringing back r282015.
The problem was that the comparison function was not a strict weak
ordering anymore, which this patch fixes.
Original message:
Only restrict order if both sections are in the script.
This matches gold and bfd behavior and is required to handle some scripts.
The script has to assume where PT_LOADs start in order to align that
spot. If we don't allow section it doesn't know about to move to the
middle, we can need more PT_LOADs and those will not be aligned.
llvm-svn: 282035
This matches gold and bfd behavior and is required to handle some scripts.
The script has to assume where PT_LOADs start in order to align that
spot. If we don't allow section it doesn't know about to move to the
middle, we can need more PT_LOADs and those will not be aligned.
llvm-svn: 282015
This is particularly important when the symbol comes from a linker
script. It is common to use the same linker script for shared
libraries and executables. Without this we would always fail to link
shared libraries with -z,defs and a linker script with an ENTRY
directive.
llvm-svn: 281989
Linker scripts are responsible for aliging '.'. Since they are
designed for bfd which has no --rosegment, they don't align the RO to
RX transition.
llvm-svn: 281978
In most cases that means just not checking the address when we don't
need it.
For some tests it is easier to just set . to a known value.
llvm-svn: 281976
Previously it used llvm-readobj -s, now changed to llvm-objdump -section-headers,
that reduced the output significantly and helps to read/support it.
llvm-svn: 281832
Our implementation supported integer value previously.
ld can use expression,
for example, it is OK to write
. = SEGMENT_START("foobar", .);
Patch implements that.
llvm-svn: 281831
It was possible situation about some commands just were not processed
(were skipped) because of a bug appeared when constraint checking used.
Testcase is attached.
llvm-svn: 281818
This matches gold and bfd, and is pretty much required by some linker
scripts. They end with commands like
foo 0 : { *(bar) }
if we put any SHF_ALLOC sections after they can have an address that
is too low.
llvm-svn: 281778
This matches bfd behavior. It also makes future changes simpler as we
don't have to worry about ignoring these commands in multiple places
llvm-svn: 281775