Previously, on each iteration in ICF, we scan the entire vector of
input sections to find boundaries of groups having the same ID.
This patch changes the algorithm so that we now have a vector of ranges.
Each range contains a starting index and an ending index of the group.
So we no longer have to search boundaries on each iteration.
Performance-wise, this seems neutral. Instead of searching boundaries,
we now have to maintain ranges. But I think this is more readable
than the previous implementation.
Moreover, this makes easy to parallelize the main loop of ICF,
which I'll do in a follow-up patch.
llvm-svn: 288228
StringRefZ is a class to represent a null-terminated string. String
length is computed lazily, so it's more efficient than StringRef to
represent strings in string table.
The motivation of defining this new class is to merge functions
that only differ in string types; we have many constructors that takes
`const char *` or `StringRef`. With StringRefZ, we can merge them.
Differential Revision: https://reviews.llvm.org/D27037
llvm-svn: 288172
The module index dynamic relocation R_ARM_DTPMOD32 is always 1 for an
executable. When static linking and when we know that we are not a shared
object we can resolve the module index relocation statically.
The logic in handleNoRelaxTlsRelocation remains the same for Mips as it
has its own custom GOT writing code. For ARM we add the module index
relocation to the GOT when it can be resolved statically.
In addition the type of the RelExpr for the static resolution of TlsGotRel
should be R_TLS and not R_ABS as we need to include the size of
the thread control block in the calculation.
Addresses the TLS part of PR30218.
Differential revision: https://reviews.llvm.org/D27213
llvm-svn: 288153
When -O0 is specified, we do not do section merging.
Though before this patch several sections were generated instead
of single, what is useless.
Differential revision: https://reviews.llvm.org/D27041
llvm-svn: 288151
This change continues what was started by D27040
Now all allocatable synthetics should be available from script side.
Differential revision: https://reviews.llvm.org/D27131
llvm-svn: 288150
The MipsGotSection::getPageEntryOffset calculates index of GOT entry
with a "page" address. Previously this method changes the state
of MipsGotSection because it modifies PageIndexMap field. That leads
to the unpredictable results if getPageEntryOffset called from multiple threads.
The patch makes getPageEntryOffset constant. To do so it calculates GOT
entry index but does not update PageIndexMap field. Later in the
MipsGotSection::writeTo method linker calculates "page" addresses and
writes them to the output.
llvm-svn: 288129
If output section which referenced by R_MIPS_GOT_PAGE or R_MIPS_GOT16
relocations is small (less that 0x10000 bytes) and occupies two adjacent
0xffff-bytes pages, current formula gives incorrect number of required "page"
GOT entries. The problem is that in time of calculation we do not know
the section address and so we cannot calculate number of 0xffff-bytes
pages exactly.
This patch fix the formula. Now it gives a correct number of pages in
the worst case when "small" section intersects 0xffff-bytes page
boundary. From the other side, sometimes it adds one more redundant GOT
entry for each output section. But usually number of output sections
referenced by GOT relocations is small.
llvm-svn: 288127
-N (-omagic)
Set the text and data sections to be readable and writable.
Also, do not page-align the data segment.
Differential revision: https://reviews.llvm.org/D26888
llvm-svn: 288123
Right now we just remember a SymbolBody for each got entry and
duplicate a bit of logic to decide what value, if any, should be
written for that SymbolBody.
With ARM there will be more complicated values, and it seems better to
just use the relocation code to fill the got entries. This makes it
clear that each entry is filled by the dynamic linker or by the static
linker.
llvm-svn: 288107
That unifies handling cases when we have SECTIONS and when
-no-rosegment is given in compareSectionsNonScript()
Now Config->SingleRoRx is used for check, testcase is provided.
llvm-svn: 288022
Previously Config->SingleRoRx was set in
createFiles() and used HasSections.
This change moves it to readConfigs at place of
common flags handling, and adds logic that sets
this flag separatelly from ScriptParser if SECTIONS present.
llvm-svn: 288021
--no-rosegment: Do not put read-only non-executable sections in their own segment
Differential revision: https://reviews.llvm.org/D26889
llvm-svn: 288020
Unfortunatelly PT_ARM_EXIDX is special. There is no way to create it
from linker scripts, so we have to create it even if PHDRS is used.
This matches bfd and is required for the lld output to survive bfd's strip.
llvm-svn: 288012
Unfortunatelly some scripts look like
kernphys = ...
. = ....
and the expectation in that every orphan section is after the
assignment.
llvm-svn: 287996
This is an horrible special case, but seems to match bfd's behaviour
and is important for avoiding placing an orphan section before the
expected start of the file.
llvm-svn: 287994
They return new vectors, but at the same time they mutate other vectors,
so returning values doesn't make much sense. We should just mutate two
vectors.
llvm-svn: 287979
-color-diagnostics=auto is default because that's the same as
Clang's default. When color is enabled, error or warning messages
are colored like this.
error:
<bold>ld.lld</bold> <red>error:</red> foo.o: no such file
warning:
<bold>ld.lld</bold> <magenta>warning:</magenta> foo.o: no such file
Differential Revision: https://reviews.llvm.org/D27117
llvm-svn: 287949
Uncompressing section contents and spliting mergeable section contents
into smaller chunks are heavy tasks. They scan entire section contents
and do CPU-intensive tasks such as uncompressing zlib-compressed data
or computing a hash value for each section piece.
Luckily, these tasks are independent to each other, so we can do that
in parallel_for_each. The number of input sections is large (as opposed
to the number of output sections), so there's a large parallelism here.
Actually the current design to call uncompress() and splitIntoPieces()
in batch was chosen with doing this in mind. Basically what we need to
do here is to replace `for` with `parallel_for_each`.
It seems this patch improves latency significantly if linked programs
contain debug info (which in turn contain lots of mergeable strings.)
For example, the latency to link Clang (debug build) improved by 20% on
my machine as shown below. Note that ld.gold took 19.2 seconds to do
the same thing.
Before:
30801.782712 task-clock (msec) # 3.652 CPUs utilized ( +- 2.59% )
104,084 context-switches # 0.003 M/sec ( +- 1.02% )
5,063 cpu-migrations # 0.164 K/sec ( +- 13.66% )
2,528,130 page-faults # 0.082 M/sec ( +- 0.47% )
85,317,809,130 cycles # 2.770 GHz ( +- 2.62% )
67,352,463,373 stalled-cycles-frontend # 78.94% frontend cycles idle ( +- 3.06% )
<not supported> stalled-cycles-backend
44,295,945,493 instructions # 0.52 insns per cycle
# 1.52 stalled cycles per insn ( +- 0.44% )
8,572,384,877 branches # 278.308 M/sec ( +- 0.66% )
141,806,726 branch-misses # 1.65% of all branches ( +- 0.13% )
8.433424003 seconds time elapsed ( +- 1.20% )
After:
35523.764575 task-clock (msec) # 5.265 CPUs utilized ( +- 2.67% )
159,107 context-switches # 0.004 M/sec ( +- 0.48% )
8,123 cpu-migrations # 0.229 K/sec ( +- 23.34% )
2,372,483 page-faults # 0.067 M/sec ( +- 0.36% )
98,395,342,152 cycles # 2.770 GHz ( +- 2.62% )
79,294,670,125 stalled-cycles-frontend # 80.59% frontend cycles idle ( +- 3.03% )
<not supported> stalled-cycles-backend
46,274,151,813 instructions # 0.47 insns per cycle
# 1.71 stalled cycles per insn ( +- 0.47% )
8,987,621,670 branches # 253.003 M/sec ( +- 0.60% )
148,900,624 branch-misses # 1.66% of all branches ( +- 0.27% )
6.747548004 seconds time elapsed ( +- 0.40% )
llvm-svn: 287946
The function was used only within Relocations.cpp, but now we are
using it in many places, so this patch moves it to a file that fits
to the functionality.
llvm-svn: 287943
This is important for cases like:
.sdata : {
*(.got.plt .got)
...
}
That was not supported before as there was no way to get access to
synthetic sections from script.
More details on review page.
Differential revision: https://reviews.llvm.org/D27040
llvm-svn: 287913
This patch changes the error message from
too many errors emitted, stopping now
to
too many errors emitted, stopping now (use -error-limit=0 to see all errors)
Thanks for Sean for the suggestion!
llvm-svn: 287900
The .ARM.exidx table has an entry for each function with the first entry
giving the start address of the function, the table is sorted in ascending
order of function address. Given a PC value, the unwinder will search the
table for the entry that contains the PC value.
If the table entry happens to be the last, the range of the addresses that
the final unwinding table describes will extend to the end of the address
space. To prevent an incorrect address outside the address range of the
program matching the last entry we follow ld.bfd's example and add a
sentinel EXIDX_CANTUNWIND entry at the end of the table. This gives the
final real table entry an upper bound.
In addition the llvm libunwind unwinder currently depends on the presence
of a sentinel entry (PR31091).
Differential revision: https://reviews.llvm.org/D26977
llvm-svn: 287869
Previously, if a symbol specified by -e or ENTRY() is not found,
we didn't set entry point address. That is incompatible with GNU
because GNU linkers set the first address of .text to entry.
This patch implement that behavior.
llvm-svn: 287836
Offset between beginning of a .got section and _gp symbols used in MIPS
GOT relocations calculations. Usually the expression looks like
VA + Offset - GP, where VA is the .got section address, Offset - offset
of the GOT entry, GP - offset between .got and _gp. Also there two "magic"
symbols _gp_disp and __gnu_local_gp which hold the offset mentioned above.
These symbols might be referenced by MIPS relocations.
Now the linker always defines _gp symbol and uses hardcoded value for
its initialization. So offset between .got and _gp is 0x7ff0. The _gp_disp
and __gnu_local_gp defined if required and initialized by 0x7ff0.
In fact that is not correct because _gp symbol might be defined by a linker
script and holds arbitrary value. In that case we need to use this value
in relocation calculation and initialize _gp_disp and __gnu_local_gp
properly.
The patch fixes the problem and completes fixing the bug #30311.
https://llvm.org/bugs/show_bug.cgi?id=30311
Differential revision: https://reviews.llvm.org/D27036
llvm-svn: 287832
This is in the context of https://llvm.org/bugs/show_bug.cgi?id=31109.
When LLD prints out errors for relocations, it tends to print out
extremely large number of errors (like millions) because it would
print out one error per relocation.
This patch makes LLD bail out if it prints out more than 20 errors.
You can configure the limitation using -error-limit argument.
-error-limit=0 means no limit.
I chose the flag name because Clang has the same feature as -ferror-limit.
"f" doesn't make sense to us, so I omitted it.
Differential Revision: https://reviews.llvm.org/D26981
llvm-svn: 287789
We have different functions to stringize objects to construct
error messages. For InputFile, we have getFilename, and for
InputSection, we have getName. You had to memorize them.
I think this is the case where the function overloading comes in handy.
This patch defines toString() functions that are overloaded for all these
types, so that you just call it in error().
Differential Revision: https://reviews.llvm.org/D27030
llvm-svn: 287787
Align to the large page size (known as a superpage or huge page).
FreeBSD automatically promotes large, superpage-aligned allocations.
Differential Revision: https://reviews.llvm.org/D27042
llvm-svn: 287782
There are two ways to set symbol versions. One way is to use symbol
definition file, and the other is to embed version names to symbol
names. In the latter way, symbol name is in the form of `foo@version1`
where `foo` is a real name and `version1` is a version.
We were parsing symbol names in insert(). That seems unnecessarily
too early. We can do it later after we resolve all symbols. Doing it
lazily is a good thing because it makes code easier to read
(because now we have a separate pass to parse symbol names). Also
it could slightly improve performance because if two identical symbols
have versions, we now parse them only once.
llvm-svn: 287741
For now MipsGotSection class is not ready for concurrent access from
multiple threads. The problem is in the getPageEntryOffset method. It
changes state of MipsGotSection object and might be called from
different threads at the same time. So turn Threads off for this target.
It's a temporary solution. The patch fixes MipsGotSection::getPageEntryOffset
is almost ready.
Differential revision: https://reviews.llvm.org/D27035
llvm-svn: 287740
Previously, we stored offsets in string tables to symbols, so
you needed to pass a string table to get a symbol name. This patch
stores const char pointers instead to eliminate the need to pass
a string table.
llvm-svn: 287737
Now that lld switched to lib/LTO, which always calls setDataLayout(),
we don't need this check anymore.
Thanks to Peter for pointing out!
llvm-svn: 287699
GNU LD allows `ASSERT` commands to be in output section descriptions.
Note that LD also mandates that `ASSERT` commands in this context must
end with a semicolon.
llvm-svn: 287677
Some synthetic sections are not derived calsses of SyntehticSection.
They are derived directly from InputSection. For consistencly, we should
use SyntheticSection.
llvm-svn: 287606
If the linker script has SECTIONS, the address computation is now
always done in LinkerScript::assignAddresses, like for any other
section.
Before fixHeaders would do a tentative computation that
assignAddresses would sometimes override.
This patch also splits the cases where assignAddresses needs to add
the headers to the first PT_LOAD and the address computation. The net
effect is that we no longer create an empty page for no reason in the
included test case, which matches bfd behavior.
llvm-svn: 287565
LLD's error messages contain line numbers, function names or section names.
Currently they are formatter as follows.
foo.c (32): symbol 'foo' not found
foo.c (function bar): symbol 'foo' not found
foo.c (.text+0x1234): symbol 'foo' not found
This patch changes them so that they are consistent with Clang's output.
foo.c:32: symbol 'foo' not found
foo.c:(function bar): symbol 'foo' not found
foo.c:(.text+0x1234): symbol 'foo' not found
Differential Revision: https://reviews.llvm.org/D26901
llvm-svn: 287537
Previously, we set (uintptr_t)-1 to InputSectionBase::OutSec to record
that a section has already been set to be assigned to some output section
by linker scripts. Later, we restored nullptr to the pointer to use
the field for the original purpose. That overloading is not very easy to
understand.
This patch adds a bit flag for that purpose, so that we don't need
to piggyback the flag on an unrelated pointer.
llvm-svn: 287508
Also this patch uses file-scope functions instead of class member function.
Now that ICF class is not visible from outside, InputSection class
can no longer be "friend" of it. So I removed the friend relation
and just make it expose the features to public.
llvm-svn: 287480
GNU linkers disagree here.
Though both -version and -v are mentioned
in help to print the version information, GNU ld just normally exits,
while gold can continue linking. We are compatible with ld.bfd here.
This fixes PR31057.
Differential revision: https://reviews.llvm.org/D26865
llvm-svn: 287448
readVersionDeclaration was to read anonymous version definition and
named version definition. Splitting it into two functions should
improve readability as the two cases are different enough.
I also changed a few helper functions to return values instead of
mutating given references.
llvm-svn: 287319
MergeOutputSection class was a bit hard to use because it provdes
a series of finalize functions that have to be called in a right way
at a right time. It also intereacted with MergeInputSection, and the
logic was somewhat entangled between the two classes.
This patch simplifies it by providing only one finalize function.
Now, all you have to do is to call MergeOutputSection::finalize
when you have added all sections to the output section. Then, it
internally merges strings and initliazes StringPiece objects.
I think this is much easier to understand.
This patch also adds comments.
llvm-svn: 287314
Since the output has a section table too, it is meaningful to compute
the sh_link. In a more practical note, the binutils' strip crashes if
sh_link is not set for SHT_ARM_EXIDX.
llvm-svn: 287280
I hit an internal linker script that was defining _DYNAMIC instead of
letting the linker do it. It turns out that both bfd and gold allow
that.
This is pretty easy to implement, just make the linker defined symbol
weak. This should have no impact in the case where there is no user
defined symbol: The visibility is hidden, which causes the output to
still be local.
llvm-svn: 287260
Linker script doesn't create a section if it has no content. So the following
script doesn't create .norelocs section if it doesn't have any .rel* sections.
.norelocs : { *(.rel*) }
Later, if you assert that the size of .norelocs is 0, LLD printed out
an error message, because it didn't allow calling SIZEOF() on nonexistent
sections.
This patch allows SIZEOF() on nonexistent sections, so that you can do
something like this.
ASSERT(SIZEOF(.norelocs), "shouldn't contain .rel sections!")
Note that this behavior is compatible with GNU.
Differential Revision: https://reviews.llvm.org/D26810
llvm-svn: 287257
LLD supports multi-threading, and it seems to be working well as
you can see in r287140. In short, LLD runs a few percent to 30%
faster with -threads and more than 50% faster if you are using
-build-id (your mileage may vary depending on your computer).
However, I don't think most users even don't know about that because
-threads is not a default option.
This patch enables it by default.
Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107160.html
llvm-svn: 287237
This patch updates a couple places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.
llvm-svn: 287205