Commit Graph

278 Commits

Author SHA1 Message Date
Rafael Espindola 99558efed6 Pass a InputSectionData to classoff.
This allows a non template class to hold input sections.

llvm-svn: 285221
2016-10-26 18:44:57 +00:00
Rafael Espindola 1854a8ebb8 Delete trivial getters. NFC.
llvm-svn: 285190
2016-10-26 12:36:56 +00:00
Rafael Espindola 0e090522c8 Read section headers upfront.
Instead of storing a pointer, store the members we need.

The reason for doing this is that it makes it far easier to create
synthetic sections. It also avoids reading data from files multiple
times., which might help with cross endian linking and host
architectures with slow unaligned access.

There are obvious compacting opportunities, but this already has mixed
results even on native x86_64 linking.

There is also the possibility of better refactoring the code for
handling common symbols, but this already shows that a custom class is
not necessary.

llvm-svn: 285148
2016-10-26 00:54:03 +00:00
Rafael Espindola 397f0aa0d3 Be a bit more consistent about using getters. NFC.
llvm-svn: 285082
2016-10-25 16:42:46 +00:00
Rafael Espindola 58139d1758 Delete getSectionHdr.
We were fairly inconsistent as to what information should be accessed
with getSectionHdr and what information (like alignment) was stored
elsewhere.

Now all section info has a dedicated getter. The code is also a bit
more compact.

llvm-svn: 285079
2016-10-25 16:14:25 +00:00
Hans Wennborg 7314c48bcb Fix SectionPiece size when compiling with MSVC
Builds were failing with:

  InputSection.h(139): error C2338: SectionPiece is too big

because MSVC does record layout differently, probably not packing the
'OutputOff' and 'Live' bitfields because their types are of different
size. Using size_t for 'Live' seems to fix it.

llvm-svn: 284740
2016-10-20 15:59:08 +00:00
Rafael Espindola 113860b9ae Compact SectionPiece.
We allocate a lot of these when linking debug info. This speeds up the
link of debug programs by 1% to 2%.

llvm-svn: 284716
2016-10-20 10:55:58 +00:00
Rui Ueyama 388838ed23 Format. NFC.
llvm-svn: 284697
2016-10-20 05:23:23 +00:00
Rafael Espindola 116d83fbe0 Don't call markLiveAt for non alloc sections.
We don't gc them anyway, so just use an early return in Enqueue.

llvm-svn: 284663
2016-10-19 23:13:40 +00:00
Rui Ueyama 05384080df Support GNU-style ZLIB-compressed input sections.
Previously, we supported only SHF_COMPRESSED sections because it's
new and it's the ELF standard. But there are object files compressed
in the GNU style out there, so we had to support it.

Sections compressed in the GNU style start with ".zdebug_" and
contain different headers than the ELF standard's one. In this
patch, getRawCompressedData is responsible to handle it.

A tricky thing about GNU-style compressed sections is that we have
to rename them when creating output sections. ".zdebug_" prefix
implies the section is compressed. We need to rename ".zdebug_"
".debug" because our output sections are not compressed.
We do that in this patch.

llvm-svn: 284068
2016-10-12 22:36:31 +00:00
Peter Smith 0760605ac5 [ELF][ARM] Garbage collection support for .ARM.exidx sections
.ARM.exidx sections have a reverse dependency on the section they have
a SHF_LINK_ORDER dependency on. In other words a .ARM.exidx section is
live only if the executable section it describes is live. We implement
this with a reverse dependency field in InputSection.

Adding the dependency to InputSection is the simplest implementation
but it could be moved out to a separate map if it were found to decrease
performance for non ARM targets.

Differential revision: https://reviews.llvm.org/D25234

llvm-svn: 283734
2016-10-10 10:10:27 +00:00
Peter Smith 0a259f3b9c [ELF][ARM] Initial implentation of ARM exceptions support
The .ARM.exidx sections contain a table. Each entry has two fields:
- PREL31 offset to the function the table entry describes
- Action to take, either cantunwind, inline unwind, or PREL31 offset to
  .ARM.extab section

The table entries must be sorted in order of the virtual addresses the
first entry of the table describes. Traditionally this is implemented by
the SHF_LINK_ORDER dependency. Instead of implementing this directly we
sort the table entries post relocation. 

The .ARM.exidx OutputSection is described by the PT_ARM_EXIDX program
header

Differential revision: https://reviews.llvm.org/D25127

llvm-svn: 283730
2016-10-10 09:39:26 +00:00
Rafael Espindola 5fc2b1d2fe Store the hash in SectionPiece.
This spreads out computing the hash and using it in a hash table. The
speedups are:

firefox
  master 6.811232891
  patch  6.559280249 1.03841162939x faster
chromium
  master 4.369323666
  patch  4.33171853 1.00868134338x faster
chromium fast
  master 1.856679971
  patch  1.850617741 1.00327578725x faster
the gold plugin
  master 0.32917962
  patch  0.325711944 1.01064645023x faster
clang
  master 0.558015452
  patch  0.550284165 1.01404962652x faster
llvm-as
  master 0.032563515
  patch  0.032152077 1.01279662275x faster
the gold plugin fsds
  master 0.356221362
  patch  0.352772162 1.00977741549x faster
clang fsds
  master 0.635096494
  patch  0.627249229 1.01251060127x faster
llvm-as fsds
  master 0.030183188
  patch  0.029889544 1.00982430511x faster
scylla
  master 3.071448906
  patch  2.938484138 1.04524944215x faster

This seems to be because we don't stall as much. When linking firefox
stalled-cycles-frontend goes from 57.56% to 55.55%.

With -O2 the difference is even more significant since we avoid
recomputing the hash. For firefox we go from 9.990295265 to
 9.149627521 seconds (1.09x faster).

llvm-svn: 283367
2016-10-05 19:36:02 +00:00
Rafael Espindola 32aca87bf8 Compact SectionPiece.
It is pretty easy to get the data from the InputSection, so we don't
have to store it.

This opens the way for storing the hash instead.

llvm-svn: 283357
2016-10-05 18:40:00 +00:00
Rafael Espindola 939e9493bf Simplify setting the Live bit in SectionPiece. NFC.
llvm-svn: 283340
2016-10-05 17:02:09 +00:00
Rafael Espindola c7e1e03498 Store an ArrayRef for Data in InputSectionData.
llvm-svn: 281210
2016-09-12 13:13:53 +00:00
Rafael Espindola 54f1614ec1 Revert "Revert "Compact InputSectionData from 64 to 48 bytes. NFC.""
This reverts commit r281096.

The previous link errors should be fixed by r281208.

llvm-svn: 281209
2016-09-12 13:06:10 +00:00
Rafael Espindola 78fe670994 Revert "Compact InputSectionData from 64 to 48 bytes. NFC."
This reverts commit r281084.

The link was failing on some bots. No idea why. I will try to
reproduce it on Monday.

llvm-svn: 281096
2016-09-09 21:20:30 +00:00
Rafael Espindola 82621dcb10 Compact InputSectionData from 64 to 48 bytes. NFC.
llvm-svn: 281084
2016-09-09 19:42:11 +00:00
Rafael Espindola 042a3f209b Compute section names only once.
This simplifies error handling as there is now only one place in the
code that needs to consider the possibility that the name is
corrupted. Before we would do it in every access.

llvm-svn: 280937
2016-09-08 14:06:08 +00:00
Rafael Espindola 16853bb00f Pack InputSectionData from 72 to 64 bytes. NFC.
llvm-svn: 280925
2016-09-08 12:33:41 +00:00
Rafael Espindola 0a75850fa7 Move field to the base class. NFC.
llvm-svn: 280858
2016-09-07 20:41:19 +00:00
Rafael Espindola 664c6522fa Delete dead field. NFC.
llvm-svn: 280856
2016-09-07 20:37:34 +00:00
Eugene Leviant 97403d15ee Eliminate LayoutInputSection class
Previously we used LayoutInputSection class to correctly assign
symbols defined in linker script. This patch removes it and uses
pointer to preceding input section in SymbolAssignment class instead.

Differential revision: https://reviews.llvm.org/D23661

llvm-svn: 280348
2016-09-01 09:55:57 +00:00
Rafael Espindola e7553e4eac Delete unnecessary template.
llvm-svn: 280237
2016-08-31 13:28:33 +00:00
Simon Atanasyan 85c6b44817 [ELF][MIPS] Support .MIPS.abiflags section
This section supersedes .reginfo and .MIPS.options sections. But for now
we have to support all three sections for ABI transition period.

llvm-svn: 278482
2016-08-12 06:28:49 +00:00
Eugene Leviant ceabe80e97 [ELF] Symbol assignment within output section description
llvm-svn: 278322
2016-08-11 07:56:43 +00:00
Rui Ueyama d6bd1371fc Include filenames and section names to error messages.
llvm-svn: 277566
2016-08-03 04:39:42 +00:00
Rui Ueyama 09d4f177fc Remove dependency to SymbolTable from CommonInputSection.
llvm-svn: 277103
2016-07-29 03:39:44 +00:00
Rui Ueyama ad10c3d8d4 Make CommonInputSection singleton class.
All other singleton instances are accessible globally.
CommonInputSection shouldn't be an exception.

Differential Revision: https://reviews.llvm.org/D22935

llvm-svn: 277034
2016-07-28 21:05:04 +00:00
Eugene Leviant 3e6b027705 [ELF] Allows setting section for common symbols in linker script
llvm-svn: 277023
2016-07-28 19:24:13 +00:00
Rafael Espindola 2deeb6093d Fix PR28575.
Not all relocations from a .eh_frame that point to an executable
section should be ignored. In particular, the relocation finding the
personality function should not.

This is a reduction from trying to bootstrap a static lld on linux.

llvm-svn: 276329
2016-07-21 20:18:30 +00:00
Rafael Espindola 6eae9f2c67 Delete SplitInputSection.
This opens the way for having a different Piece type for EhInputSection.

llvm-svn: 276275
2016-07-21 13:32:37 +00:00
Rafael Espindola 2197311c31 Delete EhInputSection::getOffset.
We no longer need it for relocations in .eh_frame.

The only relocations that point to .eh_frame are the ones trying to
find the output .eh_frame.

This actually fixes a bug in the symbol value code. It was not
handling -1 as an indicator for a piece not being included in the
output.

llvm-svn: 276175
2016-07-20 20:19:58 +00:00
Eugene Leviant e63d81bd05 [ELF] Create output sections in LinkerScript class
llvm-svn: 276121
2016-07-20 14:43:20 +00:00
George Rimar 5d53d1f42c [ELF] - Make few members of Writer to be global and export them for reuse
Creating sections on linkerscript side requires some methods
that can be reused if are exported from writer.

Patch implements that change.

Differential revision: http://reviews.llvm.org/D20104

llvm-svn: 275162
2016-07-12 08:50:42 +00:00
Peter Smith fb05cd997c Recommit R274836 Add Thunk support framework for ARM and Mips
The TinyPtrVector of const Thunk<ELFT>* in InputSections.h can cause 
build failures on certain compiler/library combinations when Thunk<ELFT> 
is not a complete type or is an abstract class. Fixed by making Thunk<ELFT>
non Abstract.

type or is an abstract class 

llvm-svn: 274863
2016-07-08 16:10:27 +00:00
Peter Smith eeb827447e Revert R274836 Add Thunk support framework for ARM and Mips
This seems to be causing a buildbot failure on lld-x86_64-freebsd. Will
reproduce locally and fix. 

llvm-svn: 274841
2016-07-08 12:25:50 +00:00
Peter Smith de01b98a26 Add Thunk support framework for ARM and Mips
Generalise the Mips LA25 Thunk code and implement ARM and Thumb
    interworking Thunks.
    
    - Introduce a new module Thunks.cpp to store the Target Specific Thunk
      implementations.
    - DefinedRegular and Shared have a ThunkData field to record Thunk.
    - A Target can have more than one type of Thunk.
    - Support PC-relative calls to Thunks.
    - Support Thunks to PLT entries.
    - Existing Mips LA25 Thunk code integrated.
    - Support for ARMv7A interworking Thunks.
    
    Limitations:
    - Only one Thunk per SymbolBody, this is sufficient for all currently
      implemented Thunks.
    - ARM thunks assume presence of V6T2 MOVT and MOVW instructions.

    Differential revision: http://reviews.llvm.org/D21891

llvm-svn: 274836
2016-07-08 11:13:40 +00:00
Rui Ueyama 1d12ac1d11 Fix endianness issue.
Previously, ch_size was read in host byte order, so if a host and
a target are different in byte order, we would produce a corrupted
output.

llvm-svn: 274729
2016-07-07 03:55:55 +00:00
George Rimar 602fbee9fc [ELF] - Support of compressed input sections implemented.
Patch implements support of zlib style compressed sections.
SHF_COMPRESSED flag is used to recognize that decompression is required.
After that decompression is performed and flag is removed from output.

Differential revision: http://reviews.llvm.org/D20272

llvm-svn: 273661
2016-06-24 11:18:44 +00:00
Rui Ueyama 809d8e2d41 Fix a bug that MIPS thunks can overwrite other section contents.
Peter Smith found while trying to support thunk creation for ARM that
LLD sometimes creates broken thunks for MIPS. The cause of the bug is
that we assign file offsets to input sections too early. We need to
create all sections and then assign section offsets because appending
thunks changes file offsets for all following sections.

This patch separates the pass to assign file offsets from thunk
creation pass. This effectively reverts r265673.

Differential Revision: http://reviews.llvm.org/D21598

llvm-svn: 273532
2016-06-23 04:33:42 +00:00
Rui Ueyama 424b408165 Rename Align -> Alignment.
I think it is me who named these variables, but I always find that
they are slightly confusing because align is a verb.
Adding four letters is worth it.

llvm-svn: 272984
2016-06-17 01:18:46 +00:00
Davide Italiano e6c8fa4530 [ELF] Unbreak build with GCC.
Differential Revision:  http://reviews.llvm.org/D20777

llvm-svn: 271148
2016-05-28 23:27:38 +00:00
Rui Ueyama 406b469de4 Avoid doing binary search.
MergedInputSection::getOffset is the busiest function in LLD if string
merging is enabled and input files have lots of mergeable sections.
It is usually the case when creating executable with debug info,
so it is pretty common.

The reason why it is slow is because it has to do faily complex
computations. For non-mergeable sections, section contents are
contiguous in output, so in order to compute an output offset,
we only have to add the output section's base address to an input
offset. But for mergeable strings, section contents are split for
merging, so they are not contigous. We've got to do some lookups.

We used to do binary search on the list of section pieces.
It is slow because I think it's hostile to branch prediction.

This patch replaces it with hash table lookup. Seems it's working
pretty well. Below is "perf stat -r10" output when linking clang
with debug info. In this case this patch speeds up about 4%.

Before:

       6584.153205 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.09% )
               238 context-switches          #    0.036 K/sec                    ( +-  6.59% )
                 0 cpu-migrations            #    0.000 K/sec                    ( +- 50.92% )
         1,067,675 page-faults               #    0.162 M/sec                    ( +-  0.15% )
    18,369,931,470 cycles                    #    2.790 GHz                      ( +-  0.09% )
     9,640,680,143 stalled-cycles-frontend   #   52.48% frontend cycles idle     ( +-  0.18% )
   <not supported> stalled-cycles-backend
    21,206,747,787 instructions              #    1.15  insns per cycle
                                             #    0.45  stalled cycles per insn  ( +-  0.04% )
     3,817,398,032 branches                  #  579.786 M/sec                    ( +-  0.04% )
       132,787,249 branch-misses             #    3.48% of all branches          ( +-  0.02% )

       6.579106511 seconds time elapsed                                          ( +-  0.09% )

After:

       6312.317533 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.19% )
               221 context-switches          #    0.035 K/sec                    ( +-  4.11% )
                 1 cpu-migrations            #    0.000 K/sec                    ( +- 45.21% )
         1,280,775 page-faults               #    0.203 M/sec                    ( +-  0.37% )
    17,611,539,150 cycles                    #    2.790 GHz                      ( +-  0.19% )
    10,285,148,569 stalled-cycles-frontend   #   58.40% frontend cycles idle     ( +-  0.30% )
   <not supported> stalled-cycles-backend
    18,794,779,900 instructions              #    1.07  insns per cycle
                                             #    0.55  stalled cycles per insn  ( +-  0.03% )
     3,287,450,865 branches                  #  520.799 M/sec                    ( +-  0.03% )
        72,259,605 branch-misses             #    2.20% of all branches          ( +-  0.01% )

       6.307411828 seconds time elapsed                                          ( +-  0.19% )

Differential Revision: http://reviews.llvm.org/D20645

llvm-svn: 270999
2016-05-27 14:39:13 +00:00
Rui Ueyama d884927463 Make SectionPiece 8 bytes smaller on LP64.
This patch makes SectionPiece class 8 bytes smaller on platforms
on which pointer size is 8 bytes. Sean suggested in a post commit
review for r270340 that this could make a differentce, and it
actually is. Time to link clang (with debug info) improved from
6.725 seconds to 6.589 seconds or by about 2%.

Differential Revision: http://reviews.llvm.org/D20613

llvm-svn: 270717
2016-05-25 16:37:01 +00:00
Rui Ueyama 0fcdc730ad Create Relocations.cpp and move scanRelocs there.
scanReloc and the functions on which scanReloc depends is in total
more than 600 lines of code. Since scanReloc does not depend on Writer,
it is better to move it into a separate file.

Differential Revision: http://reviews.llvm.org/D20554

llvm-svn: 270606
2016-05-24 20:24:43 +00:00
Rafael Espindola fe3a2f1b81 Revert "Simplify. Thanks to Rui for the suggestion."
This reverts commit r270551.

Sorry, I commited the wrong branch :-(

llvm-svn: 270554
2016-05-24 12:12:06 +00:00
Rafael Espindola dba64b8ea4 Simplify. Thanks to Rui for the suggestion.
llvm-svn: 270551
2016-05-24 11:53:15 +00:00
Rui Ueyama 0b9a90364b Rename EHInputSection -> EhInputSection.
llvm-svn: 270532
2016-05-24 04:19:20 +00:00
Rui Ueyama b91bf1a9a0 Do not split mergeable sections if they are gc'ed.
Previously, mergeable section's constructors did more than just
setting member variables; it split section contents into small
pieces. It is not always computationally cheap task because if
the section is a mergeable string section, it needs to scan the
entire section to split them by NUL characters.

If a section would be thrown away by GC, that cost ended up
being a waste of time. It is going to be larger problem if the
section is compressed -- the whole time to uncompress it and
split it up is going to be a waste.

Luckily, we can defer section splitting after GC. We just have
to remember which offsets are in use during GC and apply that later.
This patch implements it.

Differential Revision: http://reviews.llvm.org/D20516

llvm-svn: 270455
2016-05-23 16:55:43 +00:00
Rui Ueyama 88abd9b300 Move splitInputSection from EHOutputSection to EHInputSection.
llvm-svn: 270385
2016-05-22 23:53:00 +00:00
Rui Ueyama 34dc99e2c5 Store section contents to SectionPiece. NFC.
So that we don't need to cut a slice when we use a SectionPiece.

llvm-svn: 270348
2016-05-22 01:15:32 +00:00
Rui Ueyama 90fa3722d2 Simplify SplitInputSection::getRangeAndSize.
This patch adds Size member to SectionPiece so that getRangeAndSize
can just return a SectionPiece instead of a std::pair<SectionPiece *, uint_t>.
Also renamed the function.

llvm-svn: 270346
2016-05-22 00:41:38 +00:00
Rui Ueyama 3ea8727188 Define SectionPiece and use it instead of std::pair<uint_t, uint_t>.
We were using std::pair to represents pieces of splittable section
contents. It hurt readability because "first" and "second" are not
meaningful. This patch give them names.

One more thing is that piecewise liveness information is stored to
the second element of the pair as a special value of output section
offset. It was confusing, so I defiend a new bit, "Live", in the
new struct.

llvm-svn: 270340
2016-05-22 00:13:04 +00:00
Rafael Espindola ebed1fe0de Refactor R_RELAX_TLS_* value computation.
This makes it explicit that each R_RELAX_TLS_* is equivalent to some
other expression.

With this I think we are at a sweet spot for how much is done in
Target.cpp. I did experiment with moving *all* the value math out of it.
It has the advantage that we know the final value in target independent
code, but it gets quite verbose.

llvm-svn: 270277
2016-05-20 21:23:52 +00:00
Simon Atanasyan 4e3a15c9f3 [ELF][MIPS] Rename R_MIPS_GOT_xxx relocation expression kinds
New names reflect purpose of corresponding GOT entries better.
Both expression types related to entries allocated in the 'local'
part of MIPS GOT. R_MIPS_GOT_LOCAL_PAGE is for entries contain 'page'
addresses. R_MIPS_GOT_LOCAL is for entries contain 'full' address.

llvm-svn: 269597
2016-05-15 18:13:50 +00:00
Rafael Espindola 3e0b7837bf Cache result when tail merging too.
This speeds up a link of chromium with -O2 (but no icf,gc) from
1.940664632 to 1.925578119.

llvm-svn: 268639
2016-05-05 16:12:25 +00:00
Peter Collingbourne e29e142a10 ELF: Do not use -1 to mark pieces of merge sections as being tail merged.
We were previously using an output offset of -1 for both GC'd and tail
merged pieces. We need to distinguish these two cases in order to filter
GC'd symbols from the symbol table -- we were previously asserting when we
asked for the VA of a symbol pointing into a dead piece, which would end
up asking the tail merging string table for an offset even though we hadn't
initialized it properly.

This patch fixes the bug by using an offset of -1 to exclusively mean GC'd
pieces, using 0 for tail merges, and distinguishing the tail merge case from
an offset of 0 by asking the output section whether it is tail merge.

Differential Revision: http://reviews.llvm.org/D19953

llvm-svn: 268604
2016-05-05 04:10:12 +00:00
Rafael Espindola ebb04b9eb6 Simplify handling of hint relocations.
llvm-svn: 268501
2016-05-04 14:44:22 +00:00
Rafael Espindola de2c76ed73 Sort entries. NFC.
llvm-svn: 268499
2016-05-04 14:38:55 +00:00
Simon Atanasyan add74f37f2 [ELF][MIPS] Read/write .MIPS.options section
MIPS N64 ABI introduces .MIPS.options section which specifies miscellaneous
options to be applied to an object/shared/executable file. LLVM as well as
modern versions of GNU tools read and write the only type of the options -
ODK_REGINFO. It is exact copy of .reginfo section used by O32 ABI.

llvm-svn: 268485
2016-05-04 10:07:38 +00:00
Rafael Espindola a85efd985c Don't create dynamic relocations to ro segments.
These would just crash at runtime.

If we ever decide to support rw text segments this should make it easier
to implement as there is now a single point where we notice the problem.

I have tested this with a freebsd buildworld. It found a non pic
assembly file being linked into a .so,. With that fixed, buildworld
finished.

llvm-svn: 268149
2016-04-30 01:15:17 +00:00
Rui Ueyama 2b6fb80384 Skip scanRelocs for non-alloc sections.
Relocations against sections with no SHF_ALLOC bit are R_ABS relocations.
Currently we are creating Relocations vector for them, but that is wasteful.
This patch is to skip vector construction and to directly apply relocations
in place.

This patch seems to be pretty effective for large executables with debug info.
r266158 (Rafael's patch to change the way how we apply relocations) caused a
temporary performance degradation for such executables, but this patch makes
it even faster than before.

Time to link clang with debug info (output size is 1070 MB):

  before r266158: 15.312 seconds (0%)
  r266158:        17.301 seconds (+13.0%)
  Head:           16.484 seconds (+7.7%)
  w/patch:        13.166 seconds (-14.0%)

Differential Revision: http://reviews.llvm.org/D19645

llvm-svn: 267917
2016-04-28 18:42:04 +00:00
Peter Collingbourne 676c7cd1ed ELF: Move code to where it is used, and related cleanups. NFC.
Differential Revision: http://reviews.llvm.org/D19490

llvm-svn: 267637
2016-04-26 23:52:44 +00:00
Rafael Espindola 0b9531c8e6 Bring r267164 back with a fix.
The fix is to handle local symbols referring to SHF_MERGE sections.

Original message:

GC entries of SHF_MERGE sections.

It is a fairly direct extension of the gc algorithm. For merge sections
instead of remembering just a live bit, we remember which offsets
were used.

This reduces the .rodata sections in chromium from 9648861 to 9477472
bytes.

llvm-svn: 267233
2016-04-22 22:09:35 +00:00
Rafael Espindola 46c039f2c0 Revert "GC entries of SHF_MERGE sections."
This reverts commit r267164.

    Revert "Trying to fix the windows build."

    This reverts commit r267168.

Debugging a bootstrap problem.

llvm-svn: 267194
2016-04-22 19:31:35 +00:00
Rafael Espindola caa831d85a GC entries of SHF_MERGE sections.
It is a fairly direct extension of the gc algorithm. For merge sections
instead of remembering just a live bit, we remember which offsets were
used.

This reduces the .rodata sections in chromium from 9648861 to 9477472
bytes.

llvm-svn: 267164
2016-04-22 16:46:08 +00:00
Rafael Espindola 197d6a882f This reverts commit r267154 and r267161.
It turns out that this will read data from the section to properly
handle Elf_Rel implicit addends.

Sorry for the noise.

Original messages:

Try to fix Windows lld build.

Move getRelocTarget to ObjectFile.
It doesn't use anything from the InputSection.

llvm-svn: 267163
2016-04-22 16:39:59 +00:00
Rafael Espindola ea4d177977 Move getRelocTarget to ObjectFile.
It doesn't use anything from the InputSection.

llvm-svn: 267154
2016-04-22 14:17:14 +00:00
Rafael Espindola c6b17bdc29 Delete refersToGotEntry.
It can be computed from the expression.

llvm-svn: 266890
2016-04-20 17:30:22 +00:00
Rafael Espindola 475dbf42e4 Simplify mips gp0 handling.
In all currently supported cases this is a nop.

llvm-svn: 266888
2016-04-20 17:20:49 +00:00
Rafael Espindola 58cd5db4ef Simplify mips got handling.
This avoids computing the address of a position in the got just to then
subtract got->getva().

llvm-svn: 266831
2016-04-19 22:46:03 +00:00
Rafael Espindola 3f5d634c73 Have getRelExpr handle all cases on x86.
This requires adding a few more expression types, but is already a small
simplification. Having Writer.cpp know the exact expression will also
allow further simplifications.

llvm-svn: 266604
2016-04-18 12:07:13 +00:00
Rafael Espindola 38c67a27fe Store a Symbol for EntrySym.
This makes it impossible to forget to call repl on the SymbolBody.

llvm-svn: 266432
2016-04-15 14:41:56 +00:00
Rafael Espindola 22ef956a45 Change how we apply relocations.
With this patch we use the first scan over the relocations to remember
the information we found about them: will them be relaxed, will a plt be
used, etc.

With that the actual relocation application becomes much simpler. That
is particularly true for the interfaces in Target.h.

This unfortunately means that we now do two passes over relocations for
non SHF_ALLOC sections. I think this can be solved by factoring out the
code that scans a single relocation. It can then be used both as a scan
that record info and for a dedicated direct relocation of non SHF_ALLOC
sections.

I also think it is possible to reduce the number of enum values by
representing a target with just an OutputSection and an offset (which
can be from the start or end).

This should unblock adding features like relocation optimizations.

llvm-svn: 266158
2016-04-13 01:40:19 +00:00
Rafael Espindola 0f7ccc3d92 Update for llvm change.
llvm-svn: 265404
2016-04-05 14:47:28 +00:00
Rafael Espindola ccfe3cb3d6 Don't store an Elf_Sym for most symbols.
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.

Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.

There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.

The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.

As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.

In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.

llvm-svn: 265293
2016-04-04 14:04:16 +00:00
Simon Atanasyan 13f6da1d2c [ELF] Implement infrastructure for thunk code creation
Some targets might require creation of thunks. For example, MIPS targets
require stubs to call PIC code from non-PIC one. The patch implements
infrastructure for thunk code creation and provides support for MIPS
LA25 stubs. Any MIPS PIC code function is invoked with its address
in register $t9. So if we have a branch instruction from non-PIC code
to the PIC one we cannot make the jump directly and need to create a small
stub to save the target function address.
See page 3-38 ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf

- In relocation scanning phase we ask target about thunk creation necessity
by calling `TagetInfo::needsThunk` method. The `InputSection` class
maintains list of Symbols requires thunk creation.

- Reassigning offsets performed for each input sections after relocation
scanning complete because position of each section might change due
thunk creation.

- The patch introduces new dedicated value for DefinedSynthetic symbols
DefinedSynthetic::SectionEnd. Synthetic symbol with that value always
points to the end of the corresponding output section. That allows to
escape updating synthetic symbols if output sections sizes changes after
relocation scanning due thunk creation.

- In the `InputSection::writeTo` method we write thunks after corresponding
input section. Each thunk is written by calling `TargetInfo::writeThunk` method.

- The patch supports the only type of thunk code for each target. For now,
it is enough.

Differential Revision: http://reviews.llvm.org/D17934

llvm-svn: 265059
2016-03-31 21:26:23 +00:00
Rafael Espindola 163974dd33 Simplify AHL handling.
This simplifies a few things

* Read the value as early as possible, instead of passing a pointer to
  the location.
* Print the warning for missing pair close to where we find out it is
  missing.
* Don't pass the value to relocateOne.

llvm-svn: 264802
2016-03-29 23:05:59 +00:00
Rafael Espindola 69082f051d Revert "bar"
This reverts commit r263799.
It was a mistake. Sorry about that.

llvm-svn: 263801
2016-03-18 18:11:26 +00:00
Rafael Espindola c2cfd9fa34 bar
llvm-svn: 263799
2016-03-18 18:09:32 +00:00
Rui Ueyama 9328b2cdde Use ELFT instead of ELFFile<ELFT>.
llvm-svn: 263510
2016-03-14 23:16:09 +00:00
Rui Ueyama fc467e77b8 Use RelTy instead of Elf_Rel_Impl<ELFT, isRela> for readability.
llvm-svn: 263368
2016-03-13 05:06:50 +00:00
Rui Ueyama 2039847062 Simplify findMipsPairedReloc function signature. NFC.
llvm-svn: 263356
2016-03-13 03:09:40 +00:00
Simon Atanasyan 92a32559fd [ELF][MIPS] Put type of symbol (local/global) to the findMipsPairedReloc and call it from the single place. NFC.
llvm-svn: 263339
2016-03-12 11:58:15 +00:00
Rafael Espindola e0df00b91f Rename elf2 to elf.
llvm-svn: 262159
2016-02-28 00:25:54 +00:00
Rui Ueyama 6a9ef8550c Remove default values which are always overwritten.
llvm-svn: 261913
2016-02-25 18:49:09 +00:00
Rui Ueyama 0b28952993 ELF: Implement ICF.
This patch implements the same algorithm as LLD/COFF's ICF. I'm
not going to repeat the same description about how it works, so you
want to read the comment in ICF.cpp in this patch if you want to know
the details. This algorithm should be more powerful than the ICF
algorithm implemented in GNU gold. It can even merge mutually-recursive
functions (which is harder than one might think).

ICF is a fairly effective size optimization. Here are some examples.

 LLD:   37.14 MB -> 35.80 MB (-3.6%)
 Clang: 59.41 MB -> 57.80 MB (-2.7%)

The lacking feature is "safe" version of ICF. This merges all
identical sections. That is not compatible with a C/C++ language
requirement that two distinct functions must have distinct addresses.

But as long as your program do not rely on the pointer equality
(which is in many cases true), your program should work with the
feature. LLD works fine for example.

GNU gold implements so-called "safe ICF" that identifies functions
that are safe to merge by heuristics -- for example, gold thinks
that constructors are safe to merge because there is no way to
take an address of a constructor in C++. We have a different idea
which David Majnemer suggested that we add NOPs at beginning of
merged functions so that two or more pointers can have distinct
values. We can do whichever we want, but this patch does not
include neither.

http://reviews.llvm.org/D17529

llvm-svn: 261912
2016-02-25 18:43:51 +00:00
George Rimar 58941ee12a [ELF2] - Basic implementation of -r/--relocatable
-r, -relocatable - Generate relocatable output

Currently does not have support for files containing 
relocation sections with entries that refer to local 
symbols (like rel[a].eh_frame which refer to sections
and not to symbols)

Differential revision: http://reviews.llvm.org/D14382

llvm-svn: 261838
2016-02-25 08:23:37 +00:00
Rui Ueyama 733153de3c ELF: Do not instantiate InputSectionBase::Discarded.
"Discarded" section is a marker for discarded sections, and we do not
use the instance except for checking its identity. In that sense, it
is just another type of a "null" pointer for InputSectionBase. So,
it doesn't have to be a real instance of InputSectionBase class.

In this patch, we no longer instantiate Discarded section but instead
use -1 as a pointer value. This eliminates a global variable which
needed initialization at startup.

llvm-svn: 261761
2016-02-24 18:33:35 +00:00
Rui Ueyama 5ac589171d ELF: Remove InputSectionBase::getAlign and instead add Align member.
This is a preparation for ICF. If we merge two sections, we want to
align the merged section at the largest alignment requirement.
That means we want to update the alignment value, which was
impossible before this patch because Header is a const value.

llvm-svn: 261712
2016-02-24 00:38:18 +00:00
Rui Ueyama 8fc070d64d ELF: Remove InputSectionBase::isLive and use Live member instead. NFC.
This is also a preparation for ICF.

llvm-svn: 261711
2016-02-24 00:23:15 +00:00
Rui Ueyama d7e4a281c4 ELF: Make some functions constant. NFC.
This is a preparation for ICF.

llvm-svn: 261710
2016-02-24 00:23:13 +00:00
Rui Ueyama c00718fd8e Use TinyPtrVector<Ty *> instead of SmallVector<Ty *, 1>.
Thanks to Sean Silva for the suggestion.

llvm-svn: 261606
2016-02-23 03:34:37 +00:00
Rui Ueyama 70eed364fc Simplify MipsReginfoInputSection.
MipsReginfoInputSection is basically just a container of Elf_Mips_Reginfo
struct. This patch makes that struct directly accessible from others.

llvm-svn: 256984
2016-01-06 22:42:43 +00:00
Rui Ueyama 5a32c7647e Add comments.
llvm-svn: 256905
2016-01-06 03:16:23 +00:00
Simon Atanasyan 57830b60dc [ELF][MIPS] Implement R_MIPS_GPREL16/R_MIPS_GPREL32 relocations
The R_MIPS_GPREL16 / R_MIPS_GPREL32 relocations use the following
expressions for calculations:
```
local symbol:  S + A + GP0 - GP
global symbol: S + A - GP

GP  - Represents the final gp value, i.e. _gp symbol
GP0 - Represents the gp value used to create the relocatable object
```
The GP0 value is taken from the .reginfo data section defined by an object
file. To implement that I keep a reference to `MipsReginfoInputSection`
in the `ObjectFile` class. This reference is used by the
`ObjectFile::getMipsGp0` method to return the GP0 value.

Differential Revision: http://reviews.llvm.org/D15760

llvm-svn: 256416
2015-12-25 13:02:13 +00:00
Simon Atanasyan 1d7df40711 [ELF][MIPS] MIPS .reginfo sections handling
MIPS .reginfo section provides information on the registers used by
the code in the object file. Linker should collect this information and
write .reginfo section in the output file. This section contains a union
of used registers masks taken from input .reginfo sections and final
value of the `_gp` symbol.

For details see the "Register Information" section in Chapter 4 in the
following document:
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf

The patch implements .reginfo sections handling with a couple missed
features: a) it does not put output .reginfo section into the separate
REGINFO segment; b) it does not merge `ri_cprmask` masks from input
section. These features will be implemented later.

Differential Revision: http://reviews.llvm.org/D15669

llvm-svn: 256119
2015-12-20 10:57:34 +00:00
Simon Atanasyan dddbeb7a46 [ELF][MIPS] Match paired relocation using relocation type and symbol index
If we have R_MIPS_HI16 relocation, the paired relocation is the next
R_MIPS_LO16 relocation with the same symbol as a target.

llvm-svn: 255452
2015-12-13 06:49:08 +00:00
Simon Atanasyan 09b3e3685f [ELF] MIPS paired R_MIPS_HI16/LO16 relocations support
Some MIPS relocations including `R_MIPS_HI16/R_MIPS_LO16` use combined
addends. Such addend is calculated using addends of both paired relocations.
Each `R_MIPS_HI16` relocation is paired with the next `R_MIPS_LO16`
relocation. ABI requires to compute such combined addend in case of REL
relocation record format only.

For details see p. 4-17 at
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf

This patch implements lookup of the next paired relocation suing new
`InputSectionBase::findPairedRelocLocation` method. The primary
disadvantage of this approach is that we put MIPS specific logic into
the common code. The next disadvantage is that we lookup `R_MIPS_LO16`
for each `R_MIPS_HI16` relocation, while in fact multiple `R_MIPS_HI16`
might be paired with the single `R_MIPS_LO16`. From the other side
this way allows us to keep `MipsTargetInfo` class stateless and implement
later relocation handling in parallel.

This patch does not support `R_MIPS_HI16/R_MIPS_LO16` relocations against
`_gp_disp` symbol. In that case the relocations use a special formula for
the calculation. That will be implemented later.

Differential Revision: http://reviews.llvm.org/D15112

llvm-svn: 254461
2015-12-01 21:24:45 +00:00
Rafael Espindola 0c6a4f197f Add support for processing .eh_frame.
This adds support for:
* Uniquing CIEs
* Dropping FDEs that point to dropped sections

It drops 657 488 bytes from the .eh_frame of a Release+Asserts clang.

The link time impact is smallish. Linking clang with a Release+Asserts
lld goes from 0.488064805 seconds to 0.504763060 seconds (1.034 X slower).

llvm-svn: 252790
2015-11-11 19:54:14 +00:00
Rafael Espindola db9bf4dbfe Add a helper for getting the output offset of an input offset.
This will get a non st_value use shortly.

llvm-svn: 252753
2015-11-11 16:50:37 +00:00
Rafael Espindola 32994991ce Replace size_t with uintX_t in a few places.
If linking a 32 bit binary, these values must fit in 32 bits.

llvm-svn: 252739
2015-11-11 15:40:37 +00:00
Rafael Espindola 8e37f791f7 Don't pass a member variable to a method. NFC.
llvm-svn: 252718
2015-11-11 10:23:32 +00:00
Rafael Espindola 9a6e4632a0 Move relocate to the base class.
This is in preparation for adding .eh_frame support. They will have
another input section type but will also need to be relocated.

llvm-svn: 252717
2015-11-11 10:18:52 +00:00
George Rimar 8b8222b04c [ELF2] merge-string.s test fixed for win32 configuration.
Differential revision: http://reviews.llvm.org/D14171

llvm-svn: 251644
2015-10-29 19:30:28 +00:00
Rui Ueyama 12504649dc ELF2: Move some code from MarkLive.cpp to InputSection.cpp.
This function is useful for ICF, so move that to a common place.

llvm-svn: 251455
2015-10-27 21:51:13 +00:00
Rafael Espindola f82ed2a28c Add support for merging string from SHF_STRINGS sections.
llvm-svn: 251212
2015-10-24 22:51:01 +00:00
Rafael Espindola 48225b4433 Drop a few const to reduce the noise from the next patch. NFC.
llvm-svn: 251140
2015-10-23 19:55:11 +00:00
Rui Ueyama c4aaed9255 ELF2: Implement --gc-sections.
Section garbage collection is a feature to remove unused sections
from outputs. Unused sections are sections that cannot be reachable
from known GC-root symbols or sections. Naturally the feature is
implemented as a mark-sweep garbage collector.

In this patch, I added Live bit to InputSectionBase. If and only
if Live bit is on, the section will be written to the output.
Starting from GC-root symbols or sections, a new function, markLive(),
visits all reachable sections and sets their Live bits. Writer then
ignores sections whose Live bit is off, so that such sections are
excluded from the output.

This change has small negative impact on performance if you use
the feature because making sections means more work. The time to
link Clang changes from 0.356s to 0.386s, or +8%.

It reduces Clang size from 57,764,984 bytes to 55,296,600 bytes.
That is 4.3% reduction.

http://reviews.llvm.org/D13950

llvm-svn: 251043
2015-10-22 18:49:53 +00:00
Rafael Espindola c159c967f6 Add support for merging the contents of SHF_MERGE sections.
For now SHF_STRINGS are not supported.

llvm-svn: 250737
2015-10-19 21:00:02 +00:00
Rafael Espindola 932efcfa77 Change getLocalRelTarget to include the addend.
Given the name, it is natural for this function to compute the full target.

This will simplify SHF_MERGE handling by allowing getLocalRelTarget to
centralize the addend logic.

llvm-svn: 250731
2015-10-19 20:24:44 +00:00
Rui Ueyama c7cc6ecf08 ELF2: Use ELFT to template OutputSections.
This patch is to use ELFT instead of Is64Bits to template OutputSection
and its subclasses. This increases code size slightly because it creates
two identical functions for some classes, but that's only 20 KB out of
33 MB, so it's negligible.

This is as per discussion with Rafael. He's not fan of the idea but OK
with this. We'll revisit later to this topic.

llvm-svn: 250466
2015-10-15 22:27:29 +00:00
Rafael Espindola aa19708f88 Centralize the handling of r_addend. NFC.
When a relocation points to a SHF_MERGE section, the addend has special meaning.
It should be used to find what in the section the relocation points to. It
should not be added to the output position.

Centralizing it means that the above rule will be implemented once, not once
per target.

llvm-svn: 250421
2015-10-15 15:52:12 +00:00
Rafael Espindola ae81a7bf49 Use OutputSectionBase in a few cases where we don't need a OutputSection.
NFC. This is just preparation for adding a new OutputSection dedicated to
SHF_MERGE input sections.

llvm-svn: 250419
2015-10-15 15:29:53 +00:00
Rui Ueyama 55c3f89edb ELF2: Do not use OutputSection as a member variable name.
We have OutputSection<ELFT> type. GCC 4.9.2 warns on the duplication.

llvm-svn: 250358
2015-10-15 01:58:40 +00:00
Rui Ueyama 80edbbbdf8 ELF2: Remove {set,get}OutputSection accessors.
These accessors didn't provide any additional value over a public
member variable, too.

llvm-svn: 250328
2015-10-14 21:09:55 +00:00
Rui Ueyama edffd91bce ELF2: Remove {set,get}OutputSectionOff accessors.
These accessors didn't provide any additional value over a public
member variable.

llvm-svn: 250326
2015-10-14 21:00:23 +00:00
Hal Finkel 87bbd5ffd4 [ELF2] Allow PPC64 to add the TOC-restore after .plt-based relocations
Under the PPC64 ELF ABI, functions that might call into other modules (and,
thus, need to load a different TOC base value into %r2), need to restore the
old value after the call. The old value is saved by the .plt code, and the
caller only needs to include a nop instruction after the call, which the linker
will transform into a TOC restore if necessary.

In order to do this the relocation handler needs two things:

 1. It needs to know whether the call instruction it is modifying is targeting
    a .plt stub that will load a new TOC base value (necessitating a restore after
    the call).

 2. It needs to know where the buffer ends, so that it does not accidentally
    run off the end of the buffer when looking for the 'nop' instruction after the
    call.

Given these two pieces of information, we can insert the restore instruction in
place of the following nop when necessary.

llvm-svn: 250110
2015-10-12 21:19:18 +00:00
Rafael Espindola 444576d4c4 Add support for comdats.
The implementation is a direct translation to c++ of the rules in the ELF spec.

llvm-svn: 249881
2015-10-09 19:25:07 +00:00
Rui Ueyama 15ef5e174b ELF2: Make singleton output sections globally accessible.
Previously, output sections that are handled specially by the linker
(e.g. PLT or GOT) were created by Writer and passed to other classes
that need them. The problem was that because these special sections
are required by so many classes, the plumbing work became too much
burden.

This patch is to simply make them accessible from anywhere in the
linker to eliminate the plumbing work once and for all.

http://reviews.llvm.org/D13486

llvm-svn: 249590
2015-10-07 19:18:16 +00:00
Rui Ueyama b4908761f8 ELF2: Rename local variable name `Out` in preparation to define `Out` global var.
llvm-svn: 249568
2015-10-07 17:04:18 +00:00
Rafael Espindola e1901cc33d Simplify memory management by having ELFData contain a ELFObj.
llvm-svn: 248502
2015-09-24 15:11:50 +00:00
Michael J. Spencer 2812aa82d0 [elf2] Pass BSSSec to the relocation handling code differently. Don't store it in the symbol.
llvm-svn: 248393
2015-09-23 16:57:31 +00:00
Rafael Espindola c40108858d Move relocation processing to Target.
I will add a couple of ppc64 relocs in the next patches.

llvm-svn: 248319
2015-09-22 20:54:08 +00:00
Rafael Espindola 7167585c94 Remove the Chunk terminology from ELF.
llvm-svn: 248229
2015-09-22 00:16:19 +00:00
Rafael Espindola 9d06ab6ded Rename Chunks.(h|cpp) to InputSection.(h|cpp). NFC.
llvm-svn: 248226
2015-09-22 00:01:39 +00:00