Commit Graph

26 Commits

Author SHA1 Message Date
Fangrui Song 5391f158c2 [ELF] Add -z separate-code and pad the last page of last PF_X PT_LOAD with traps only if -z separate-code is specified
This patch

1) adds -z separate-code and -z noseparate-code (default).
2) changes the condition that the last page of last PF_X PT_LOAD is
 padded with trap instructions.
 Current condition (after D33630): if there is no `SECTIONS` commands.
 After this change: if -z separate-code is specified.

-z separate-code was introduced to ld.bfd in 2018, to place the text
segment in its own pages. There is no overlap in pages between an
executable segment and a non-executable segment:

1) RX cannot load initial contents from R or RW(or non-SHF_ALLOC).
2) R and RW(or non-SHF_ALLOC) cannot load initial contents from RX.

lld's current status:

- Between R and RX: in `Writer<ELFT>::fixSectionAlignments()`, the start of a
  segment is always aligned to maxPageSize, so the initial contents loaded by R
  and RX do not overlap. I plan to allow overlaps in D64906 if -z noseparate-code
  is in effect.
- Between RX and RW(or non-SHF_ALLOC if RW doesn't exist):
  we currently unconditionally pad the last page to commonPageSize
  (defaults to 4096 on all targets we support).
  This patch will make it effective only if -z separate-code is specified.

-z separate-code is a dubious feature that intends to reduce the number
of ROP gadgets (which is actually ineffective because attackers can find
plenty of gadgets in the text segment, no need to find gadgets in
non-code regions).

With the overlapping PT_LOAD technique D64906, -z noseparate-code
removes two more alignments at segment boundaries than -z separate-code.
This saves at most defaultCommonPageSize*2 bytes, which are significant
on targets with large defaultCommonPageSize (AArch64/MIPS/PPC: 65536).

Issues/feedback on alignment at segment boundaries to help understand
the implication:

* binutils PR24490 (the situation on ld.bfd is worse because they have
  two R-- on both sides of R-E so more alignments.)

* In binutils, the 2018-02-27 commit "ld: Add --enable-separate-code" made -z separate-code the default on Linux.
  d969dea983
  In musl-cross-make, binutils is configured with --disable-separate-code
  to address size regressions caused by -z separate-code. (lld actually has the same
  issue, which I plan to fix in a future patch. The ld.bfd x86 status is
  worse because they default to max-page-size=0x200000).

* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237676 people want
  smaller code size. This patch will remove one alignment boundary.

* Stef O'Rear: I'm opposed to any kind of page alignment at the
  text/rodata line (having a partial page of text aliased as rodata and
  vice versa has no demonstrable harm, and I actually care about small
  systems).

So, make -z noseparate-code the default.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D64903

llvm-svn: 367537
2019-08-01 09:58:25 +00:00
Fangrui Song d45df09435 [ELF] Place SHT_NOTE sections with the same alignment into one PT_NOTE
Summary:
While the generic ABI requires notes to be 8-byte aligned in ELF64, many
vendor-specific notes (from Linux, NetBSD, Solaris, etc) use 4-byte
alignment.

In a PT_NOTE segment, if 4-byte aligned notes are followed by an 8-byte
aligned note, the possible 4-byte padding may make consumers fail to
parse the 8-byte aligned note. See PR41000 for a recent report about
.note.gnu.property (NT_GNU_PROPERTY_TYPE_0).
(Note, for NT_GNU_PROPERTY_TYPE_0, the consumers should probably migrate
to PT_GNU_PROPERTY, but the alignment issue affects other notes as well.)

To fix the issue, don't mix notes with different alignments in one
PT_NOTE. If compilers emit 4-byte aligned notes before 8-byte aligned
notes, we'll create at most 2 segments.

sh_size%sh_addralign=0 is actually implied by the rule for linking
unrecognized sections (in generic ABI), so we don't have to check that.
Notes that match in name, type and attribute flags are concatenated into
a single output section. The compilers have to ensure
sh_size%sh_addralign=0 to make concatenated notes parsable.

An alternative approach is to create a PT_NOTE for each SHT_NOTE, but
we'll have to incur the sizeof(Elf64_Phdr)=56 overhead every time a new
note section is introduced.

Reviewers: ruiu, jakehehrlich, phosek, jhenderson, pcc, espindola

Subscribers: emaste, arichardson, krytarowski, fedor.sergeev, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61296

llvm-svn: 359853
2019-05-03 00:35:49 +00:00
Fangrui Song b159906a9a [test] Change llvm-readobj -long-option to --long-option or well-known short options. NFC
Also change some options that have different semantics (cause confusion) in llvm-readelf mode:

-s => -S
-t => --symbols
-sd => --section-data

llvm-svn: 359651
2019-05-01 05:49:01 +00:00
Rui Ueyama fa9f699d30 Add --build-id=fast as a synonym for --build-id.
When you omit an argument, most options fall back to their defaults.
For example, --color-diagnostics is a synonym for --color-diagnostics=auto.
We don't have a way to specify the default choice for --build-id, so we
can't describe --build-id (without an argument) in that way.
This patch adds "fast" for the default build-id choice.

Differential Revision: https://reviews.llvm.org/D43032

llvm-svn: 324502
2018-02-07 19:22:42 +00:00
Rafael Espindola a5d43d004a Propagate sh_entsize out.
No difference in practice other than having sh_entsize in the output.

This should simplify the patch for handling SHF_MERGE in -r.

Based on a patch by George Rimar.

llvm-svn: 318306
2017-11-15 16:56:20 +00:00
Jake Ehrlich 1128dc5e30 Give .note.gnu.build-id section alignment 4
All SHT_NOTE sections should have minimum alignment 4.

Differential Revision: https://reviews.llvm.org/D38907

llvm-svn: 316961
2017-10-30 22:08:11 +00:00
Petr Hosek 7ab9f7be0c [ELF] Set p_memsz to p_filesz when aligning the last segment to page boundary
Having p_filesz different from p_memsz is confusing some tools.

Differential Revision: https://reviews.llvm.org/D37369

llvm-svn: 312384
2017-09-01 21:48:20 +00:00
George Rimar f7ef2a13f6 [ELF] - Recommit "[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions."
With fix: explicitly specify ouput format for hexdump tool call.

Original commit message:

[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions.

Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.

Differential revision: https://reviews.llvm.org/D36262

llvm-svn: 311315
2017-08-21 08:31:14 +00:00
George Rimar 09a6945b48 [ELF] - Revert r311310 "[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions."
It broke BB:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/11792/steps/test_lld/logs/stdio

llvm-svn: 311314
2017-08-21 08:13:45 +00:00
George Rimar c7392cbe9a [ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions.
Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.

Differential revision: https://reviews.llvm.org/D36262

llvm-svn: 311310
2017-08-21 07:51:21 +00:00
Petr Hosek edd6c3587c [ELF] When the code segment is the last, align it to the page boundary
When the data segment is the last segment, it is correct to leave
it unaligned. However, when the code segment is the last segment,
it should be aligned to the page boundary to avoid loading the
non-segment parts of the ELF file at the end of the file.

Differential Revision: https://reviews.llvm.org/D33630

llvm-svn: 309829
2017-08-02 16:35:00 +00:00
Peter Collingbourne 968fe93803 ELF: The later of --build-id and --build-id= wins.
If the compiler driver passes --build-id and the user uses -Wl to
pass --build-id= then the user's flag should take precedence.

Differential Revision: https://reviews.llvm.org/D33461

llvm-svn: 303689
2017-05-23 21:16:48 +00:00
Rafael Espindola 2532431332 Stop propagating Entsize.
Now that we combine multiple synthetic merge section into one output
section there is no point in trying to propagate a value.

llvm-svn: 294048
2017-02-03 21:29:51 +00:00
Ed Maste 8fd0196c6f lld: Default image base address to 0x200000 on x86-64
Align to the large page size (known as a superpage or huge page).
FreeBSD automatically promotes large, superpage-aligned allocations.

Differential Revision:	https://reviews.llvm.org/D27042

llvm-svn: 287782
2016-11-23 17:44:02 +00:00
Rui Ueyama cb87675186 Define -build-id=tree as a synonym for -build-id=sha1.
Our build-id is a tree hash anyway, so I'll define this as a synonym
for sha1. GNU gold takes this parameter, so this is for compatibility
with that.

llvm-svn: 287119
2016-11-16 17:14:11 +00:00
Rui Ueyama 3da3f06dd3 Include version string into ".comment" section.
Summary:
This patch adds a ".comment" section to an output. The comment
section contains the linker's version string. You can now
find out whether a binary is created by LLD or not using objdump
command like this.

  $ objdump -s -j .comment foo

  foo:     file format elf64-x86-64

  Contents of section .comment:
   0000 00474343 3a202855 62756e74 7520342e  .GCC: (Ubuntu 4.
   0010 382e342d 32756275 6e747531 7e31342e  8.4-2ubuntu1~14.
   ...
   00c0 766d2f74 72756e6b 20323835 38343629  vm/trunk 285846)
   00d0 004c696e 6b65723a 204c4c44 20342e30  .Linker: LLD 4.0
   00e0 2e302028 7472756e 6b203238 36343036  .0 (trunk 286406
   00f0 2900                                 ).

Compilers emits .comment section as well, so the output contains
both compiler and linker information.

Alternative considered:

I first tried to add a SHT_NOTE section because GNU gold does that.
A NOTE section starts with a header which contains content type.
It turned out that ld.gold sets type NT_GNU_GOLD_VERSION to their
NOTE section. So the NOTE type is only for GNU gold (surprise!)

Next, I tried to create ".linker-version" section. However, it seems
that reusing the existing ".comment" section is better because 1)
other tools already know about .comment section and is able to strip
it and 2) the result contans not only linker info but also compiler
info.

Differential Revision: https://reviews.llvm.org/D26487

llvm-svn: 286496
2016-11-10 20:20:37 +00:00
George Rimar 364b59e266 [ELF] - Implemented threaded --build-id computation
Patch switches computing of --build-id hash to tree.

This is the way when input data is splitted by chunks,
hash is computed for each one in threaded/non-threaded way.
At the end hash is conputed for result tree.

With or without -threads the result hash is the same.

Differential revision: https://reviews.llvm.org/D26199

llvm-svn: 286061
2016-11-06 07:42:55 +00:00
Eugene Leviant 282251a226 Convert BuildIdSection to input section
Differential revision: https://reviews.llvm.org/D25627

llvm-svn: 285682
2016-11-01 09:49:24 +00:00
Eugene Leviant 933dae7435 Implement support for --build-id=uuid switch
Differential revision: https://reviews.llvm.org/D23349

llvm-svn: 279810
2016-08-26 09:55:37 +00:00
Rui Ueyama 9194db78fb Support --build-id=0x<hexstring>.
If you specify the option in the form of --build-id=0x<hexstring>,
that hexstring is set as a build ID. We observed that the feature
is actually in use in some builds, so we want this feature.

llvm-svn: 269495
2016-05-13 21:55:56 +00:00
Rafael Espindola c5b8ed241d Implement --build-id=none.
Both bfd and gold have this. It allows disabling build-id when it is the
default with by adding -Wl,--build-id=none no the clang command line.

llvm-svn: 268435
2016-05-03 20:55:47 +00:00
Rui Ueyama 144debcc0f Remove unnecessary trailing semicolons.
Since this semicolon existed in an early test file,
it has spread to many files.

llvm-svn: 267659
2016-04-27 02:58:27 +00:00
Rui Ueyama d86ec30168 ELF: Add --build-id=sha1 option.
llvm-svn: 265748
2016-04-07 23:51:56 +00:00
Rui Ueyama 3a41be277a ELF: Implement --build-id=md5.
Previously, we supported only one hash function, FNV-1, so
BuildIdSection directly handled hash computation. In this patch,
I made BuildIdSection an abstract class and defined two subclasses,
BuildIdFnv1 and BuildIdMd5.

llvm-svn: 265737
2016-04-07 22:49:21 +00:00
Rui Ueyama 28286cdfc7 ELF: Include the build ID section in the first page.
At least Linux has the kernel configuration to include the first page
of the executable into core files. We want build ID section to be
included in core files to identify them.

Here is the link to the description about the kernel configuration.

097f70b3c4/fs/Kconfig.binfmt (L46)

llvm-svn: 263351
2016-03-13 01:54:48 +00:00
Rui Ueyama 634ddf0bec ELF: Implement --build-id.
This patch implements --build-id. After the linker creates an output file
in the memory buffer, it computes the FNV1 hash of the resulting file
and set the hash to the .note section as a build-id.

GNU ld and gold have the same feature, but their default choice of the
hash function is different. Their default is SHA1.

We made a deliberate choice to not use a secure hash function for the
sake of performance. Computing a secure hash is slow -- for example,
MD5 throughput is usually 400 MB/s or so. SHA1 is slower than that.

As a result, if you pass --build-id to gold, then the linker becomes about
10% slower than that without the option. We observed a similar degradation
in an experimental implementation of build-id for LLD. On the other hand,
we observed only 1-2% performance degradation with the FNV hash.

Since build-id is not for digital certificate or anything, we think that
a very small probability of collision is acceptable.

We considered using other signals such as using input file timestamps as
inputs to a secure hash function. But such signals would have an issue
with build reproducibility (if you build a binary from the same source
tree using the same toolchain, the build id should become the same.)

GNU linkers accepts --build-id=<style> option where style is one of
"MD5", "SHA1", or an arbitrary hex string. That option is out of scope
of this patch.

http://reviews.llvm.org/D18091

llvm-svn: 263292
2016-03-11 20:51:53 +00:00