This patch
1) adds -z separate-code and -z noseparate-code (default).
2) changes the condition that the last page of last PF_X PT_LOAD is
padded with trap instructions.
Current condition (after D33630): if there is no `SECTIONS` commands.
After this change: if -z separate-code is specified.
-z separate-code was introduced to ld.bfd in 2018, to place the text
segment in its own pages. There is no overlap in pages between an
executable segment and a non-executable segment:
1) RX cannot load initial contents from R or RW(or non-SHF_ALLOC).
2) R and RW(or non-SHF_ALLOC) cannot load initial contents from RX.
lld's current status:
- Between R and RX: in `Writer<ELFT>::fixSectionAlignments()`, the start of a
segment is always aligned to maxPageSize, so the initial contents loaded by R
and RX do not overlap. I plan to allow overlaps in D64906 if -z noseparate-code
is in effect.
- Between RX and RW(or non-SHF_ALLOC if RW doesn't exist):
we currently unconditionally pad the last page to commonPageSize
(defaults to 4096 on all targets we support).
This patch will make it effective only if -z separate-code is specified.
-z separate-code is a dubious feature that intends to reduce the number
of ROP gadgets (which is actually ineffective because attackers can find
plenty of gadgets in the text segment, no need to find gadgets in
non-code regions).
With the overlapping PT_LOAD technique D64906, -z noseparate-code
removes two more alignments at segment boundaries than -z separate-code.
This saves at most defaultCommonPageSize*2 bytes, which are significant
on targets with large defaultCommonPageSize (AArch64/MIPS/PPC: 65536).
Issues/feedback on alignment at segment boundaries to help understand
the implication:
* binutils PR24490 (the situation on ld.bfd is worse because they have
two R-- on both sides of R-E so more alignments.)
* In binutils, the 2018-02-27 commit "ld: Add --enable-separate-code" made -z separate-code the default on Linux.
d969dea983
In musl-cross-make, binutils is configured with --disable-separate-code
to address size regressions caused by -z separate-code. (lld actually has the same
issue, which I plan to fix in a future patch. The ld.bfd x86 status is
worse because they default to max-page-size=0x200000).
* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237676 people want
smaller code size. This patch will remove one alignment boundary.
* Stef O'Rear: I'm opposed to any kind of page alignment at the
text/rodata line (having a partial page of text aliased as rodata and
vice versa has no demonstrable harm, and I actually care about small
systems).
So, make -z noseparate-code the default.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64903
llvm-svn: 367537
Summary:
While the generic ABI requires notes to be 8-byte aligned in ELF64, many
vendor-specific notes (from Linux, NetBSD, Solaris, etc) use 4-byte
alignment.
In a PT_NOTE segment, if 4-byte aligned notes are followed by an 8-byte
aligned note, the possible 4-byte padding may make consumers fail to
parse the 8-byte aligned note. See PR41000 for a recent report about
.note.gnu.property (NT_GNU_PROPERTY_TYPE_0).
(Note, for NT_GNU_PROPERTY_TYPE_0, the consumers should probably migrate
to PT_GNU_PROPERTY, but the alignment issue affects other notes as well.)
To fix the issue, don't mix notes with different alignments in one
PT_NOTE. If compilers emit 4-byte aligned notes before 8-byte aligned
notes, we'll create at most 2 segments.
sh_size%sh_addralign=0 is actually implied by the rule for linking
unrecognized sections (in generic ABI), so we don't have to check that.
Notes that match in name, type and attribute flags are concatenated into
a single output section. The compilers have to ensure
sh_size%sh_addralign=0 to make concatenated notes parsable.
An alternative approach is to create a PT_NOTE for each SHT_NOTE, but
we'll have to incur the sizeof(Elf64_Phdr)=56 overhead every time a new
note section is introduced.
Reviewers: ruiu, jakehehrlich, phosek, jhenderson, pcc, espindola
Subscribers: emaste, arichardson, krytarowski, fedor.sergeev, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61296
llvm-svn: 359853
Also change some options that have different semantics (cause confusion) in llvm-readelf mode:
-s => -S
-t => --symbols
-sd => --section-data
llvm-svn: 359651
When you omit an argument, most options fall back to their defaults.
For example, --color-diagnostics is a synonym for --color-diagnostics=auto.
We don't have a way to specify the default choice for --build-id, so we
can't describe --build-id (without an argument) in that way.
This patch adds "fast" for the default build-id choice.
Differential Revision: https://reviews.llvm.org/D43032
llvm-svn: 324502
No difference in practice other than having sh_entsize in the output.
This should simplify the patch for handling SHF_MERGE in -r.
Based on a patch by George Rimar.
llvm-svn: 318306
With fix: explicitly specify ouput format for hexdump tool call.
Original commit message:
[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions.
Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.
Differential revision: https://reviews.llvm.org/D36262
llvm-svn: 311315
Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.
Differential revision: https://reviews.llvm.org/D36262
llvm-svn: 311310
When the data segment is the last segment, it is correct to leave
it unaligned. However, when the code segment is the last segment,
it should be aligned to the page boundary to avoid loading the
non-segment parts of the ELF file at the end of the file.
Differential Revision: https://reviews.llvm.org/D33630
llvm-svn: 309829
If the compiler driver passes --build-id and the user uses -Wl to
pass --build-id= then the user's flag should take precedence.
Differential Revision: https://reviews.llvm.org/D33461
llvm-svn: 303689
Align to the large page size (known as a superpage or huge page).
FreeBSD automatically promotes large, superpage-aligned allocations.
Differential Revision: https://reviews.llvm.org/D27042
llvm-svn: 287782
Our build-id is a tree hash anyway, so I'll define this as a synonym
for sha1. GNU gold takes this parameter, so this is for compatibility
with that.
llvm-svn: 287119
Summary:
This patch adds a ".comment" section to an output. The comment
section contains the linker's version string. You can now
find out whether a binary is created by LLD or not using objdump
command like this.
$ objdump -s -j .comment foo
foo: file format elf64-x86-64
Contents of section .comment:
0000 00474343 3a202855 62756e74 7520342e .GCC: (Ubuntu 4.
0010 382e342d 32756275 6e747531 7e31342e 8.4-2ubuntu1~14.
...
00c0 766d2f74 72756e6b 20323835 38343629 vm/trunk 285846)
00d0 004c696e 6b65723a 204c4c44 20342e30 .Linker: LLD 4.0
00e0 2e302028 7472756e 6b203238 36343036 .0 (trunk 286406
00f0 2900 ).
Compilers emits .comment section as well, so the output contains
both compiler and linker information.
Alternative considered:
I first tried to add a SHT_NOTE section because GNU gold does that.
A NOTE section starts with a header which contains content type.
It turned out that ld.gold sets type NT_GNU_GOLD_VERSION to their
NOTE section. So the NOTE type is only for GNU gold (surprise!)
Next, I tried to create ".linker-version" section. However, it seems
that reusing the existing ".comment" section is better because 1)
other tools already know about .comment section and is able to strip
it and 2) the result contans not only linker info but also compiler
info.
Differential Revision: https://reviews.llvm.org/D26487
llvm-svn: 286496
Patch switches computing of --build-id hash to tree.
This is the way when input data is splitted by chunks,
hash is computed for each one in threaded/non-threaded way.
At the end hash is conputed for result tree.
With or without -threads the result hash is the same.
Differential revision: https://reviews.llvm.org/D26199
llvm-svn: 286061
If you specify the option in the form of --build-id=0x<hexstring>,
that hexstring is set as a build ID. We observed that the feature
is actually in use in some builds, so we want this feature.
llvm-svn: 269495
Both bfd and gold have this. It allows disabling build-id when it is the
default with by adding -Wl,--build-id=none no the clang command line.
llvm-svn: 268435
Previously, we supported only one hash function, FNV-1, so
BuildIdSection directly handled hash computation. In this patch,
I made BuildIdSection an abstract class and defined two subclasses,
BuildIdFnv1 and BuildIdMd5.
llvm-svn: 265737
At least Linux has the kernel configuration to include the first page
of the executable into core files. We want build ID section to be
included in core files to identify them.
Here is the link to the description about the kernel configuration.
097f70b3c4/fs/Kconfig.binfmt (L46)
llvm-svn: 263351
This patch implements --build-id. After the linker creates an output file
in the memory buffer, it computes the FNV1 hash of the resulting file
and set the hash to the .note section as a build-id.
GNU ld and gold have the same feature, but their default choice of the
hash function is different. Their default is SHA1.
We made a deliberate choice to not use a secure hash function for the
sake of performance. Computing a secure hash is slow -- for example,
MD5 throughput is usually 400 MB/s or so. SHA1 is slower than that.
As a result, if you pass --build-id to gold, then the linker becomes about
10% slower than that without the option. We observed a similar degradation
in an experimental implementation of build-id for LLD. On the other hand,
we observed only 1-2% performance degradation with the FNV hash.
Since build-id is not for digital certificate or anything, we think that
a very small probability of collision is acceptable.
We considered using other signals such as using input file timestamps as
inputs to a secure hash function. But such signals would have an issue
with build reproducibility (if you build a binary from the same source
tree using the same toolchain, the build id should become the same.)
GNU linkers accepts --build-id=<style> option where style is one of
"MD5", "SHA1", or an arbitrary hex string. That option is out of scope
of this patch.
http://reviews.llvm.org/D18091
llvm-svn: 263292