This fixes PR# 45336.
Output sections described in a linker script as NOLOAD with no input sections would be marked as SHT_PROGBITS.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D76981
Before this patch, it wasn't possible to extend the ThinLTO threads to all SMT/CMT threads in the system. Only one thread per core was allowed, instructed by usage of llvm::heavyweight_hardware_concurrency() in the ThinLTO code. Any number passed to the LLD flag /opt:lldltojobs=..., or any other ThinLTO-specific flag, was previously interpreted in the context of llvm::heavyweight_hardware_concurrency(), which means SMT disabled.
One can now say in LLD:
/opt:lldltojobs=0 -- Use one std::thread / hardware core in the system (no SMT). Default value if flag not specified.
/opt:lldltojobs=N -- Limit usage to N threads, regardless of usage of heavyweight_hardware_concurrency().
/opt:lldltojobs=all -- Use all hardware threads in the system. Equivalent to /opt:lldltojobs=$(nproc) on Linux and /opt:lldltojobs=%NUMBER_OF_PROCESSORS% on Windows. When an affinity mask is set for the process, threads will be created only for the cores selected by the mask.
When N > number-of-hardware-threads-in-the-system, the threads in the thread pool will be dispatched equally on all CPU sockets (tested only on Windows).
When N <= number-of-hardware-threads-on-a-CPU-socket, the threads will remain on the CPU socket where the process started (only on Windows).
Differential Revision: https://reviews.llvm.org/D75153
The error previously talked about a "section header" but was actually
referring to a program header.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D76846
This behavior matches GNU ld and seems reasonable.
```
// If a SECTIONS command is not specified
.text.* -> .text
.rodata.* -> .rodata
.init_array.* -> .init_array
```
A proposed Linux feature CONFIG_FG_KASLR may depend on the GNU ld behavior.
Reword a comment about -z keep-text-section-prefix and a comment about
CommonSection (deleted by rL286234).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75225
This essentially drops the change by r288021 (discussed with Georgii Rymar
and Peter Smith and noted down in the release note of lld 10).
GNU ld>=2.31 enables -z separate-code by default for Linux x86. By
default (in the absence of a PHDRS command) a readonly PT_LOAD is
created, which is different from its traditional behavior.
Not emulating GNU ld's traditional behavior is good for us because it
improves code consistency (we create a readonly PT_LOAD in the absence
of a SECTIONS command).
Users can add --no-rosegment to restore the previous behavior (combined
readonly and read-executable sections in a single RX PT_LOAD).
Fixes https://bugs.llvm.org/show_bug.cgi?id=44903
It is about the following case:
```
SECTIONS {
.foo : { *(.foo) } =0x90909090
/DISCARD/ : { *(.bar) }
}
```
Here while parsing the fill expression we treated the
"/" of "/DISCARD/" as operator.
With this change, suggested by Fangrui Song, we do
not allow expressions with operators (e.g. "0x1100 + 0x22")
that are not wrapped into round brackets. It should not
be an issue for users, but helps to resolve parsing ambiguity.
Differential revision: https://reviews.llvm.org/D74687
Hexagon ABI specifies that call x@gdplt is transformed to call __tls_get_addr.
Example:
call x@gdplt
is changed to
call __tls_get_addr
When x is an external tls variable.
Differential Revision: https://reviews.llvm.org/D74443
Any OUTPUT_FORMAT in a linker script overrides the emulation passed on
the command line, so record the passed bfdname and use that in the error
message about incompatible input files.
This prevents confusing error messages. For example, if you explicitly
pass `-m elf_x86_64` to LLD but accidentally include a linker script
which sets `OUTPUT_FORMAT(elf32-i386)`, LLD would previously complain
about your input files being compatible with elf_x86_64, which isn't the
actual issue, and is confusing because the input files are in fact
x86-64 ELF files.
Interestingly enough, this also prevents a segfault! When we don't pass
`-m` and we have an object file which is incompatible with the
`OUTPUT_FORMAT` set by a linker script, the object file is checked for
compatibility before it's added to the objectFiles vector.
config->emulation, objectFiles, and sharedFiles will all be empty, so
we'll attempt to access bitcodeFiles[0], but bitcodeFiles is also empty,
so we'll segfault. This commit prevents the segfault by adding
OUTPUT_FORMAT as a possible source of machine configuration, and it also
adds an llvm_unreachable to diagnose similar issues in the future.
Differential Revision: https://reviews.llvm.org/D76109
On an internal target,
* Before D74773: time -f '%M' => 18275680
* After D74773: time -f '%M' => 22088964
This patch restores to the status before D74773.
-M output can be useful when diagnosing an "error: output file too large" problem (emitted in openFile()).
I just ran into such a situation where I had to debug an erronerous
Linux kernel linker script. It tried to create a file larger than
INT64_MAX bytes.
This patch could have helped https://bugs.llvm.org/show_bug.cgi?id=44715 as well.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75966
See `docs/ELF/linker_script.rst` for the new computation for sh_addr and sh_addralign.
`ALIGN(section_align)` now means: "increase alignment to section_align"
(like yet another input section requirement).
The "start of section .foo changes from 0x11 to 0x20" warning no longer
makes sense. Change it to warn if sh_addr%sh_addralign!=0.
To decrease the alignment from the default max_input_align,
use `.output ALIGN(8) : {}` instead of `.output : ALIGN(8) {}`
See linkerscript/section-address-align.test as an example.
When both an output section address and ALIGN are set (can be seen as an
"undefined behavior" https://sourceware.org/ml/binutils/2020-03/msg00115.html),
lld may align more than GNU ld, but it makes a linker script working
with GNU ld hard to break with lld.
This patch can be considered as restoring part of the behavior before D74736.
Differential Revision: https://reviews.llvm.org/D75724
Summary:
Places orphan sections into a unique output section. This prevents the merging of orphan sections of the same name.
Matches behaviour of GNU ld --unique. --unique=pattern is not implemented.
Motivated user case shown in the test has 2 local symbols as they would appear if C++ source has been compiled with -ffunction-sections. The merging of these sections in the case of a partial link (-r) may limit the effectiveness of -gc-sections of a subsequent link.
Reviewers: espindola, jhenderson, bd1976llvm, edd, andrewng, JonChesterfield, MaskRay, grimar, ruiu, psmith
Reviewed By: MaskRay, grimar
Subscribers: emaste, arichardson, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75536
```
createFiles(args)
readDefsym
readerLinkerScript(*mb)
...
readMemory
readMemoryAssignment("ORIGIN", "org", "o") // eagerly evaluated
target = getTarget();
link(args)
writeResult<ELFT>()
...
finalizeSections()
script->processSymbolAssignments()
addSymbol(cmd) // with this patch, evaluated here
```
readMemoryAssignment eagerly evaluates ORIGIN/LENGTH and returns an uint64_t.
This patch postpones the evaluation to make
* --defsym and symbol assignments
* `CONSTANT(COMMONPAGESIZE)` (requires a non-null `lld:🧝:target`)
work. If the expression somehow requires interaction with memory
regions, the circular dependency may cause the expression to evaluate to
a strange value. See the new test added to memory-err.s
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75763
Added a write method for TimeTrace that takes two strings representing
file names. The first is any file name that may have been provided by the
user via `time-trace-file` flag, and the second is a fallback that should
be configured by the caller. This method makes it cleaner to write the
trace output because there is no longer a need to check file names at the
caller and simplifies future TimeTrace usages.
Reviewed By: modocache
Differential Revision: https://reviews.llvm.org/D74514
llvm::call_once(initDwarfLine, [this]() { initializeDwarf(); });
Though it is not used in all places.
I need that patch for implementing "Remove obsolete debug info" feature
(D74169). But this caching mechanism is useful by itself, and I think it
would be good to use it without connection to "Remove obsolete debug info"
feature. So this patch changes inplace creation of DWARFContext with
its cached version.
Depends on D74308
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D74773
* Delete boilerplate
* Change functions to return `Error`
* Test parsing errors
* Update callers of ARMAttributeParser::parse() to check the `Error` return value.
Since this patch touches nearly everything in the file, I apply
http://llvm.org/docs/Proposals/VariableNames.html and change variable
names to lower case.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D75015
Summary:
LLD has workaround for the times when SectionIndex was not passed properly:
LT->getFileLineInfoForAddress(
S->getOffsetInFile() + Offset, nullptr,
DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath, Info));
S->getOffsetInFile() was added to differentiate offsets between
various sections. Now SectionIndex is properly specified.
Thus it is not necessary to use getOffsetInFile() workaround.
See https://reviews.llvm.org/D58194, https://reviews.llvm.org/D58357.
This patch removes getOffsetInFile() workaround.
Reviewers: ruiu, grimar, MaskRay, espindola
Reviewed By: grimar, MaskRay
Subscribers: emaste, arichardson, llvm-commits
Tags: #llvm, #lld
Differential Revision: https://reviews.llvm.org/D75636
Similar to D63182 [ELF][PPC64] Don't report "relocation refers to a discarded section" for .toc
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D75419
Summary:
Extracted from D74773. Currently, errors happened while parsing
debug info are reported as errors. DebugInfoDWARF library treats such
errors as "Recoverable errors". This patch makes debug info errors
to be reported as warnings, to support DebugInfoDWARF approach.
Reviewers: ruiu, grimar, MaskRay, jhenderson, espindola
Reviewed By: MaskRay, jhenderson
Subscribers: emaste, aprantl, arichardson, arphaman, llvm-commits
Tags: #llvm, #debug-info, #lld
Differential Revision: https://reviews.llvm.org/D75234
MC will now output the R_ARM_THM_PC8, R_ARM_THM_PC12 and
R_ARM_THM_PREL_11_0 relocations. These are short-ranged relocations that
are used to implement the adr rd, literal and ldr rd, literal pseudo
instructions.
The instructions use a new RelExpr called R_ARM_PCA in order to calculate
the required S + A - Pa expression, where Pa is AlignDown(P, 4) as the
instructions add their immediate to AlignDown(PC, 4). We also do not want
these relocations to generate or resolve against a PLT entry as the range
of these relocations is so short they would never reach.
The R_ARM_THM_PC8 has a special encoding convention for the relocation
addend, the immediate field is unsigned, yet the addend must be -4 to
account for the Thumb PC bias. The ABI (not the architecture) uses the
convention that the 8-byte immediate of 0xff represents -4.
Differential Revision: https://reviews.llvm.org/D75042
They are purposefully skipped by input section descriptions (rL295324).
Similarly, --orphan-handling= should not warn/error for them.
This behavior matches GNU ld.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75151
This makes --orphan-handling= less noisy.
This change also improves our compatibility with GNU ld.
GNU ld special cases .symtab, .strtab and .shstrtab . We need output section
descriptions for .symtab, .strtab and .shstrtab to suppress:
<internal>:(.symtab) is being placed in '.symtab'
<internal>:(.shstrtab) is being placed in '.shstrtab'
<internal>:(.strtab) is being placed in '.strtab'
With --strip-all, .symtab and .strtab can be omitted (note, --strip-all is not compatible with --emit-relocs).
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D75149
With this --shuffle-sections=seed produces the same result in every
host.
Reviewed By: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D74971
When the output section address (addrExpr) is specified, GNU ld warns if
sh_addr is different. This patch implements the warning.
Note, LinkerScript::assignAddresses can be called more than once. We
need to record the changed section addresses, and only report the
warnings after the addresses are finalized.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D74741
Follow-up for D74286.
Notations:
* alignExpr: the computed ALIGN value
* max_input_align: the maximum of input section alignments
This patch changes the following two cases to match GNU ld:
* When ALIGN is present, GNU ld sets output sh_addr to alignExpr, while lld use max(alignExpr, max_input_align)
* When addrExpr is specified but alignExpr is not, GNU ld sets output sh_addr to addrExpr, while lld uses `advance(0, max_input_align)`
Note, sh_addralign is still set to max(alignExpr, max_input_align).
lma-align.test is enhanced a bit to check we don't overalign sh_addr.
fixSectionAlignments() sets addrExpr but not alignExpr for the `!hasSectionsCommand` case.
This patch sets alignExpr as well so that max_input_align will be respected.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D74736
Summary:
This option causes lld to shuffle sections by assigning different
priorities in each run.
The use case for this is to introduce randomization in benchmarks. The
idea is inspired by the paper "Producing Wrong Data Without Doing
Anything Obviously Wrong!"
(https://www.inf.usi.ch/faculty/hauswirth/publications/asplos09.pdf). Unlike
the paper, we shuffle individual sections, not just input files.
Doing this in lld is particularly convenient as the --reproduce option
makes it easy to collect all the necessary bits for relinking the
program being benchmarked. Once that it is done, all that is needed is
to add --shuffle-sections=0 to the response file and relink before each
run of the benchmark.
Differential Revision: https://reviews.llvm.org/D74791
With this patch lld recognizes ARM SBREL relocations.
R_ARM*_MOVW_BREL relocations are not tested because they are not used.
Patch by Tamas Petz
Differential Revision: https://reviews.llvm.org/D74604
Summary:
Generate PAC protected plt only when "-z pac-plt" is passed to the
linker. GNU toolchain generates when it is explicitly requested[1].
When pac-plt is requested then set the GNU_PROPERTY_AARCH64_FEATURE_1_PAC
note even when not all function compiled with PAC but issue a warning.
Harmonizing the warning style for BTI/PAC/IBT.
Generate BTI protected PLT if case of "-z force-bti".
[1] https://www.sourceware.org/ml/binutils/2019-03/msg00021.html
Reviewers: peter.smith, espindola, MaskRay, grimar
Reviewed By: peter.smith, MaskRay
Subscribers: tatyana-krasnukha, emaste, arichardson, kristof.beyls, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74537
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
Fixes https://bugs.llvm.org//show_bug.cgi?id=44878
When --strip-debug is specified, .debug* are removed from inputSections
while .rel[a].debug* (incorrectly) remain.
LinkerScript::addOrphanSections() requires the output section of a relocated
InputSectionBase to be created first.
.debug* are not in inputSections ->
output sections .debug* are not created ->
getOutputSectionName(.rel[a].debug*) dereferences a null pointer.
Fix the null pointer dereference by deleting .rel[a].debug* from inputSections as well.
Reviewed By: grimar, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D74510
Recommit of 0b4a047bfb
(reverted in c29003813a) to incorporate
subsequent fix and add a warning when LLD's interworking behavior has
changed.
D73474 disabled the generation of interworking thunks for branch
relocations to non STT_FUNC symbols. This patch handles the case of BL and
BLX instructions to non STT_FUNC symbols. LLD would normally look at the
state of the caller and the callee and write a BL if the states are the
same and a BLX if the states are different.
This patch disables BL/BLX substitution when the destination symbol does
not have type STT_FUNC. This brings our behavior in line with GNU ld which
may prevent difficult to diagnose runtime errors when switching to lld.
This change does change how LLD handles interworking of symbols that do not
have type STT_FUNC from previous versions including the 10.0 release. This
brings LLD in line with ld.bfd but there may be programs that have not been
linked with ld.bfd that depend on LLD's previous behavior. We emit a warning
when the behavior changes.
A summary of the difference between 10.0 and 11.0 is that for symbols
that do not have a type of STT_FUNC LLD will not change a BL to a BLX or
vice versa. The table below enumerates the changes
| relocation | STT_FUNC | bit(0) | in | 10.0- out | 11.0+ out |
| R_ARM_CALL | no | 1 | BL | BLX | BL |
| R_ARM_CALL | no | 0 | BLX | BL | BLX |
| R_ARM_THM_CALL | no | 1 | BLX | BL | BLX |
| R_ARM_THM_CALL | no | 0 | BL | BLX | BL |
Differential Revision: https://reviews.llvm.org/D73542
D43468+D44380 added INSERT [AFTER|BEFORE] for non-orphan sections. This patch
makes INSERT work for orphan sections as well.
`SECTIONS {...} INSERT [AFTER|BEFORE] .foo` does not set `hasSectionCommands`, so the result
will be similar to a regular link without a linker script. The differences when `hasSectionCommands` is set include:
* image base is different
* -z noseparate-code/-z noseparate-loadable-segments are unavailable
* some special symbols such as `_end _etext _edata` are not defined
The behavior is similar to GNU ld:
INSERT is not considered an external linker script.
This feature makes the section layout more flexible. It can be used to:
* Place .nv_fatbin before other readonly SHT_PROGBITS sections to mitigate relocation overflows.
* Disturb the layout to expose address sensitive application bugs.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D74375
GNU ld has a counterintuitive lang_propagate_lma_regions rule.
```
// .foo's LMA region is propagated to .bar because their VMA region is the same,
// and .bar does not have an explicit output section address (addr_tree).
.foo : { *(.foo) } >RAM AT> FLASH
.bar : { *(.bar) } >RAM
// An explicit output section address disables propagation.
.foo : { *(.foo) } >RAM AT> FLASH
.bar . : { *(.bar) } >RAM
```
In both cases, lld thinks .foo's LMA region is propagated and
places .bar in the same PT_LOAD, so lld diverges from GNU ld w.r.t. the
second case (lma-align.test).
This patch changes Writer<ELFT>::createPhdrs to disable propagation
(start a new PT_LOAD). A user of the first case can make linker scripts
portable by explicitly specifying `AT>`. By contrast, there was no
workaround for the old behavior.
This change uncovers another LMA related bug in assignOffsets() where
`ctx->lmaOffset = 0;` was omitted. It caused a spurious "load address
range overlaps" error for at2.test
The new PT_LOAD rule is complex. For convenience, I listed the origins of some subexpressions:
* rL323449: `sec->memRegion == load->firstSec->memRegion`; linkerscript/at3.test
* D43284: `load->lastSec == Out::programHeaders` (don't start a new PT_LOAD after program headers); linkerscript/at4.test
* D58892: `sec != relroEnd` (start a new PT_LOAD after PT_GNU_RELRO)
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D74297
When lmaRegion is non-null, respect `sec->alignment`
This rule is analogous to `switchTo(sec)` which advances sh_addr (VMA).
This fixes the p_paddr misalignment issue as reported by
https://android-review.googlesource.com/c/trusty/external/trusted-firmware-a/+/1230058
Note, `sec->alignment` is the maximum of ALIGN and input section alignments. We may overalign LMA than GNU ld.
linkerscript/align-lma.s has a FIXME that demonstrates another bug:
`.bss ... >RAM` should be placed in a different PT_LOAD (GNU ld
behavior) because its lmaRegion (nullptr) is different from the previous
section's lmaRegion (ROM).
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D74286
There are still problems after the fix in
"[ELF][ARM] Fix regression of BL->BLX substitution after D73542"
so let's revert to get trunk back to green while we investigate.
See https://reviews.llvm.org/D73542
This reverts commit 5461fa2b1f.
This reverts commit 0b4a047bfb.
This adds some of LLD specific scopes and picks up optimisation scopes
via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in
77e6bb3c.
Differential Revision: https://reviews.llvm.org/D71060
D73542 made a typo (`rel.type == R_PLT_PC`; should be `rel.expr`) and introduced a regression:
BL->BLX substitution was disabled when the target symbol is preemptible
(expr is R_PLT_PC).
The two added bl instructions in arm-thumb-interwork-shared.s check that
we patch BL to BLX.
Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1047531
For an out-of-range relocation referencing a non-local symbol, report the symbol name and the object file that defines the symbol. As an example:
```
t.o:(function func: .text.func+0x3): relocation R_X86_64_32S out of range: -281474974609120 is not in [-2147483648, 2147483647]
```
=>
```
t.o:(function func: .text.func+0x3): relocation R_X86_64_32S out of range: -281474974609120 is not in [-2147483648, 2147483647]; references func
>>> defined in t1.o
```
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D73518
D73474 disabled the generation of interworking thunks for branch
relocations to non STT_FUNC symbols. This patch handles the case of BL and
BLX instructions to non STT_FUNC symbols. LLD would normally look at the
state of the caller and the callee and write a BL if the states are the
same and a BLX if the states are different.
This patch disables BL/BLX substitution when the destination symbol does
not have type STT_FUNC. This brings our behavior in line with GNU ld which
may prevent difficult to diagnose runtime errors when switching to lld.
Differential Revision: https://reviews.llvm.org/D73542
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
Similar to R_MIPS_GPREL16 and R_MIPS_GPREL32 (D45972).
If the addend of an R_PPC_PLTREL24 is >= 0x8000, it indicates that r30
is relative to the input section .got2.
```
addis 30, 30, .got2+0x8000-.L1$pb@ha
addi 30, 30, .got2+0x8000-.L1$pb@l
...
bl foo+0x8000@PLT
```
After linking, the relocation will be relative to the output section .got2.
To compensate for the shift `address(input section .got2) - address(output section .got2) = ppc32Got2OutSecOff`, adjust by `ppc32Got2OutSecOff`:
```
addis 30, 30, .got2+0x8000-.L1+ppc32Got2OutSecOff$pb@ha
addi 30, 30, .got2+0x8000-.L1+ppc32Got2OutSecOff$pb@ha$pb@l
...
bl foo+0x8000+ppc32Got2OutSecOff@PLT
```
This rule applys to a relocatable link or a non-relocatable link with --emit-relocs.
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D73532
ELF for the ARM architecture requires linkers to provide
interworking for symbols that are of type STT_FUNC. Interworking for
other symbols must be encoded directly in the object file. LLD was always
providing interworking, regardless of the symbol type, this breaks some
programs that have branches from Thumb state targeting STT_NOTYPE symbols
that have bit 0 clear, but they are in fact internal labels in a Thumb
function. LLD treats these symbols as ARM and inserts a transition to Arm.
This fixes the problem for in range branches, R_ARM_JUMP24,
R_ARM_THM_JUMP24 and R_ARM_THM_JUMP19. This is expected to be the vast
majority of problem cases as branching to an internal label close to the
function.
There is at least one follow up patch required.
- R_ARM_CALL and R_ARM_THM_CALL may do interworking via BL/BLX
substitution.
In theory range-extension thunks can be altered to not change state when
the symbol type is not STT_FUNC. I will need to check with ld.bfd to see if
this is the case in practice.
Fixes (part of) https://github.com/ClangBuiltLinux/linux/issues/773
Differential Revision: https://reviews.llvm.org/D73474
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.
The bot failure should be fixed by D73418, committed as
af954e441a.
I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
* Generalize the code added in D70637 and D70937. We should eventually remove the EM_MIPS special case.
* Handle R_PPC_LOCAL24PC the same way as R_PPC_REL24.
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D73424
-fno-pie produces a pair of non-GOT-non-PLT relocations R_PPC_ADDR16_{HA,LO} (R_ABS) referencing external
functions.
```
lis 3, func@ha
la 3, func@l(3)
```
In a -no-pie/-pie link, if func is not defined in the executable, a canonical PLT entry (st_value>0, st_shndx=0) will be needed.
References to func in shared objects will be resolved to this address.
-fno-pie -pie should fail with "can't create dynamic relocation ... against ...", so we just need to think about -no-pie.
On x86, the PLT entry passes the JMP_SLOT offset to the rtld PLT resolver.
On x86-64: the PLT entry passes the JUMP_SLOT index to the rtld PLT resolver.
On ARM/AArch64: the PLT entry passes &.got.plt[n]. The PLT header passes &.got.plt[fixed-index]. The rtld PLT resolver can compute the JUMP_SLOT index from the two addresses.
For these targets, the canonical PLT entry can just reuse the regular PLT entry (in PltSection).
On PPC32: PltSection (.glink) consists of `b PLTresolve` instructions and `PLTresolve`. The rtld PLT resolver depends on r11 having been set up to the .plt (GotPltSection) entry.
On PPC64 ELFv2: PltSection (.glink) consists of `__glink_PLTresolve` and `bl __glink_PLTresolve`. The rtld PLT resolver depends on r12 having been set up to the .plt (GotPltSection) entry.
We cannot reuse a `b PLTresolve`/`bl __glink_PLTresolve` in PltSection as a canonical PLT entry. PPC64 ELFv2 avoids the problem by using TOC for any external reference, even in non-pic code, so the canonical PLT entry scenario should not happen in the first place.
For PPC32, we have to create a PLT call stub as the canonical PLT entry. The code sequence sets up r11.
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D73399
Symbol information can be used to improve out-of-range/misalignment diagnostics.
It also helps R_ARM_CALL/R_ARM_THM_CALL which has different behaviors with different symbol types.
There are many (67) relocateOne() call sites used in thunks, {Arm,AArch64}errata, PLT, etc.
Rename them to `relocateNoSym()` to be clearer that there is no symbol information.
Reviewed By: grimar, peter.smith
Differential Revision: https://reviews.llvm.org/D73254
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html
This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.
Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.
Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.
Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.
I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.
Depends on D71907 and D71911.
Reviewers: pcc, evgeny777, steven_wu, espindola
Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D71913
I felt really sad to push this commit for my selfish purpose to make
glibc -static-pie build with lld. Some code constructs in glibc require
R_X86_64_GOTPCREL/R_X86_64_REX_GOTPCRELX referencing undefined weak to
be resolved to a GOT entry not relocated by R_X86_64_GLOB_DAT (GNU ld
behavior), e.g.
csu/libc-start.c
if (__pthread_initialize_minimal != NULL)
__pthread_initialize_minimal ();
elf/dl-object.c
void
_dl_add_to_namespace_list (struct link_map *new, Lmid_t nsid)
{
/* We modify the list of loaded objects. */
__rtld_lock_lock_recursive (GL(dl_load_write_lock));
Emitting a GLOB_DAT will make the address equal &__ehdr_start (true
value) and cause elf/ldconfig to segfault. glibc really should move away
from weak references, which do not have defined semantics.
Temporarily special case --no-dynamic-linker.
These functions call relocateOne(). This patch is a prerequisite for
making relocateOne() aware of `Symbol` (D73254).
Reviewed By: grimar, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D73250
Summary:
Linker scripts allow filenames to be put in double quotes to prevent
characters in filenames that are part of the linker script syntax from
having their special meaning. Case in point the * wildcard character.
Availability of double quoting filenames also allows to fix a failure in
ELF/linkerscript/filename-spec.s when the path contain a @ which the
lexer consider as a special characters and thus break up a filename
containing it. This may happens under Jenkins which createspath such as
pipeline@2.
To avoid the need for escaping GlobPattern metacharacters in filename
in double quotes, GlobPattern::create is augmented with a new parameter
to request literal matching instead of relying on the presence of a
wildcard character in the pattern.
Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap
Reviewed By: MaskRay
Subscribers: peter.smith, grimar, ruiu, emaste, arichardson, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72517
The --fix-cortex-a8 is sensitive to alignment and the precise destination
of branch instructions. These are not knowable at relocatable link time. We
follow GNU ld and the --fix-cortex-a53-843419 (D72968) by not patching the
code when there is a relocatable link.
Differential Revision: https://reviews.llvm.org/D73100
Add new method getFirstInputSection and use instead of getInputSections
where appropriate to avoid creation of an unneeded vector of input
sections.
Differential Revision: https://reviews.llvm.org/D73047
The INPUT_SECTION_FLAGS linker script command is used to constrain the
section pattern matching to sections that match certain combinations of
flags.
There are two ways to express the constraint.
withFlags: Section must have these flags.
withoutFlags: Section must not have these flags.
The syntax of the command is:
INPUT_SECTION_FLAGS '(' sect_flag_list ')'
sect_flag_list: NAME
| sect_flag_list '&' NAME
Where NAME matches a section flag name such as SHF_EXECINSTR, or the
integer value of a section flag. If the first character of NAME is ! then
it means must not contain flag.
We do not support the rare case of { INPUT_SECTION_FLAGS(flags) filespec }
where filespec has no input section description like (.text).
As an example from the ld man page:
SECTIONS {
.text : { INPUT_SECTION_FLAGS (SHF_MERGE & SHF_STRINGS) *(.text) }
.text2 : { INPUT_SECTION_FLAGS (!SHF_WRITE) *(.text) }
}
.text will match sections called .text that have both the SHF_MERGE and
SHF_STRINGS flag.
.text2 will match sections called .text that don't have the SHF_WRITE flag.
The flag names accepted are the generic to all targets and SHF_ARM_PURECODE
as it is very useful to filter all the pure code sections into a single
program header that can be marked execute never.
fixes PR44265
Differential Revision: https://reviews.llvm.org/D72756
Summary:
Unlike R_RISCV_RELAX, which is a linker hint, R_RISCV_ALIGN requires the
support of the linker even when ignoring all R_RISCV_RELAX relocations.
This is because the compiler emits as many NOPs as may be required for
the requested alignment, more than may be required pre-relaxation, to
allow for the target becoming more unaligned after relaxing earlier
sequences. This means that the target is often not initially aligned in
the object files, and so the R_RISCV_ALIGN relocations cannot just be
ignored. Since we do not support linker relaxation, we must turn these
into errors.
Reviewers: ruiu, MaskRay, espindola
Reviewed By: MaskRay
Subscribers: grimar, Jim, emaste, arichardson, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71820
The code doesn't apply the fix correctly to relocatable links. I could
try to fix the code that applies the fix, but it's pointless: we don't
actually know what the offset will be in the final executable. So just
ignore the flag for relocatable links.
Issue discovered building Android.
Differential Revision: https://reviews.llvm.org/D72968
This essentially reverts b841e119d7.
Such code construct can be used in the following way:
// glibc/stdlib/exit.c
// clang -fuse-ld=lld => succeeded
// clang -fuse-ld=lld -fpie -pie => relocation R_PLT_PC cannot refer to absolute symbol
__attribute__((weak, visibility("hidden"))) extern void __call_tls_dtors();
void __run_exit_handlers() {
if (__call_tls_dtors)
__call_tls_dtors();
}
Since we allow R_PLT_PC in -no-pie mode, it makes sense to allow it in
-pie mode as well.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D72943
In D71281 a fix was put in to round up the size of a ThunkSection to the
nearest 4KiB when performing errata patching. This fixed a problem with a
very large instrumented program that had thunks and patches mutually
trigger each other. Unfortunately it triggers an assertion failure in an
AArch64 allyesconfig build of the kernel. There is a specific assertion
preventing an InputSectionDescription being larger than 4KiB. This will
always trigger if there is at least one Thunk needed in that
InputSectionDescription, which is possible for an allyesconfig build.
Abstractly the problem case is:
.text : {
*(.text) ;
...
. = ALIGN(SZ_4K);
__idmap_text_start = .;
*(.idmap.text)
__idmap_text_end = .;
...
}
The assertion checks that __idmap_text_end - __idmap_start is < 4 KiB.
Note that there is more than one InputSectionDescription in the
OutputSection so we can't just restrict the fix to OutputSections smaller
than 4 KiB.
The fix presented here limits the D71281 to InputSectionDescriptions that
meet the following conditions:
1.) The OutputSection is bigger than the thunkSectionSpacing so adding
thunks will affect the addresses of following code.
2.) The InputSectionDescription is larger than 4 KiB. This will prevent
any assertion failures that an InputSectionDescription is < 4 KiB
in size.
We do this at ThunkSection creation time as at this point we know that
the addresses are stable and up to date prior to adding the thunks as
assignAddresses() will have been called immediately prior to thunk
generation.
The fix reverts the two tests affected by D71281 to their original state
as they no longer need the 4KiB size roundup. I've added simpler tests to
check for D71281 when the OutputSection size is larger than the ThunkSection
spacing.
Fixes https://github.com/ClangBuiltLinux/linux/issues/812
Differential Revision: https://reviews.llvm.org/D72344
`{clang,gcc} -nostdlib -r a.c` passes --dynamic-linker to the linker,
and the expected behavior is to ignore it.
If .interp is kept in the relocatable object file, a final link will get
PT_INTERP even if --dynamic-linker is not specified. glibc ld.so expects
to see PT_DYNAMIC and the executable will likely fail to run.
Ignore --dynamic-linker in -r mode as well as -shared.
ThunkSection contains 4-byte instructions on all targets that use
thunks. Thunks should not be used in any performance sensitive places,
and locality/cache line/instruction fetching arguments should not apply.
We use 16 bytes as preferred function alignments for modern PowerPC cores.
In any case, 8 is not optimal.
Differential Revision: https://reviews.llvm.org/D72819
Moved the section name check ahead of any filename matching or
exclusion. Firstly, this reduces the need to retrieve the filename and
secondly, reduces the amount of potentially expensive filename pattern
matching if such rules are present in the linker script.
The impact of this change is particularly significant when linking
objects built with -ffunction-sections and -fstack-size-section, using a
linker script that includes non-trivial filename patterns. In a number
of such cases, the link time is halved.
Differential Revision: https://reviews.llvm.org/D72775
This assertion was added as part of D70659 but did not account for .bss
input sections. I noticed that this assert was incorrectly triggering
while building FreeBSD for MIPS64. Fixed by relaxing the assert to also
account for SHT_NOBITS input sections and adjust the test
mips-jalr-non-function.s to link a file with a .bss section first.
Reviewed By: MaskRay, grimar
Differential Revision: https://reviews.llvm.org/D72567
R_HINT is ignored like R_NONE. There are no strong reasons to keep
R_HINT. The largest RelExpr member R_RISCV_PC_INDIRECT is 60 now.
Differential Revision: https://reviews.llvm.org/D71822
Suggested by Peter Collingbourne.
Non-VER_NDX_GLOBAL versions should not be assigned to defined symbols. --exclude-libs violates this and can cause a spurious error "cannot refer to absolute symbol" after D71795.
excludeLibs incorrectly assigns VER_NDX_LOCAL to an undefined weak symbol =>
isPreemptible is false =>
R_PLT_PC is optimized to R_PC =>
in isStaticLinkTimeConstant, an error is emitted.
Reviewed By: pcc, grimar
Differential Revision: https://reviews.llvm.org/D72681
This patch is a joint work by Rui Ueyama and me based on D58102 by Xiang Zhang.
It adds Intel CET (Control-flow Enforcement Technology) support to lld.
The implementation follows the draft version of psABI which you can
download from https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI.
CET introduces a new restriction on indirect jump instructions so that
you can limit the places to which you can jump to using indirect jumps.
In order to use the feature, you need to compile source files with
-fcf-protection=full.
* IBT is enabled if all input files are compiled with the flag. To force enabling ibt, pass -z force-ibt.
* SHSTK is enabled if all input files are compiled with the flag, or if -z shstk is specified.
IBT-enabled executables/shared objects have two PLT sections, ".plt" and
".plt.sec". For the details as to why we have two sections, please read
the comments.
Reviewed By: xiangzhangllvm
Differential Revision: https://reviews.llvm.org/D59780
RELA targets don't read initial .got.plt entries.
REL targets (ARM, x86-32) write the address of the IFUNC resolver to the
entry (`write32le(buf, s.getVA())`).
The default writeIgotPlt() is not meaningful. Make it a no-op. AArch64
and x86-64 will have 0 as initial .got.plt entries associated with
IFUNC.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D72474
down to pass builder in ltobackend.
Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang
are not passed down to pass builder in ltobackend when new pass manager is
used. This is inconsistent with the behavior when new pass manager is used
and thinlto is not used. Such inconsistency causes slp vectorization pass
not being enabled in ltobackend for O3 + thinlto right now. This patch
fixes that.
Differential Revision: https://reviews.llvm.org/D72386
An undefined weak does not fetch the lazy definition. A lazy weak symbol
should be considered undefined, and thus preemptible if .dynsym exists.
D71795 is not quite an NFC. It errors on an R_X86_64_PLT32 referencing
an undefined weak symbol. isPreemptible is false (incorrect) => R_PLT_PC
is optimized to R_PC => in isStaticLinkTimeConstant, an error is emitted
when an R_PC is applied on an undefined weak (considered absolute).
Weak undefined symbols are preemptible after D71794.
if (sym.isPreemptible)
return false;
if (!config->isPic)
return true;
// isPic means includeInDynsym is true after D71794.
...
// We can delete this if because it can never be true.
if (sym.isUndefWeak)
return true;
Differential Revision: https://reviews.llvm.org/D71795
D59275 added the following clause to Symbol::includeInDynsym()
if (isUndefWeak() && Config->Pie && SharedFiles.empty())
return false;
D59549 explored the possibility to generalize it for -no-pie.
GNU ld's rules are architecture dependent and partly controlled by -z
{,no-}dynamic-undefined-weak. Our attempts to mimic its rules are
actually half-baked and don't provide perceivable benefits (it can save
a few more weak undefined symbols in .dynsym in a -static-pie
executable). Let's just delete the rule for simplicity. We will expect
cosmetic inconsistencies with ld.bfd in certain -static-pie scenarios.
This permits a simplification in D71795.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D71794
In AArch64 a branch to an undefined weak symbol that does not have a PLT
entry should resolve to the next instruction. The thunk generation code
can prevent this from happening as a range extension thunk can be generated
if the branch is sufficiently far away from 0, the value of an undefined
weak symbol.
The fix is taken from the Arm implementation of needsThunk(), we prevent a
thunk from being generated to an undefined weak symbol.
fixes pr44451
Differential Revision: https://reviews.llvm.org/D72267
```
lld/ELF/Relocations.cpp:1622:56: warning: loop variable 'ts' of type 'const std::pair<ThunkSection *, uint32_t>' (aka 'const pair<lld:🧝:ThunkSection *, unsigned int>') creates a copy from type 'const std::pair<ThunkSection *, uint32_t>' [-Wrange-loop-analysis]
for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections)
```
Drop const qualifier to fix -Wrange-loop-analysis.
We can make -Wrange-loop-analysis warnings (DiagnoseForRangeConstVariableCopies) on `const A` more
permissive on more types (e.g. POD -> trivially copyable), unfortunately it will not make std::pair
good, because `constexpr pair& operator=(const pair& p);` is unfortunately user-defined.
Reviewed By: Mordante
Differential Revision: https://reviews.llvm.org/D72211
One instance looks like a false positive:
lld/ELF/Relocations.cpp:1622:14: note: use reference type 'const std::pair<ThunkSection *, uint32_t> &' (aka 'cons
t pair<lld:🧝:ThunkSection *, unsigned int> &') to prevent copying
for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections)
It is not changed in this commit.
GCC before r245813 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79439)
did not emit nop after b/bl. This can happen with recursive calls.
r245813 was back ported to GCC 5.5 and GCC 6.4.
This is common, for example, libstdc++.a(locale.o) shipped with GCC 4.9
and many objects in netlib lapack can cause lld to error. gold allows
such calls to the same section. Our __plt_foo symbol's `section` field
is used for ThunkSection, so we can't implement a similar loosen rule
easily. But we can make use of its `file` field which is currently NULL.
Differential Revision: https://reviews.llvm.org/D71639
Similar to D71509 (EM_PPC64), on EM_PPC, the IPLT code sequence should
be similar to a PLT call stub. Unlike EM_PPC64, EM_PPC -msecure-plt has
small/large PIC model differences.
* -fpic/-fpie: R_PPC_PLTREL24 r_addend=0. The call stub loads an address relative to `_GLOBAL_OFFSET_TABLE_`.
* -fPIC/-fPIE: R_PPC_PLTREL24 r_addend=0x8000. (A partial linked object
file may have an addend larger than 0x8000.) The call stub loads an address relative to .got2+0x8000.
Just assume large PIC model for now. This patch makes:
// clang -fuse-ld=lld -msecure-plt -fno-pie -no-pie a.c
// clang -fuse-ld=lld -msecure-plt -fPIE -pie a.c
#include <stdio.h>
static void impl(void) { puts("meow"); }
void thefunc(void) __attribute__((ifunc("resolver")));
void *resolver(void) { return &impl; }
int main(void) {
thefunc();
void (*theptr)(void) = &thefunc;
theptr();
}
work on Linux glibc. -fpie will crash because the compiler and the
linker do not agree on the value which r30 stores (_GLOBAL_OFFSET_TABLE_
vs .got2+0x8000).
Differential Revision: https://reviews.llvm.org/D71621
Non-preemptible IFUNC are placed in in.iplt (.glink on EM_PPC64). If
there is a non-GOT non-PLT relocation, for pointer equality, we change
the type of the symbol from STT_IFUNC and STT_FUNC and bind it to the
.glink entry.
On EM_386, EM_X86_64, EM_ARM, and EM_AARCH64, the PLT code sequence
loads the address from its associated .got.plt slot. An IPLT also has an
associated .got.plt slot and can use the same code sequence.
On EM_PPC64, the PLT code sequence is actually a bl instruction in
.glink . It jumps to `__glink_PLTresolve` (the PLT header). and
`__glink_PLTresolve` computes the .plt slot (relocated by
R_PPC64_JUMP_SLOT).
An IPLT does not have an associated R_PPC64_JUMP_SLOT, so we cannot use
`bl` in .iplt . Instead, create a call stub which has a similar code
sequence as PPC64PltCallStub. We don't save the TOC pointer, so such
scenarios will not work: a function pointer to a non-preemptible ifunc,
which resolves to a function defined in another DSO. This is the
restriction described by https://sourceware.org/glibc/wiki/GNU_IFUNC
(though on many architectures it works in practice):
Requirement (a): Resolver must be defined in the same translation unit as the implementations.
If an ifunc is taken address but not called, technically we don't need
an entry for it, but we currently do that.
This patch makes
// clang -fuse-ld=lld -fno-pie -no-pie a.c
// clang -fuse-ld=lld -fPIE -pie a.c
#include <stdio.h>
static void impl(void) { puts("meow"); }
void thefunc(void) __attribute__((ifunc("resolver")));
void *resolver(void) { return &impl; }
int main(void) {
thefunc();
void (*theptr)(void) = &thefunc;
theptr();
}
work on Linux glibc and FreeBSD. Calling a function pointer pointing to
a Non-preemptible IFUNC never worked before.
Differential Revision: https://reviews.llvm.org/D71509
This restores commit 1417558e4a and its follow-up, reverted by commit c3dbd782f1.
After this commit:
clang -fuse-ld=bfd -no-pie -nostdlib a.c => .interp not created
clang -fuse-ld=bfd -pie -fPIE -nostdlib a.c => .interp created
clang -fuse-ld=gold -no-pie -nostdlib a.c => .interp not created
clang -fuse-ld=gold -pie -fPIE -nostdlib a.c => .interp created
clang -fuse-ld=lld -no-pie -nostdlib a.c => .interp created
clang -fuse-ld=lld -pie -fPIE -nostdlib a.c => .interp created
This reverts commit 1417558e4a.
Also reverts commit 019a92bb28.
This causes check-sanitizer to fail. The "-Nolib" variant of the test
crashes on startup in the loader.