When the debug info contains a relocation against a dead symbol, wasm-ld
may emit spurious range-list terminator entries (entries with Start==0
and End==0). This change fixes this by emitting the WasmRelocation
Addend as End value for a non-live symbol.
Reviewed by: sbc100, dblaikie
Differential Revision: https://reviews.llvm.org/D74781
Summary:
This adds a test for D76752. Now the global section comes after the
event section, and this change makes sure it is satisfied.
Reviewers: sbc100, tlively
Reviewed By: tlively
Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76823
```
// llvm-objdump -d output (before)
0: bl .-4
4: bl .+0
8: bl .+4
// llvm-objdump -d output (after) ; GNU objdump -d
0: bl 0xfffffffc / bl 0xfffffffffffffffc
4: bl 0x4
8: bl 0xc
```
Many Operand's are not annotated as OPERAND_PCREL.
They are not affected (e.g. `b .+67108860`). I plan to fix them in future patches.
Modified test/tools/llvm-objdump/ELF/PowerPC/branch-offset.s to test
address space wraparound for powerpc32 and powerpc64.
Reviewed By: sfertile, jhenderson
Differential Revision: https://reviews.llvm.org/D76591
The error previously talked about a "section header" but was actually
referring to a program header.
Reviewed by: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D76846
```
// llvm-objdump -d output (before)
400000: e8 0b 00 00 00 callq 11
400005: e8 0b 00 00 00 callq 11
// llvm-objdump -d output (after)
400000: e8 0b 00 00 00 callq 0x400010
400005: e8 0b 00 00 00 callq 0x400015
// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
400000: e8 0b 00 00 00 callq 400010
400005: e8 0b 00 00 00 callq 400015
```
In llvm-objdump, we pass the address of the next MCInst. Ideally we
should just thread the address of the current address, unfortunately we
cannot call X86MCCodeEmitter::encodeInstruction (X86MCCodeEmitter
requires MCInstrInfo and MCContext) to get the length of the MCInst.
MCInstPrinter::printInst has other callers (e.g llvm-mc -filetype=asm, llvm-mca) which set Address to 0.
They leave MCInstPrinter::PrintBranchImmAsAddress as false and this change is a no-op for them.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76580
Added support for /map and /map:[filepath].
The output was derived from Microsoft's Link.exe output when using that same option.
Note that /MAPINFO support was not added.
The previous implementation of MapFile.cpp/.h was meant for /lldmap, and was renamed to LLDMapFile.cpp/.h
MapFile.cpp/.h is now for /MAP
However, a small fix was added to lldmap, replacing a std::sort with std::stable_sort to enforce reproducibility.
Differential Revision: https://reviews.llvm.org/D70557
Add the relevant magic bits to allow "-mllvm=-load=plugin.so" etc.
This is now using export_executable_symbols_for_plugins, so symbols are
only exported if plugins are enabled.
Differential Revision: https://reviews.llvm.org/D75879
This behavior matches GNU ld and seems reasonable.
```
// If a SECTIONS command is not specified
.text.* -> .text
.rodata.* -> .rodata
.init_array.* -> .init_array
```
A proposed Linux feature CONFIG_FG_KASLR may depend on the GNU ld behavior.
Reword a comment about -z keep-text-section-prefix and a comment about
CommonSection (deleted by rL286234).
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75225
This essentially drops the change by r288021 (discussed with Georgii Rymar
and Peter Smith and noted down in the release note of lld 10).
GNU ld>=2.31 enables -z separate-code by default for Linux x86. By
default (in the absence of a PHDRS command) a readonly PT_LOAD is
created, which is different from its traditional behavior.
Not emulating GNU ld's traditional behavior is good for us because it
improves code consistency (we create a readonly PT_LOAD in the absence
of a SECTIONS command).
Users can add --no-rosegment to restore the previous behavior (combined
readonly and read-executable sections in a single RX PT_LOAD).
Fixes https://bugs.llvm.org/show_bug.cgi?id=44903
It is about the following case:
```
SECTIONS {
.foo : { *(.foo) } =0x90909090
/DISCARD/ : { *(.bar) }
}
```
Here while parsing the fill expression we treated the
"/" of "/DISCARD/" as operator.
With this change, suggested by Fangrui Song, we do
not allow expressions with operators (e.g. "0x1100 + 0x22")
that are not wrapped into round brackets. It should not
be an issue for users, but helps to resolve parsing ambiguity.
Differential revision: https://reviews.llvm.org/D74687
MCTargetOptionsCommandFlags.inc and CommandFlags.inc are headers which contain
cl::opt with static storage.
These headers are meant to be incuded by tools to make it easier to parametrize
codegen/mc.
However, these headers are also included in at least two libraries: lldCommon
and handle-llvm. As a result, when creating DYLIB, clang-cpp holds a reference
to the options, and lldCommon holds another reference. Linking the two in a
single executable, as zig does[0], results in a double registration.
This patch explores an other approach: the .inc files are moved to regular
files, and the registration happens on-demand through static declaration of
options in the constructor of a static object.
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1756977#c5
Differential Revision: https://reviews.llvm.org/D75579
Follow-up for D74433
What the function returns are almost standard BFD names, except that "ELF" is
in uppercase instead of lowercase.
This patch changes "ELF" to "elf" and changes ARM/AArch64 to use their BFD names.
MIPS and PPC64 have endianness differences as well, but this patch does not intend to address them.
Advantages:
* llvm-objdump: the "file format " line matches GNU objdump on ARM/AArch64 objects
* "file format " line can be extracted and fed into llvm-objcopy -O literally.
(https://github.com/ClangBuiltLinux/linux/issues/779 has such a use case)
Affected tools: llvm-readobj, llvm-objdump, llvm-dwarfdump, MCJIT (internal implementation detail, not exposed)
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76046
When emitting PDBs, the TypeStreamMerger class is used to merge .debug$T records from the input .OBJ files into the output .PDB stream.
Records in .OBJs are not required to be aligned on 4-bytes, and "The Netwide Assembler 2.14" generates non-aligned records.
When compiling with -DLLVM_ENABLE_ASSERTIONS=ON, an assert was triggered in MergingTypeTableBuilder when non-ghash merging was used.
With ghash merging there was no assert.
As a result, LLD could potentially generate a non-aligned TPI stream.
We now align records on 4-bytes when record indices are remapped, in TypeStreamMerger::remapIndices().
Differential Revision: https://reviews.llvm.org/D75081
Hexagon ABI specifies that call x@gdplt is transformed to call __tls_get_addr.
Example:
call x@gdplt
is changed to
call __tls_get_addr
When x is an external tls variable.
Differential Revision: https://reviews.llvm.org/D74443
Any OUTPUT_FORMAT in a linker script overrides the emulation passed on
the command line, so record the passed bfdname and use that in the error
message about incompatible input files.
This prevents confusing error messages. For example, if you explicitly
pass `-m elf_x86_64` to LLD but accidentally include a linker script
which sets `OUTPUT_FORMAT(elf32-i386)`, LLD would previously complain
about your input files being compatible with elf_x86_64, which isn't the
actual issue, and is confusing because the input files are in fact
x86-64 ELF files.
Interestingly enough, this also prevents a segfault! When we don't pass
`-m` and we have an object file which is incompatible with the
`OUTPUT_FORMAT` set by a linker script, the object file is checked for
compatibility before it's added to the objectFiles vector.
config->emulation, objectFiles, and sharedFiles will all be empty, so
we'll attempt to access bitcodeFiles[0], but bitcodeFiles is also empty,
so we'll segfault. This commit prevents the segfault by adding
OUTPUT_FORMAT as a possible source of machine configuration, and it also
adds an llvm_unreachable to diagnose similar issues in the future.
Differential Revision: https://reviews.llvm.org/D76109
On an internal target,
* Before D74773: time -f '%M' => 18275680
* After D74773: time -f '%M' => 22088964
This patch restores to the status before D74773.
-M output can be useful when diagnosing an "error: output file too large" problem (emitted in openFile()).
I just ran into such a situation where I had to debug an erronerous
Linux kernel linker script. It tried to create a file larger than
INT64_MAX bytes.
This patch could have helped https://bugs.llvm.org/show_bug.cgi?id=44715 as well.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75966
See `docs/ELF/linker_script.rst` for the new computation for sh_addr and sh_addralign.
`ALIGN(section_align)` now means: "increase alignment to section_align"
(like yet another input section requirement).
The "start of section .foo changes from 0x11 to 0x20" warning no longer
makes sense. Change it to warn if sh_addr%sh_addralign!=0.
To decrease the alignment from the default max_input_align,
use `.output ALIGN(8) : {}` instead of `.output : ALIGN(8) {}`
See linkerscript/section-address-align.test as an example.
When both an output section address and ALIGN are set (can be seen as an
"undefined behavior" https://sourceware.org/ml/binutils/2020-03/msg00115.html),
lld may align more than GNU ld, but it makes a linker script working
with GNU ld hard to break with lld.
This patch can be considered as restoring part of the behavior before D74736.
Differential Revision: https://reviews.llvm.org/D75724
LLD implements Linker Scripts as they are described in the GNU ld manual.
This description is far from a specification, with the only true reference
the GNU ld implementation, which has undocumented behaviour that can vary
from release to release.
To make it easy for people to switch between linkers we try to follow GNU
ld implementation details wherever possible. We reserve the right to make
our own decisions where the undocumented GNU ld behaviour is not
appropriate for LLD. We don't have a place to document these decisions and
it can be difficult for users to find out this information.
This file is a statement of the LLD implementation policy and will contain
intentional deviations from GNU ld.
The first patch that will add concrete details to this file is D75724
Differential Revision: https://reviews.llvm.org/D75921
Summary:
Places orphan sections into a unique output section. This prevents the merging of orphan sections of the same name.
Matches behaviour of GNU ld --unique. --unique=pattern is not implemented.
Motivated user case shown in the test has 2 local symbols as they would appear if C++ source has been compiled with -ffunction-sections. The merging of these sections in the case of a partial link (-r) may limit the effectiveness of -gc-sections of a subsequent link.
Reviewers: espindola, jhenderson, bd1976llvm, edd, andrewng, JonChesterfield, MaskRay, grimar, ruiu, psmith
Reviewed By: MaskRay, grimar
Subscribers: emaste, arichardson, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75536
```
createFiles(args)
readDefsym
readerLinkerScript(*mb)
...
readMemory
readMemoryAssignment("ORIGIN", "org", "o") // eagerly evaluated
target = getTarget();
link(args)
writeResult<ELFT>()
...
finalizeSections()
script->processSymbolAssignments()
addSymbol(cmd) // with this patch, evaluated here
```
readMemoryAssignment eagerly evaluates ORIGIN/LENGTH and returns an uint64_t.
This patch postpones the evaluation to make
* --defsym and symbol assignments
* `CONSTANT(COMMONPAGESIZE)` (requires a non-null `lld:🧝:target`)
work. If the expression somehow requires interaction with memory
regions, the circular dependency may cause the expression to evaluate to
a strange value. See the new test added to memory-err.s
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75763
lld/.clang-tidy is almost identical to the top-level .clang-tidy, with the aforementioned customization.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D75809
Added a write method for TimeTrace that takes two strings representing
file names. The first is any file name that may have been provided by the
user via `time-trace-file` flag, and the second is a fallback that should
be configured by the caller. This method makes it cleaner to write the
trace output because there is no longer a need to check file names at the
caller and simplifies future TimeTrace usages.
Reviewed By: modocache
Differential Revision: https://reviews.llvm.org/D74514
llvm::call_once(initDwarfLine, [this]() { initializeDwarf(); });
Though it is not used in all places.
I need that patch for implementing "Remove obsolete debug info" feature
(D74169). But this caching mechanism is useful by itself, and I think it
would be good to use it without connection to "Remove obsolete debug info"
feature. So this patch changes inplace creation of DWARFContext with
its cached version.
Depends on D74308
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D74773
Currently `yaml2obj` require `Offset` field in a relocation description.
There are many cases when `Offset` is insignificant in a context of a test case.
Making `Offset` optional allows to simplify our test cases.
This is what this patch does.
Also, with this patch `obj2yaml` does not dump a zero offset of a relocation.
Differential revision: https://reviews.llvm.org/D75608
The new behavior matches GNU objdump. A pair of angle brackets makes tests slightly easier.
`.foo:` is not unique and thus cannot be used in a `CHECK-LABEL:` directive.
Without `-LABEL`, the CHECK line can match the `Disassembly of section`
line and causes the next `CHECK-NEXT:` to fail.
```
Disassembly of section .foo:
0000000000001634 .foo:
```
Bdragon: <> has metalinguistic connotation. it just "feels right"
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D75713
* Delete boilerplate
* Change functions to return `Error`
* Test parsing errors
* Update callers of ARMAttributeParser::parse() to check the `Error` return value.
Since this patch touches nearly everything in the file, I apply
http://llvm.org/docs/Proposals/VariableNames.html and change variable
names to lower case.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D75015
This fixes several issues. The behavior changes are:
A SHN_COMMON symbol does not have the 'g' flag.
An undefined symbol does not have 'g' or 'l' flag.
A STB_GLOBAL SymbolRef::ST_Unknown symbol has the 'g' flag.
A STB_LOCAL SymbolRef::ST_Unknown symbol has the 'l' flag.
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D75659
This changes the output of `llvm-readelf -n` from:
```
Displaying notes found at file offset 0x<...> with length 0x<...>:
```
to:
```
Displaying notes found in: .note.foo
```
And similarly, adds a `Name:` field to the `llvm-readobj -n` output for notes.
This change not only increases GNU compatibility, it also makes it much easier to read notes. Note that we still fall back to printing the file offset/length in cases where we don't have a section name, such as when printing notes in program headers or printing notes in a partially stripped file (GNU readelf does the same).
Fixes llvm.org/PR41339.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D75647
Summary:
LLD has workaround for the times when SectionIndex was not passed properly:
LT->getFileLineInfoForAddress(
S->getOffsetInFile() + Offset, nullptr,
DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath, Info));
S->getOffsetInFile() was added to differentiate offsets between
various sections. Now SectionIndex is properly specified.
Thus it is not necessary to use getOffsetInFile() workaround.
See https://reviews.llvm.org/D58194, https://reviews.llvm.org/D58357.
This patch removes getOffsetInFile() workaround.
Reviewers: ruiu, grimar, MaskRay, espindola
Reviewed By: grimar, MaskRay
Subscribers: emaste, arichardson, llvm-commits
Tags: #llvm, #lld
Differential Revision: https://reviews.llvm.org/D75636
Summary:
A test is passing `-o -` to lld in the hope of writing the output to
standard out but that is not the case. Instead it creates a file named
`-.lto.o`. This fixes it by creating a temporary file in the work
directory.
Differential Revision: https://reviews.llvm.org/D75605
and follow-ups:
a2ca1c2d "build: disable zlib by default on Windows"
2181bf40 "[CMake] Link against ZLIB::ZLIB"
1079c68a "Attempt to fix ZLIB CMake logic on Windows"
This changed the output of llvm-config --system-libs, and more
importantly it broke stand-alone builds. Instead of piling on more fix
attempts, let's revert this to reduce the risk of more breakages.
Otherwise ld.lld -save-temps will crash when writing to ResolutionFile.
llvm-lto2 -save-temps does not crash because it exits immediately.
Reviewed By: evgeny777
Differential Revision: https://reviews.llvm.org/D75426
Similar to D63182 [ELF][PPC64] Don't report "relocation refers to a discarded section" for .toc
Reviewed By: Bdragon28
Differential Revision: https://reviews.llvm.org/D75419
Summary:
Extracted from D74773. Currently, errors happened while parsing
debug info are reported as errors. DebugInfoDWARF library treats such
errors as "Recoverable errors". This patch makes debug info errors
to be reported as warnings, to support DebugInfoDWARF approach.
Reviewers: ruiu, grimar, MaskRay, jhenderson, espindola
Reviewed By: MaskRay, jhenderson
Subscribers: emaste, aprantl, arichardson, arphaman, llvm-commits
Tags: #llvm, #debug-info, #lld
Differential Revision: https://reviews.llvm.org/D75234
When there are both strong and weak references to an undefined
symbol ensure that the strong reference prevails in the output symbol
generating the correct error.
Test case copied from lld/test/ELF/weak-and-strong-undef.s
Differential Revision: https://reviews.llvm.org/D75322
MC will now output the R_ARM_THM_PC8, R_ARM_THM_PC12 and
R_ARM_THM_PREL_11_0 relocations. These are short-ranged relocations that
are used to implement the adr rd, literal and ldr rd, literal pseudo
instructions.
The instructions use a new RelExpr called R_ARM_PCA in order to calculate
the required S + A - Pa expression, where Pa is AlignDown(P, 4) as the
instructions add their immediate to AlignDown(PC, 4). We also do not want
these relocations to generate or resolve against a PLT entry as the range
of these relocations is so short they would never reach.
The R_ARM_THM_PC8 has a special encoding convention for the relocation
addend, the immediate field is unsigned, yet the addend must be -4 to
account for the Thumb PC bias. The ABI (not the architecture) uses the
convention that the 8-byte immediate of 0xff represents -4.
Differential Revision: https://reviews.llvm.org/D75042
WebAssembly requires that caller and callee signatures match, so it
can't do the usual trick of passing more arguments to main than it
expects. Instead WebAssembly will mangle "main" with argc/argv
parameters as "__main_argc_argv". This patch teaches lld how to
demangle it.
This patch is part of https://reviews.llvm.org/D70700.
They are purposefully skipped by input section descriptions (rL295324).
Similarly, --orphan-handling= should not warn/error for them.
This behavior matches GNU ld.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D75151
This makes --orphan-handling= less noisy.
This change also improves our compatibility with GNU ld.
GNU ld special cases .symtab, .strtab and .shstrtab . We need output section
descriptions for .symtab, .strtab and .shstrtab to suppress:
<internal>:(.symtab) is being placed in '.symtab'
<internal>:(.shstrtab) is being placed in '.shstrtab'
<internal>:(.strtab) is being placed in '.strtab'
With --strip-all, .symtab and .strtab can be omitted (note, --strip-all is not compatible with --emit-relocs).
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D75149
Instead, use `using namespace lld(::coff)`, and fully qualify the names
of free functions where they are defined in cpp files.
This effectively reverts d79c3be618 to follow the new style guide added
in 236fcbc21a.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D74882
While the value of the CIE pointer field in a DWARF FDE record is
an offset to the corresponding CIE record from the beginning of
the section, for EH FDE records it is relative to the current offset.
Previously, we did not make that distinction when dumped both kinds
of FDE records and just printed the same value for the CIE pointer
field and the CIE offset; that was acceptable for DWARF FDEs but was
wrong for EH FDEs.
This patch fixes the issue by explicitly printing the offset of the
linked CIE object.
Differential Revision: https://reviews.llvm.org/D74613
With this --shuffle-sections=seed produces the same result in every
host.
Reviewed By: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D74971
When the output section address (addrExpr) is specified, GNU ld warns if
sh_addr is different. This patch implements the warning.
Note, LinkerScript::assignAddresses can be called more than once. We
need to record the changed section addresses, and only report the
warnings after the addresses are finalized.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D74741
Follow-up for D74286.
Notations:
* alignExpr: the computed ALIGN value
* max_input_align: the maximum of input section alignments
This patch changes the following two cases to match GNU ld:
* When ALIGN is present, GNU ld sets output sh_addr to alignExpr, while lld use max(alignExpr, max_input_align)
* When addrExpr is specified but alignExpr is not, GNU ld sets output sh_addr to addrExpr, while lld uses `advance(0, max_input_align)`
Note, sh_addralign is still set to max(alignExpr, max_input_align).
lma-align.test is enhanced a bit to check we don't overalign sh_addr.
fixSectionAlignments() sets addrExpr but not alignExpr for the `!hasSectionsCommand` case.
This patch sets alignExpr as well so that max_input_align will be respected.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D74736
The changes the in-memory representation of wasm symbols such that their
optional ImportName and ImportModule use llvm::Optional.
ImportName is set whenever WASM_SYMBOL_EXPLICIT_NAME flag is set.
ImportModule (for imports) is currently always set since it defaults to
"env".
In the future we can possibly extent to binary format distingish
import which have explit module names.
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74109
Summary:
This option causes lld to shuffle sections by assigning different
priorities in each run.
The use case for this is to introduce randomization in benchmarks. The
idea is inspired by the paper "Producing Wrong Data Without Doing
Anything Obviously Wrong!"
(https://www.inf.usi.ch/faculty/hauswirth/publications/asplos09.pdf). Unlike
the paper, we shuffle individual sections, not just input files.
Doing this in lld is particularly convenient as the --reproduce option
makes it easy to collect all the necessary bits for relinking the
program being benchmarked. Once that it is done, all that is needed is
to add --shuffle-sections=0 to the response file and relink before each
run of the benchmark.
Differential Revision: https://reviews.llvm.org/D74791
With this patch lld recognizes ARM SBREL relocations.
R_ARM*_MOVW_BREL relocations are not tested because they are not used.
Patch by Tamas Petz
Differential Revision: https://reviews.llvm.org/D74604
Summary:
Generate PAC protected plt only when "-z pac-plt" is passed to the
linker. GNU toolchain generates when it is explicitly requested[1].
When pac-plt is requested then set the GNU_PROPERTY_AARCH64_FEATURE_1_PAC
note even when not all function compiled with PAC but issue a warning.
Harmonizing the warning style for BTI/PAC/IBT.
Generate BTI protected PLT if case of "-z force-bti".
[1] https://www.sourceware.org/ml/binutils/2019-03/msg00021.html
Reviewers: peter.smith, espindola, MaskRay, grimar
Reviewed By: peter.smith, MaskRay
Subscribers: tatyana-krasnukha, emaste, arichardson, kristof.beyls, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74537
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
This reverts commit 80a34ae311 with fixes.
Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
This reverts commit 80a34ae311 with fixes.
On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
Fixes https://bugs.llvm.org//show_bug.cgi?id=44878
When --strip-debug is specified, .debug* are removed from inputSections
while .rel[a].debug* (incorrectly) remain.
LinkerScript::addOrphanSections() requires the output section of a relocated
InputSectionBase to be created first.
.debug* are not in inputSections ->
output sections .debug* are not created ->
getOutputSectionName(.rel[a].debug*) dereferences a null pointer.
Fix the null pointer dereference by deleting .rel[a].debug* from inputSections as well.
Reviewed By: grimar, nickdesaulniers
Differential Revision: https://reviews.llvm.org/D74510
Recommit of 0b4a047bfb
(reverted in c29003813a) to incorporate
subsequent fix and add a warning when LLD's interworking behavior has
changed.
D73474 disabled the generation of interworking thunks for branch
relocations to non STT_FUNC symbols. This patch handles the case of BL and
BLX instructions to non STT_FUNC symbols. LLD would normally look at the
state of the caller and the callee and write a BL if the states are the
same and a BLX if the states are different.
This patch disables BL/BLX substitution when the destination symbol does
not have type STT_FUNC. This brings our behavior in line with GNU ld which
may prevent difficult to diagnose runtime errors when switching to lld.
This change does change how LLD handles interworking of symbols that do not
have type STT_FUNC from previous versions including the 10.0 release. This
brings LLD in line with ld.bfd but there may be programs that have not been
linked with ld.bfd that depend on LLD's previous behavior. We emit a warning
when the behavior changes.
A summary of the difference between 10.0 and 11.0 is that for symbols
that do not have a type of STT_FUNC LLD will not change a BL to a BLX or
vice versa. The table below enumerates the changes
| relocation | STT_FUNC | bit(0) | in | 10.0- out | 11.0+ out |
| R_ARM_CALL | no | 1 | BL | BLX | BL |
| R_ARM_CALL | no | 0 | BLX | BL | BLX |
| R_ARM_THM_CALL | no | 1 | BLX | BL | BLX |
| R_ARM_THM_CALL | no | 0 | BL | BLX | BL |
Differential Revision: https://reviews.llvm.org/D73542
We do not keep the actual value of the CIE ID field, because it is
predefined, and use a constant when dumping a CIE record. The issue
was that the predefined value is different for .debug_frame and
.eh_frame sections, but we always printed the one which corresponds
to .debug_frame. The patch fixes that by choosing an appropriate
constant to print.
See the following for more information about .eh_frame sections:
https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html
Differential Revision: https://reviews.llvm.org/D73627
Summary:
Even though this test is a check for failure, lld still attempts
to open the final output file, which fails when the default "a.out"
file is used and the current directory is read-only. Specifying an
output file works around this problem.
Reviewers: espindola
Subscribers: emaste, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74523
D43468+D44380 added INSERT [AFTER|BEFORE] for non-orphan sections. This patch
makes INSERT work for orphan sections as well.
`SECTIONS {...} INSERT [AFTER|BEFORE] .foo` does not set `hasSectionCommands`, so the result
will be similar to a regular link without a linker script. The differences when `hasSectionCommands` is set include:
* image base is different
* -z noseparate-code/-z noseparate-loadable-segments are unavailable
* some special symbols such as `_end _etext _edata` are not defined
The behavior is similar to GNU ld:
INSERT is not considered an external linker script.
This feature makes the section layout more flexible. It can be used to:
* Place .nv_fatbin before other readonly SHT_PROGBITS sections to mitigate relocation overflows.
* Disturb the layout to expose address sensitive application bugs.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D74375
GNU ld has a counterintuitive lang_propagate_lma_regions rule.
```
// .foo's LMA region is propagated to .bar because their VMA region is the same,
// and .bar does not have an explicit output section address (addr_tree).
.foo : { *(.foo) } >RAM AT> FLASH
.bar : { *(.bar) } >RAM
// An explicit output section address disables propagation.
.foo : { *(.foo) } >RAM AT> FLASH
.bar . : { *(.bar) } >RAM
```
In both cases, lld thinks .foo's LMA region is propagated and
places .bar in the same PT_LOAD, so lld diverges from GNU ld w.r.t. the
second case (lma-align.test).
This patch changes Writer<ELFT>::createPhdrs to disable propagation
(start a new PT_LOAD). A user of the first case can make linker scripts
portable by explicitly specifying `AT>`. By contrast, there was no
workaround for the old behavior.
This change uncovers another LMA related bug in assignOffsets() where
`ctx->lmaOffset = 0;` was omitted. It caused a spurious "load address
range overlaps" error for at2.test
The new PT_LOAD rule is complex. For convenience, I listed the origins of some subexpressions:
* rL323449: `sec->memRegion == load->firstSec->memRegion`; linkerscript/at3.test
* D43284: `load->lastSec == Out::programHeaders` (don't start a new PT_LOAD after program headers); linkerscript/at4.test
* D58892: `sec != relroEnd` (start a new PT_LOAD after PT_GNU_RELRO)
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D74297
When lmaRegion is non-null, respect `sec->alignment`
This rule is analogous to `switchTo(sec)` which advances sh_addr (VMA).
This fixes the p_paddr misalignment issue as reported by
https://android-review.googlesource.com/c/trusty/external/trusted-firmware-a/+/1230058
Note, `sec->alignment` is the maximum of ALIGN and input section alignments. We may overalign LMA than GNU ld.
linkerscript/align-lma.s has a FIXME that demonstrates another bug:
`.bss ... >RAM` should be placed in a different PT_LOAD (GNU ld
behavior) because its lmaRegion (nullptr) is different from the previous
section's lmaRegion (ROM).
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D74286
Summary:
GNU objdump prints the file format in lowercase, e.g. `elf64-x86-64`. llvm-objdump prints `ELF64-x86-64` right now, even though piping that into llvm-objcopy refuses that as a valid arch to use.
As an example of a problem this causes, see: https://github.com/ClangBuiltLinux/linux/issues/779
Reviewers: MaskRay, jhenderson, alexshap
Reviewed By: MaskRay
Subscribers: tpimh, sbc100, grimar, jvesely, nhaehnle, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74433
Also remove some test duplication and add a test case that shows the
maximum version is rejected (this also shows that the value in the error
message is actually in decimal, and not just missing an 0x prefix).
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D74403
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.
There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.