Emit warning when using deprecated option '-reorder-blocks=cache+'.
Auto switch to option '-reorder-blocks=ext-tsp'.
Test Plan:
```
ninja check-bolt
```
Added a new test cache+-deprecated.test.
Run and verify that the upstream tests are passed.
Reviewed By: rafauler, Amir, maksfb
Differential Revision: https://reviews.llvm.org/D126722
When we generate split dwarf with -fdebug-types-section we will have
.debug_types.dwo sections. These go into TU Index when we run llvm-dwp. BOLT was
not handling DWP input correctly with this section.
Added support for handling DWP with TU Index as an input and output for DWARF4.
Added support for handling DWP with TU Index as an input for DWARF5
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D126087
Replace "cache+" with "ext-tsp" in all BOLT tests
Test Plan:
```
ninja check-bolt
grep -rnw . -e "cache+"
```
no more tests containing "cache+"
"cache+" and "ext-tsp" are aliases
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D126714
After D126484, order in .debug-line-str and .debug-line is different. Changed
test accordingly.
Differential Revision: https://reviews.llvm.org/D126733
Summary:
While disassembling instructions, we need to replace certain immediate
operands with symbols. This symbolizing process relies on reading
relocations against instructions. However, some X86 instructions can
have multiple immediate operands and up to two relocations against
them. Thus, correctly matching a relocation to an operand is not
always possible without knowing the operand offset within the
instruction.
Luckily, LLVM provides an interface for passing the required info from
the disassembler via a virtual MCSymbolizer class. Creating a
target-specific version allows a precise matching of relocations to
operands.
This diff adds X86MCSymbolizer class that performs X86-specific
symbolizing (currently limited to non-branch instructions).
Reviewers: yota9, Amir, ayermolo, rafauler, zr33
Differential Revision: https://reviews.llvm.org/D120928
Fix BOLT's constant island mapping when a constant island marked by $d
spans multiple functions. Currently, because BOLT only marks the
constant island in the first function where $d is located, if the next
function contains data at its start, BOLT will miss the data and try
to disassemble it. This patch adds code to explicitly go through all
symbols between $d and $x markers and mark their respective offsets as
data, which stops BOLT from trying to disassemble data. It also adds
MarkerType enum and refactors related functions.
Reviewed By: yota9, rafauler
Differential Revision: https://reviews.llvm.org/D126177
Fix a bug where shrink-wrapping would use wrong stack offsets
because the stack was being aligned with an AND instruction, hence,
making its true offsets only available during runtime (we can't
statically determine where are the stack elements and we must give up
on this case).
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D126110
Add a new testcase that reproduces a bug when BOLTing current
trunk LLD bootstrapped with trunk clang. This makes it official
that we do not support this transformation but are working on
it. When the support is ready, XFAIL should be removed.
Reviewed By: maksfb, Amir, yota9
Differential Revision: https://reviews.llvm.org/D125843
When a profile is collected in a BOLTed binary, the generated
profile is tagged with a header string "boltedcollection" in the first
line of the fdata file. Fix merge-fdata to recognize this header
string and preserve it into the output.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D125591
As BOLT support for monolithic and split DWARF5 is added, remove DWARF version
override for BOLT tests.
Reviewed By: ayermolo
Differential Revision: https://reviews.llvm.org/D125366
Address X86 tests failures on AArch64 builder:
https://lab.llvm.org/staging/#/builders/211/builds/82
Inputs fail to cross-compile due to a missing header:
```
/usr/include/stdio.h:27:10: fatal error: 'bits/libc-header-start.h' file not found
#include <bits/libc-header-start.h>
```
As inputs are linked with `-nostdlib` anyway, don't include stdio.h.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D124863
Address X86 tests failures on AArch64 builder:
https://lab.llvm.org/staging/#/builders/211/builds/82
Inputs fail to cross-compile due to a missing header:
```
/usr/include/stdio.h:27:10: fatal error: 'bits/libc-header-start.h' file not found
#include <bits/libc-header-start.h>
```
As inputs are linked with `-nostdlib` anyway, don't include stdio.h.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D124863
The relocation value is calculated using the formula S + A - P,
the verification of the value is performed by inversely calculating the location address
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D124270
Added implementation to support DWARF5 in monolithic mode.
Next step DWARF5 split dwarf support.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D121876
LLVM with LTO can generate function names in the form
func.llvm.<number>, where <number> could vary based on the compilation
environment. As a result, if a profiled binary originated from a
different build than a corresponding binary used for BOLT optimization,
then profiles for such LTO functions will be ignored.
To fix the problem, use "fuzzy" matching with "func.llvm.*" form.
Reviewed By: yota9, Amir
Differential Revision: https://reviews.llvm.org/D124117
Looks like implementation in llvm changed, and now we need to process error
being returned.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D124133
The ld might relax ADRP+ADD or ADRP+LDR sequences to the ADR+NOP, add
the new case to the skipRelocation for aarch64.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D123334
Align with an upstream change D120305 to make PIE the default on linux-gnu.
Add `-no-pie` to tests that require it.
Reviewed By: maksfb, yota9
Differential Revision: https://reviews.llvm.org/D123329
BOLT expects PC-relative relocations in data sections to reference code
and the relocated data to form a jump table. However, there are cases
where PC-relative addressing is used for data-to-data references
(e.g. clang-15 can generate such code). BOLT should recognize and ignore
such relocations. Otherwise, they will be considered relocations not
claimed by any jump table and cause a failure in the strict mode.
Reviewed By: yota9, Amir
Differential Revision: https://reviews.llvm.org/D123650
tls-lld test might be broken since compiler might optimize plt function
call and use address directly from got table. The test is removed since
plt-gnu-ld checks the same functionality + versioning symbol matching,
no need to keep both of the tests.
The toolchain might optimize relocations in runtime-relocs test, replace
the test compilation with yaml files.
Differential Revision: https://reviews.llvm.org/D123332
Add !isTailCall in isUnconditionalBranch check in order to sync the x86
and aarch64 and fix the fixDoubleJumps pass on aarch64.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122929
The bfd linker adds the symbol versioning string to the symbol name in symtab.
Skip the versioning part in order to find the registered PLT function.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122039
Check for supported target architecture instead of the host arch when
deciding to execute non-runtime tests.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D122498
Read static relocs on the same address, as dynamic in order to update
constant island data address properly.
Differential Revision: https://reviews.llvm.org/D122100
BOLT treats aarch64 objects located in text as empty functions with
contant islands. Emit them with at least 8-byte alignment to the new
text section.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122097
AArch64 requires CI to be aligned to 8 bytes due to access instructions
restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.
Differential Revision: https://reviews.llvm.org/D122065
It seems the earlier implementation does not follow the description
in LoopRotationPass.h: It rotates loops even if they are already laid out
correctly. The diff adjusts the behaviour.
Given that the impact of LoopInversionPass is minor, this change won't
yield significant perf differences. Tested on clang-10: there seems to be a
0.1%-0.3% cpu win and a small reduction of branch misses.
**Before:**
BOLT-INFO: 120 Functions were reordered by LoopInversionPass
**After:**
BOLT-INFO: 79 Functions were reordered by LoopInversionPass
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D121921
Since LLVM MC now preserves redundant AdSize override prefix (0x67), remove it
in BOLT explicitly (-x86-strip-redundant-adsize, on by default).
Test Plan:
`bin/llvm-lit -a bolt/test/X86/addr32.s`
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120975
This clarifies that this is an LLVM specific variable and avoids
potential conflicts with other projects.
Differential Revision: https://reviews.llvm.org/D119918
The aarch64 uses the trampolines located in .iplt section, which
contains plt-like trampolines on the value stored in .got. In this case
we don't have JUMP_SLOT relocation, but we have a symbol that belongs to
ifunc trampoline, so use it and set set plt symbol for such functions.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D120850