Commit Graph

1279 Commits

Author SHA1 Message Date
Maksim Panchenko 30a6d3ada6 [BOLT][TEST] Fix stack alignment in section-reloc-with-addend.s
Misaligned stack can cause a runtime crash.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128227
2022-06-20 14:47:37 -07:00
Maksim Panchenko f263a66ba0 [BOLT] Split functions with exceptions in shared objects and PIEs
Add functionality to allow splitting code with C++ exceptions in shared
libraries and PIEs. To overcome a limitation in exception ranges format,
for functions with fragments spanning multiple sections, add trampoline
landing pads in the same section as the corresponding throwing range.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127936
2022-06-19 16:48:48 -07:00
Amir Ayupov 445bc88501 [BOLT] Use 32-bit MOV to zero 64-bit register in instrumentation code
Instead of `movabsq $0x0, %rax` emit shorter equivalent `movl $0x0, %eax`.
Intel SDM, 3.4.1.1 General-Purpose Registers in 64-Bit Mode:
>32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in
> the destination general-purpose register.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D127045
2022-06-19 11:34:32 -07:00
Huan Nguyen 543f13c99b [BOLT] Allow function entry to be a cold fragment
Allow cold fragment to get new address.

Our previous assumption is that a fragment (.cold) is only reached
through the main fragment of same function. In addition, .cold fragment
must be reached through either (a) direct transfer, or (b) split jump
table. For (a), we perform a simple fix-up. For (b), we currently mark
all relevant fragments as non-simple. Therefore, there is no need to
get new address for .cold fragment.

This is not always the case, as function entry can be rarely executed,
and is placed in .text.cold segment. Essentially we cannot tell which
the source-level function entry is based on hot and cold segments,
so we must treat each fragment a function on its own. Therfore, we
remove the assertion that a function entry cannot be cold fragment.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128111
2022-06-18 11:39:51 -07:00
Huan Nguyen 28b1dcb122 [BOLT] Allow function fragments to point to one jump table
Resolve a crash related to split functions

Due to split function optimization, a function can be divided to two

fragments, and both fragments can access same jump table. This
violates 
the assumption that a jump table can only have one parent
function, 
which causes a crash during instrumentation.

We want to support the case: different functions cannot access same
jump tables, but different fragments of same function can!

As all fragments are from same function, we point JT::Parent to one
specific fragment. Right now it is the first disassembled fragment, but
we can point it to the function's main fragment later.

Functions are disassembled sequentially. Previously, at the end of
processing a function, JT::OffsetEntries is cleared, so other fragment
can no longer reuse JT::OffsetEntries. To extend the support for split
function, we only clear JT::OffsetEntries after all functions are
disassembled.

Let say A.hot and A.cold access JT of three targets {X, Y, Z}, where
X and Y are in A.hot, and Z is in A.cold. Suppose that A.hot is
disassembled first, JT::OffsetEntries = {X',Y',INVALID_OFFSET}. When
A.cold is disassembled, it cannot reuse JT::OffsetEntries above due to
different fragment start. A simple solution:
A.hot  = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, INVALID_OFFSET}

We update the assertion to allow different fragments of same function
to get the same JumpTable object.

Potential improvements:
A.hot  = {X',Y',INVALID_OFFSET}
A.cold = {INVALID_OFFSET, INVALID_OFFSET, Z'}
The main issue is A.hot and A.cold have separate CFGs, thus jump table
targets are still constrained within fragment bounds.

Future improvements:
A.hot  = {X, Y, Z}
A.cold = {X, Y, Z}

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127924
2022-06-17 16:22:30 -07:00
Rafael Auler 9d5e6ccd9b [BOLT] Fix for missing entry offset
Temporary fix for missing entry offset when creating address
translation tables (BAT) after D127935 landed. Will later work on
assigning a more reasonable offset different than zero.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D128092
2022-06-17 13:14:42 -07:00
Maksim Panchenko 8228c70358 [BOLT][NFCI] Refactor interface for adding basic blocks
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127935
2022-06-16 11:51:57 -07:00
Maksim Panchenko 401a425d20 [BOLT][NFCI] Remove redundant code
createBasicBlock() will create a label for addBasicBlock().

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D127928
2022-06-15 21:26:30 -07:00
Amir Ayupov 02d510b416 [BOLT][NFC] Pass Function to BC.printInstructions in BinaryBasicBlock::dump
BC::printInstruction(s) has many uses of Function ptr if it's available:
# printing CFI instructions (unconditional)
# printing debug line information (-print-debug-info)
# printing instruction relocations (-print-relocations)

Enable these uses by passing Function ptr from the primary printing entry point:
BinaryBasicBlock::dump.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126916
2022-06-13 14:26:51 -07:00
Amir Ayupov a2c4d6d332 [BOLT][NFC] Forward declare ReorderBlocks for MSVC19
Fix bolt-x86_64-wine-msvc builder:
https://lab.llvm.org/buildbot/#/builders/222/builds/1154

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D127612
2022-06-13 10:26:58 -07:00
Vladislav Khmelevsky 6e26ffa064 [BOLT][AARCH64] Skip R_AARCH64_LD_PREL_LO19 relocation
Supress failed to analyze relocations warning for R_AARCH64_LD_PREL_LO19
relocation. This relocation is mostly used to get value stored in CI and
we don't process it since we are caluclating target address using the
instruction value in evaluateMemOperandTarget().

Differential Revision: https://reviews.llvm.org/D127413
2022-06-13 15:40:06 +03:00
Amir Ayupov 7dee646b28 [BOLT][NFC] Move printDebugInfo out of BC::printInstruction
Simplify `BinaryContext::printInstruction`.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D127561
2022-06-11 11:58:36 -07:00
Fangrui Song adf4142f76 [MC] De-capitalize SwitchSection. NFC
Add SwitchSection to return switchSection. The API will be removed soon.
2022-06-10 22:50:55 -07:00
Maksim Panchenko d648aa1b8e [BOLT][TEST] Use double dash flags in tests
Replace a single dash with a double dash for options that have more
than a single letter.

llvm-bolt-wrapper.py has special treatment for output options such as
"-o" and "-w" causing issues when a single dash is used, e.g. for
"-write-dwp". The wrapper can be fixed as well, but using a double dash
has other advantages as well.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D127538
2022-06-10 16:27:33 -07:00
Huan Nguyen 82095bd5ed [BOLT] Mark fragments related to split jump table as non-simple
Mark fragments related to split jump table as non-simple.

A function could be splitted into hot and cold fragments. A split jump table is
challenging for correctly reconstructing control flow graphs, so it was marked
as ignored. This update marks those fragments as non-simple, allowing them
to be printed and partial control flow graph construction.

Test Plan:
```
llvm-lit -a tools/bolt/test/X86/split-func-icf.s
```
This test has two functions (main, main2), each has a jump table target to the
same cold portion main2.cold.1(*2). We try to print out only this cold portion.
If it is ignored, it cannot be printed. If it is non-simple, it can be printed. We
verify that it can be printed.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D127464
2022-06-10 15:49:32 -07:00
John Ericson 0bb317b7bf Revert "[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore"
This reverts commit d5daa5c5b0.
2022-06-10 19:26:12 +00:00
John Ericson d5daa5c5b0 [cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.

Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.

@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.

Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.

To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.

For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117977
2022-06-10 14:35:18 +00:00
Denis Revunov 0b7e8baf83 [BOLT][AArch64] Handle data at the beginning of a function when disassembling and building CFG.
This patch adds getFirstInstructionOffset method for BinaryFunction
which is used to properly handle cases where data is at zero offset in
a function. The main change is that we add basic block at first
instruction offset when disassembling, which prevents assertion
failures in buildCFG.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D127111
2022-06-09 15:26:32 -07:00
Maksim Panchenko 1817642684 [BOLT] Add support for GOTPCRELX relocations
The linker can convert instructions with GOTPCRELX relocations into a
form that uses an absolute addressing with an immediate. BOLT needs to
recognize such conversions and symbolize the immediates.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126747
2022-06-09 13:37:04 -07:00
Alexander Yermolovich 901867b1ef [BOLT][DWARF] Fix dwarf5-loclist-offset-form test
I put it into wrong directory. As the result it is failing for aarch64. Moving
the test under X86. Orignial diff D126999.

Differential Revision: https://reviews.llvm.org/D127417
2022-06-09 10:54:09 -07:00
Alexander Yermolovich 1c6dc43de9 [BOLT]DWARF] Eagerly write out loclists
Taking advantage of us being able to re-write .debug_info to reduce memory
footprint loclists. Writing out loc-list as they are added, similar to how
we handle ranges.

Collected on clang-14
trunk
4:41.20 real,   389.50 user,    59.50 sys,      0 amem, 38412532 mmem
4:30.08 real,   376.10 user,    63.75 sys,      0 amem, 38477844 mmem
4:25.58 real,   373.76 user,    54.71 sys,      0 amem, 38439660 mmem
diff
4:34.66 real,   392.83 user,    57.73 sys,      0 amem, 38382560 mmem
4:35.96 real,   377.70 user,    58.62 sys,      0 amem, 38255840 mmem
4:27.61 real,    390.18 user,    57.02 sys,      0 amem, 38223224 mmem

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126999
2022-06-08 16:52:59 -07:00
Huan Nguyen db313a00b6 [BOLT][NFC] Replace stdio with raw_ostream in CallGraph
Replacing stdio functions, e.g., fopen, fprintf, with raw_ostream.

Test Plan:
```
ninja check-bolt
```

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126826
2022-06-08 15:38:04 -07:00
Vladislav Khmelevsky fd9604952d [BOLT] Set valid index for functions with profiles
Some of the passes that calculates tentative layout like LongJmp and
Golang are expecting that only functions with valid index will be
located in hot text section. But currently functions with valid profiles
and not set index are breaking this logic, to fix this we can move the
hasValidProfile() condition from AssignSections pass to ReorderFunctions.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D127223
2022-06-08 14:13:12 +03:00
Fangrui Song 15d82c62dc [MC] De-capitalize MCStreamer functions
Follow-up to c031378ce0 .
The class is mostly consistent now.
2022-06-07 00:31:02 -07:00
Fangrui Song b92436efcb [bolt] Remove unneeded cl::ZeroOrMore for cl::opt options 2022-06-05 13:29:49 -07:00
Fangrui Song 36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Amir Ayupov b346af6d44 [BOLT][UTILS] Usability improvements for nfc-check-setup
# Stash local changes before checkout.
# Print a message that the source repository revision has been changed, with
  instructions to switch back.
# Make the script executable.
# Print sample instructions how to run bolt tests.
# Assume that llvm-bolt-wrapper script is in the same source directory.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126941
2022-06-03 22:54:56 -07:00
Fangrui Song 72f9c69421 [Hexagon][bolt] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Similar to 557efc9a8b
2022-06-03 22:04:57 -07:00
Huan Nguyen 5ac26156fe [BOLT][NFC] Warning for deprecated option '-reorder-blocks=cache+'
Emit warning when using deprecated option '-reorder-blocks=cache+'.
Auto switch to option '-reorder-blocks=ext-tsp'.

Test Plan:
```
ninja check-bolt
```
Added a new test cache+-deprecated.test.
Run and verify that the upstream tests are passed.

Reviewed By: rafauler, Amir, maksfb

Differential Revision: https://reviews.llvm.org/D126722
2022-06-03 14:16:55 -07:00
spupyrev 5904836b8a [BOLT] Cache-Aware Tail Duplication
A new "cache-aware" strategy for tail duplication.

Differential Revision: https://reviews.llvm.org/D123050
2022-06-03 09:08:45 -07:00
Amir Ayupov e2142ff47c [BOLT][NFC] Make ICP::verifyProfile static
Follow LLVM style guide suggestion to avoid function definitions in anonymous
namespaces: https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D124896
2022-06-02 19:09:29 -07:00
Amir Ayupov 65a84195ca [BOLT][DOCS] Add PACKAGE_VERSION to doxygen config
Clang's doxygen documentation specifies LLVM revision. Do the same for BOLT.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126912
2022-06-02 18:05:44 -07:00
Maksim Panchenko 986e5dedf2 [BOLT][NFC] Fix braces in BinaryEmitter
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126844
2022-06-02 12:45:25 -07:00
Amir Ayupov 6333e5dde9 [BOLT][NFC] Use colors in CFG dumps
Use color coding to distinguish nodes:
- Entry nodes have bold border
- Scalar (non-loopy) code is milk white
- Outer loops are light yellow
- Innermost loops are light blue

`-print-loops` needs to be enabled to provide BinaryLoopInfo.
Examples:
{F23170673}
{F23170680}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126248
2022-06-02 00:27:12 -07:00
Amir Ayupov cc23c64ff1 [BOLT][NFC] Print block instructions in dumpGraph as part of node label
Reuse the option `-dot-tooltip-code` to put block instructions into the label.
This way, the instructions are displayed by default when used with dot viewer.

When the .dot file is used with dot2html, instructions are hidden by default,
and are shown by clicking on a node.

{F23169510}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126237
2022-06-01 23:41:41 -07:00
Amir Ayupov 51c20e5804 [BOLT][UTILS] Add dot2html helper tool to embed dot into html
To be rendered in browser using d3-graphviz.
Example: {F23169510}

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126218
2022-06-01 23:37:43 -07:00
Alexander Yermolovich ab9a175990 [BOLT][DWARF] Fix TU Index handling for DWARF4/5
When we generate split dwarf with -fdebug-types-section we will have
.debug_types.dwo sections. These go into TU Index when we run llvm-dwp. BOLT was
not handling DWP input correctly with this section.

Added support for handling DWP with TU Index as an input and output for DWARF4.
Added support for handling DWP with TU Index as an input for DWARF5

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D126087
2022-06-01 18:16:12 -07:00
Huan Nguyen 38fb7d56e5 [BOLT][TEST] Replace cache+ option with ext-tsp
Replace "cache+" with "ext-tsp" in all BOLT tests

Test Plan:
```
ninja check-bolt
grep -rnw . -e "cache+"
```
no more tests containing "cache+"
"cache+" and "ext-tsp" are aliases

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D126714
2022-06-01 14:00:16 -07:00
Maksim Panchenko 0426100ff4 [BOLT][NFC] Remove unused variable
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126808
2022-06-01 13:43:10 -07:00
Yi Kong 716d428ab5 [BOLT] Add `-o` option to merge-fdata
Differential Revision: https://reviews.llvm.org/D126788
2022-06-02 01:29:04 +08:00
Alexander Yermolovich ec2711b354 [BOLT][DWARF] Fix dwarf5-debug-line test
After D126484, order in .debug-line-str and .debug-line is different. Changed
test accordingly.

Differential Revision: https://reviews.llvm.org/D126733
2022-05-31 18:17:16 -07:00
Maksim Panchenko e290133c76 [BOLT] Add new class for symbolizing X86 instructions
Summary:
While disassembling instructions, we need to replace certain immediate
operands with symbols. This symbolizing process relies on reading
relocations against instructions. However, some X86 instructions can
have multiple immediate operands and up to two relocations against
them. Thus, correctly matching a relocation to an operand is not
always possible without knowing the operand offset within the
instruction.

Luckily, LLVM provides an interface for passing the required info from
the disassembler via a virtual MCSymbolizer class. Creating a
target-specific version allows a precise matching of relocations to
operands.

This diff adds X86MCSymbolizer class that performs X86-specific
symbolizing (currently limited to non-branch instructions).

Reviewers: yota9, Amir, ayermolo, rafauler, zr33

Differential Revision: https://reviews.llvm.org/D120928
2022-05-31 17:48:19 -07:00
Denis Revunov 8579db96e8 [BOLT] [AArch64] Handle constant islands spanning multiple functions
Fix BOLT's constant island mapping when a constant island marked by $d
spans multiple functions. Currently, because BOLT only marks the
constant island in the first function where $d is located, if the next
function contains data at its start, BOLT will miss the data and try
to disassemble it. This patch adds code to explicitly go through all
symbols between $d and $x markers and mark their respective offsets as
data, which stops BOLT from trying to disassemble data. It also adds
MarkerType enum and refactors related functions.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D126177
2022-05-31 13:51:35 -07:00
Yi Kong 2a42f7f72a [BOLT] Allow merge-fdata to take a directory as input
and recursively merge all files under said directory. This is similar
to `llvm-profdata merge`.

Differential Revision: https://reviews.llvm.org/D126695
2022-06-01 03:01:14 +08:00
Rafael Auler b8a6345554 [BOLT] Fix LIT tests on Windows VS2019
Fix newline issue in link_fdata.py, as well as how to call the tool.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126437
2022-05-31 11:45:39 -07:00
Yi Kong 97715104c5 [BOLT][NFC] Don't over-specify the size of SmallVector
This is the recommended way, should make merging profiles ever so
slightly faster.
2022-05-31 16:16:38 +08:00
Balazs Benics a73b50ad06 Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable"
This reverts commit 3988bd1398.

Did not build on this bot:
https://lab.llvm.org/buildbot#builders/215/builds/6372

/usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to
‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock*>&, const std::pair<long unsigned int, std::nullptr_t>&)’
  177 |  { return bool(_M_comp(*__it, __val)); }
2022-05-27 11:19:18 +02:00
Balazs Benics 3988bd1398 [llvm][clang][bolt][NFC] Use llvm::less_first() when applicable
One could reuse this functor instead of rolling out your own version.
There were a couple other cases where the code was similar, but not
quite the same, such as it might have an assertion in the lambda or other
constructs. Thus, I've not touched any of those, as it might change the
behavior in some way.

As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal
Chris Lattner
> LLVM intentionally has a “yes, you can apply common sense judgement to
> things” policy when it comes to code review. If you are doing mechanical
> patches (e.g. adopting less_first) that apply to the entire monorepo,
> then you don’t need everyone in the monorepo to sign off on it. Having
> some +1 validation from someone is useful, but you don’t need everyone
> whose code you touch to weigh in.

Differential Revision: https://reviews.llvm.org/D126068
2022-05-27 11:15:23 +02:00
Rafael Auler c09cd64e5c [BOLT] Fix AND evaluation bug in shrink wrapping
Fix a bug where shrink-wrapping would use wrong stack offsets
because the stack was being aligned with an AND instruction, hence,
making its true offsets only available during runtime (we can't
statically determine where are the stack elements and we must give up
on this case).

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126110
2022-05-26 14:59:28 -07:00
zr33 e51a6b7374 [BOLT][DWARF] Convert dwarf5-df-* tests to assembly tests
Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D126086
2022-05-25 13:41:18 -07:00